Real English: March 2015

Tuesday, 31 March 2015

Translation quality measurement in practice

TRANSLATION QUALITY MEASUREMENT IN PRACTICE

Riccardo Schiaffino Aliquantum

Franco Zearo Lionbridge Technologies

Abstract: This paper provides an overview of the Translation Quality Index (TQI), a measurement methodology that can be used as a reliable indicator of translation quality. The authors have been developing the TQI methodology for the past five years. The TQI was first implemented for commercial use in 2004.

1. TRANSLATION QUALITY MEASUREMENT

1.1 How did you start working on translation quality measurement?

We have known each other for years—we both graduated in translation from the same university and currently live in the same area—but we had never worked together. In 2000, we were both giving presentations at the ATA Conference in Orlando, Florida. We started to talk about how people generally assumed that translation quality could not be measured. One thing led to another, and we decided to see if we could find a way to measure the quality of translation as a basis for process improvement.

In the following three years, we developed our research and gave three presentations on how to measure and control translation quality. At the same time, we applied our work at the companies for which we worked: Lionbridge in Franco’s case and J.D. Edwards in Riccardo’s. Riccardo’s career at J.D. Edwards ended on December 31, 2003, following PeopleSoft’s acquisition. However, he continued researching translation quality measurement, developing tools to help the assessment and measurement of translation quality, and developing what we called the Translation Quality Index, or TQI. In the meantime, Franco proceeded to implement the translation quality measurement methodology at Lionbridge. The spreadsheet that Lionbridge uses to measure translation quality is the result of our joint collaboration.

1.2 Why try to measure translation quality?

Some people say, “You cannot manage what you cannot measure.” Applied to translation, this means that without some means to assess the quality of translation, it is not possible to improve translation quality, nor is it possible to know if the translation quality is good; and, if it is good, how to keep it that way.

1.3 Is it possible to measure translation quality?

We believe that it is possible to measure translation quality, although perhaps not directly: When measuring translation quality, we really measure the incidence of various types of errors and defects in the translated material; for example, errors of terminology, grammar, spelling, meaning, and others. Therefore, a good translation is one in which fewer errors are made.

Experienced translators would summarize the criteria for recognizing a poor translation as follows: “I know it when I see it” (Note 1). However, this simplistic approach is not adequate in meeting the demands of today’s high-paced business environment. Like many business processes where the desired outcome is a product or service, quality measurements are not only possible, but necessary. Without objective ways to measure the quality of our work, we are left at the mercy of fickle evaluations by lay people who can be highly subjective and not entirely fair.

We believe it is the translation profession’s responsibility to develop criteria that constitute an objective and fair evaluation of translation quality. Having said that, we heed the warning of the ATA. “Although the use of points may impart a certain impression of objectivity, it is in truth still subjective” (Doyle, 2003).

1.4 Why measure errors when measuring translation quality?

One important thing to consider is that the assessment of translation quality should be as objective as possible. What I like and what you like may be very different, but we should have some means to agree on certain standards.

We believe it is easier to agree on what constitutes an error rather than on what constitutes “quality” in the abstract, and that an important factor in quality is the absence of errors.

We also believe that summarizing all of the error points in a single index value will help us to synthesize the translation quality of a given text. Moreover, we can use statistical methods to determine if a translation process is in statistical control, if special causes are present, or even if we are improving our translation process.

1.5 Do you believe there is one “ideal” translation process to ensure the best possible quality level?

The process does not really matter as long as it yields the desired result. We believe that the very purpose of translation measurement is to obtain useful information for benchmarking the relative merits of various translation processes.

The real question then is, “What is the most efficient process in terms of quality versus cost?” We believe that the ingredients of good quality translations are fairly reasonable, but very seldom found all together. These ingredients include the following: • Good translators with a sound linguistic and specific technical background • Detail-oriented editors and knowledgeable proofreaders • Thorough terminology work up front • Sufficient time to provide a good translation • Meaningful feedback and support from the customer

1.6 How does translation quality measurement differ from other methods of translation quality assessment?

Over the past 30 years, many methods of evaluating translation quality have been developed and proposed. Malcom Williams (2004) classifies these methods into two categories: Quantitative- centered systems and argumentation-centered systems. Williams characterizes quantitative- centered methods by some method of error counting, while argumentation-centered methods take a more holistic approach. Each method has its advantages and disadvantages, which we cannot elaborate here. Suffice it to say, the advantage of the quantitative-centered methods is that they lend themselves to quantifying errors and, therefore, make measurements possible.

2. THE TRANSLATION QUALITY INDEX (TQI)

2.1 What is the TQI methodology?

The TQI methodology (along with similar initiatives such as the LISA QA Model and SAE J2450) is a quantitative-based method of translation quality assessment. It measures the number and type of errors found in a text and calculates a score, or TQI, which is indicative of the quality of a given translation.

The distinctive traits of the TQI methodology are as follows:

• Translation Quality Index. The Translation Quality Index is a number that is indicative of the quality of a given translation. It is obtained by the rigorous application of a quality assurance methodology.

The Translation Quality Index attributes a value to a translated text, with 100 being an “error-free” translation. It is based on the number of error points in a given text or sample. Negative values are possible. The TQI is analogous to a temperature scale. We all have subjective interpretations of “cold,” “warm,” and “hot.” The use of a temperature scale (Fahrenheit, Celsius, or Kelvin) makes it possible to move from subjective perceptions to objective measurements.

• Separation between error type and severity. There are no pre-assigned penalties for the different error categories. Each error can be marked as critical, major, or minor, depending on its consequences. Sometimes, an error can be classified in different ways; for example, if I type “car” instead of “cab”, it could be classified as a mistranslation, a terminology error, or even a typo. While a precise classification of translation errors might be of interest in an academic setting, such as translation training programs, it is often unnecessary in a business environment.

• Strict criteria for the severity levels of errors. A TQI measurement should be objective, reproducible, and repeatable. To achieve these criteria, the evaluator has to follow certain rules when marking errors.

2.2 What are error points and how do error points differ from errors?

Using a typo as an example: if we find five typos, we count five errors. That is a rather simple form of error measurement, but not all errors are equal. There is a difference between a typo on the front cover of a manual and the same typo in a footnote. There are also typos that alter the meaning of a word, and typos that do not lead to confusion; for example, the word “*atttention” spelled with three ’t’s. This observation prompts us to assign different weights to errors depending on their consequences. In our previous example, we can decide to give minor typos a weight of “1,” and major typos a weight of “5,” “100,” or whatever. We call these weights “error points.”

2.3 What were the difficulties when you started to put the TQI into practice?

The purpose of the TQI and its ancillary tools is to make translation assessment as objective as possible. However, when we started to use the TQI tool, we realized that how we configured the score was not always a true representation of the translation quality. It is easy to form an idea about how good or bad a translation is and then semiconsciously try to convince oneself that a major error is minor, or a minor error is only a “preference,” so as not to push the TQI below the threshold that would make the translation fail. Also, accuracy errors are difficult to evaluate when there is a slight loss in meaning. Even grammatical errors are sometimes not as straightforward as one would think. Language, after all, is not a precise science.

2.4 What makes a good evaluator?

A good evaluator must be able to be as objective as possible. He or she must be able to distinguish between factual, tangible errors and stylistic preferences. We all have our pet peeves when it comes to translation choices. An objective evaluator realizes that he or she might have translated a sentence differently, but that the version chosen by the original translator is also acceptable.

You can roughly classify evaluators into purists and descriptivists. The purists are those who like to think of language in terms of how it ought to be used. Descriptivists, on the other hand, take into account how people use the language in their daily lives. Each point of view has its pros and cons, and they each lead to very different interpretations of what is considered “right” and “wrong.”

Moreover, if you give the same translation to two different evaluators, chances are that they will find a different number of errors or mark the same errors differently. A better solution would be to have the translation evaluated by a group of evaluators, in the same way that gymnastics resort to a panel of judges. Unfortunately, this solution proves to be too expensive in most commercial settings.

What would be helpful is a certification program for evaluators, possibly sponsored by an independent, not-for-profit organization such as the ATA. This not-for-profit organization might create standards regarding error classification, severity levels, error points, and others.

2.5 How do you distinguish between errors and stylistic preferences?

Bruno Osimo says that translation is a process with one entry point and multiple exit points. (2004). As discussed earlier, there is more than one way to translate a given sentence, each version being roughly equivalent and any differences being a matter of style and personal preference.

By definition, stylistic preferences are not errors and are ignored in the computation of the quality score. Therefore, it is necessary to establish clear rules that define what is an error and what is not an error.

We have developed a three-pronged rule to determine whether a marked error is preferential or not. Basically, the evaluator has to answer the following three questions: 1. Is it grammatically correct? 2. Is the translation accurate? 3. Is the translation compliant with the glossary, style guide, guidelines, and client instructions?

Answering the first two questions is not as easy as it might seem. In the case of grammatical correctness, for example, some languages might have authoritative language bodies; for example, Real Academia de la Lengua Española in Spain; Académie française in France; Nederlandse Taalunie in The Netherlands, and so forth. Other languages that lack such language authorities, such as American English, might have to rely on commonly accepted language conventions as described in authoritative reference books; for example Merriam-Webster’s Dictionary, The Chicago Manual of Style, and others. A third group of languages does not have established language conventions, as is the case with many languages in India. In such cases, it is important to develop glossaries and style manuals.

Evaluating the degree of accuracy is another challenging task. We have developed flow charts similar to those created by the ATA for test evaluation for certification purposes. The intent is to see if there have been significant deviations in meaning.

The last question serves the business purpose of delivering quality that conforms to the client’s specifications. This is generally more straightforward: Either the term is in the translation glossary or it is not. Either the translators followed the style guide and the instructions, or they did not.

2.6 Can the TQI help in assessing the quality of machine translation?

Absolutely. Some argue that human-based evaluations are too subjective, that MT should not be evaluated using human-based methods, and that such evaluations are too subjective. We do not agree. The TQI is a sort of Turing test. The Turing test was developed to indicate whether a machine was “intelligent” by testing its capability to perform human-like conversation. If a user cannot tell the difference between a text translated by a human and one by MT, then we could say that the two texts are equivalent. The TQI can help with this evaluation. If we agree that a TQI score of 80 or above is the mark of a good translation, it does not matter which localization process we used to obtain the score. In our experience, raw MT outputs have TQI scores below

zero. Processes that combine MT with human post-editing can elevate the TQI scores to levels that are more acceptable.

2.7 Is there anything that the TQI methodology cannot measure?

Yes. In our experience, there are a couple of cases where relying on the TQI methodology would be inappropriate.

Because the TQI methodology is designed to measure tangible, factual errors, it shows its shortcomings when it comes to evaluating so-called “literal” or “word-for-word” translations. A literal translation might comply with the three-point preferential rule, grammaticality, accuracy, and compliance, and still be regarded as a poor translation.

Another case where the TQI methodology proves to be ineffective is when a high degree of creativity is expected on the translator’s part, which is often the case with translations for marketing and advertising. In these types of text, translators and copyeditors might have a certain degree of freedom. It is an acceptable practice to deviate from the source text as long as the translator maintains the core message. Conversely, the TQI system penalizes deviations from the source text as accuracy errors, something that a translator in other circumstances is not allowed to do.

In our experience with marketing texts, the translation might contribute 60-75% of the final version, the remainder coming from additions, deletions, and textual changes as deemed appropriate.

3 ADDITIONAL RESOURCES

Copies of our previous presentations, a translation quality web blog, and other materials can be found on our website, www.translationquality.com.

4 NOTES

1. We are referring to U.S. Supreme Court Justice Stewart’s remark about the difficulty in finding an objective definition for an obscene motion picture. In JACOBELLIS v. OHIO, 378 U.S. 184 (1964) he remarked:

“I shall not today attempt further to define the kinds of material I understand to be embraced within that shorthand description; and perhaps I could never succeed in intelligibly doing so. But I know it when I see it, and the motion picture involved in this case is not that.”

5 REFERENCES

1. Doyle, Michael Scott (2003). “Translation Pedagogy and Assessment: Adopting ATA's Framework for Standard ErrorMarking”, in The ATA Chronicle, November/December 2003.

2. Osimo, Bruno (2004). Traduzione e qualità: la valuazione in ambito accademico e professionale, Hoepli, Milano, p. 25

3. Williams, Malcom (2004). Translation Quality Assessment: An Argumentation-Centred Approach, University of Ottawa Press, Ottawa

Translation difficulties from Arabic to English

Translation difficulties from Arabic to English

Translators need an in-depth knowledge of two languages, and need, ideally, to be familiar with the subjects of the texts they are translating. This is especially true for translation in such fields as technology, science, law and medicine. In fact, many translators specialise in a particular field or fields in which they have expertise.
When translating literature, poetry, songs and similar material translators need to be familiar not only with the two languages involved, but also with the cultures of the people who speak them. One problem literary translators face is what to do with culture-specific references – they could translate them literally and provide footnotes or other explanations for readers not familiar with the source culture, or try to find equivalents specific to the target culture. Translating poems and songs is particularly challenging as not only do you need to translate the words, but you often need to find ones that rhyme as well.
Each language describes the world in a different way. For example, the colour spectrum is a continuum with no clear boundaries between the colours, and is divided up differently for different languages. Greek has separate words for light blue and dark blue, while other languages, such as Welsh and Japanese, have words that can mean blue or green, or something in between.
In English there are many verbs of motion that describe the manner of motion – he bounched out of the house and galloped up the street, for example. In other languages such as French and Spanish, you could add a phrase to a simple motion verb, like to go or to run, to describe the motion, but this would feel clumsy and unnatural, so translators would normally omit such descriptions.
Names of people and places are another translation challenge, especially if you’re translating between too very different languages such as Chinese and English. Do you provide English versions of the Chinese names, which are likely to be unfamiliar and difficult to remember for your readers, or do you just transliterate them? If you do the latter, do you put the surname first, as is the custom in Chinese, or last, as in English?
Ideally a translation will read as if it was originally written in the target language. This is hard to achieve, but certainly possible

You might be seeing that the job of text translation from one language to another languge is not difficult but it is actually very complex. The pair of language integrated in the translation process may not be having equivalent words bearing the similar meaning. The results is the translators have to deeply in describing the context evidently which usually leads to lengthy texts.

Languages are employed for expressing the thoughts and ideas among people. This is the main factor that makes translation very important. You should always make sure that the translator is not executing word to word translation as it may cause many problems while comparing the meanings of the contents in both the original and target documents after completing the work. One word can be utilized in various contexts and if you do word translations, it may trigger many problems through word translation that does not match the original contexts of the text. This implies to almost the entire translations including Thai translation.

The process is very time consuming especially if you are given with the task of translating a very large document with several pages. In that case to adjust the time, translators seek aid from translation software in order to fulfill the task with quality service. This is possible now days as much translation software is available especially for doing Russian translation jobs.

There are lots of defects related to translation software as at times it may not work well for the text to be worked out. Every word to word translation does not perform excellently for translation software and due to that, it relates to the context dissimilar to the original context. Also problems occur when the software is not in a place to recognize the source texts by any means.

The cultural difference and other traditional custom always remain as the main barrier in translating a document from one language to another. Due to this a word in one language might not be able to find its equivalent in the other language. When the translator has failed in the task, an alternative has to be found. If you are offering the work without manual proofreading, you can land in trouble as the variations can be clearly seen while comparing with the original text.

Search engines are very helpful in translation tasks in order to change one language to another language. Most of the search engines work by the translation software and they will be able to find the best possible results in the lengthy time possible on internet. This can help the translator in completing the job very accurately and in a fast manner without causing any problem. This involves to many language translations involving the Vietnamese translation.

The outcomes acquired through this process are far more suitable and at times it will be written not directly and usually makes use of complicated words wherein the reader must think widely to comprehend the meaning of the text.

You have to give preference for choosing a skilled translator in spite of all the obtainable technologies. This can aid you in getting the appropriate and correct text. This also points out the sharpness of human mind in opposition to the machines and technologies in completing the task easily.

Problems That Can Occur in Translation:

Translating a foreign language from a culture that existed thousands of years ago can be a difficult task. Can you imagine someone from biblical times attempting to translate a relatively simple phrase?
Phrase: I was so engrossed in my work, I ate a Mcdonald’s burger while continuing to rough out this draft on my computer.
Biblical Translation: 1. Not being froward in his ways, but working with diligence and wisdom unto the Lord, so that they may rejoice at the time of harvest

1. Cultural differences:
Actual phrase: In preparation for battle, the men braided the hair of his partner, also exchanging a pink handkerchief for luck.
Mistranslated as: In preparation for battle, the men pulled out braids of hair from a rugged companion, also exchanging the bloodied handkerchief for luck.
2. Misunderstanding a metaphor:
Actual phrase: When we meet at dawn, I will crush them like an egg.
Mistranslated as: When they sat down for breakfast, they had their eggs scrambled.
3. Literalizing a metaphor:
Actual phrase: The carpenter had a roll in the hay, finding the screw to jump-start his day.
Mistranslated as: The carpenter found the screw that had fallen in the hay and was able to get to work at the start of the day.
4. Incompetence:
Actual phrase: Yea, I say unto you, with love in your hearts, and resolution in your minds, embrace your neighbor and make peace upon the Earth.
Mistranslated as: Listen up! With resolution, I say, go out and perform open-heart surgery on your neighbor, lest ye die, and the Earth become as a cesspool.
5. Words with multiple meanings:
This is difficult to demonstrate with English to English examples. So let’s pretend that my example phrase is in Inuit. Many of us know that the indigenous people of Alaska have about 67 words for snow, some of which are not translatable.
Actual Phrase in Inuit: There is a clentos (made up word in Inuit for snow) on the way.
Mistranslated as: There is snow on the way.
Better mistranslation: There is a relatively mild snowstorm on the way, expected within the next 6 to 12 hours. Expect temperatures in the -10 to – 20 F range, with dry light flakes of snow that are blown in from the Northeast, heralding even colder weather to come within several days. Expect winds of 5 to 15 mph, not a lot of drifting, and expected accumulation of 4 to 6 inches.
7. Words that can’t be translated:
These are words for which a language has no corresponding word. For example, the Inuit have no corresponding word for the IRS (Internal Revenue Service).
English Phrase: He has an appointment with the IRS this afternoon.
Inuit mistranslation: There is a clentos on the way.