Saturday, May 29, 2021

Marking Translations Positively

 From time to time we delve down into the archive of this blog in order to revive a post that deserves not to be forgotten amidst the mass of hundreds. Here is one such post.

FRIDAY, AUGUST 3, 2012

Marking Positively: How to Score Natural Translations

 

This post is addressed particularly to researchers, but it's relevant too for teachers of translation. Note that Natural Translation (NT) is used here as a cover term for both Natural Translation and Native Translation.


At the Forli conference in May (enter forli in the Search box), I noticed that some people are still using the old subtractive scoring method to rate NT.

What is the subtractive method? It means starting from 100 points and knocking off a point, or several points, for each mistake of any kind; typically a point or two for minor errors of content or expression and up to five points for major ones. The 'pass mark' is usually expressed as a positive percentage, but it's really a 'failure score'. That's how students' written translations are marked, and likewise the examinations of the professional associations like the Canadian one to which I belong. It can also be used for interpretations, especially if they're transcribed.

Two objections can be raised. The first is a didactic one: that the approach is negative and therefore discouraging. True, mathematically speaking, -30% of mistakes is equivalent to +70% correct, but the psychological effect is different. Anyway, it's not so important as the second objection, which is that the approach reinforces 'nit-picking' by the markers, because small details are allowed to affect the score significantly. I still squirm at a sequence in an old film about an interpretation exercise for European Commission interpreters (see References) in which a student is berated in front of the other students for his translation of a single word.

When evaluating NT, we need to take the opposite approach. Although mistakes are of great interest insofar as they reveal the limitations and the 'pathology' of NT, in NT research our primary interest should be in what subjects can translate and not in what they can't. A score of only 40% because of numerous distortions and omissions would probably entail failure for an Expert or Professional translator or a translation school student; but for a Natural Translator it represents a non-negligible translating ability and we should focus on it and analyse what that 40% consists of.

How can we build a positive scoring method?

In the 1990s I became involved in the design of tests for candidates who wanted to work as community interpreters for public services in Ontario, Canada. These became known as the CILISAT tests and are still in use. The Government of Ontario funded the necessary research. The candidates were almost always Native Interpreters, because the pay was too low to attract Professional Experts and because the languages were not taught in Canada. We decided we needed a test instrument that would be better suited to Native, i.e. untrained, Interpreters than those used by the translation schools and in the profession. So we turned to a method called propositional analysis. It's used by psychologists among others, and in fact I'd been introduced to it by the late David Gerver, who was one of the pioneer researchers on interpreters and was also a clinical psychologist. The form of it we used it can be described this way:

"To analyze the text, propositional analysis – a description of the text in terms of its semantic content – is used. The units of analysis are propositions, or units of meaning containing one verbal element plus one or more nouns. The corresponding units are then selected on the basis of meaning rather than structure."

In practice this meant that we broke down the scripts for the interpretation tests into simple, single-clause sentences representing propositions and then awarded points according to whether the meaning of each proposition as a whole was conveyed in translation: zero points for an omission or a meaning contrary to that of the proposition; 1 point for a meaning conveyed but not clearly or not completely; 2 points for a complete and true rendering. There was a weighting that distinguished between important and unimportant propositions. This scale was solely for meaning. Other factors, for example correct language, were scored separately and globally, not proposition by proposition.

For example, the statement, "At around 6 o'clock I saw a blue sports car waiting on the other side of the road," might be broken down into:

The time was approximately 6 pm

I saw a car.

The car was blue.

The car was a sports car.

The car was waiting.

The car was on the other side of the road.

A paraphrase like, "I seed a sport car stopping at the kerb of our street before supper" would score 7 points for informational meaning before being weighted for importance. (Work it out! 1+2+0+2+1+1.)  The maximum possible points varied with each script. Small language mistakes like "seed" were relegated to a separate evaluation.

References
Guadalupe Barrera Valdes and Manuel Rosalinda Cardenas. Constructing matching tests in two languages: the application of propositional analysis. NABE: The Journal for the National Association for Bilingual Education, vol. 9 no. 1, pp. 3-19. 1984. There’s an abstract here.

Roda P. Roberts. Interpreter assessment tools for different settings. In R. P. Roberts et al. (eds.), The Critical Link 2: Interpreters in the Community, Amsterdam, Benjamins, 1999. Most of it is here.

David Gerver. A psychological approach to simultaneous interpretation'. Meta, vol. 20, no. 2, pp. 119-128, 1975. "A slightly altered version of a paper presented at the 18th International Congress of Applied Psychology in Montreal in July 1974". The text is here.

André Delvaux (director). Les Interprètes. Brussels: Commission of the European Communities. c1975. 16 mm film. c15 mins.

Comments

The post drew comments. Here are a couple of them.

To those of you who have commented on the post about positive marking...

I ought to have acknowledged that even before I heard about propositional analysis from David Gerver, I'd learnt about positive marking from Daniel Gouadec, a well-known French translation teacher who came to teach for a couple of years at the University of Ottawa in the late seventies (see References). He was working at the time on a marking system for the Canadian government Translation Bureau's quality assessment section, but I don't know whether they ever used it.

In reply to SEO Translator: the deductive method is usually applied to short texts, say 300-500 words. For purposes of comparison, texts of about the same length as one another are used; and also, obviously, of the same level of difficulty. The 'pass mark' varies according to the expectations of the markers or examiners, taking account of the purpose of the exercise (professional examination, translation school assignment, etc.), the institution, the difficulty of the text, the level of the examinees, and so on. I've seen pass marks of 60% to 90%. Logically, tests for Expert Translators should have a high pass mark.

In the CILISAT tests, using positive scoring, we actually had two pass marks: one for 'ready to work' and a lower one for 'shows promise but needs training'. As I recall, they were 80 and 60 respectively, but that was after combining with the separate assessment for quality of target language. I haven't thought about automating these or other scorings. Possibly.

1 comment:

  1. I want to express my appreciation for the way you described how to properly score a natural language translation. You made some excellent points, and both scholars and translation teachers can benefit from the strategies you outlined.

    ReplyDelete