Friday, August 20, 2010


This weekend, a majority of my fellow members of the International Committee on Computational Linguistics (ICCL) will be gathering in Beijing for COLING 2010, the 23rd International Conference on Computational Linguistics. COLING takes place every two years in a different part of the world. It’s to sponsor this prestigious conference that the ICCL exists.

The first COLING I attended was at Stockholm in 1969. The application of computational linguistics that drew me to it was machine translation (MT), then in its infancy. Today there are other major applications that compete with MT at COLING; information extraction is a hot topic. The size as well as the scope of COLING has increased with the years. When I was secretary of COLING 1976, which was held in Ottawa, we had about 100 papers submitted. (Incidentally, it was the first COLING at which the registrations were computerised and the poster produced by computer graphics.) This time there were an amazing 840 submissions!

The computational changes that have taken place in MT since 1969 have been twofold. On the one hand, innovations in its techniques: at first transformational grammar and more recently statistically based MT, in which large data bases are searched for previous translations of the same or very similar pieces of text. And on the other hand, the breathtaking advances of computers in general. In fact the use of the large databases needed for statistical MT (called translation memories) only became feasible with the advent of cheap,fast mass storage devices in the 1980s. At the time of the Stockholm COLING, we were still using large reels of magnetic tape for storage!

In addition, there's been a radical social change. In 1969 MT was envisaged as a professional product that would translate technical documentation or (in the case of Canada) government publications, for clients with funds to pay for it. That function remains operative, but to it has been added a new dimension: MT as a consumer product for DIY translation by anybody, universally available and free. With that comes an enormous increase in the volume of MT. Of course, in 1969 we couldn`t imagine the Internet and its impact.

Mind you, we all know that MT still has serious limitations; I’ve written about that before (March 24; April 2, 4, 6 and 14 posts). There was an case recently in Canada that serves as a lesson.
The RCMP, the national police, being an agency of the Federal Government, is obliged by law to produce all its news releases in both the official languages of the country, English and French. The RCMP in the far western province of British Columbia, where native French speakers number about 1.5% of the population but there are many Francophone tourists, decided to cut time and cost for itself and for the public by letting users have Google Translate do the work through a link on their website. Before long, however, there was a public protest by the French division of the Canadian Broadcasting Corporation about the abysmal quality of the French that was being turned out. Fearing possible legal and political repercussions, the British Columbia RCMP soon backtracked and announced it would send its translations to the government’s human translators again – at a cost, so they said, of $3,000 a day.
This cautionary tale illustrates how naive ill-informed clients may be about MT and about translation in general, but that was so in 1969 too.


COLING 2010.

Chad Skelton. B.C. RCMP removes Google French translator from website. Vancouver Sun, August 3, 2010, as well as numerous other press and media reports.

1 comment: