What Did You Say?
January 7, 1954: The Georgetown-IBM experiment takes place. Georgetown University and IBM worked together to develop a machine capable of translating Russian into English. The experiment was run using an IBM 701 mainframe computer which was programmed to used six grammar rules and had a vocabulary of 250 words. The sentences were entered by someone who spoke no Russian. Over 60 Romanized Russian (changing from Cyrillic script to the Latin alphabet) statements on a wide range of topics were entered. They were concerned with politics, legal matters, mathematics, and scientific subjects. The resulting English translation then appeared on a printer. The experiment was considered a success and encouraged governments to invest in computational linguistics. The project managers claimed that machine translation would be a reality in three to five years.
The dream for easy translations predates computers. This dream was first referenced as early as the 1600s. With this experiment, more investing was supposed to bring it to fruition. Instead, there has remained a certain lack in machine translation (MT). In 1966, the ALPAC (Automatic Language Processing Advisory Committee) report spoke to the many issues with successful MT. Funding was dramatically curtailed. The difficulty comes in the way humans use language. There is not a one to one correspondence from one language to another. Rather, there are nuances held in the phrasing humans use. The translator must understand the nuances of both languages in order to preserve the meaning. It remains beyond our current technology.
Beginning in the late 1980s, when computer processing power increased dramatically, interest began to grow in statistical models for machine translation. The idea stems from information theory and is based on probability distribution. This was used rather than the rules based translation with some benefits but also, some problems. Word-based translation had been problematic because of differences with compound words, morphology, and idioms. There are some words that can be translated into more than one word in a second language, depending on usage. This issue is one of fertility, which when talking about languages refers to the number of foreign words each native word produces.
Hybrid MT (HMT) uses the strengths found in both statistical and rule-based translations methodologies. Even with this more advanced translation, there are problems with turning text of one language into another. A major stumbling block is that many words have more than one definition and so the machine must understand which meaning is being used before translating rather than just using a one to one correspondence. Another problem is the use of vernacular or colloquial language. A third major issue is with named entities. When a machine doesn’t recognize the proper noun, it can be translated as a common noun or sometimes left out altogether. More than a half century has passed and translators still have jobs today. MT may yet be in our future.
Why does a translator need a whole workday to translate five pages, and not an hour or two? ….. About 90% of an average text corresponds to these simple conditions. But unfortunately, there’s the other 10%.
It’s that part that requires six [more] hours of work. There are ambiguities one has to resolve. For instance, the author of the source text, an Australian physician, cited the example of an epidemic which was declared during World War II in a “Japanese prisoner of war camp”.
Was he talking about an American camp with Japanese prisoners or a Japanese camp with American prisoners? The English has two senses.
It’s necessary therefore to do research, maybe to the extent of a phone call to Australia. – Claude Piron explaining the pitfalls of translation
Also on this day: Zeus’s Lovers – In 1610, the four Galilean moons were discovered.
CQD – What? – In 1904, a new distress signal was called for.
Around the world in more than 80 days – In 1887, Thomas Stevens completed his trip around the world on a bike.
Fire! – In 1950, the Mercy Hospital in Davenport, Iowa caught fire.
Up on the Roof – In 1973, Mark Essex’s killing spree came to an end.