Thursday, October 10, 2019

Artificial Intelligence is helping to translate messages of long-lost languages

There are about 6,000-8,000 languages currently spoken in the world. That's less than a quarter of all the languages people spoke over the course of human history. Every time a language is lost, so goes that method of thinking, of relating to the world.



While languages change, many of the signs and how the words and characters are distributing stay relatively constant gradually. Due to the fact that of that, you might try to translate a long-lost language if you understood its relationship to a known progenitor language. This insight is what allowed the group which included Jimmy Lu and Charlotte Kim from MIT and Jason Kim from Google's AI lab to use machine learning to analyze the early Greek language Linear B (from 1400 BC) and a cuneiform Ugaritic (early Hebrew) language that's also over 3,000 years of ages.

Linear B was formerly split by a human - in 1952, it was figured out by Mike Ventris. However this was the very first time the language was determined by a device.

The technique by the scientists focused on 5 key properties connected to the context and alignment of the characters to be analyzed - distributional resemblance, monotonic character mapping, structural sparsity and considerable cognate overlap.


They trained the AI network to search for these characteristics, accomplishing the appropriate translation of 67.8% of Linear B cognates (word of typical origin) into their Greek equivalents.

What AI can potentially do much better in such jobs, according to Stanford Innovation Evaluation, is that it can simply take a strength technique that would be too tiring for humans. They can attempt to equate signs of an unknown alphabet by rapidly testing it versus signs from one language after another, running them through everything that is currently known.

Next for the scientists? Maybe, the translation of Linear A - the Ancient Greek language that nobody has actually been successful in analyzing so far.

You can inspect out their paper "Neural Decipherment through Minimum-Cost Flow: from Ugaritic to Linear B" here.

Every time a language is lost, so goes that way of thinking, of relating to the world. Because of that, you could try to decode a long-lost language if you understood its relationship to a recognized progenitor language.

No comments:

Post a Comment