More than 60 years after philosopher Ludwig Wittgenstein’s theories on language were published, the artificial intelligence behind Google Translate has provided a practical example of his hypotheses. Patrick Hebron, who works on machine learning in design at Adobe and studied philosophy with Wittgenstein expert Garry Hagberg for his bachelor’s degree at Bard College, notes that the networks behind Google Translate are a very literal representation of Wittgenstein’s work.
Google employees have previously acknowledged that Wittgenstein’s theories gave them a breakthrough in making their translation services more effective, but somehow, this key connection between philosophy of language and artificial intelligence has long gone under-celebrated and overlooked.
Crucially, Google Translate functions by making sense of words in their context. The translation service relies on an algorithm created by Google employees called word2vec, which creates “vector representations” for words, which essentially means that each word is represented numerically.
For the translations to work, programmers have to then create a “neural network,” a form of machine learning, that’s trained to understand how these words relate to each other. Most words have several meanings (“trunk,” for example, can refer to part of an elephant, tree, luggage, or car, notes Hebron), and so Google Translate has to understand the context. The neural network will read millions of texts, focusing on the two words preceding and following on from any one word, so as to be able to predict a word based on the words surrounding it. The artificial intelligence calculates probabilistic connections between each word, which form the coordinates of an impossible-to-imagine multi-dimensional vector space.
Here’s the cool part: It turns out that algebra can be applied to the vector representations of the words and produce conceptually meaningful results. Hebron cites the canonical example, published in Computer Science in 2013: “If you take the word vector representing ‘king,’ minus the vector representing ‘man,’ plus ‘woman,’ you will land in the vector space that represents the word ‘queen.’” This is not a fluke; for example, there’s a similar vector relationship between the vector representations for “Beijing” and “China” as there is for “Moscow” and “Russia,” notes Hebron.
“Similar words land in similar places,” says Hebron. “The spatial relationships between these words holds to the ways we think about the conceptual relations between them.”
This connection is a representation of Wittgenstein’s notion of language. In Philosophical Investigations, published posthumously in 1953, the philosopher argued that there are no standard, fixed meanings to words; instead, their meanings lie in their use. “[W]hen investigating meaning, the philosopher must “look and see” the variety of uses to which the word is put,” notes Stanford Encyclopedia’s explanation of Wittgenstein’s theory. He also emphasized that words must be understood by their “family resemblance” to other words: “There is no reason to look, as we have done traditionally—and dogmatically—for one, essential core in which the meaning of a word is located and which is, therefore, common to all uses of that word. We should, instead, travel with the word’s uses through “a complicated network of similarities overlapping and criss-crossing’,” notes Stanford Encyclopedia.
And so Google Translate neatly maps on to Wittgenstein’s theories: “There’s a very literal connection between these two ideas because the ways we’re coming up with the representations of words within word2vec is that we are basically finding a place for them in space by looking at their surrounding words and pinpointing them as defined by the sum of all of their in-context uses,” says Hebron.
This is far from the only example of artificial intelligence putting philosophical theories to the test. For example, Noam Chomsky argued that certain features of language, such as grammar, are biologically and innately rooted in the mind; but deep learning pioneer Yoshua Bengio has noted that deep learning so far entirely contradicts these theories. While machine learning is certainly useful in its own right, Hebron notes that it can also be “case studies for otherwise abstract philosophical notions.” Given that philosophers such as George Boole and Gottlob Frege first created computer code, it makes sense that advances in AI would continually circle back to philosophical theories.