Wild Ruminations on the Language of the Future

December 31, 2007

In the future computers will translate our language effortlessly. At first it will look like a hearing aid that can recognize speech, translate the words, and output the translation into your ear as a synthetic voice, roughly modulated to match the original signal (so people still sound the same, just speaking a different language).

This technology will allow an English speaker to have a fluent conversation with a Spanish speaker, a mandarin speaker to speak with a Russian, etc.

That is just the first step.

At first new languages will be added manually, and available for wireless download such that you can select a “translate to” language that you understand, which will have an array of “translate from” definition files, just like a text translator works now — English to Spanish, Spanish to English, etc.

Next, professionals will grow tired of maintaining an exponentially growing repository of To and From definition files. They will invent an intermediate language that is made up symbolic constituent parts (theoretically human understandable, but not practical for use as a direct language). This is the same concept as modern programming platforms like Microsoft’s .NET. The reason it’s possible to use several different programming languages in the same system is because they are all translated to the same intermediate language– in the case of .NET, that language is called creatively enough “MSIL” or “MicroSoft Intermediate Language.” As with the language I’m talking about, technically you can read and write MSIL directly but it’s not meant for that purpose, and it’s more difficult than just using the higher level language.

This intermediate language (IL) for translation will allow new language definitions to be produced by the linguists without regard for how it will be translated, because all languages will only have to be translated to IL, then they can be translated from IL to any other language. This will make the maintenance of language files at least an order of magnitude less difficult.

Either following on the heels of or simultaneously with that development, a new context recognition engine will take hold, that will intelligently add to and modify existing language definitions. For example, you speak an unusual dialect of an obscure language which has a phrase for which the universal IL has no conceptual match, a listener will ask: “What does that mean?” You will explain the meaning, and just like a human would do it, the engine will decipher the network of concepts you describe to point to a working definition of the previously unknown phrase — this definition (any definition, really) can be modified slightly over time given more information about the network of concepts that support it. That’s how real language works too.

But here’s where it gets interesting: how will the engine synthesize that new sound? What will the new word sound like? There are many options here.

For example, what if the word in question is a noun about which the listener has no knowledge, like some kind of exotic animal? Should the new translation simply use the original word since there is no analog, or should it try to translate relative to the accepted taxonomy of animal life? There’s a 25 foot tall ape in the jungle. He’s called Kong — should your translator call him “Kong” or “Very Large Gorilla”? Should it do something else? “Korilla”?

What if it’s a more complex web of ideas? In Hawaii the word “Aloha” is used for hello and goodbye. The actual translation of the word is “I love you because you exist,” which is a fascinating concept, and sheds much light on the culture given its common usage.

How will the engine translate it? Like a person, it could bear the definition in mind, but continue using the word itself, appropriated directly from the original language; that would miss the subtle meaning of it, because unlike a human, the translator’s job is to convey a complete, and culturally accurate picture of the meaning, and just because the translators knows the definition, doesn’t mean that it’s clear to the listener. It might use the whole English phrase such that whenever a native speaker says “Aloha,” you hear “I love you because you exist” — but that isn’t correct either, because it doesn’t convey the salutory (salutational?) meaning that way.

The answer may be individual. Right now, foreign phrases are often misunderstood or ignored entirely. A human can decide that the conceptual difference between “soy” and “estoy” in Spanish isn’t important, and that he’ll just memorize the situations in which one or the other should be used to mean “I am.” Others might not recognize that there was ever an important distinction to begin with. Thus, complexity of translation within an individual is scaled perfectly according to the cognitive complexity of the person himself.

A general translation engine will not have such an option: it will be tasked with precisely and completely translating language at a very fundamental level, to all possible listeners. A person can choose to think of Aloha as hello and goodbye. A person can fail to understand what the literal translation even hints at. This is a limitation of human kind that we call “Lost in Translation.”

Translation engines are going to end this phenomenon, but an essential difficulty is this: in communication there is a sender, a medium, and a receiver. Even assuming the sender is clear, and the medium is relatively noise free, the receiver ultimately decides upon the meaning conveyed based on cultural factors, physical factors, intelligence factors, and others that I’m not thinking of. That means that each person’s translator will have to be calibrated to their particular strengths and limitations in order to deliver unfettered meaning.

It also means that that meaning will be different per person. My engine will translate Aloha differently than your engine, so even though we’ll all be having a conversation about the same thing, even if we both speak English, we’ll be hearing different words.

Consequently, as time moves toward infinity, our languages will diverge, no longer inhibited by the previous physical limitation of convention: in the past, language depended on shared meaning through similar voice modulations that would produce decodable strings of “words” that roughly matched in conceptual meaning between sender and receiver. Now that rough matching is no longer necessary, the symbols connected to our shared concepts will be tailored to the person.

I would hope that such tailoring could bring about a new age of thought. Our language determines our world view in many ways, and if the complex concepts we use could be encapsulated in individualized language, then we could jump up a perhaps limitless hierarchy of concepts very quickly.

Essentially we’ll keep our own language definition, updated in real time to be translatable to IL.

A pleasant side effect of such a system would be implicit debugging of our concepts as they are communicated. Something That Eli Yudkowsky talks about frequently, as in this post on Overcoming Bias is the “Great Idea,” which normally turns out to be not as great as we had hoped.

I can postulate an idea and call it something new like “God” or “Dragon,” but something curious will happen when I try to tell other people about it. Their translation engine will choke, and they will get the word “Dragon” with no additional meaning, and their answer will be this: “What do you mean?”

Here’s the cool thing, though. In our world right now, “What do you mean” is not at all profound because it’s hard to share meaning in our current system of language, and in what “what do you mean” is the primary way of forming the conceptual framework for whatever new concept we’re attempting to understand. But that will not be so with ubiquitous translations, because when someone makes a statement that is even remotely comprehensible given their current conceptual web, the translator will convey that meaning in a precise and penetrating way. That will all but eliminate “What do you mean” as the translators do the work of conveying exactly what the speaker means without any additional effort — that will become the default condition.

That means that when a person has to ask “What do you mean,” it will mark either a truly new leap in concepts, or it will mark nonsense. It will also mean that “what do you mean” will be heard as a precise request for a specific set of information, because even if a person doesn’t exactly recognize where the conceptual disconnect is between his web of meaning and the new concept, the translation engine will know precisely that, and therefore when the speaker asks for more information, the translator will be able to formulate a much more precise question.

The end result is that new concepts will quickly either be connected with the conceptual framework that exists in the intermediate language space, or it will be sussed out as nonsense — an independent web of concepts with no bearing on reality. When such a web is a work of fiction, it’s interesting entertainment, when such a web is a culturally held belief system, it is dangerous.

Another side effect of this individualized language system is that depending on one’s expertise and interests, his concepts will be wildly divergent from another person’s. His concepts will encapsulate what he has already learned and mastered.

For example, a modern person has a concept of “desk.” When spoken to a cave man, it becomes clear that this desk concept has many subconcepts, which in turn have their own subconcepts — eventually, a person would be able to explain a desk to a caveman because the caveman has concepts in his mind like wood, the use of tools, maybe labor — all the concepts encapsulated by “desk.” The problem gets indefinitely more difficult with higher order concepts.

Right now when a man says “desk,” to a caveman it is his responsibility to divide the concept until the meaning is shared. With the ubiquitous translation engine, the man will say “desk,” and the caveman will hear what he needs to hear to understand what the man means. But that’s problematic in that it takes far longer to explain a desk than it does to simply say “desk” so it would seem that there would be some lag in communication. The man talking about a desk might have to wait days or weeks to be understood.

How this difficulty will be resolved, is to move away from the “hearing aid” form factor.

Eventually, it won’t be a hearing aid device at all. It will be implanted, then later genetically installed prenatally, then simply passed down through generations of modified humans. If human beings were to catastrophically lose their technology and history, new generations wouldn’t recognize their mode of language as “technology” at all, just a natural state of being.

It will shortly pass over the auditory senses entirely, allowing us to pass vibrations to each other to be understood in a more direct way, eventually giving way to a medium that isn’t as prone to noise, perhaps like wireless computers now. This would, in effect, be indistinguishable from being telepathic.

This would also allow for much faster transfer of conceptual webs, so that our desk man and cave man would have a similar exchange, and even though the cave man would still have to form the web of concepts in his mind, such formation wouldn’t be constrained in any meaningful way by time as it is with auditory sensing; the information would still take time to propagate through the brain, but at a rate several orders of magnitude more quickly that the ear hearing the vibrations in order, the brain decoding it, then translating it, then interpreting it relative to existing knowledge.

The other question is, what effect will our individualized language have on babies? It seems plausible to have shared meaning, then diverge on the symbols we use to represent the meaning, but how will a being with no meaning at all create a symbolic system from scratch? Will the whines be translated as “I want something but I don’t know what”? and eventually to “I am hungry” or “I have a shitty diaper”? Will parents’ words in response be translated to comfortable cooing, or will it be necessary to calibrate a new translation engine to simply convey the sound offered, so that the child can form a basic foundation (just like humans do now) for future divergence? Can the noises that are currently nonsense to the infant be translated to nonsense that is tailored to be easily understood by his particular brain pattern?

One of the more titillating questions to me is whether such a thing has already happened. What aspects of our experience seem natural to us but at some point were invented and created by previous humans or other intelligent entities, only to be forgotten? What if what we think of as our immune system is an invention of nanotechnology that was seamlessly integrated with our DNA? What if our system of communication, or other sensory systems, are the result of ingenuity rather than nature?

What if we ourselves, in our entirety, were inventions of some intelligence that has since left, or exists in a way so fundamentally askew from our mode that we cannot perceive them readily?

Alright, so it’s a tangent, and it’s far from original, but when one traces the line from where we are, to a possible future that looks an awfully lot like the present, such a tangent seems all the more plausible.

Happy New Year.

3 Responses to “Wild Ruminations on the Language of the Future”

  1. emma Says:

    interesting theories. have one concern though.

    how would a personalized database look like? and what kind of system would be needed to capture my world view etc etc. because if you have personalized system, then the system would need data about me. and if i have say, 60 000 thoughts each day, i would then personally have to tell the computer my thoughs and formulations on flowers, books, people. everything. pretty complex thing to do? but it would be needed if the computer where to fully “understand” me so that it could tailor the meaning of aloha so that it made perfect sense to me.
    so if i did not personally tell the machine the contents of my inner landscape, how else would it know what the perfect translation for me would be? and the logic that is in my mind, would it be there for the computer to follow? or would i also have to give instructions about my personal logic?

    no machine can monitor my thoughts. machines can display the fact that i do have brainwaves, but it can not read them. as far as i know there is no science even close to accomplishing something like that.

    and i don’t think it ever will, because i get the clear notion that thoughts aren’t really in the physical universe at all. only their manifestations (waves). (well this is from my mystic experiences but they are not really relevant here)

    or maybe you could decipher those brainwaves? if a person was sitting in a lab and thinking i love my mother, i love my mother. then the exact shape of those brainwaves could be monitored and given a value in a computer. then the person could think i love my father, i love my father. and those too would be put into the system. and so the next time i thought “i love my mother” the computer could tell this.

    problem is that if you think, i love my mother, and i think, i love my mother. our brainwaves don’t look the same. there is no standard for this. so it would still have to be the indivdual who told the computer what they where thinking.

    well that is a bit of the track.

    so my question is, how do you imagine the personal base needed for the tailored translation to come about?

    oh, yeah, i am not a native english speaker so please, no need to point out the flaws in my language. =)

  2. sylvester Says:

    i have often thought about this theory, of universal intermediate language.however i must point out a few factors that i could not resolve

    1 paradoxical statements, would the translator state both meanings of the phrase?

    2 puns are simple lingusitical features of the human language yet ever so complex as a logical pathway is needed to understand it.( as you said it would probly be lost in translation)

    3 most of human communication is done via none verbal means ( or more specifically without the use of any sybolic contructs) the vast majority of non intellectual communication isdone via body language and voice modulation. i am not sure how well onomatapieas would translate.

    4 purposely ambigious expressions would be extremly difficult to translate. as such is the nature of poetry, where each level of interpretation brings about new meanings to the same symbolic expressions!!

    the genetic traits u speak of embedding into the human genome , im quite interested in that idea, a fully comprehensive artinnary of vocabulary and grammar would be quite useful.

    a nueral interface is quite an interesting idea but i must say i believe it is utterly pointless to consider it. from what my ( very limited ) reading up on how our nueral networks function, the pathway specified through the brain and the concentrations of different chemicals in the brain affect the thought patterns and the end result ( i could be more presice but i dont think the effort is worthwhile here in this post). to accomplic the nueral synaptc linkages of your imagination would most likely require a microchip or nanochip be placed at every synapse of the brains communicative nuerons, and then linked together somehow and then relayed to the processing unit( all of this taking place in the alrady cramped confines of the human cranial cavity)

    a much more simple solution i propose is 2 languages be taught to children at age 3 ( or close ot that age where the brain is hardwiring itself for communication). 1st language is the parents native tounge the 2nd is the intermiate language. also the children should be taught how to effectively use non-verbal forms of communication( the type that may be taught at an advanced communication studies class in highschool???). at a younger age i thought i was telepathic due to mi ignorance of the fact that non verbal communication is an excellent medium to translate thoughts ( both consious and sub concious) into messages for the reciever to comphrehend relatively free of noise.

    having said this , i believe its time for mi to think about the fundementals of our verbal communication and see if we can begin the framework of an intermideate language.

    the q i want to pose is shud an intermediate language be very simple in order to be very dynamic or should it be very complex in order to encompass all human ranges of thought?

    • James Says:

      Interesting as it may seem you only see the good points of this idea. While trying to improve and so on.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: