On scaling the language barriers

thoughtful emoticon

Got a call from my advisor, which often turns me back to thinking about science, and this time, it looks like there’s an idea somewhere there, so I guess I’ll write it down.

National borders are irrelevant on the internet, everyone ignores them just fine. Nobody wants to hear of them, even. However, cultural borders still remain — but not in the form of conflicts of norms and values1 but in the form of language barriers.

I frequently notice that my Net is throughly different from the Net as it is perceived by other people in this very city. About 90% of the feed articles I read daily are in English. At least half the people I prefer to talk to about personal circumstances don’t speak Russian and most of them don’t expect to ever do. When I tell my students about something new I read just a few days ago — in English, naturally — this normally comes as surprising news to them, and when I tell them to look up X, they look it up in Russian, on the local search engines, even if the information is definitely not available in this language yet, cause it’s new.2

Not knowing English limits them to their own corner of the Net. But similarly, not knowing Russian bars Americans from knowing what’s going on around here, not knowing Japanese causes American anime fans to subsist on second-degree information3 and the effect is universal disjointedness. People of the exUSSR are united within the Net, hardly anyone tells the difference between Ukrainians and Russians cause they can communicate. Hardly anyone can tell the difference between all the former subjects of British Empire.

But what is done to solve this, machine translation? It’s laughable. I have an impression it will still be laughable for years. Only true human understanding of communication and all the cultural nuances may ever provide a significant translation. The barrier is definitely currently unscalable with such a brute force approach.

One notable alternative approach included limiting the message instead of trying to get through the complexity directly. See Phantasy Star Online. Something like Esperanto in software. If anyone can make a machine translation system based on this principle of constructing sentences from predefined structures which are known to have direct equivalents in other languages, they’ll definitely make a lot of money.

But will the users of the Net at large accept this limitation? They are used to communicating freely with their own language, which is the one their own thoughts and ideas are originally conceived in. Such an approach will definitely be unsuitable for translating static content, since it only works on sentences generated through using it. While immediacy is a big part of the Net now, it gets in the way of most practical work, which will still require static content. In short, it’s a measure that will help, but won’t solve the problem.

Back in the days when Europe was still large, it was common norm for an educated person to know three languages at least, often more. This approach would solve the problem much better, making the individual language-Nets intermingle on a much wider scale. But no education system can handle this kind of thing now, and the strain on students is high enough already.4

I don’t see a solution to this anywhere close, but this is one of the problems which is required to be solved before a true Net culture can emerge, one based on merit and respect instead of money. Without this, the national pockets will take much longer to develop such cultures, and might not have time to do so before the big businesses catch up with them and make them buy their stuff instead of creating their own.


  1. We get enough of those within cultures, really, since communication now allows relatively free transfer of ideas, which once translated into new languages, flourish all by themselves. ↩︎

  2. Well, students have this annoying quality of trying to sidestep any work at all, so it’s hardly surprising. ↩︎

  3. Lots and lots of problems tracking down that elusive X if you don’t know how to spell it. ↩︎

  4. They can’t spell in their own languages these days, darnit… ↩︎