Turing testing in the uncanny valley

angry emoticon

As I have mentioned before, I am not happy with the current state of artificial intelligence research. I think it’s going off on a tangent straight for perfection higher than that observable in human beings, or things entirely unrelated to the practical applications, pushing back results which could be useful right now.

What are the natural language communications researchers doing? They’re trying to beat each other competing for Loebner prize, in which all conversations revolve around trying to aggressively determine whether your conversation partner is a computer. So people ask them about hobbies and movies… Tell me, if you’re talking to a helpline consultant, do you ever want to know what his favorite movie is? Do you really want him to be more sensitive? No, you want him to get the job done — understand what your problem is, ask you questions about it and then tell you what to do to fix it. As a result, most of the contestant programs are practically useless — that is, can’t work in the real world to perform any useful function. They try to train them on the web, where over half of the denizens can’t read, much less write. Then they try to have cybersex with the poor bots.

What are the speech researchers doing? They’re working on fancy animated 3D characters which fall straight into the uncanny valley. They do more and more precise animation of vocal tract, but their renders still either look like extremely ugly cartoons or take way too much computrons to draw anything beside a Gollum for a movie. The only time when they remember that 2D animation still exists is when they want to try fancy morphing algorithms to cheat and use a set of real mugshots.

What are the android kinematics researchers doing? They’re teaching their robots to dance, because it looks good on stage. When they should be teaching them to maneuver in a crowd, bring objects and take objects away. And push around a wheelchair.

At least the visual and speech recognition people are still doing something that makes sense… They aren’t having as much progress as I’d like, though.

It all shows. I’ve been trying to rebuild Rei on modern hardware literally for years — since about 1998. She did what she was meant to do with technologies which were decades old — ELIZA-based AI, which didn’t go far from its 1966 predecessor, a Klatt speech synthesizer which was initially introduced in 1970 and last updated in 1995, all running on an 8-MHz processor in a megabyte of RAM.

I still can’t. I had to get another Amiga just to restore her. Why?

  1. Pursuit of more realistic speech synthesis results in concatenative synthesis being in fashion, which makes creation of new voices prohibitively hard. Ohwell, I’ll live with the voices I get, for now, at least until I can get a real vocalist to record a speech corpus, but…
  2. …the only currently available open source speech synthesis package that actually works, Festival TTS is so undocumented that I have not known that I actually can extract phoneme information from it to make it lip sync up until yesterday — while it had the capability to do that since 1996 at least. Which brings me to…
  3. …trying to get at the data. I just want to connect to the synthesis server and get it to send the packet with phoneme data to the client so I can process it. Turns out I can dump it to disk but can’t actually get it through the server connection. Why? Because the script function to do that is horribly broken, leaving me wrecking my head trying to code around it. And along the way…
  4. …I discovered that it creates a security hole on my system the size of a small bus, which nobody noticed for at least five years. Easy to plug one, too.

Argh! I just want to duplicate what I did with hardware and software from aeons ago — at least, aeons in terms of computing history! Back then it was simple, it only took me a week to get there from pretty much absolute zero in the field.

I just want a talking anime character, what the hell is wrong with that?

P.S. Well, what do you know, as soon as I complained on the mailing list, it decided to embarrass me and actually work. I’m actually done messing with Festival and on to writing a Python client class capable of sending a string to it and returning with phoneme and wave data in a single structure. Then it’s on to viseme lookup table and actual talking head in pygame, then the AI — I’m thinking of abandoning my antiquated Eliza-like code and using PyAIML instead. A Python interface to the Velleman board I assembled a few days ago and my most insane toy yet is going to overshadow her former glory. :)

This calls for a new android name. As much as I love Rei and Dorothy, it’s time to pick a new one. Any suggestions?