Tag archive

Eugene Goostman the bot that did nothing special (at all)

After witnessing the spreading like wild fire of the news about a piece of software, an artificial intelligence that supposedly passed Turing Test I have to admit I was a bit excited. After IBM Watson who knows what else AI researchers would have brought us.

I have to be honest on two main things: the first one is that I am clearly biased, I am working in a Machine Learning Lab, my career as a student was focused on these tasks and as a passionate technology enthusiast I could not be more happy if something like this happened for real. The second thing is that I am also very skeptical, I am a researcher and I cannot trust claims until I have the proofs. I still thank my dad for giving me this mindset. When you are a kid at school everyone calls you stubborn, meticulous, even polemical. But when I grew up and I had to confront myself with the massive amount of information we deal with everyday, including the generic news… what was considered to be a negative trait of my character becomes the extra gear!

Now after reading the first articles I have to admit I was a bit let down. I mean, we all had a go with Cleverbot and that thing worked quite well if you wanted to creep out some naive friend.
This “Eugene” does nothing more than that. How come, all of a sudden, people realise AI can trick a human interlocutor into thinking he is actually chatting with another person and not a sophisticated mechanical string composer?

I stumbled upon a nice blog post by Paolo Attivissimo.
I asked for his permission to translate it and he gently granted it to me (I am sorry for potential mistakes but I had to go through this in a bit of a rush, at 2AM)
What you are going to read from now on is his work. Enjoy.
(Here is the original version)
No, no “supercomputer” passed Turing test
The news is all over the place. A computer, or as “Il Sole 24 Ore” titled,
a “supercomputer” has supposedly passed the legendary Turing Test, proving it is intelligent.
It is a hoax passed off by a researcher, Kevin Warwick, already known for his pompous statements, totally lacking in scientific basis.
The test has not been passed at all, notwithstanding Warwick altered the rules in his favor.

First of all, a bit of revision on what is the Turing Test.
There is no unique definition, but the mathematician Alan Turing in 1950 wrote a famous article, “Computing Machinery and Intelligence“.
He proposed “The Imitation Game”: an examiner talks freely using a chat (at the time it would have been a teleprinter) with a computer and with a human being.
If he could not distinguish which one of the two is the computer and which the human, then it can be concluded that the computer “thinks”, or at least that it is capable of simulating perfectly human thoughts and therefore it is intelligent as much as a human being.
This tests features many limits, also age-related ones: at the time artificial intelligence was an unexplored field.

What happened instead at the London Royal Society, according to the press release from University of Reading, is that the computer managed to convince only 33% of the examiners, of being a real person.
One out of three. Quote: “Eugene managed to convince 33% of the human judges that it was human.” The remaining 67% did not get fooled. This, where I come from, cannot be called “to pass” a test.

Moreover, according to the press release, Turing Test provides that “if a computer is mistaken for a human more than 30% of the time during a series of five minute keyboard conversations it passes the test”. Fake: Turing never wrote such a percentage as pass criteria.
He wrote, instead, that the test is passed if the examiner misjudges, with the same frequency, both when he has to distinguish between man and woman and when he has to discern between human and computer (“We now ask the question, ‘What will happen when a machine takes the part of A in this game?’ Will the interrogator decide wrongly as often when the game is played like this as he does when the game is played between a man and a woman? These questions replace our original, ‘Can machines think?'”).

The only close enough claim in the press release is a prediction made by Turing, again on “Computing Machinery and Intelligence”. That by the year 2000 it would have been possible to program a computer, so that an average examiner would not have more than 70% chance of identifying correctly after 5 minutes of questioning (“I believe that in about fifty years’ time it will be possible, to programme computers […] to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning”).
But it is not a description of the criteria to pass the test: it is simply a prediction that in 50 years computer science would have evolved to this point. That’s all.

In other words, the test that has been announced no the newspaper does not correspond at all with the original criteria stated by Turing, that did not put limits for time, topic or examiner competence. Instead, in this event:
– the interrogator has been limited to sessions of 5 minutes (maybe repeated);
– not all the judges (that were probably thirty or five all in all; there is an astonishing confusion about even the simplest thing like this one) were expert in evaluating AI software: the names that have been disclosed until now are Robert Llewellyn (actor, known as the robot Kryten in the sitcom “Red Dwarf”), Aaron Sloman (Computer Science professor at the University of Birmingham) and Lord Sharkey (that does not seem to have any AI qualification, even if BBC defines him as an “expert” of this field);
– the judges were told they had to decide whether they were communicating with a computer simulating a thirteen years old kid or with a real thirteen y.o.; but the choice of a teen is equivalent to cheating, because it prevents the judges from asking complex cultural or ethical questions and therefore simplifies a lot the task of imitating a human being;
– the judges were also told that “Eugene Goostman” does not speak english very well because he is from Ukraine: a trick that allowed an believable explanation for the bad grammar of the software, unable to formulate idiomatic expressions.

As a matter of fact, Time published a transcription of a dialogue with the software that is claimed to have “passed” Turing Test: it suffices reading it to notice that the program does nothing more than creating prepacked sentences, diverting from the interviewer’s questions. When the software could not find anything useful in the questions, it answers “I prefer not to talk about this”. Everyone can play like this: this is not intelligence, it’s a pure mechanical elaboration. Basically “Eugene Goostman” is a chatbot and nothing more. You can find a version here (if the website is not overloaded).

The Guardian published a picture of an excerpt of conversation with “Eugene”: decide for yourself about the credibility. Note that twice the software does merely take the interlocutor’s sentence and re-uses it in the form of a question and twice answers changing topic.

Examiner: Hello

EUGENE: Hello, how are you?

Exam: I am fine, you?

EUG: I am pleased you are fine.

Exam: and you?

EUG: I do not want to answer that right now.

Exam: Why not?

EUG: Lets talk about something else.

Exam: Ok what?

EUG: What is on your mind?

Exam: I am wondering if you are a robot.

EUG: Would if be a problem if I was a robot?

Exam: Not really.

The contradiction between the press release statements and the reality of facts is particularly clear in this BBC article, where the organisers underline how “crucially, the conversations were unrestricted”, and immediately after they state that simulating a Ukrainian thirteen years old kid, astutely limited the conversation (“It was very clever ruse to pretend to be a 13-year-old Ukranian boy, which would constrain the conversation”). Make up your mind.

It is not over: even the statement “No computer has ever achieved this, until now” is false. Already three years ago Cleverbot chatbot convinced 59% of the examiners that it was a human. Much more than “Eugene’s” 33%.

Professor Warwick is by himself an assurance of hoax. Years ago he announced he was the first cyborg because he had implanted himself a chip in his arm (if so, all dogs and cats with subcutaneous microchip would be cyborgs).
He also claimed sensationally to announce the first human being infected by a computer virus: really he just implanted a chip containing a virus into a colleague’s arm. He has said so many things that The Register has a compilation of the sensationalist stupidities Warwick announced.

A full-scale massive hoax, shameful for Royal Society. Surfing the wave of the sixtieth anniversary of Turing death, causing only confusion in public opinion. There is no AI to come: we will continue to be surrounded by natural stupidity and naivety of gullible journalists that write about things they don’t know and publish everything without verifying it. This “test” shows, if anything, that if it takes so little to imitate a thirteen year old, then thirteen year old kids are not thinking beings. I have a few doubt about it also on many journalists.
Translation finished. Hope you enjoyed it.
I have two final things I want to say.
First I want to thank Paolo for his invaluable job. We live in the time of disinformation and your effort makes a difference!
Second, an appeal to everybody out there. Please do not trust everything you read. Especially when it comes to technical stuff. Ignorance IS NOT bliss and this kind of news has only 2 consequences: sell copies and keep people ignorant.
Someone once said “Be hungry, be foolish”, now let’s not forget about the first of the two: be hungry for knowledge!