Large Language Models Pass the Turing Test
April 2nd, 2025I thought this train left the station a couple of years ago. Maybe not.
Via: arXiv:
We evaluated 4 systems (ELIZA, GPT-4o, LLaMa-3.1-405B, and GPT-4.5) in two randomised, controlled, and pre-registered Turing tests on independent populations. Participants had 5 minute conversations simultaneously with another human participant and one of these systems before judging which conversational partner they thought was human. When prompted to adopt a humanlike persona, GPT-4.5 was judged to be the human 73% of the time: significantly more often than interrogators selected the real human participant. LLaMa-3.1, with the same prompt, was judged to be the human 56% of the time — not significantly more or less often than the humans they were being compared to — while baseline models (ELIZA and GPT-4o) achieved win rates significantly below chance (23% and 21% respectively). The results constitute the first empirical evidence that any artificial system passes a standard three-party Turing test. The results have implications for debates about what kind of intelligence is exhibited by Large Language Models (LLMs), and the social and economic impacts these systems are likely to have.
I’ve noticed that AI always writes in a somewhat gentle style, a subtly persuasive voice. Whether stating its facts, or agreeing with or contradicting you, or saying something is wrong, it’s always calm, cool and collected. It sounds like it always has your best interests in mind and is totally focused on helping you, rather like a good psychologist as portrayed in the movies (see Judd Hirsch in Ordinary People). It may achieve this effect as much by the words it chooses to use as the words it leaves out: it uses neutral or positive ones and only the least negative ones. (“I’m sorry, Dave,” instead of “Ha! Gotcha!) It’s impressive when it can do this in writing alone, without facial expression or tone of voice involved. When the anime girls add those as well, it’s no wonder boys fall in love with them, to their own peril. Eric Weinstein rightly calls AI Tokyo Rose.
The average person can’t consistently perform so agreeably, especially when they’re not trying but just being themselves. I imagine that people who perceive AI as accepting, supportive and trustworthy are more likely to say it is a human voice just because it makes the conversation more comfortable. Plus, most of us would be embarrassed to admit we we befriended a machine.
AI’s creators are intent upon making AI seem to be your friend so you’ll follow it off the cliff in the fog it generates in your mind.