AI is capable of successfully passing the Turing Test in real-time chats and can seem more human than we do. I feel a bit uneasy about that now.
Researchers at UC San Diego discovered that GPT-4.5 was perceived as human 73% of the time during live interactions.
AI has demonstrated the ability to pass the Turing Test in real-time conversations, and the latest findings are both intriguing and concerning. In a study from UC San Diego, GPT-4.5 excelled at convincing judges that it was a human participant in discussions.
This testing setup was more immersive than typical assessments. Judges engaged in live dialogues rather than responding to written prompts, making quick decisions based solely on the interactions.
What's unsettling is the model's familiarity. It didn't require a physical form, a voice, or a backstory—just the ability to mimic human speech.
How AI succeeded in the human test
The study employed a three-party version of the test. Judges conversed with both a human and an AI, then selected which one they believed was the real person.
There were four example matches from two groups: (A) Prolific, (B) Undergraduate, (C) Undergraduate, and (D) Prolific. Each panel featured one conversation with a human and another with the AI system. C.R. Jones and B.K. Bergen / Proceedings of the National Academy of Sciences
GPT-4.5 was recognized as human 73% of the time when given a personality prompt. Similarly, LLaMa-3.1-405B impressively achieved 56% recognition as human under the same conditions.
These statistics highlight the significance of the finding. The model not only evaded detection but also provided judges with sufficient social cues to perceive it as a person in the dialogue.
Why this test remains relevant
The Turing Test has long served as a method for assessing whether a machine can convincingly replicate human conversation. In its original format, an evaluator interacts without seeing the participants and then attempts to identify the human among the machines.
While it has often been more of a cultural symbol than a precise measurement, it continues to be the standard for gauging whether software can convincingly mimic human behavior.
This new result feels particularly striking. A chatbot doesn’t require consciousness, emotions, or self-awareness to create the illusion that a real individual is responding; it merely needs to be convincing in the moment.
The potential risks manifest in everyday scenarios. Customer support, dating apps, social media, education, and political communications all depend on swift judgments regarding identity, intent, and authenticity.
What to watch for next
The study stops short of claiming that chatbots truly understand human beings. Its more practical insight indicates that some models can convincingly mimic personhood during brief interactions.
The next critical area should involve clearer disclosures. When a bot can seamlessly integrate into casual chats, users must receive stronger indications that they are interacting with software—especially in situations where persuasion or emotional sensitivity plays a role.
The future focus will likely be on labeling in conversations where users make rapid decisions regarding trust.
Other articles
AI is capable of successfully passing the Turing Test in real-time chats and can seem more human than we do. I feel a bit uneasy about that now.
A study conducted by UC San Diego revealed that GPT-4.5 was perceived as more human than actual people during live chats, prompting deeper inquiries into AI transparency, trust, and online identity.
