ChatGPT, Claude, Gemini, and Grok are not prepared to inform American voters.

ChatGPT, Claude, Gemini, and Grok are not prepared to inform American voters.

      A new group of voters will consult ChatGPT, Claude, Gemini, and Grok for information on how to vote, locate polling stations, and discern the truth. Research consistently shows that these models struggle to accurately answer such queries. Nonetheless, the election is approaching.

      In spring 2024, a researcher from the Tow Center at Columbia Journalism School conducted a controlled study that, upon reflection, should have resolved a prevailing debate in the industry. The team provided eight AI search tools, including ChatGPT Search, Perplexity, Gemini, Copilot, and Grok-2 and Grok-3, with a selection of 200 news articles from twenty different publishers and requested that each tool identify the articles and attribute them correctly. Across 1,600 queries, the models produced incorrect answers over 60% of the time.

      ChatGPT Search, the only tool to respond to all 200 queries, was accurate for 28% of them but incorrect for 57%. Perplexity, touted as the research-grade choice, had the lowest error rate at 37%.

      These findings were released over a year ago and have not improved since. A Bloomberg study summary from May 20 confirmed that ChatGPT, Claude, Gemini, and Grok still offer unreliable responses regarding news, including election-related updates. Nieman Lab's analysis of the same data revealed that ChatGPT continues to struggle the most when attributing information to news sources. A separate report from NewsGuard highlighted that the top ten generative-AI chatbots returned false claims in response to news inquiries 35% of the time in August 2025, an increase from 18% the previous year.

      With 167 days remaining until the 2026 US midterm elections, the first group of American voters who are likely to use chatbots as their primary source of news will vote in November.

      Reportedly, ChatGPT and Claude will influence this election, yet no one—including their developers—has a solid plan for managing the repercussions when these systems provide confident, articulate answers that are incorrect.

      The research collectively indicates not merely occasional inaccuracies from chatbots but rather a more specific and dangerous threat to information integrity: chatbots often misattribute quotes, generate links that lead to dead ends, and preferentially cite AI-summarized or syndicated versions of articles over the originals, severing connections to the original journalists.

      They struggle to differentiate between a Reuters article, a rewritten piece from a content farm, or a Russian disinformation site disguised under similar syndication banners. NewsGuard’s monitoring revealed that the top ten generative-AI models citing Russian disinformation claims did so about one-third of the time, treating these sites as credible sources.

      The underlying reason for this issue is clear, and the labs acknowledge it. The training data for today’s advanced models has drawn extensively from the open web, incorporating both reputable sources like the New York Times and the cleaned-up outputs from disinformation campaigns.

      The systems designed to enhance retrieval of up-to-date information function over a search index where top results for many news queries are AI-generated rewrites of previously AI-generated rewrites.

      Lawfare's earlier analysis of 'data voids' explains this mechanism: where authentic stories have minimal coverage, propaganda fills the gaps, leading the chatbot to treat this propaganda as the main source based on its retrieval logs.

      This context serves as the backdrop for the labs' negotiations over publisher-licensing agreements. OpenAI has made deals with various publishers including the Financial Times, Axel Springer, News Corp, and Le Monde, among others; Google has done similarly; and Anthropic and Perplexity have developed their own partnerships.

      Both parties argue that access to licensed content will improve citations, enhance summarization accuracy, and create a healthier relationship between chatbots and publishers. This argument is plausible, yet published evidence as of May 2026 does not support it.

      ChatGPT Search's 57% failure rate was based on a dataset that included articles from publishers with existing licensing agreements, which did not enhance accuracy in retrieval but rather created an illusion of legitimacy around incorrect information.

      The specific issue regarding the midterms is that the current chatbots' failure modes align almost perfectly with the spread of election misinformation. A voter asking ChatGPT, "Where is my polling place?" may receive a confident answer with a seemingly credible citation, the validity of which hinges on whether the model's most recent source for that address is accurate.

      Similarly, if a voter queries Gemini about a Republican candidate in their district being charged with crimes, the accuracy of the answer will depend on the version of the news article that the retrieval layer accesses and whether it is from the AP wire or a rewritten piece that omits critical details.

      Asking Grok, "Who is winning this race?" will yield an answer influenced by the model's training cut-off and the distribution of polling aggregator sites within the retrieval index.

      None of these errors appear as "hallucinations" to

Other articles

Discord calls have finally introduced end-to-end encryption, but your direct messages have not been included. Discord calls have finally introduced end-to-end encryption, but your direct messages have not been included. Discord has completed its multi-year encryption project. Now, all calls are end-to-end encrypted by default across consoles, mobile devices, and browsers. However, there is a caveat that you should be aware of. Lambda secures a cloud contract with Hudson River Trading to provide access to NVIDIA chips. Lambda has entered into a cloud infrastructure agreement with Hudson River Trading to provide HRT with access to NVIDIA chips. Spotify introduces verified podcast badges to ensure you are listening to the authentic host, rather than an AI imitation. Spotify introduces verified podcast badges to ensure you are listening to the authentic host, rather than an AI imitation. Spotify is introducing verified badges for podcasts and is reinforcing its policies regarding AI voice cloning to ensure that the podcasts you enjoy are truly created by the individuals you believe are behind them. Spotify introduces verified podcast badges to ensure you're listening to the authentic host rather than an AI replica. Spotify introduces verified podcast badges to ensure you're listening to the authentic host rather than an AI replica. Spotify is introducing verified badges for podcasts and strengthening its regulations on AI voice cloning to ensure that your favorite podcasts are genuinely produced by the individuals you believe are behind them. Forza Horizon 6 offers Game Pass its next essential title. Forza Horizon 6 offers Game Pass its next essential title. Forza Horizon 6 launches on Game Pass today, featuring an open world set in Japan, support for cloud and handheld devices, and accessibility for Ultimate and PC Game Pass members. Android 17 is at long last receiving Apple’s Handoff feature, and it’s overdue. Android 17 is at long last receiving Apple’s Handoff feature, and it’s overdue. Android 17's new Continue On feature enables you to launch an app on your phone and immediately resume your activity on your tablet with a simple tap.

ChatGPT, Claude, Gemini, and Grok are not prepared to inform American voters.

Chatbots are not dependable for news updates. With the midterm elections 167 days away, let's examine what existing research reveals about ChatGPT, Claude, Gemini, and Grok, along with the actions being taken by the labs.