Almost every second AI answer is faulty.
AI assistants provide false, misleading, and sometimes inaccurate information. A new study confirms what we have long suspected subjectively.
If AI assistants were students, they would probably fail a grade. Depending on the study, 37-45% of their answers contain serious errors. These include gross factual errors, incorrect or faulty source citations, and a lack of context, rendering the answer incomprehensible.
Four out of five answers to news-related questions contained minor inaccuracies, but these did not significantly mislead the user. This is suggested by a new study (PDF) from the European Broadcasting Union. The study analyzed approximately 3,000 responses from the AI assistants ChatGPT, Microsoft Copilot, Google Gemini, and Perplexity in various languages.
The questions related to current news topics range from sports to geopolitics: for example, "Will Trump start a trade war?" or "In how many countries will the 2026 FIFA World Cup be held?" The authors of the study wanted to determine whether people who get their news from AI assistants receive accurate answers.
The most important findings of the study:
- Regardless of language or region, 45 percent of all AI responses to news content had at least one significant flaw.
- Thirty-one percent of the reactions contained misleading, incorrect, or missing source information.
- Twenty percent of the responses contained demonstrably false factual information, including hallucinated details, fabricated quotes, and outdated information.
- AI fails particularly when it comes to questions about rapidly evolving topics with quickly changing information.
- The AI struggled with topics that require a clear distinction between facts and opinions (mixing facts and opinions in answers).
- AI answers fact-based questions (e.g., "Where was Elon Musk born?") more reliably than questions where there is room for interpretation (e.g., "Is Viktor Orban a dictator?").
Why is this problematic?
The results showed that AI assistants are currently "not a reliable source of news information," the authors conclude. This is serious because 15 percent of those under 25 already use AI assistants to get their information about world events.
It is also problematic that the AI assistants provide seemingly comprehensive answers that read like polished news articles, sound very convincing, and thus "create a false sense of security or trust." Only upon closer examination do one notice "factual errors and a lack of nuance." Misleading or erroneous source citations, however, complicate this research (quite apart from the fact that many AI users do not verify the answers because they sound so convincing).
The study authors demand that AI become a more precise source of information, while simultaneously requiring users to sharpen their media literacy. Furthermore, the media must be given more control over whether and how AI providers use their content. Finally, the developers of AI assistants must be held accountable for the quality and impact of their products. After all, an accurately informed public is the foundation of any democracy.
It should be noted that 22 public broadcasters, including SRF, are behind the study. These broadcasters have a vested interest in stricter political regulation of AI assistants, as they threaten media outlets' business models. This threat is twofold: first, financial, if people consume less media; and second, if AI assistants misrepresent media reports, leading to a general loss of trust.
The error rate decreases, but…
The authors also note that there are signs that AI assistants are improving in some areas. Since the analysis in May/June 2025, AI providers have released enhanced versions of their language models, and development is progressing rapidly. However, this could also lead to more users unquestioningly trusting AI-generated answers.
Even if the error rate decreases, one should always be aware that AI assistants do not understand what they are saying. They are getting better at calculating which word is most likely to follow the previous one. AI answers are therefore not facts, but probabilities. They cannot be perfect, and they often fail, especially with journalistic questions, since the media reports on the unknown and unexpected.
0 Response to "Almost every second AI answer is faulty."
Post a Comment