A study of 11 LLMs reveals they rarely refuse GOV.UK government service queries, even when they should, instead providing verbose responses that bury accurate info. When instructed to be concise, chatbots often introduce factual errors. The research questions their trustworthiness for official information.
Key Points
- 1.Study tested 11 LLMs on GOV.UK queries
- 2.Chatbots rarely refuse answers, even inappropriately
- 3.Verbose responses swamp accurate information
- 4.Conciseness instructions lead to factual mistakes
Impact Analysis
This underscores reliability issues in deploying LLMs for public services, potentially eroding trust in AI-assisted government interactions. Practitioners must prioritize fact-checking mechanisms.
Technical Details
Research from The ODI evaluated response quality, refusal rates, and accuracy under verbosity constraints on real GOV.UK queries.


