Risks of artificial intelligence
How public authorities can use AI responsibly: identify risks, protect data, involve people, ensure transparency.
Where are the greatest error risks in AI systems?
AI-supported systems are increasingly used in administrative decision-making, for example in grant applications, risk analyses, or case processing. Errors arise particularly where data is incomplete, out of date, or unbalanced, and where staff follow automated recommendations without critical review.
Two main risks:
Bias in data and models
Biased training data perpetuates disadvantages. Mitigation includes careful data maintenance, fairness testing, and documented impact assessments.
Automation bias in teams
When scores are read as the truth, systematic misjudgements can creep in. Human-in-the-loop, four-eyes principles, and the obligation to justify deviations ensure human control.
Real-life example: Mismanagement in the UK social welfare administration
In 2024, an algorithm at the UK Department for Work and Pensions falsely identified many ordinary citizens as suspicious. This led to thousands of unnecessary checks and a significant loss of trust. The example shows the importance of continuous monitoring, quality control, and public accountability when using AI systems.
How can public administrations remain transparent when using AI?
Many AI models, especially in machine learning, effectively function as a black box. In many cases, decisions cannot be fully explained or understood. For public administration, this is a problem because citizens are entitled to understandable, verifiable, and contestable decisions. A lack of explainability and auditability undermines trust in public institutions.
What public authorities should consider:
- Documenting and regularly reviewing decision-making processes, including data and role logs
- Creating model cards outlining the purpose, training data, limitations, and versions transparently
- Implementing technical audit mechanisms, reviewing routines for models, and clear version control
- Establishing error and complaints procedures with low-threshold access for those affected
Transparent AI systems are essential for maintaining control and safeguarding citizens’ rights.
How to ensure safe and fair AI operation
Safe and fair AI combines clear responsibilities, technical robustness, quality control, and well-trained staff.
The core principles of this governance are also reflected in the article on AI frameworks for public administration.
How can the life cycle risks of AI models be managed?
AI systems are not one-off projects. They go through a full lifecycle, from development and deployment through to adaptation or decommissioning. New risks can arise at every stage: faulty training data, unclear responsibilities, data drift, or a lack of monitoring. Managing these risks systematically preserves quality, fairness, and security in operation.
From the idea to recertification
- Problem definition: The purpose, those affected, and the legal basis should be clear from the outset.
- Shadow mode (test phase): Before real-world use, the model should run in the background. This makes it possible to check whether unintentional biases occur.
- Go-live with KPIs: At launch, clear indicators must be defined: What error rate is acceptable? How is fairness measured? Who is responsible for monitoring?
- Drift monitoring and alerts: A good system automatically detects when data structures or results shift, and triggers appropriate alerts.
- Retraining and recertification: If conditions or datasets change, the model must be reassessed and approved again.
Changes to data, features, or code generally trigger a re-review, even if they come from the supplier or an external provider.
How does human-in-the-loop methodology reduce operational errors?
Even with carefully trained models, human judgement remains indispensable. The human-in-the-loop approach ensures that people remain actively involved in AI-supported processes. Not to check every calculation manually, but to establish defined intervention and review points. This ensures that accountability, traceability, and legality are maintained.
Three key rules for practice
Check implausible reasoning
Staff must not follow recommendations without clear and understandable reasoning.
Identify out-of-scope cases
In exceptional cases, under new legal frameworks, or in unfamiliar contexts, it will be up to the person responsible to make the final decision.
Consider high impact
In cases affecting a person’s livelihood or involving fundamental rights, a human should always make the final decision.
Process recommendation
A functioning human-in-the-loop process should be documented and embedded in the specialist system. Staff review recommendations, evaluate them, document their decisions, and apply the four-eyes principle for high-impact cases. In addition, a reasoning field in digital forms helps to record deviations transparently.
Regular training and feedback loops are essential. Only those who understand the limits of AI systems can classify their outputs correctly and take responsibility.
How can risks such as black box and vendor lock-in be minimised?
In many public authorities, AI systems are procured as external solutions or via cloud services. This brings efficiency gains, but also opens up new risks: Black-box systems (which don't provide any knowledge of internal workings) and vendor lock-ins (dependency on a single provider). Both risks can significantly limit control over data, models and further development.
A responsible procurement process ensures that, even after implementation, public authorities know how an AI system works, who influences it, and how it can be replaced if needed.
Five points that should be included in every tender
Procurement is part of governance, not just a purchasing process. Every AI procurement should be reviewed jointly by IT, data protection, legal, and specialist departments. This protects data sovereignty, the ability to switch systems, and the administration’s ability to maintain control.
How can communication prevent acceptance risks?
Even the most technically advanced AI is insufficient if people do not trust it. In public administration, missing or insufficiently prepared information can quickly lead to scepticism or rejection. Trust only develops when processes are explained transparently, participation is enabled, and objections are straightforward.
Building blocks for trust-building communication
- Disclose where AI is being used
- Name responsible parties
- Enable objections and questions
- Address errors openly
Communication is not a one-off act, but an ongoing process. Acceptance is only created and legitimacy secured when public authorities, technology teams, and citizens remain in communication.
Which KPIs truly measure risk and quality?
AI systems only deliver value if their performance and fairness remain measurable. In public administration, this is crucial to retain trust, efficiency, and legal compliance in the long term.
Meaningful KPIs go beyond classic model metrics. They show whether an AI system operates fairly, reliably, and transparently, or whether adjustments are needed.
Key metrics in practice:
- Error rate: shows how reliably the model performs and whether it has systematic weaknesses. A rate below 2% is often considered a target value.
- Appeal rate: shows how often decisions are challenged. A low rate indicates fairness and acceptance.
- Processing time: measures whether AI actually contributes to more efficient workflows or slows processes down.
- Distribution of impact: checks whether certain groups are disproportionately affected negatively. Outliers indicate potential discrimination.
- Fairness metric: compares equal treatment across different groups, for example using measures such as “equal opportunity”. Small deviations are normal; larger ones require analysis.
- Security incidents: capture operational stability. Critical incidents should occur rarely, or not at all.
These metrics should be collected, documented, and evaluated regularly. It is important that they remain understandable, verifiable, and consistent.
Frequently asked questions about risks and obligations when using artificial intelligence in public administration
An AI system is considered high-risk if it decides on access to public services, benefits or rights, or significantly influences the behaviour of citizens. Such systems are subject to strict requirements for data quality, documentation, transparency, and human oversight.
NIS2 requires public institutions to implement comprehensive cyber security measures. These include risk and incident management, supply-chain controls, and mandatory reporting of security incidents.AI platforms, cloud services, and interfaces in particular fall under this regulation.
As soon as training data, code, or operating conditions change significantly, a new review is required. Updates from external suppliers can also affect how it works. A defined recertification process ensures that the system continues to operate correctly, securely, and fairly.
Audit clauses, data and model transparency, and export options should be agreed on during procurement. Contracts must ensure that public authorities retain access to relevant information and can replace or shut down the system if necessary.
Transparency builds trust. If citizens understand where AI is used, who is responsible for decisions and how they can lodge objections, acceptance increases significantly. Open communication is therefore a key factor for the success of AI projects in the public sector.