A man sits at his desk in front of his laptop and writes something down.

How does Artificial Intelligence work?

An introduction to machine learning, neural networks and large language models (LLMs).

Explained briefly: 

  • Modern AI does not “think” but rather identifies patterns in vast quantities of data using neural networks and deep learning.
  • Generative AI (such as LLMs) builds on this technology to produce new, statistically plausible content (for example text or images).
  • Understanding how this works is essential for using AI securely, lawfully and with digital sovereignty in public administration and regulated environments.

Why the “how” determines trust and security

Artificial intelligence is now embedded in everyday life – from customer service chatbots to systems supporting complex data analysis within public authorities. Public debate often focuses on the “what” – which new tools are available? – and the “why” – what transformative or disruptive benefits might they bring? However, for strategic decision-making, particularly in governmental or highly regulated settings, the “how” must take centre stage. How exactly does AI actually “learn”? How does it “know” what it is supposed to do? 

On what basis does it reach a decision? 

These questions are not merely technical details. They are central to trust, security and digital sovereignty. Only those who understand the underlying mechanisms can manage the real risks effectively, including: 

  • algorithmic bias
  • the proliferation of deepfakes
  • the opacity of “black-box” systems 

A sound technical understanding is also a prerequisite for ensuring lawful deployment in line with regulatory frameworks such as the EU AI Act. This article explains three core concepts of modern AI: machine learning, neural networks and generative AI.

The foundation of AI: What Is Machine Learning (ML)?

The fundamental difference between traditional software and AI lies in the learning process. Conventional software is explicitly programmed: “If X happens, do Y.” An AI system, by contrast, “learns”. Machine learning (ML) is the technical term for this paradigm shift: Rather than being given fixed rules, an algorithm is trained on data. It independently extracts patterns and statistical relationships from large volumes of training data. 

The resulting trained model can then be used to make accurate predictions, classify data or support complex processes.

How does training work?

In supervised learning, a system is shown thousands of examples – for instance images of dogs and cats – together with the correct labels. The system adjusts its internal parameters, effectively thousands of small numerical settings, until it can reliably detect the statistical features (such as fur texture or ear shape) that distinguish a “dog” from a “cat”. 

This method immediately reveals the biggest challenge: AI systems are entirely dependent on their training data. The principle of “garbage in, garbage out” applies without exception. If training data is flawed, incomplete or biased, the resulting model may be ineffective at best and harmful at worst.

Example 

If certain population groups are underrepresented in the training data, the AI “learns” this bias and reproduces it as an apparently objective truth.

In automated public-sector decision-making – for example in the assessment of applications – such discriminatory outcomes are unacceptable and legally indefensible. Moreover, where personal data is used for training purposes, the highest standards of data protection, anonymisation and purpose limitation must be applied.

The “brain” of AI: neural networks and deep learning

What is the technology – the “brain” – that enables this learning process? In most modern AI systems, it is a neural network: an architecture inspired by the human brain, but by no means a replica of it. Such a network consists of multiple layers of artificial “neurons”. These are simple mathematical nodes that receive, process and transmit signals – broadly analogous to their biological counterparts. Each connection between these nodes carries a numerical “weight”. During training, these weights are adjusted so that certain patterns are given more or less significance.

How does such a network “think”?

A neural network does not operate with words or pixels. Instead, it converts each piece of information – a word or a pixel, for example – into a so-called meaning vector: a long sequence of numbers representing statistical meaning and contextual relationships. You might think of this as a mathematical postcode for a concept. Similar concepts (e.g. the vectors for “authority” and “office”) are close together in this multidimensional mathematical space. The core task of AI is to calculate and interpret the relationships between billions of these vectors.

What is deep learning?

If a neural network contains a very large number of layers, it is referred to as “deep learning”. This depth enables the system to identify highly complex and abstract patterns. For example, it can recognise not merely an eye, but a specific facial expression; not just individual words, but their meaning within a particular context. Nearly all modern AI applications – from image recognition to speech recognition – rely on deep learning. 

It is precisely this ability to learn and imitate complex human patterns such as faces or voices perfectly that forms the technical basis for deepfakes. An AI system can learn the patterns of a real person’s face or voice so accurately that it can generate highly convincing synthetic replicas. Such developments pose a direct challenge to trust in digital content. The response must therefore include verifiability. Cryptographic provenance mechanisms can provide tamper-resistant evidence of the authenticity of content.

The current frontier: how generative AI (LLMs) works

The current hype around ChatGPT and similar tools is driven by generative AI (GenAI). Technically, these are deep learning models trained to produce new content, rather than simply classify existing data. It is, however, more precise to describe this process as synthesis rather than creation. An AI system does not create something from nothing, nor does it possess creative intent. It synthesises a new, statistically plausible output based on patterns learned from billions of training examples.

This is how ChatGPT and the like work.

Person types on the keyboard

How do transformer-based Large Language Models (LLMs) such as OpenAI's ChatGPT or Google's Gemini work? Transformer-based large language models (LLMs), such as ChatGPT (developed by OpenAI) or Gemini (developed by Google), do not “understand” language in the human sense. Instead, they abstract language into meaning vectors and process these vectors using highly advanced statistical pattern recognition within neural networks. The output – a sequence of multi-dimensional vectors – is ultimately converted back into natural language.

Based on billions of texts, images and videos, these LLMs have learned which word is statistically most likely in a given context, what things look like, how natural and technical systems behave, how concepts are expressed in different languages, and much more. Large Language Models are therefore masters of probability and linguistic form, not of understanding. 

LLMs are not limited to providing entertainment or answering questions. Rather, LLMs function as a communication interface between humans and machines: For example, they enable users to assign tasks to connected systems or applications using natural language. Conversely, technical outputs – such as diagnostic feedback, data queries or analytical results – can be translated by the LLM into human-readable language. 

Where such instructions involve complex tasks composed of multiple recurring subtasks, the term agentic AI is used. This refers to AI systems capable of autonomously planning and executing structured sequences of actions.

Hallucination: when LLMs produce nonsensical output 

A pressing risk inherent in this approach is “hallucination”. Because the model does not understand content but instead generates statistically plausible sequences of words, it may produce information that sounds convincing but is factually incorrect. It optimises only for plausibility, but not necessarily for truth. 

For public administration – where decision-making must be fully evidence-based – this represents an unacceptable risk. 

The problem of traceability starts at the technical level. In deep neural networks with billions of parameters, even developers are often unable to fully reconstruct how internal weightings combine to produce a specific output. The result may be statistically coherent, yet the path leading to it remains opaque. This is also referred to as the technical “black box”. For Europe, however, a second “black box” is equally significant: the strategic opacity of commercial models operated by non-European providers. Recent developments point to the emergence of closed ecosystems and create substantial security-related vulnerabilities:

 pictogram magnifying glass and data

Opaque training data: 

On what data – and with which potential biases – were these large language models trained?

Warning triangle pictogram

Uncertain data usage: 

Are sensitive inputs being read, reused or even used for training by third parties (a risk known as “data selling”)?

Pictogram Connection

Strategic dependency: 

Reliance on providers with opaque governance structures creates direct technological dependency.

For public administration, such a triple “black box” – hallucinations, technical opacity and strategic dependency – is not a viable option. 

Public-sector decision-making must be transparent, traceable and evidence-based, without strategic dependency. The solution lies in the development and deployment of sovereign AI platforms, purpose-built for the public sector, trained on verified and evidence-based data, and operated under full institutional control.

Conclusion: technical understanding as the basis for sovereign decision-making

Understanding how AI works – whether machine learning, deep learning or large language models – is not a marginal technical detail. The underlying mechanics directly determine the risks involved (bias, deepfakes, hallucinations, black-box effects) and therefore shape the necessary regulatory and strategic safeguards. 

The Bundesdruckerei Group focuses on enabling the responsible use of AI for government and business alike. It addresses inherent risks by drawing on its core competencies: 

  • Secure digital identities for people and machines
  • Cryptographic trust services to protect data integrity and provenance
  • The development and operation of sovereign systems for Germany’s and Europe’s digital transformation 

A deep understanding of how AI works is therefore not merely an IT concern. It is the foundation for every sovereign, strategic decision in both the public and private sectors in the age of Artificial Intelligence.

You may also be interested in the following

Article
Article

Frequently asked questions about how AI works

AI refers to the capability of a machine to imitate human-like intelligent capabilities such as logical reasoning, learning, planning and creativity. AI systems analyse their environment (data), process that information and respond to it in order to achieve defined objectives.

AI systems typically rely on algorithms (step-by-step computational procedures) and data. They receive input data (such as text, images or sensor data), process it using complex mathematical and computational techniques – often involving artificial neural networks – and produce an output (for example a response, prediction or action).

Machine learning is a central approach within AI. An algorithm is trained to perform a task more effectively over time by analysing large datasets, identifying patterns and continuously refining its outputs based on feedback and evaluation. In other words, it learns from experience instead of being explicitly programmed for every possible case.

AI systems are inspired by the human brain, in particular through artificial neural networks, which attempt to replicate how neurons are interconnected. However, even the most advanced systems (including deep learning models) represent simplified and specialised simulations of specific cognitive processes. Today’s AI systems do not achieve the breadth of capabilities, consciousness or emotional intelligence of human beings.