traqx
Newsletter These analyses by email
AI in GMP & GxP

EU GMP Annex 22: what the first AI annex means for your GxP practice

Reading time ~10 min · Daniel Herrmann

EU GMP · ANNEX 22 (DRAFT) · ARTIFICIAL INTELLIGENCE AI MODEL Statisch DETERMINISTISCH FROZEN · TESTED HUMAN OVERSIGHT Mensch gibt frei ATTRIBUTIERT · BEGRÜNDET ✓ REVIEW GATE GMP RECORD Belegt & versioniert AUDIT-TRAIL TRACEABLE INTENDED USE TEST SPLIT EXPLAINABILITY MONITORING KRITISCHE ANWENDUNGEN · GENERATIVE AI NUR MIT MENSCHLICHER PRÜFUNG

EU GMP Annex 22 is the first annex of the EU GMP guide dedicated entirely to artificial intelligence. The draft was published for consultation in July 2025 — together with a draft revision of Annex 11. It addresses AI models in critical GMP applications and requires, among other things: a defined intended use, controlled data, independent test data, explainability, performance monitoring — and consistent human oversight. Generative AI must not carry critical decisions without human review.

What this is about: the first GMP annex dedicated to AI

Until 2025 there was no place in the EU GMP guide where the use of artificial intelligence was regulated in its own right. AI-assisted systems ran under Annex 11 (Computerised Systems) — a framework written long before machine-learning models. That is changing: on 7 July 2025 the European Commission published the draft of a new Annex 22 “Artificial Intelligence” for targeted consultation — together with a draft revision of Annex 11. The consultation window closed in early October 2025.

With it, AI in the GMP environment gets an explicit set of expectations for the first time. That is good news for everyone who wants to use AI in a controlled way: instead of uncertainty (“is this even allowed?”) there are now named requirements against which a deployment can be designed and checked.

Important for context: at the time of writing, Annex 22 existed as a draft; the final version may change in detail. The direction of travel — risk-based, data-disciplined, humanly supervised — is considered settled and matches GAMP 5 (2nd Edition) and the FDA CSA guidance. Always check the current status of the document before making decisions.

Scope: which AI Annex 22 means — and which it doesn't

The draft deliberately keeps the scope narrow. It addresses AI/ML models in critical applications of GMP-regulated manufacturing — wherever model output can directly touch product quality, patient safety or data integrity.

Three boundary lines matter most:

  • Static models: for critical applications the draft expects models with deterministic behaviour — the same input leads to the same output. The model is trained, frozen, tested and then operated in a defined state.
  • Dynamic models: systems that keep learning in operation and continuously change their behaviour are not foreseen for critical applications — their validated state could not be demonstrated as stable.
  • Generative AI and LLMs: because of their probabilistic behaviour they sit outside the critical Annex 22 scope — the draft does not foresee their use in critical GMP applications. In non-critical, supporting applications they remain possible — then with qualified, documented human review before any output becomes regulatorily effective.
Annex 22 does not ban generative AI — it makes human review the condition under which its use becomes defensible.

The core requirements at a glance

The draft's requirements condense into seven disciplines — every validation team already knows each of them in spirit; what is new is their consistent application to models:

  • Intended use: the model's purpose is precisely described — task, limits, input data, affected processes.
  • Data quality & governance: training, validation and test data are controlled, representative and traceably managed.
  • Independent test data: performance is assessed on data that was not used in training — the separation is demonstrable.
  • Performance & acceptance criteria: metrics and thresholds are fixed before testing and aligned with the intended use.
  • Explainability & confidence: where possible, it becomes visible which features drive a result and how confident the model is.
  • Human oversight: human supervision is built into the process — with defined roles and documented decisions.
  • Monitoring & change control: model performance is monitored in operation (drift); changes run under control.

The pattern behind it

All seven disciplines follow a principle GxP teams know well: claim nothing you cannot prove. Annex 22 transfers validation's discipline of evidence onto models — data, behaviour and decisions must remain traceable.

What this means concretely for QA and validation teams

Even with the final version pending: anyone using or planning AI in GxP processes can do four things right away.

First: build an inventory. Which AI runs in your processes today — including unofficially? A copilot in document drafting is AI use, even if it appears in no system register. Without an inventory there is no risk assessment.

Second: classify criticality. Does an application separate critical decisions (approval, specification, assessment) from supporting work (draft, research, structuring)? Annex 22 will draw its line exactly there.

Third: build human oversight as a process, not a claim. “A human looks over it” is not enough. What holds up: defined review steps, attributed approvals with a reason, and an audit trail that keeps AI suggestion and human decision distinguishable.

Fourth: establish source binding. When AI output flows into regulated documents, it must be traceable what every statement rests on. A draft without verifiable sources is more expensive in review than no draft at all.

Start low-threshold

The cleanest entry is a tightly scoped, non-critical process with full human review — that is where you collect the evidence and working patterns Annex 22 will demand for bigger steps anyway.

Annex 22, GAMP 5 2nd Edition, FDA CSA: one coherent picture

Annex 22 does not stand alone. Three frameworks from different directions have converged on the same principles in recent years:

  • GAMP 5 2nd Edition (2022) accepts AI-assisted work within the risk-based lifecycle and demands critical thinking instead of template documentation.
  • FDA CSA (final guidance 2025) shifts effort from documenting to defensible reasoning — test depth follows risk.
  • EU GMP Annex 22 (draft 2025) formulates the same logic AI-specifically for the first time: controlled data, demonstrable model behaviour, human oversight.

For your strategy this means: you do not have to react to three rulebooks separately. If you build your AI work on source binding, attributed human approval and a complete audit trail, you serve all three frameworks with the same architecture — and you are robust against the final Annex 22 text, because those principles are the common core, not the negotiable detail.

How to prepare today — without waiting for the final version

A realistic preparation path in four steps:

  • 01 · AI inventory and criticality map — capture every AI touchpoint and classify each application: critical / supporting.
  • 02 · Define the review gate — for every AI-assisted workflow, fix who reviews, what is reviewed and how the decision is documented.
  • 03 · Control the source and data space — define which approved sources AI drafts may be built from, and secure the separation of working data and training data contractually and technically.
  • 04 · Carry evidence from the start — versions, approvals and reasons are created in the workflow, not in a re-documentation loop before the audit.

That is exactly the pattern traqx is built on: drafts are created exclusively from approved sources, every regulatory statement ends on a clickable citation, a deterministic check marks anything unsupported as unverified, and every approval is an attributed human decision with an audit trail. That is not an Annex 22 certificate — no such certification exists. It is a way of working that puts the draft's principles into practice today.

Key takeaways

  • Annex 22 (draft, July 2025) is the first EU GMP annex dedicated to AI — published together with the draft revision of Annex 11.
  • Critical applications: only static, deterministically operated models — systems that keep learning in operation are not foreseen for them.
  • Generative AI and LLMs must not carry critical decisions autonomously — their output needs a documented human review.
  • The core disciplines — intended use, data quality, independent test data, explainability, human oversight, monitoring — follow the same logic as GAMP 5 2nd Edition and FDA CSA.
  • Whoever establishes source binding, attributed approvals and an audit trail today is robust against the final version — always check the document's current status before deciding.

Sources

Author

Daniel Herrmann

Daniel Herrmann is a founder of traqx and has worked in GxP validation and quality assurance for more than 15 years. This article summarises publicly accessible regulation (draft EU GMP Annex 22, the Annex 11 revision, GAMP 5 2nd Edition, FDA CSA) in his own assessment — based on the consultation status; the final version may differ. It is orientation, not legal or compliance advice, and does not replace an assessment for your specific scope. Where traqx is mentioned, the text describes the verifiable way of working — sources first, AI as a suggestion, a human decides, the audit trail remains — and no claim beyond that.

Newsletter

These analyses monthly by email

Practice-depth on CSV, CSA, audit-readiness and AI governance — no spam, unsubscribe in one click.

Live demo

See traqx live on your process.

30 minutes on your real GxP context — sources first, AI as a suggestion, a human decides. No sales pitch, one concrete working example.