← Blog
May 27, 2026 · 15 min

EU AI Act: what every agent operator needs to know before August 2, 2026, in plain language

On 2 August 2026 — sixty-seven days from when this post is being published — the European Union's AI Act starts enforcing its real teeth. Until that day, most of the law has been on paper. From that day forward, the highest-risk category of obligations becomes enforceable with penalties up to €15 million or 3% of global annual revenue (whichever is higher), and the worst infractions carry penalties of up to €35 million or 7%. The law applies to anyone whose AI system reaches a person in the European Union, regardless of where the operator is based — a small operator in Buenos Aires or San Francisco running an agent that serves an EU user is, technically, in scope. This post is the version of the law we wish someone had written for us when we first read it: every technical term explained, every requirement mapped to a concrete thing you have to do, no legalese, and a one-page checklist at the end.

What the law is, in three sentences

The EU AI Act is the first law in the world that regulates AI systems based on the level of risk they pose to people. It does not ban AI; it sorts AI systems into four tiers and attaches different obligations to each. The higher the risk, the more documentation, oversight, and accountability the law requires from the people building and deploying the system.

Think of it like food regulation: a bakery selling muffins has lighter obligations than a hospital pharmacy, but both have rules. The Act does the same for AI. A chatbot that recommends recipes lives in a different tier than an algorithm that scores someone's loan application, even though both technically involve "an AI making a decision."

Who is in scope

The Act uses two key roles that get treated differently. We will spend a lot of time on these two words because they are the only legal terminology that genuinely matters for an operator.

Provider = the entity that builds and puts the AI system on the market under its own name. In our world, if you generate an agent in Agent Builder and ship it to a counterparty under your own brand, you are the provider of that agent. Anthropic, OpenAI and Google are providers of the underlying models; you are the provider of the agent built on top.

Deployer = the entity that uses an AI system. If your client uses your agent to screen their job applicants, your client is the deployer; you are still the provider. The two roles can be the same entity (you build it and use it for your own business) or different entities (you build, they use).

The Act's reach is extraterritorial — the same trick GDPR used in 2018. The trigger is not where you are based; the trigger is whether the output of your AI system is used in the EU, or whether the people using your system are in the EU. An operator in Argentina running an agent that serves European customers is in scope. An operator in California whose agent's output reaches an EU user is in scope. The only way to be safely out of scope is to have no EU touchpoint at all — and most operators do not, because the EU has a lot of customers.

The four risk tiers, with examples

Everything in the Act sits on top of a four-tier classification. Understanding the tiers tells you which obligations apply to your agent.

Tier 1 — Unacceptable risk (banned outright). A small list of practices the EU has decided are incompatible with fundamental rights and bans entirely. Social scoring of citizens by governments. Real-time biometric identification by police in public spaces (with narrow exceptions). Cognitive behavioural manipulation that exploits vulnerabilities. Emotion recognition in workplaces and schools (with exceptions for safety). Predictive policing based solely on profiling. If your agent does any of these, it is illegal in the EU. Most operators will never touch this tier.

Tier 2 — High risk. The category that absorbs most of the Act's substantive obligations and the category most agent operators need to read carefully. We will spend the whole next section on it. Examples drawn from Annex III of the Act (the official enumerated list): recruiting and HR tools, education (admissions, grading), credit scoring and creditworthiness, biometric identification, law enforcement support, migration and border control, judicial and democratic processes, access to essential public services, and management of critical infrastructure.

The deciding question for whether an agent falls in this tier is not "is it powerful," it is "does it touch one of the use cases on the list." A frontier model running on a researcher's laptop is not high-risk on its own. The same model used to score loan applications is.

Tier 3 — Limited risk (transparency obligations). Any AI system that interacts with a natural person or generates content has obligations under Article 50. Chatbots must disclose that they are AI. AI-generated images, audio, video and text must be marked machine-readably as AI-generated (deepfakes specifically must be labelled when not used for legitimate purposes). Emotion-recognition and biometric-categorisation systems must inform the people exposed. The obligations are lighter than high-risk but still mandatory; an agent that lies about being human is not compliant.

Tier 4 — Minimal risk. Everything else. Spam filters, AI in video games, recommendation systems. The Act does not impose specific obligations on this tier; it only encourages voluntary codes of conduct.

A separate category — General-Purpose AI (GPAI) — applies to the underlying foundation models (the Claudes, the GPTs, the Geminis) and to anyone who builds on top of them. GPAI providers have their own set of obligations around documentation, training data summaries and safety evaluations. Most operators are not GPAI providers; they are providers of high-risk or limited-risk systems built on top of someone else's GPAI.

The seven concrete obligations for high-risk systems

If your agent falls in the high-risk tier, the Act requires seven things. We are going to translate each one from legal language into a concrete artifact you have to produce.

1. Risk management system

The legal text: you must establish a continuous, documented process to identify, estimate and mitigate foreseeable risks throughout the system's lifecycle.

What that means in practice: a written document, kept up to date, that lists what could go wrong with your agent, how likely each thing is, what you are doing about it, and who is responsible. Not a one-off audit; a living register. Think of it like a software bug tracker, but for AI risks instead of code bugs. Sample entries: "agent hallucinates incorrect tax code → likelihood medium → mitigation: human review on outputs above $1,000 of tax impact." "Agent fails on non-English documents → likelihood high → mitigation: detect language and refuse to process unsupported ones."

2. Data governance

The legal text: training, validation and testing data must meet quality criteria for relevance and representativeness, and must be examined for bias.

What that means in practice: if you train or fine-tune your own model, you have to be able to describe where the data came from, how you sampled it, and what biases you checked for. If you are using somebody else's model (the common case for most operators), the obligation is much lighter — you have to be able to describe the data you feed into the system, document any retrieval-augmented generation (RAG) corpus your agent uses, and show that your test cases cover the populations the agent will actually be used on.

3. Technical documentation

The legal text: detailed pre-market documentation demonstrating compliance, kept up to date.

What that means in practice: a document (the Act calls it a "technical file") that any EU regulator can ask to see. It should describe what your agent does, what models it uses, what tools it has access to, what the input/output flows are, what testing you did, what risk-mitigation measures you have in place, and how someone can use it correctly. Think of it as the "operator's manual + safety datasheet" for your agent. For most agent operators, this is a 10-20 page document, refreshed quarterly.

4. Automatic logging

The legal text: high-risk AI systems must automatically record events relevant to identifying risks or modifications.

What that means in practice: every meaningful agent action must be logged — the input that triggered it, what tools it called, what it produced, when it happened, on whose behalf. The logs must be retained long enough to support post-hoc investigations (industry practice is six months minimum, longer if you operate in finance or healthcare). Agent Builder ships this by default for every agent in your fleet; if you operate outside Agent Builder, you need to wire it up.

5. Transparency to deployers

The legal text: instructions enabling deployers to understand capabilities and limitations.

What that means in practice: you have to give whoever is using your agent (your customer, your client, the person who plugs your agent into their workflow) clear documentation of what the agent does well, what it does poorly, and what it should not be used for. A one-page "fact sheet" that ships with the agent. A short list of "do not use for X, Y, Z." A summary of typical error modes.

6. Human oversight

The legal text: the system must be designed to allow effective human oversight, including the ability to monitor operation, intervene, override and disable the system.

What that means in practice: a human in the loop — or at least a human with the ability to step into the loop — for high-stakes decisions. Not necessarily a human approving every output, but a clear path for the human to see what the agent did, to override a decision the agent made, and to shut the agent down. The Act explicitly requires that the human be capable of doing this without disruption to the workflow. An agent without an off-switch is not compliant.

7. Accuracy, robustness, and cybersecurity

The legal text: the system must achieve appropriate levels of accuracy and be resilient against errors, faults and attempts at unauthorised manipulation.

What that means in practice: three things bundled together. Accuracy: declared performance metrics that you can defend with evidence (not "this agent is great" but "this agent achieves an F1 score of 0.84 on the documented test set"). Robustness: the system tolerates the kinds of input variation it will encounter in the real world — capitalisation, typos, low-quality images, partial information. Cybersecurity: the system is resistant to attempts at unauthorised manipulation, which in agent-land includes prompt injection, data poisoning, and supply-chain attacks against the tools the agent uses. We are publishing the threat-model post that goes deeper on this tomorrow.

The Article 50 transparency layer (applies even to limited-risk systems)

Even if your agent is not high-risk, if it talks to a person or generates content, you have transparency obligations.

The penalties, made concrete

Three penalty tiers map to three classes of violation.

Penalty structure (whichever is higher):

  Prohibited practices (Tier 1 violations)
    €35,000,000  OR  7% of global annual revenue

  High-risk obligations violated (Tier 2 violations)
    €15,000,000  OR  3% of global annual revenue

  Misleading information to authorities
    €7,500,000   OR  1% of global annual revenue

The numbers are deliberately set to be larger than GDPR's maximums (4% of revenue), because the EU views AI Act non-compliance as more consequential than data protection non-compliance. They are also deliberately scaled to revenue, not profit, so that they bite even loss-making startups.

For an operator the size of a small business — €1M annual revenue — the 3% figure works out to €30,000. Not catastrophic, but enough that you do not want to be the test case. For a mid-size operator at €100M revenue, the same percentage is €3M. For Salesforce at $35B+, you are looking at the kind of penalty that makes the front page.

The one-page operator checklist

This is the version we would print and tape to the wall. Each item maps to one of the seven high-risk obligations or the transparency layer.

EU AI Act compliance checklist — for the agent operator

[ ] 1. AI Inventory
        For every agent you operate, you can name: what it does,
        which model(s) it uses, which tools it has access to,
        who built it, who deploys it.

[ ] 2. Annex III Classification per agent
        For each agent, documented decision: is it high-risk?
        If yes, which Annex III category? If no, what was the
        reasoning? Keep this written down.

[ ] 3. Risk Register
        Living document. Foreseeable risks, likelihood, mitigation,
        owner. Refreshed quarterly minimum.

[ ] 4. Technical File
        10-20 pages per high-risk agent. Architecture, models,
        data sources, testing, mitigations, instructions for use.

[ ] 5. Automatic Logging on
        Every meaningful action logged. Retention >= 6 months
        (longer for regulated industries). W3C Trace Context.

[ ] 6. Deployer Instructions
        One-page fact sheet per agent: capabilities, limitations,
        do-not-use-for list, known error modes.

[ ] 7. Human Oversight Path
        For every high-risk agent: documented human who can
        monitor / intervene / override / disable. No agent in
        production without an off-switch.

[ ] 8. Performance Metrics
        Documented accuracy, robustness, cybersecurity testing
        results. Defensible with evidence.

[ ] 9. Transparency Disclosures
        Every chatbot tells the user it is AI. Every AI-generated
        output is marked. Deepfakes are labelled.

[ ] 10. Conformity Assessment (for high-risk)
        Self-assessment for most categories; third-party
        assessment for some. CE marking on the system. EU
        database registration.

[ ] 11. Provider/Deployer Matrix
        Written assignment: for each agent, who is the provider,
        who is the deployer, who carries which obligations.

[ ] 12. Incident Response Plan
        How you discover, document, and report serious incidents
        to the relevant national authority within the required
        timeline.

If you can tick all twelve before 2 August 2026, you are in a defensible position. If you cannot tick all twelve, prioritise: Inventory (item 1), Classification (item 2), Risk Register (item 3), Logging (item 5), Human Oversight (item 7), and Transparency (item 9) are the load-bearing ones. The rest are paperwork that you can build out in the months after the deadline, as long as the first six are in place.

How Agent Builder maps to the obligations

The honest sales pitch: the reason we have been writing for a month about protocols (MCP, A2A, AP2, x402, ERC-8004) is that the protocols themselves give an operator most of the technical capabilities the AI Act demands. The mapping is direct.

Logging (Obligation 4) is automatic. Every MCP call, every A2A message, every AP2 mandate, every x402 payment is logged by Agent Builder with W3C Trace Context propagation. The 6-month retention is on by default; 12-month is one toggle.

Transparency (Article 50) ships by default. Every agent generated by Agent Builder discloses its AI nature at the start of an interaction. Outputs are tagged with the SynthID standard where the underlying model supports it. Deepfake-style outputs are gated behind explicit operator opt-in with watermarking enforced.

Human Oversight (Obligation 6) is the dashboard. The multi-agent operator dashboard we shipped lets a human monitor any agent, intervene in any task, override any decision and disable any agent — exactly the four affordances the Act requires.

Technical Documentation (Obligation 3) auto-generates. The Agent Builder export produces a Technical File draft for every agent: model used, tools connected, data flows, test results, risk register entries. The operator's job is to review and add the contextual narrative; the structured content is filled in.

Cybersecurity (part of Obligation 7) sits on microVM isolation. Every tool execution runs in a sandboxed environment. Prompt injection has dedicated defences (see tomorrow's threat-model post). Supply-chain provenance of MCP servers is tracked.

The pieces an Agent Builder operator still has to do themselves: the AI Inventory (it is your inventory), the Annex III Classification (only the operator knows the use case), the deployer-side instructions (you write them, the catalog includes templates), the conformity assessment (self-attested for most agents; we provide the template), the Provider/Deployer Matrix (operator-specific). These are the things no platform can do for you because they depend on your specific deployment context.

What to do this week

If you have 67 days and you have not started, here is the order we would tackle this in.

Week 1 (today through Sunday): Make the AI Inventory. Just a spreadsheet. One row per agent. Columns: name, what it does, model used, tools connected, who built it, who uses it, where the users are. If you do not know what you have, nothing else matters.

Week 2: Classify each agent. For each row in the inventory, ask: "is the agent making decisions in any of the Annex III categories (recruiting, education, credit, biometric, public services, employment, law enforcement, infrastructure)?" If yes, the agent is high-risk and the seven obligations apply. If no, you are in the limited-risk tier with only Article 50 transparency obligations. Write down the answer. The act of writing down the classification is what the law requires.

Week 3-4: For each high-risk agent, start the Technical File. Use the auto-generated draft from Agent Builder if you are on Agent Builder. Add the narrative context. Have a human review the output for at least the first three weeks of operation and document what they saw. The Technical File is alive, not a one-time write-up.

Week 5-6: Turn on logging if it is not already on, verify retention. Write the operator-facing instructions for each agent (the one-page fact sheet). Set up the human-oversight path with documented escalation rules.

Week 7-8: Conformity assessment. CE marking. EU database registration. These are the formal-paperwork items that you cannot do until the underlying obligations are in place, which is why they come last.

Week 9: Have a real human read your work and find the gaps. Ideally a lawyer who has read the Act, but in the absence of that, anyone with discipline who has not been involved in building the agents. The fresh eyes will catch the obligations you missed.

Closing — the unglamorous truth

The EU AI Act is not the most exciting piece of legislation ever written. It is, however, the most consequential one for anyone running AI agents in the next decade. The deadline is real, the penalties are real, and the operators who treat it as a paperwork exercise are going to regret that framing the first time a national authority asks to see their Technical File.

The operators we work with who are taking it seriously have all said the same thing: once you do the inventory and the classification, the rest of the work is not heavy — it is mostly documentation of things you should be doing anyway. Risk register, automatic logging, human oversight, transparency disclosures. These are the practices of a serious operator regardless of regulation. The Act just forces you to write them down.

If you are running agents through Agent Builder, you are starting from a position where the technical implementation of most obligations is already done. The operator's job is to write the documentation that says "here is what my deployment of this technology actually does, here are the risks I have considered, here is how I supervise it." That is the work for the next 67 days.

The official Act text is at digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai. The official FAQ is at ai-act-service-desk.ec.europa.eu/en/faq. Read both. If you operate at any meaningful scale, talk to a lawyer who specifically reads AI Act work. The €15M penalty is not the kind of mistake you fix later.

Tomorrow we publish the security threat-model post that goes deep on Obligation 7 (cybersecurity) — the kinds of attacks agent operators have to defend against today, and the ones we expect to see in the next eighteen months. Read that one too; the two posts are halves of the same conversation.