What happens when the AI gets it wrong? An honest answer

AI agents make mistakes. How UK businesses design for failure, what guardrails really look like, and how to handle the moment something goes wrong.

Illustration of a safety net underneath a tightrope walker

One of the smartest questions a UK business owner asks on the first call is, "what happens when the AI gets it wrong?" Not "will it" but "when". Anyone who tells you the agent will be perfect is selling you a story.

The honest answer is the kind of thing nobody puts on their website, so let us put it here.

AI agents do make mistakes

They are statistical systems. Statistical systems are right most of the time and wrong some of the time. The job of designing one is to make the wrong-cases as cheap and as recoverable as possible. Not zero. Cheap and recoverable.

The right question is not "can we eliminate errors". It is "what kind of errors are acceptable, and what is our recovery plan when they happen".

The four kinds of mistakes

For practical purposes, AI agents go wrong in four distinct ways.

1. Wrong answer, confidently

The most famous one. The model is sure of itself and also wrong. With a well-scoped agent, on a job it has been built for, this is rare. With a generic chatbot doing whatever a customer asks, it is common.

2. Wrong action

The agent does something it should not have done. Sent a message it should have queued. Categorised an invoice incorrectly. Booked the wrong slot.

3. Misses the point

The customer is upset and the agent is being cheerfully helpful. The customer is asking about a refund and the agent is talking about delivery times. Tone-deaf, not technically wrong.

4. Goes silent

The agent does not respond, or sends an unhelpful "I am not sure how to help with that". Frustrating for the customer, but at least nothing harmful goes out.

Each of these has different consequences and different design responses.

What guardrails actually look like

Guardrails is the term of art. In plain English, it means the safety nets you build into the agent so the wrong-cases do not cost you customers or money. The good ones come in layers.

Input filters

Recognising when a request is genuinely outside the agent's scope, or when a customer is upset, or when something looks suspicious. The agent steps back rather than ploughing on.

Output checks

Reviewing what the agent is about to send. Some checks are automated (does this contain a price the customer did not pay, is this a refund larger than X, does this look like a complaint waiting to happen). Some are human, especially in early days.

Action limits

The agent only has the permissions it needs. If you do not want it issuing refunds over £100, it cannot. If you do not want it sending external emails without human approval, it cannot. The boundaries are technical, not just policy.

Human in the loop

For everything sensitive, especially in early operation, the agent's output goes to a human before it goes out. As confidence builds, you turn off the approval for the simple cases. You keep it on for the ones where mistakes are expensive.

Escalation

The agent recognises when it should hand to a human, summarises the conversation, and stops. Customers get a person. The team gets the context. The agent does not pretend to be something it is not.

A good AI agent is not the one that never makes a mistake. It is the one whose mistakes are cheap, visible, and quickly fixed.

The practical recovery plan

When something does go wrong (and at some point it will), the response is mostly about logistics, not technology.

  • Logging. You need to be able to see exactly what the agent saw, what it decided, and what it did. If you cannot, you cannot fix anything.
  • Monitoring. Quiet, in the background. Flagging anomalies, escalating any output that looks off-pattern. We use a mix of automated checks and weekly human review.
  • Recovery process. A simple, written procedure for the team. If a customer reports the agent made a mistake, here is the apology, here is the correction, here is who fixes the underlying issue.
  • Iteration. Every issue feeds back into the agent's design. After three months, the same kind of issue should be rare.

The risk profile by use case

Some agents have a much wider tolerance for error than others. Worth being clear-eyed.

  • Low risk. Internal knowledge agents, document drafting, lead qualification. Wrong answers cost five minutes.
  • Medium risk. Customer support, scheduling, sales follow-up. Wrong answers can cost a customer.
  • High risk. Financial actions, regulated communications, anything compliance-adjacent. Wrong answers can cost a fine. These need much tighter design.

The risk profile shapes how much guardrail you build in. Not every project needs maximum safeguards. None should have none.

What we will not pretend

We will not pretend our agents never make mistakes. They do. We will tell you the kind of mistakes the agent we are proposing might make, and how we have designed for them. If a vendor cannot have that conversation honestly, that is a tell.

If you would like to talk through the failure modes for a specific project you are considering, drop us a line. We will be honest about where the risks are and how we would handle them. For the technical side of how guardrails actually work, our piece on AI guardrails goes deeper.

Could AI help your business?

If you'd like to talk it through, the first call is 30 minutes, free, and there's no sales pitch. We'll tell you honestly whether AI is worth your time and money.