How does an AI agent work?
So far, we've discussed AI systems that respond to requests—you provide input, you get output. AI agents take this a step further: the outputs trigger actions in other systems, not just text on a screen.
From Chatbot to Agent
A standard chatbot works like this:
You provide an input
The system generates a response
You read the response and decide what to do with it
An AI agent adds additional steps:
You provide a goal or task
The system generates a plan as text
That text is used to automatically trigger actions: sending emails, booking appointments, writing files, querying databases
The results feed back into the system as new input
The loop continues until a completion condition is met (or the system gets stuck)
The key difference: agent outputs are connected to other systems that execute actions, rather than just being displayed to you.
Put another way: with a standard chatbot, a human is always in the loop. You see the output and decide whether to act on it (copy the text, send the email, run the code).
An agentic system removes this checkpoint. The user abdicates the responsibility for deciding what to do with the LLM's output to the software itself. This means you must really trust the output of the LLM, and the infrastructure around it, because no human is reviewing each action before it happens.
What Makes an Agent an "Agent"?
An AI agent is typically a chatbot (powered by an LLM) that engineers have connected to:
Tools: External systems the agent can invoke, such as web browsers, code interpreters, email clients, calendars, databases, APIs.
And given permissions: Authorisation to act on your behalf, such as reading your files, sending messages as you, making purchases, modifying documents.
It works in a loop: Rather than running once and stopping, agent software runs in a cycle. Pass the current state to the LLM, parse the output for actions, execute those actions, append the results to the context, repeat. This continues until the output signals completion or the system hits a limit.
How the Loop Works in Practice
First, it's important to understand "context." Every time an LLM generates output, it does so based on all the text it's been given as input, this is the context.
In an agentic system, the context grows with each iteration: the original request, the tool instructions, the first output, the results from the first tool call, the second output, the results from the second tool call, and so on. The LLM doesn't remember previous runs, it just processes whatever text is in the current context window.
Imagine you ask an agent: "Find a time next week when both Sarah and I are free, and schedule a 30-minute meeting about the project update."
Here's what happens inside the system:
Step 1: Initial Processing The LLM receives your request along with instructions (written by engineers) about available tools and how to format tool calls. It generates text that includes a structured command to check your calendar.
Step 2: Tool Execution The surrounding software parses this output, extracts the calendar query, and calls the calendar API. The results (your availability) are appended to the context.
Step 3: Next Iteration The LLM now runs again with the updated context. It generates another tool call to check Sarah's calendar. The software executes this and appends those results.
Step 4: Processing and Action With both calendars in context, the LLM generates output identifying overlapping slots and a command to create a meeting at a specific time.
Step 5: Execution and Completion The software parses the meeting creation command, calls the calendar API to create the event, and appends confirmation. The LLM generates a final response reporting what was done, and the loop terminates.
Throughout this process, the LLM is doing the same thing it always does: generating statistically probable text. The "agency" comes from the software infrastructure that parses that text and routes it to real systems.
Why Agents Introduce New Risks
Connecting LLM outputs to real systems amplifies both capabilities and risks:
Compounding Errors: When a chatbot produces incorrect output, you see it and can ignore it. When an agent produces incorrect output, that output may trigger actions, and subsequent iterations build on those flawed results. Small errors cascade before anyone notices.
Expanded Attack Surface: Every connected tool is a potential vulnerability. If the system can send emails, a manipulated prompt could trigger harmful messages. If it can execute code, it could be tricked into running malicious commands.
Reduced Oversight: The value proposition of agents is that they operate without constant human supervision. But this means problems may not be caught until after real-world consequences have occurred.
Specification Gaming: Systems optimise for whatever completion conditions engineers define. If those conditions don't perfectly capture the intended goal, the system may find unexpected paths that technically satisfy the criteria but violate the spirit of what was wanted.
Permission Creep: There's pressure to connect agents to more systems so they can do more useful things. But broader permissions mean larger potential consequences when things go wrong.
The Trust Question for Agents
Agentic AI makes trustworthiness more critical. When an AI system only generates text, mistakes mean bad advice or wasted time. When outputs connect to real systems, mistakes can mean sent emails, deleted files, unauthorised purchases, or leaked information.
The characteristics of trustworthy AI (validity, reliability, safety, security, transparency, accountability) become even more important when AI is connected to systems that affect the real world. You're not just trusting the system to generate useful text; you're trusting the entire pipeline to behave appropriately when acting on your behalf.
The Bottom Line
AI agents are chatbots connected to external systems: tools that execute actions, permissions that authorise access, and loops that iterate until tasks complete.
The underlying technology is the same LLM we discussed earlier: a system that generates statistically probable text based on its training data and current context. What's different is the infrastructure humans have built around it, i.e., the tools, the parsers, the permissions, the guardrails.
This means the quality of an agent depends heavily on the quality of that human-designed infrastructure. Robust systems with appropriate permissions and adequate oversight can be genuinely useful. Poorly designed systems with excessive permissions and inadequate safeguards can cause real harm.
As with all AI systems, the LLM generates text based on statistical patterns. Any appearance of judgment, planning, or intentionality comes from how engineers have structured the surrounding system — not from the model itself.