https://belderbos.dev/images/og/build-minimal-ai-agent-python.png
Everybody talks about agents, and a lot of people assume they’re some new kind of model. They aren’t. An agent is a small amount of plumbing around an LLM you already understand. Let’s build one from scratch in Python and see exactly what that plumbing is.
The formula
An agent is: Model + Instructions + Memory + Tools + Execution Loop.
Five parts. None of them is magic. The model is a brain in a jar: useful, fast, but stateless. It generates text; the code around it decides what to do with that text. That second half is the entire job and it’s code we can reason about.
I made the same argument about the control layer being the real product. Here it is as a program.
Start with the model. A real one calls an LLM API; we use a fake one that satisfies the same interface:
from dataclasses import dataclass
from typing import Protocol
@dataclass(frozen=True)
class Say:
text: str
@dataclass(frozen=True)
class Call:
tool: str
arg: str
Reply = Say | Call
class Model(Protocol):
def respond(self, system: str, history: list[str]) -> Reply: ...
The Model protocol has a single method, respond, which takes the system prompt and the conversation history and returns a Reply. It’s a Protocol, so any object with a matching respond method counts as a Model, no inheritance required.
For this minimal agent, the Reply type captures the two actions we support: say something to the user, or call a tool with an argument. The model is free to return either one, and the agent will execute it. (Real models can also emit plans, ask clarifying questions, or request several tool calls at once; we keep it to two to stay legible.)
The agent’s entire decision space is those two variants. The match in the loop below reads as a clean two-way branch, one case per reply, instead of a tangle of flags.
from dataclasses import dataclass, field
from typing import Callable
Tool = Callable[[str], str]
@dataclass
class Agent:
model: Model # 1. Model
system: str # 2. Instructions
history: list[str] = field(default_factory=list) # 3. Memory
tools: dict[str, Tool] = field(default_factory=dict) # 4. Tools
In this example, a tool is a function taking a string and returning a string. The agent holds the other four parts as plain fields:
- The model is any object satisfying the
Modelprotocol: a fake model goes in for testing and a real one for production. - The system prompt is a string that tells the model what to do.
- The history is the agent’s working memory: the conversation and tool outputs that get replayed back into the model. Real agents often add retrieval, summarization, or external state on top, because context windows are finite.
- The tools field is a mapping of tool names to functions that implement them.
The loop is the agent
The part that turns a well-instructed chatbot into something agent-like is the fifth piece: an execution loop that lets the model observe outcomes and decide what to do next. Observe, think, act, check, repeat. Greatly simplified, of course, but this is the piece that does the work.
Because the model is stateless, the agent must keep track of what happened and feed the history back into the model until the model decides the job is done.
def run(self, user_input: str) -> str:
self.history.append(f"user: {user_input}")
while True: # real agents cap the iterations; see termination guards below
match self.model.respond(self.system, self.history):
case Say(text):
self.history.append(f"agent: {text}")
return text
case Call(tool, arg):
fn = self.tools.get(tool)
result = fn(arg) if fn else f"no such tool: {tool}"
self.history.append(f"tool[{tool}]: {result}")
# loop again: the model sees the result and decides what's next
Read it as the cycle:
- Observe: append the input.
- Think: ask the model.
- Act: if it asked for a tool, run the tool.
- Check and repeat: feed the result back into the history and loop, so the model sees what happened and decides whether it needs another tool or can finally answer.
There is no separate "check" block in the code. The check happens implicitly when the loop restarts and calls respond again with the new history. That step is the one that matters, because a model has no native sense of when a job is finished, and nothing stops it from asking for one more tool forever. The loop keeps going until the model returns Say instead of Call.
To run the whole thing without an API key, swap in a fake model and a real tool:
from pathlib import Path
def read_file(path: str) -> str:
try:
return f"{len(Path(path).read_text())} bytes"
except OSError as e:
return f"error: {e}"
class FakeModel:
def respond(self, system: str, history: list[str]) -> Reply:
last = history[-1] if history else ""
if last.startswith("tool["):
return Say(f"Done: {last}")
if last.startswith("user: read "):
return Call("read_file", last.removeprefix("user: read ").strip())
return Say("I can read files. Try: read <path>")
Wire it into a small main that builds the agent, reads a line, calls agent.run, and prints the reply:
def main() -> None:
agent = Agent(
model=FakeModel(),
system="You can read files.",
tools={"read_file": read_file},
)
while True:
try:
line = input("> ")
except EOFError:
break
print(agent.run(line.strip()))
if __name__ == "__main__":
main()
Now you can talk to it with no API key. Run it with python agent.py and type at the prompt:
> read pyproject.toml
Done: tool[read_file]: 76 bytes
That one exchange is a complete agent loop: the model asked for a tool, the loop ran it, fed the byte count back, and the model wrapped up on the second pass. The main thing standing between it and a real one is replacing FakeModel.respond with an HTTP call that returns the same Reply.
The whole thing as one runnable file is here as a GitHub gist. Save it, run python agent.py, and type at the prompt.
What this earns you
Sure, this is a simplified example, and the hard parts are exactly what FakeModel stubs out: prompt design, retries, tool schemas, context compaction, error recovery, and termination guards that stop the loop when a model keeps hallucinating tools. But the core of an agent is 60 lines and easy to reason about. The engineering lives in the control layer around the model.
Build the loop by hand once and frameworks stop feeling magical. LangChain’s agent executor, AutoGen’s shared memory, a coding agent’s plan mode are all variations on these same five parts: engineering tradeoffs, not magic.
Keep reading
- The control layer is the product
- How an AI expense agent is actually structured
- Building an AI Agent in 6 Weeks (and Finally Understanding How They Work)
Planet Python










