Control Flow in AI Agents
Lorenzo Fontoura — December 31st, 2024
Agents are autonomous systems that leverage AI tools such as LLMs and embeddings. Agentic systems and frameworks have been gaining more attention as the base of the AI tech stack becomes more commoditised with various options for LLMs, Vector DBs, and high quality open-source tools that are capable of decent reasoning such as Llama 3.3 and Qwen 2.5.
Agent development will likely see significant growth in 2025. Developers need stable and flexible frameworks for building solid agents that are capable of handling production tasks (and not just producing slop).
But what are the different paradigms in which agent software can be designed? One of the things to consider is how the flow of control happens in these agents. Control flow refers to the order in which operations are performed in software, and it seems like a major design piece in agent software.
I believe there are two methods of agentic control flow:
- Structured flow
- LLM flow
Structured flow refers to source code defining how the operations are made in a program. But with LLM flow, the LLM is responsible for deciding which operations should happen and when.
An agent can use either or both methods of control flow.
How does agentic "structured flow" work?
It's a regular program that uses LLMs and other AI tools such as embeddings. But in structured flow, the LLM output does not decide the flow of the program, rather it only manipulates data.
For example:
- periodically fetch unread emails from API
- summarise emails with an LLM
- convert unread emails to embeddings and perform AI search to show related emails
- send summary and relevant emails as a notification somewhere via an API
This has a structured control flow but it can still be considered an AI agent.
How does agentic "LLM flow" work?
With agentic LLM flow, the LLM is responsible for deciding what to execute.
For example, imagine a program that periodically queries an LLM with the following prompt:
Fetch unread emails. Write a list of all emails that are related to incoming or outgoing charges, billing receipts or invoices. For each of the filtered emails (if any), write a one-sentence summary and the amount labeled as debit or credit. One line per email. Compile the list with all debits and credits and the net amount at the bottom line. Send the result as a text message to my phone number.
This example prompt provides instructions for the LLM to follow to achieve a desired outcome (receiving a report via SMS). But how this gets executed exactly is up to the LLM to decide.
This prompt could be given to a software developer and they could write this program in a structured way like the previous example, and run the program periodically to achieve the same outcome. The LLM could also use this prompt to generate the structured program and run it periodically (e.g. write the program in Cursor). This approach would essentially take the prompt and produce source code to run as structured flow. Another way is to expose the required tools and APIs to the LLM, and periodically prompt the LLM to have it run the procedure dynamically. This would be a dynamic way of running the agent, where the LLM inference decides what tools to call and when/how to call them based on the prompt. Therefore, there are two kinds of LLM flow: static and dynamic. In static LLM flow, the LLM produces source code. In dynamic LLM flow, the LLM has access to tools and can map data from and to tools at its own discretion based on the input prompt.
For the dynamic approach to work, the agent needs access to the relevant systems as plugins and the LLM is responsible for calling them correctly and following the prompted logic. In the example above, the agent would need an email plugin, a mathematics plugin (for reliably calculating the amounts, as LLMs are bad at maths) and an SMS plugin. These plugins would require authentication with external systems, which is another major aspect of agent development.
If the LLM can handle this task 100% reliably, then the only difference between this example and the structured example is that the developer does not have to write the code, but rather just explain in simple words how the flow of information should happen in the program (prompt engineering).
Some early multi-agent systems used only LLM flow, without any function calling or structured outputs. For example, Baby AGI. LLM flow provides maximum flexibility
Adding a feedback loop
Whether the main agent is using structured flow or LLM flow, it is possible to add an LLM flow component to your agent so that it can act on feedback (which may be human feedback or systematically LLM-generated feedback) to alter system parameters and routines over time.
For example, the LLM flow example above could have a parameter workflowIntervalMinutes
describing how often, in minutes, to run the workflow. The human could respond to the SMS with natural langauge such as "send these notifications less often", which can be consumed as feedback and result in the agent increasing the value of workflowIntervalMinutes
. Likewise, source code can be treated as parameters too if the agent has access to its own source code.
Which method to choose?
In a world where AGI is available and it's cheap enough, there's not many scenarios where LLM flow wouldn't be preferred, as it would essentially be a self-developing program created with little human effort. Humans need to program structured flow, but not LLM flow (other than some prompt engineering).
But today, there is no AGI, and LLMs hallucinate too often for them to run autonomously and provide value over extended periods of time without any sort of human in the loop. Therefore LLMs shouldn't be in fully in charge of sensitive operations, because they cannot (yet, whether technically or economically) ensure a high enough degree of order and structure like humans or computers can.
These issues are resolved with structured flow. However, it's a sensitive balance. From a functional perspective, LLMs are capable of 2 things: natural language parsing and resoning. Reasoning by LLMs can be very powerful and allow agents to become self-developing over time.
How to achieve this balance? this depends on the needs of each project, but the decision is also certainly influenced by the current state of the art.
The 2D visualisation below shows two axes: structured to LLM flow and static to dynamic.
Static in this context means that the behaviour of the program is consistent over time, whereas dynamic means that the behaviour of the program can change over time. With structured flow, the control flow of a specific iteration of the software can be analysed via its source code, whereas with LLM flow the control flow cannot be analysed from the source code because it's up to the LLM to decide during inference.
As I've been building asterai.io, a platform for building and deploying AI agents, I noticed there are different paradigms in which people are developing agentic software and how they are all referred to as agents, even if some approaches differ significantly from others. As the world starts creating more of these programs, these distinctions are important to be aware of when discussing agents. asterai lets you create modular LLM-flow agents powered by WASM/WASI plugins, and we are also working on a visual programming tool for building agents with a structured flow approach.