I’ve been experimenting with building LLM agents that can use tools, plan multi-step tasks, and recover from errors. Here are some patterns that work and some that don’t.

The ReAct loop

The most reliable pattern I’ve found is a simple Reason-Act-Observe loop:

  1. The model reasons about what to do next
  2. It chooses an action (tool call) and executes it
  3. It observes the result
  4. Repeat until done
def agent_loop(query: str, tools: dict, max_steps: int = 10):
    messages = [{"role": "user", "content": query}]

    for _ in range(max_steps):
        response = llm.chat(messages, tools=tools)

        if response.tool_calls:
            for call in response.tool_calls:
                result = tools[call.name](**call.args)
                messages.append({"role": "tool", "content": str(result)})
        else:
            return response.content

    return "Max steps reached"

What works

  • Explicit tool descriptions with examples in the docstring
  • Structured output (JSON mode) for tool arguments
  • Retry with error context — when a tool call fails, feed the error back and let the model try again

What doesn’t work

  • Overly complex planning — models are bad at making 10-step plans upfront. Better to plan one step at a time.
  • Too many tools — performance degrades noticeably above ~15 tools. Group related tools or use a tool-selection step.
  • Autonomous loops without guardrails — always set a max iteration count and budget limits.