What Is an AI Agent?
An AI agent is an autonomous system where the LLM goes beyond simple text generation to make its own decisions, use tools, and accomplish goals. It’s like delegating a task to an assistant who independently creates a plan, finds and uses the necessary tools, and verifies the results.
Core components of an agent:
| Component | Role | Analogy |
|---|---|---|
| LLM (Brain) | Reasoning, judgment, planning | Human brain |
| Tools | Interaction with external systems | Toolbox |
| Memory | Storing past actions and results | Notepad |
| Planning | Establishing steps to achieve goals | To-do list |
| Observation | Interpreting tool execution results | Eyes and ears |
Pattern 1: ReAct (Reasoning + Acting)
ReAct is the most fundamental and widely used agent pattern. It solves problems by repeating a Thought, Action, Observation loop.
Workflow
User question → [Thought] Reason about what needs to be done
→ [Action] Select and execute appropriate tool
→ [Observation] Check tool execution result
→ [Thought] Decide if additional action is needed
→ ... (repeat)
→ [Final Answer] Generate final response
Implementation Example
from langchain_openai import ChatOpenAI
from langchain.agents import create_react_agent, AgentExecutor
from langchain_core.prompts import PromptTemplate
from langchain_core.tools import tool
@tool
def search_database(query: str) -> str:
"""Searches for user information in the database."""
# Actual DB query logic
db = {"user:1001": {"name": "John", "plan": "Pro", "usage": "85%"}}
return str(db.get(query, "No results"))
@tool
def send_notification(message: str) -> str:
"""Sends a notification to the user."""
return f"Notification sent: {message}"
# ReAct prompt (Thought/Action/Observation structure)
react_prompt = PromptTemplate.from_template("""
You can use the following tools:
{tools}
Tool names: {tool_names}
Use the following format:
Thought: Think about what needs to be done
Action: Name of the tool to use
Action Input: Input to pass to the tool
Observation: Tool execution result
... (repeat as needed)
Thought: I now know the final answer
Final Answer: The final answer to deliver to the user
Question: {input}
{agent_scratchpad}
""")
llm = ChatOpenAI(model="gpt-4o", temperature=0)
tools = [search_database, send_notification]
agent = create_react_agent(llm, tools, react_prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
result = executor.invoke({"input": "Check the usage for user:1001, and send a notification if it's over 80%"})
# Thought: I need to check the usage for user:1001 first
# Action: search_database
# Action Input: user:1001
# Observation: {'name': 'John', 'plan': 'Pro', 'usage': '85%'}
# Thought: Usage is 85%, which exceeds 80%, so I need to send a notification
# Action: send_notification
# Action Input: John, your current usage is at 85%. Please consider upgrading your plan.
# Observation: Notification sent
# Final Answer: John's usage is at 85%, and a notification has been sent.
ReAct pros and cons:
- Pros: Simple to implement, transparent reasoning process, easy to debug
- Cons: Slow due to sequential step-by-step execution, can get stuck in loops on complex tasks
Pattern 2: Plan-and-Execute
Plan-and-Execute first creates an overall plan, then executes each step in order. It’s more stable than ReAct for complex tasks.
Workflow
User question → [Planner] Create overall plan (steps 1, 2, 3...)
→ [Executor] Execute step 1
→ [Executor] Execute step 2
→ [Re-planner] Revise plan if needed
→ [Executor] Execute step 3
→ [Final] Synthesize results and respond
Implementation Example
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from typing import TypedDict
class AgentState(TypedDict):
task: str # Original task
plan: list[str] # Execution plan
current_step: int # Current step
results: list[str] # Results of each step
final_answer: str # Final answer
llm = ChatOpenAI(model="gpt-4o", temperature=0)
# Step 1: Create plan
planner_prompt = ChatPromptTemplate.from_template("""
Create a step-by-step plan to accomplish the following task.
Write each step on one line, numbered.
Task: {task}
Plan:
""")
planner_chain = planner_prompt | llm | StrOutputParser()
# Step 2: Execute each step
executor_prompt = ChatPromptTemplate.from_template("""
Execute the following step and report the result.
Overall task: {task}
Current step: {step}
Previous step results: {previous_results}
""")
executor_chain = executor_prompt | llm | StrOutputParser()
# Plan-and-Execute execution
def plan_and_execute(task: str) -> str:
"""Creates a plan and executes it step by step."""
# Create plan
plan_text = planner_chain.invoke({"task": task})
steps = [s.strip() for s in plan_text.strip().split("\n") if s.strip()]
print(f"Plan created: {len(steps)} steps")
# Execute each step
results = []
for i, step in enumerate(steps):
print(f"\nExecuting: Step {i+1} - {step}")
result = executor_chain.invoke({
"task": task,
"step": step,
"previous_results": "\n".join(results) if results else "None"
})
results.append(f"Step {i+1}: {result}")
return "\n".join(results)
# Execute
answer = plan_and_execute("Design a user CRUD API with Python FastAPI")
# Plan created: 4 steps
# Executing: Step 1 - Define data models (User schema)
# Executing: Step 2 - Design CRUD endpoints
# Executing: Step 3 - Add error handling
# Executing: Step 4 - Write API documentation and test code
Plan-and-Execute pros and cons:
- Pros: Stable for complex tasks, provides overview of the entire flow upfront
- Cons: Initial planning cost, requires re-planning when changes are needed
Pattern 3: Multi-Agent
Multi-Agent involves multiple specialized agents collaborating to complete a task. Each agent has a unique role and tools, and an orchestrator distributes the work.
Architecture
User question → [Orchestrator] Analyze and assign tasks
├─→ [Research Agent] Gather information
├─→ [Code Agent] Write code
└─→ [Review Agent] Review results
→ [Orchestrator] Integrate results and deliver final answer
Implementation Example
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
llm = ChatOpenAI(model="gpt-4o", temperature=0)
# Define specialist agents
def create_specialist(role: str, expertise: str):
"""Creates a specialist agent for each role."""
prompt = ChatPromptTemplate.from_messages([
("system", f"You are a {role}. You are an expert in {expertise}."),
("human", "Task: {task}\n\nPrevious results:\n{context}")
])
return prompt | llm | StrOutputParser()
# Create agents by role
researcher = create_specialist(
"Researcher", "technology trend analysis and document research"
)
architect = create_specialist(
"Architect", "system design and architecture decisions"
)
reviewer = create_specialist(
"Reviewer", "code quality review and improvement suggestions"
)
# Orchestrator: distributes work and integrates results
def orchestrate(task: str) -> str:
"""Executes the multi-agent workflow."""
# Step 1: Researcher investigates
print("Researcher working...")
research = researcher.invoke({"task": task, "context": "None"})
# Step 2: Architect designs
print("Architect working...")
design = architect.invoke({"task": task, "context": research})
# Step 3: Reviewer evaluates
print("Reviewer working...")
review = reviewer.invoke({
"task": f"Please review the following design: {task}",
"context": design
})
# Integrate results
return f"## Research Results\n{research}\n\n## Design\n{design}\n\n## Review\n{review}"
result = orchestrate("Design a microservice authentication system")
print(result)
Multi-Agent pros and cons:
- Pros: Specialized role division, parallel processing of complex tasks possible
- Cons: Inter-agent communication overhead, coordination complexity, increased cost
Pattern Comparison Summary
| Criteria | ReAct | Plan-and-Execute | Multi-Agent |
|---|---|---|---|
| Complexity | Low | Medium | High |
| Suited for | Simple tool calls | Multi-step tasks | Combining specialized domains |
| LLM calls | 1 per step | Planning + 1 per step | Agents x Steps |
| Debugging | Easy | Moderate | Difficult |
| Cost | Low | Medium | High |
| Stability | Loop risk | Stable | Depends on coordination |
Framework Support
| Framework | ReAct | Plan-and-Execute | Multi-Agent |
|---|---|---|---|
| LangChain/LangGraph | Built-in | Via LangGraph | Via LangGraph |
| CrewAI | - | - | Core feature |
| AutoGen | - | - | Core feature |
| OpenAI Assistants | Built-in | Custom implementation | Custom implementation |
Practical Tips
- Start with the simplest pattern: Most tasks can be handled with ReAct. Only introduce complex patterns when ReAct reaches its limits.
- Set guardrails: Always configure maximum iterations (
max_iterations), timeouts, and cost limits to prevent infinite loops. - Write specific tool descriptions: For an agent to select the right tool, the tool’s docstring must be clear. Describe both what it does and when to use it.
- Limit observation results: If tools return too much data, it wastes the context window. Summarize results or extract only the needed parts.
- Consider LangGraph: For complex agent workflows, LangGraph can clearly express state management and branching logic as a graph.
- Build an evaluation system: Creating an evaluation pipeline that quantitatively measures agent accuracy, cost, and speed helps with pattern selection and improvement.