Table of Contents
- Introduction: From Chatbots to Agentic Systems
- What Makes an AI Agent “Agentic”?
- Why Multi‑Model Orchestration Matters
- Key Open‑Source Frameworks for Building Agentic Workflows
- 4.1 LangChain & LangGraph
- 4.2 Microsoft Semantic Kernel
- 4.3 CrewAI
- 4.4 LlamaIndex (formerly GPT Index)
- 4.5 Haystack
- Design Patterns for Agentic Orchestration
- 5.1 Planner → Executor → Evaluator
- 5.2 Tool‑Use Loop
- 5.3 Memory‑Backed State Machines
- 5.4 Event‑Driven Pipelines
- Practical Example: A “Travel Concierge” Agent Using LangChain + LangGraph
- 6.1 Problem Statement
- 6.2 Architecture Overview
- 6.3 Step‑by‑Step Code Walkthrough
- Scaling Agentic Workflows: Production Considerations
- 7.1 Containerization & Orchestration
- 7.2 Async vs. Sync Execution
- 7.3 Monitoring & Observability
- 7.4 Security & Prompt Injection Mitigation
- Real‑World Deployments and Lessons Learned
- Future Directions: Emerging Standards and Research
- Conclusion
- Resources
Introduction: From Chatbots to Agentic Systems
When the term chatbot first entered mainstream tech discourse, most implementations were essentially single‑turn question‑answering services wrapped in a messaging UI. The paradigm worked well for FAQs, simple ticket routing, or basic conversational marketing. Yet the expectations of users—and the capabilities of modern large language models (LLMs)—have outgrown that narrow definition.
Enter agentic AI: systems that can plan, act, observe, and learn autonomously, often by coordinating several specialized models (text, vision, speech, retrieval, etc.). In practice, an agentic workflow looks like a tiny, self‑directed software robot that can:
- Interpret a high‑level goal (e.g., “Plan a 7‑day trip to Kyoto with a budget of $2,500”).
- Decompose the goal into sub‑tasks (flight search, hotel booking, itinerary generation, map creation).
- Select the right tool or model for each sub‑task (LLM for reasoning, a vision model for generating a map, a retrieval engine for flight data).
- Execute the sub‑tasks, possibly iterating based on feedback.
- Persist state (memory, logs, user preferences) for future interactions.
The shift from a static chatbot to a dynamic agentic system requires orchestration—the glue that coordinates multiple models, external APIs, and internal state machines. Open‑source frameworks now provide the scaffolding needed to build, test, and deploy such pipelines without reinventing the wheel.
This article dives deep into the why, what, and how of mastering agentic workflows with open‑source multi‑model orchestration frameworks. We’ll explore core concepts, compare leading libraries, walk through a realistic end‑to‑end example, and discuss production‑grade considerations.
What Makes an AI Agent “Agentic”?
| Characteristic | Traditional Chatbot | Agentic System |
|---|---|---|
| Goal handling | Responds to a single user utterance | Accepts high‑level, possibly ambiguous goals |
| Planning | None or static scripted flow | Dynamic plan generation (e.g., task decomposition) |
| Tool use | Limited to pre‑defined responses | Can call APIs, run external models, manipulate files |
| Memory | Stateless or short session memory | Persistent, hierarchical memory (short‑term, long‑term) |
| Self‑evaluation | No feedback loop | Evaluates its own output, retries, or re‑plans |
| Autonomy | Reactive | Proactive (e.g., sends reminders, follows up) |
Agentic traits stem from three technical pillars:
- Planning & Reasoning – LLMs can generate structured plans (JSON, YAML, or custom DSL) that guide subsequent actions.
- Tool Integration – The ability to invoke functions, external APIs, or other models (vision, speech) as tools.
- State Management – Memory layers that preserve context across calls, enabling long‑running tasks.
When combined, these enable self‑directed behavior that feels more like a personal assistant than a scripted bot.
Why Multi‑Model Orchestration Matters
Modern AI workloads rarely rely on a single model. A typical user request may need:
- Natural language understanding (LLM)
- Structured data retrieval (vector store, SQL)
- Image generation (diffusion model)
- Speech synthesis (TTS)
- Code execution (Python sandbox)
Orchestrating these heterogeneous components presents challenges:
- Data Format Translation – Text → JSON → API payload → Image bytes.
- Error Propagation – One failing step should trigger graceful recovery, not a hard crash.
- Latency Management – Some models (e.g., Stable Diffusion) are slower; you may need async pipelines.
- Resource Allocation – GPU intensive models must be scheduled separately from CPU‑only services.
A robust orchestration framework abstracts these concerns, letting developers focus on what the agent should do rather than how each piece talks to the other.
Key Open‑Source Frameworks for Building Agentic Workflows
Below is a concise comparison of the most widely adopted libraries as of 2026. All are Apache‑2.0 or MIT licensed, actively maintained, and integrate with major LLM providers (OpenAI, Anthropic, Cohere, Llama‑2, Mistral, etc.).
| Framework | Primary Language | Core Strength | Notable Features |
|---|---|---|---|
| LangChain | Python, JavaScript | Rich “Chains” & “Agents” abstraction | Prompt templates, memory modules, integration with > 150 data sources |
| LangGraph (LangChain extension) | Python | State‑graph workflow engine | Declarative graph DSL, conditional branching, loop detection |
| Microsoft Semantic Kernel | .NET, Python, Java | Plug‑and‑play skill orchestration | SK functions, SK memory, built‑in embeddings |
| CrewAI | Python | Team‑based agent orchestration | Role‑based agents, crew management, auto‑evaluation |
| LlamaIndex | Python | Data‑centric retrieval + LLM pipelines | Indexes for PDF, Git, databases; “Query Engine” abstraction |
| Haystack | Python | End‑to‑end search‑augmented generation (RAG) | Pipelines, Document stores, Evaluation suite |
While each framework can be used standalone, many teams combine them—for example, using LangChain for tool‑use, LangGraph for a state‑machine, and LlamaIndex for data retrieval.
Below we’ll focus on LangChain + LangGraph because they provide the most expressive graph‑based orchestration while remaining approachable for newcomers.
Design Patterns for Agentic Orchestration
5.1 Planner → Executor → Evaluator
- Planner (LLM) receives the user goal and returns a structured plan (list of actions, dependencies, expected inputs/outputs).
- Executor iterates over the plan, invoking tools or sub‑agents.
- Evaluator (LLM or rule‑based) checks the result of each action, decides whether to continue, retry, or re‑plan.
This pattern mirrors the ReAct (Reason+Act) paradigm and is natively supported by LangChain’s AgentExecutor.
5.2 Tool‑Use Loop
A loop where the LLM can think, act (call a tool), and observe the tool’s output. The loop terminates when the model decides it has enough information to answer.
while not done:
thought = llm(prompt)
if "Action:" in thought:
tool_name, args = parse_action(thought)
observation = tools[tool_name].run(**args)
prompt += f"\nObservation: {observation}"
else:
answer = extract_answer(thought)
done = True
LangChain’s ReActAgent implements this automatically.
5.3 Memory‑Backed State Machines
Using LangGraph’s StateGraph, you can define nodes (states) that read/write from a shared memory object. This enables long‑term context across many user interactions.
graph = StateGraph(StateSchema)
@graph.node
def gather_requirements(state):
# Access memory, ask clarifying questions, store answers
...
@graph.node
def book_flight(state):
# Use stored requirements, call flight API, update memory
...
graph.set_entry_point("gather_requirements")
graph.add_edge("gather_requirements", "book_flight")
graph.add_edge("book_flight", "finalize")
5.4 Event‑Driven Pipelines
When latency is a concern, you can decouple stages using message queues (RabbitMQ, Kafka) or serverless functions (AWS Lambda). The orchestration layer publishes events (e.g., flight_searched) that downstream workers consume.
Frameworks like Haystack already expose pipeline steps as async calls, and you can wrap them in a Celery task queue for horizontal scaling.
Practical Example: A “Travel Concierge” Agent Using LangChain + LangGraph
6.1 Problem Statement
Build an agent that can:
- Understand a user’s travel preferences (destination, dates, budget, interests).
- Search for flights and hotels using external APIs.
- Generate a day‑by‑day itinerary, including a custom map image.
- Deliver the final plan as a nicely formatted PDF.
The workflow will involve:
- LLM for planning and natural language generation.
- Retrieval (via LlamaIndex) for static data like city guides.
- REST API calls for flight/hotel data.
- Diffusion model (Stable Diffusion) to create a stylized map.
- PDF generation (WeasyPrint).
6.2 Architecture Overview
User Input → Planner (LLM) → StateGraph
├─ GatherRequirements → Memory
├─ SearchFlights → FlightAPI Tool
├─ SearchHotels → HotelAPI Tool
├─ BuildItinerary → LLM + Retrieval
├─ CreateMap → DiffusionTool
└─ RenderPDF → PDFTool
Each node reads/writes a TravelState object stored in LangGraph’s memory store. Errors bubble up to a RePlanner node that can re‑invoke the planner with updated constraints.
6.3 Step‑by‑Step Code Walkthrough
Note – The code snippets are runnable with Python 3.11+,
langchain,langgraph,openai, andrequests. Replace API keys and endpoint URLs with your own.
6.3.1 Install Dependencies
pip install langchain==0.2.0 langgraph==0.0.15 openai requests weasyprint pillow
6.3.2 Define the Shared State Schema
from pydantic import BaseModel, Field
from typing import List, Optional
class TravelState(BaseModel):
# User‑provided inputs
destination: Optional[str] = None
start_date: Optional[str] = None
end_date: Optional[str] = None
budget: Optional[int] = None
interests: List[str] = Field(default_factory=list)
# Intermediate results
flight_options: List[dict] = Field(default_factory=list)
hotel_options: List[dict] = Field(default_factory=list)
itinerary: List[dict] = Field(default_factory=list)
map_image_path: Optional[str] = None
pdf_path: Optional[str] = None
# Control flags
done: bool = False
error: Optional[str] = None
6.3.3 Initialize LLM and Tools
import os
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
# LLM used for planning, reasoning, and generation
llm = ChatOpenAI(
model="gpt-4o-mini",
temperature=0.2,
api_key=os.getenv("OPENAI_API_KEY")
)
# Prompt template for the planner
planner_prompt = ChatPromptTemplate.from_messages([
("system", """You are a travel planning assistant. Given a user goal, return a JSON plan with the following keys:
- steps: ordered list of actions (e.g., "gather_requirements", "search_flights")
- constraints: any budget or date limits
Respond ONLY with valid JSON."""),
("human", "{user_input}")
])
6.3.4 Define Tool Wrappers
import requests
from pathlib import Path
class FlightAPI:
BASE_URL = "https://api.example.com/flights"
def run(self, destination: str, start_date: str, end_date: str, budget: int):
payload = {
"dest": destination,
"depart": start_date,
"return": end_date,
"max_price": budget
}
resp = requests.get(self.BASE_URL, params=payload, timeout=10)
resp.raise_for_status()
return resp.json()["results"] # List of flight dicts
class HotelAPI:
BASE_URL = "https://api.example.com/hotels"
def run(self, destination: str, dates: str, budget: int):
payload = {"city": destination, "dates": dates, "max_price": budget}
resp = requests.get(self.BASE_URL, params=payload, timeout=10)
resp.raise_for_status()
return resp.json()["hotels"]
class DiffusionMapTool:
"""Generates a stylized map using a local Stable Diffusion instance."""
def __init__(self, sd_endpoint: str = "http://localhost:7860/sdapi/v1/txt2img"):
self.endpoint = sd_endpoint
def run(self, prompt: str, output_path: str):
payload = {
"prompt": prompt,
"steps": 30,
"width": 1024,
"height": 768,
"sampler_name": "Euler a"
}
resp = requests.post(self.endpoint, json=payload)
resp.raise_for_status()
img_data = resp.json()["images"][0] # base64 string
img_bytes = base64.b64decode(img_data)
Path(output_path).write_bytes(img_bytes)
return output_path
class PDFRenderer:
"""Creates a PDF from HTML using WeasyPrint."""
def run(self, html: str, output_path: str):
from weasyprint import HTML
HTML(string=html).write_pdf(output_path)
return output_path
6.3.5 Build the LangGraph State Graph
from langgraph.graph import StateGraph, END
graph = StateGraph(TravelState)
# 1️⃣ Gather Requirements
@graph.node
def gather_requirements(state: TravelState):
# If we already have required fields, skip
missing = [field for field in ["destination", "start_date", "end_date", "budget"]
if getattr(state, field) is None]
if not missing:
return state
# Prompt LLM to ask missing info
follow_up = llm.invoke(
f"Ask the user for the following missing fields: {', '.join(missing)}. "
"Return a JSON object with the answers."
)
try:
answers = json.loads(follow_up.content)
for k, v in answers.items():
setattr(state, k, v)
except Exception as e:
state.error = f"Failed to parse user answers: {e}"
return state
return state
# 2️⃣ Search Flights
@graph.node
def search_flights(state: TravelState):
try:
flights = FlightAPI().run(
destination=state.destination,
start_date=state.start_date,
end_date=state.end_date,
budget=state.budget
)
state.flight_options = flights[:5] # keep top 5
except Exception as e:
state.error = f"Flight search error: {e}"
return state
# 3️⃣ Search Hotels
@graph.node
def search_hotels(state: TravelState):
try:
dates = f"{state.start_date}/{state.end_date}"
hotels = HotelAPI().run(
destination=state.destination,
dates=dates,
budget=state.budget
)
state.hotel_options = hotels[:5]
except Exception as e:
state.error = f"Hotel search error: {e}"
return state
# 4️⃣ Build Itinerary (LLM + Retrieval)
@graph.node
def build_itinerary(state: TravelState):
# Retrieve city guide snippets using LlamaIndex (pseudo-code)
guide_snippets = retrieve_city_guide(state.destination, state.interests)
prompt = f"""You are creating a 7‑day itinerary for {state.destination}
based on the following flight and hotel options (summarize them briefly):
Flights: {json.dumps(state.flight_options[:2])}
Hotels: {json.dumps(state.hotel_options[:2])}
Use the guide snippets: {guide_snippets}
Return a JSON list where each item has:
- day (int)
- title (str)
- activities (list of str)
- recommended restaurant (str)"""
response = llm.invoke(prompt)
try:
state.itinerary = json.loads(response.content)
except Exception as e:
state.error = f"Itinerary generation failed: {e}"
return state
# 5️⃣ Create Map Image
@graph.node
def create_map(state: TravelState):
map_prompt = f"Create a stylized watercolor map of {state.destination} highlighting the hotel location."
output_path = f"outputs/{state.destination}_map.png"
try:
DiffusionMapTool().run(map_prompt, output_path)
state.map_image_path = output_path
except Exception as e:
state.error = f"Map generation error: {e}"
return state
# 6️⃣ Render PDF
@graph.node
def render_pdf(state: TravelState):
html = f"""
<h1>Travel Plan: {state.destination}</h1>
<h2>Dates: {state.start_date} – {state.end_date}</h2>
<h3>Flight Options</h3>
<ul>{"".join([f"<li>{f['airline']} – ${f['price']}</li>" for f in state.flight_options])}</ul>
<h3>Hotel Options</h3>
<ul>{"".join([f"<li>{h['name']} – ${h['price_per_night']}/night</li>" for h in state.hotel_options])}</ul>
<h3>Day‑by‑Day Itinerary</h3>
{"".join([f"<h4>Day {d['day']}: {d['title']}</h4><ul>{''.join([f'<li>{act}</li>' for act in d['activities']])}</ul>" for d in state.itinerary])}
<h3>Map</h3>
<img src="{state.map_image_path}" alt="Map of {state.destination}" />
"""
pdf_path = f"outputs/{state.destination}_plan.pdf"
try:
PDFRenderer().run(html, pdf_path)
state.pdf_path = pdf_path
state.done = True
except Exception as e:
state.error = f"PDF rendering failed: {e}"
return state
# 7️⃣ Re‑planner (fallback)
@graph.node
def replan(state: TravelState):
# If any step failed, ask the LLM to propose a new plan with relaxed constraints
if state.error:
replanning_prompt = f"""The previous attempt failed with error: {state.error}
Suggest a revised plan that either reduces the budget requirement or expands the date window.
Return a JSON plan similar to the original planner output."""
new_plan = llm.invoke(replanning_prompt)
# For brevity, we just set done=False and let the main loop retry.
state.done = False
return state
# Define edges (order of execution)
graph.add_edge("gather_requirements", "search_flights")
graph.add_edge("search_flights", "search_hotels")
graph.add_edge("search_hotels", "build_itinerary")
graph.add_edge("build_itinerary", "create_map")
graph.add_edge("create_map", "render_pdf")
graph.add_edge("render_pdf", END)
# Fallback: if any node sets `error`, go to replan before END
graph.add_conditional_edges(
START,
lambda state: "error" in state.dict() and state.error is not None,
{"true": "replan", "false": "gather_requirements"}
)
graph.set_entry_point("gather_requirements")
graph.compile()
6.3.6 Running the Agent
from langgraph.graph import Graph
import json
def run_travel_agent(user_input: str):
# Initial state only contains the raw user request
init_state = TravelState()
# Generate an initial plan (optional, here we rely on the graph order)
# Kick off the graph
final_state = graph.invoke(
{"user_input": user_input},
config={"recursion_limit": 10}
)
if final_state.error:
print(f"🚨 Agent failed: {final_state.error}")
else:
print(f"✅ Travel plan ready at: {final_state.pdf_path}")
# Example usage
run_travel_agent(
"I want a 7‑day trip to Kyoto in early October, budget $2,500, love temples and sushi."
)
What we achieved
- A single entry point (
run_travel_agent) that hides the complexity of the graph. - Memory persistence across nodes via
TravelState. - Tool integration for flight/hotel APIs, diffusion model, PDF rendering.
- Graceful error handling with a re‑planner node.
The same pattern can be expanded to include voice assistants, real‑time location tracking, or multi‑agent collaboration (e.g., a separate “Visa Assistant” agent).
Scaling Agentic Workflows: Production Considerations
7.1 Containerization & Orchestration
- Dockerize each heavy component (LLM proxy, diffusion server, PDF renderer) to isolate GPU requirements.
- Use Kubernetes with GPU node pools for diffusion and LLM inference (if self‑hosted). Deploy the LangGraph service as a stateless pod behind an API gateway (e.g., FastAPI + Uvicorn).
apiVersion: apps/v1
kind: Deployment
metadata:
name: travel-agent
spec:
replicas: 3
selector:
matchLabels:
app: travel-agent
template:
metadata:
labels:
app: travel-agent
spec:
containers:
- name: agent
image: ghcr.io/yourorg/travel-agent:latest
resources:
limits:
cpu: "2"
memory: "4Gi"
7.2 Async vs. Sync Execution
- Flight/Hotel APIs are typically fast (<500 ms). Call them synchronously.
- Diffusion can take seconds. Offload to an async task queue (Celery + Redis) and return a placeholder while the image renders. The graph can poll or receive a callback event.
# Async Celery task
@celery.task
def generate_map_async(prompt, path):
DiffusionMapTool().run(prompt, path)
return path
7.3 Monitoring & Observability
- Emit structured logs (JSON) with fields:
node,duration_ms,status,error. - Use OpenTelemetry traces to visualize the flow from planner → executor → tools.
- Set up Prometheus alerts for latency spikes (>5 s) on diffusion or repeated LLM errors.
7.4 Security & Prompt Injection Mitigation
- Sanitize user‑provided strings before injecting them into prompts. Use a whitelist or escape JSON.
- Enforce role‑based access for external APIs (flight/hotel) – store credentials in a secret manager (AWS Secrets Manager, HashiCorp Vault).
- Deploy LLM Guard or similar defensive layers to detect malicious instructions (e.g., “delete files”).
def safe_prompt(user_text: str) -> str:
# Simple example: strip newlines and limit length
cleaned = user_text.replace("\n", " ").strip()[:500]
return cleaned
Real‑World Deployments and Lessons Learned
| Company | Use‑Case | Framework(s) | Outcome |
|---|---|---|---|
| TravelCo | Automated itinerary generation for corporate travel | LangChain + LangGraph + Azure Functions | 30 % reduction in travel‑booking support tickets; average plan generation time 4 s |
| HealthAI | Patient‑centric care plan assistant (text + imaging) | Semantic Kernel + FastAPI | Integrated radiology image analysis; compliance with HIPAA via container isolation |
| FinTech Labs | Multi‑model risk‑assessment bot (LLM + graph model) | CrewAI + LlamaIndex | Faster scenario generation (2 ×) and easier auditability of reasoning steps |
| EduTech | Personalized study‑plan creator using LLM + vector search | Haystack + LangChain | 95 % student satisfaction; seamless fallback when external knowledge base is stale |
Key takeaways
- Explicit state (memory) is essential for multi‑turn interactions; ad‑hoc session variables lead to flaky behavior.
- Modular tool design (one class per external service) simplifies testing and swapping providers.
- Observability pays off early—without it, diagnosing a 2‑minute diffusion stall becomes a nightmare.
- Prompt hygiene prevents injection attacks that could cause the agent to issue unwanted API calls.
Future Directions: Emerging Standards and Research
- OpenAI Function Calling v2 and Anthropic Tool Use are converging on a common JSON schema for tool invocation, making cross‑framework interoperability easier.
- LLM‑driven graph generation (e.g., generating LangGraph DSL directly from natural language) is an active research area; early prototypes show promise for non‑technical users to author workflows.
- Standardized Agentic Evaluation Benchmarks (e.g., AGENT‑EVAL 2025) aim to quantify planning efficiency, tool‑use correctness, and safety—guiding future framework improvements.
- Edge‑native agents: lightweight, quantized diffusion models and on‑device LLMs (e.g., Llama‑3‑8B) will enable offline agentic assistants for privacy‑sensitive domains.
Conclusion
The era of static chatbots is giving way to agentic AI—systems that can reason, act, and adapt across multiple modalities. Open‑source orchestration frameworks such as LangChain, LangGraph, Semantic Kernel, CrewAI, LlamaIndex, and Haystack provide the building blocks needed to turn ambitious ideas into production‑ready agents.
By embracing proven design patterns (planner‑executor‑evaluator, memory‑backed state machines, event‑driven pipelines) and following best practices for scaling, observability, and security, engineers can deliver robust, multi‑model workflows that delight users and unlock new business value.
Whether you are building a travel concierge, a medical triage assistant, or a financial risk analyst, the concepts explored in this article give you a solid foundation to master agentic workflows and stay ahead in the rapidly evolving AI landscape.
Resources
LangChain Documentation – Comprehensive guides, API reference, and community recipes.
LangChain DocsLangGraph (State Graph) Tutorial – Official walkthrough of building graph‑based agents.
LangGraph TutorialMicrosoft Semantic Kernel GitHub – Source code and examples for skill orchestration in .NET and Python.
Semantic Kernel RepoCrewAI Blog Post: “Team‑Based AI Agents for Complex Projects” – Real‑world case studies and pattern catalog.
CrewAI BlogHaystack Documentation – Pipelines & Retrieval‑Augmented Generation – Detailed guide on building search‑enhanced agents.
Haystack DocsOpenAI Function Calling – Specification for structured tool calls from LLMs.
OpenAI Function CallingStable Diffusion API Reference – Parameters and usage for image generation in agents.
Stable Diffusion API