By Adrien Payong and Shaoni Mukherjee

CrewAI is a lightweight, lightning-fast Python framework for orchestrating autonomous AI agents that work together as a “crew” to complete a task. CrewAI is built from the ground up, with no heavy abstractions, and is 100% independent of LangChain or any other agent libraries. CrewAI gives developers high-level simplicity and low-level control. It is optimized for production-ready multi-agent workflows that care about reliability, observability, and cost efficiency.
This crash course walks you from “hello world” to a production-grade multi-agent workflow with CrewAI. You’ll be introduced to the key concepts, set up a project, code a real workflow, plug in some powerful tools, and learn best practices for reliability and monitoring. By the end of this, you’ll know how to decide if CrewAI is right for your use case, and how it compares against similar frameworks like LangGraph and AutoGen. Let’s get started!
CrewAI is a role-based multi-agent framework for Python. It allows you to declare multiple LLM-powered agents with a defined expertise and purpose, and organize them to collaborate on a structured workflow. A crew’s members work semi-autonomously in their areas of specialization, and a separate coordinating process (manager agent) can also be made to manage the workflow.
Organizing AI agents in a crew with clearly defined roles allows you to avoid making single agents do everything, as is the case in simpler systems. For example, one agent may excel at research, another at writing, another at making decisions. CrewAI handles the messaging and organization between agents, enabling them to collaborate on various subtasks to get a final result.
Prerequisites: Python 3.10 to 3.13. CrewAI needs Python >=3.10 and <3.14. Ensure you are within this range (check via python3 --version). If not, make sure to update your Python first. Note also that CrewAI utilizes the OpenAI API under the hood, so you must have an OpenAI API key (or a key from any other LLM provider) available.
Step 1: Install CrewAI. CrewAI is distributed through PyPI. The core library can be installed via pip. This installs CrewAI’s package and CLI.
pip install crewai
If you want full support for tools, you can also install the optional tools package. This ensures you have CrewAI’s library of pre-built tools.
pip install crewai[tools]
Note: CrewAI has its own environment manager called UV, which also handles dependencies for you. This is optional, but recommended by the CrewAI docs.
curl \-LsSf https://astral.sh/uv/install.sh
This installs the uv command.
After installation, you can check if things work by verifying the version. You should see an output like crewai v0.x.x confirming it’s installed.
crewai --version
Step 2: Create a CrewAI project scaffold. CrewAI ships with a CLI to scaffold a new project with the recommended structure. In a terminal run:
crewai create crew my_project_name
Replace my_project_name with the name you want to give your project. This will generate a folder my_project_name/ with a ready-to-use template:
my_project_name/
├── .gitignore
├── pyproject.toml
├── README.md
├── .env # for API keys and config
├── knowledge/ # (optional knowledge base)
└── src/
└── my_project_name/
├── crew.py # Crew orchestration code
├── main.py # Entry point to run your crew
├── agents.yaml # Define your agents here
├── tasks.yaml # Define tasks/workflows here
└── tools/ # Custom tools (with __init__.py and example tool file)
This scaffold gives you a structured starting point. The agents.yaml and tasks.yaml files are where you can declaratively specify your crew’s configuration (roles, goals, task sequence, etc. ), while main.py and crew.py are Python entry points in case you prefer code-based configuration. The .env is where you’ll store API keys (e.g., OPENAI_API_KEY=…) so they aren’t hardcoded.
Step 3: Install project dependencies. Inside your project directory, run:
crewai install
This command will run the CrewAI CLI to install any dependencies the project may require (setting up the environment, making sure the crewai-tools package is installed, etc.). If your project requires additional Python packages to be installed, you can specify them using’ uv add <package>’ (UV) or via ‘pip install’ normally.
Step 4: Run your first crew. Now you’re ready to execute the default example crew. Simply run:
crewai run
This will run the crew defined in your scaffold (by default, it may run a simple two-agent collaboration example). If everything is working, you should see the agents thinking/acting in the console.
Common installation gotchas*:*
It’s worth looking at the four main building blocks of CrewAI. A CrewAI workflow consisted of 4 main components: Agents, Tasks, Tools, and the Crew itself (an orchestrator that combines agents and tasks).
An Agent in CrewAI is an LLM-powered worker with a designated role, a goal to achieve, and (optionally) some background context. Agents work autonomously, making decisions that are deemed capable of in their field of expertise. For instance, you might have an agent with a role of Research Analyst whose goal is to research, and another agent with a role of Writer whose goal is to provide a report. You create an agent by specifying its role, goal, and other parameters. Here’s an example:
from crewai import Agent
researcher = Agent(
role="Senior Research Analyst",
goal="Identify emerging AI trends and their business implications",
backstory="You have 10 years of experience in AI research...",
llm=your_llm_instance, # (Optional) use a specific LLM for this agent
tools=[web_search_tool, document_loader_tool],
memory=True, # Enable short/long-term memory for this agent
verbose=True # Log the agent's thought process and decisions
)
A Task in CrewAI represents a unit of work to be done by an agent, including instructions and an expected outcome. You can think of it as assigning a to-do item to one of your agents. Here’s how you define a Task in code:
from crewai import Task
analysis_task = Task(
description="Analyze the top 5 AI trends in 2025 and assess their market potential.",
expected_output="A structured analysis including each trend's name, market size, key players, and growth forecast.",
agent=researcher,
async_execution=False, # False = this task runs synchronously (waits for previous tasks if any)
output_file="analysis.md", # save the output to a file
depends_on=[research_task] # (Optional) task dependency: run after research_task
)
Key attributes of a Task:
Tools are how you provide agents with additional capabilities. A Tool is essentially a function or an API integration that an agent can call as part of its reasoning process. For example, a web search tool would allow an agent to query the internet, a database tool would allow an agent to fetch data, and so on.
CrewAI comes with some built-in tools and allows you to define your own custom tools. Attaching tools to an agent essentially means the agent can decide to use those tools as it deems necessary to achieve its goal.
Here’s an example of using one of CrewAI’s built-in tools, a web search tool, as well as the file I/O tools:
from crewai_tools import SerperDevTool, FileReadTool, FileWriteTool
# Initialize some tools
web_search = SerperDevTool() # web search via Serper (Google)
file_reader = FileReadTool() # read from files
file_writer = FileWriteTool() # write to files
# Give our researcher agent the ability to use these tools
researcher.tools = [web_search, file_reader, file_writer]
A custom tool is simply a Python function (or class) decorated with @tool from CrewAI, made available to agents:
from crewai.tools import tool
import json
@tool("Database Query")
def query_database(sql: str) -> str:
"""Execute an SQL query against the company database."""
# Simple validation example:
if "DROP" in sql.upper():
return "Error: Destructive queries are not allowed."
# (Pretend we run the query and get results)
results = execute_sql(sql)
return json.dumps(results)
In this example, we created a tool named “Database Query.” The docstring and function name provide context for the agent to understand the tool’s function. We included a safety check for potentially harmful SQL commands. The agent could invoke query_database(“SELECT * FROM Customers”) using this tool at run time.
CrewAI supports different types of tools and interactions:
The crew coordinates agents with tasks and controls the execution flow. When instantiating a Crew object, you must provide the Crew with the list of agents, the list of tasks, and the configuration of how you want the process to run. For example:
from crewai import Crew, Process
crew = Crew(
agents=[researcher, writer, manager],
tasks=[research_task, writing_task, review_task],
process=Process.HIERARCHICAL, # The manager agent will delegate tasks to the others
manager_agent=manager, # Specify which agent is the manager (for hierarchical process)
memory=True, # Enable shared memory among agents (they can see each other's outputs)
verbose=True # Enable detailed logging for debugging
)
result = crew.kickoff(inputs={"topic": "AI Safety"})
In this example, we are initializing a crew with 3 agents and 3 tasks. We also opted for a hierarchical process, where a Manager agent is responsible for the workflow and assigns tasks to worker agents. An input (topic: “AI Safety”) is also provided, which can be used to fill in the placeholder in task descriptions or agents’ goals.
Process types: CrewAI provides multiple execution strategies via the Process setting:
Under the hood, crew.kickoff() will direct the workflow based on the process type:
Now that we’ve covered the basics of CrewAI’s components, let’s see an example of a multi-agent workflow you could build.
Let’s say we want to build an automated workflow for research and writing content — perhaps “Research a topic and generate a summary report”. We will configure two agents: one agent performs the research (the “Researcher”) and another one uses the discovered information to write a summary (the “Writer”). This will be a simple example of a real-world use case that involves sequential collaboration, tool use (for the research), and handing off of results between agents.
Step 1: Define the Agents. We need a Researcher Agent and a Writer Agent. In YAML (agents.yaml) or Python, we’d specify:
# agents.yaml
agents:
- name: researcher
role: Researcher
goal: "Gather relevant information on a given topic"
backstory: "An expert at finding and analyzing information."
tools: [WebSearchTool] # allow web search capability
allow_delegation: false
- name: writer
role: Writer
goal: "Produce a clear, concise summary of the research findings"
backstory: "Skilled at summarizing information into reports."
tools: [] # no external tools, just relies on LLM
allow_delegation: false
Notice how we assigned the Researcher a WebSearchTool – a “dummy” tool which is an agent with the ability to perform web searches (there are tools like SerperApiTool in CrewAI that actually do Google searches through an API – we could get an API key for that and configure the tool as needed). The Writer doesn’t need a tool; it uses the underlying LLM to generate text. Both have allow_delegation: false for simplicity’s sake (we’re not going to reassign tasks to other agents in this flow). We could also specify memory: true for each agent (or just memory: true on the crew level if we want all agents to have that behavior – the crew config will propagate that to the agents) if we wanted them to retain info.
Step 2: Define the Tasks and Process. We want the Researcher to run first, then the Writer. This is a sequential process: Task 1 (research) -> Task 2 (writing). In CrewAI, we can define tasks in a YAML (tasks.yaml) like:
# tasks.yaml
tasks:
- id: do_research
description: "Research the topic: {{topic}}"
expected_output: "Key points and relevant facts about {{topic}}."
agent: researcher
- id: write_report
description: |
"Write a summary report on '{{topic}}' using the research findings."
expected_output: "A well-structured summary of the topic."
agent: writer
context: [do_research] # use output of research task as context
process: sequential
A couple of things to keep in mind: You can have placeholders in the description (e.g., {{topic}}) to fill in when we execute the crew with parameters. We annotated that the write_report task has context: [do_research], which will provide the Writer agent with the Researcher’s output as input/context automatically. This is super important — this is how data/context flows between agents. We have defined a process: sequential to make the execution order explicit. The expected_output fields are to guide the agents (also to define validations).
Step 3: Write the Orchestration Code (crew.py / main.py). With YAML config, if we run crewai run, CrewAI will perform the orchestration for us. Let’s take a look at what it looks like in Python so we can better understand the internals:
# crew.py (conceptual example)
from crewai import Crew, Task, Process
# Import agents and tools definitions (or define inline)
from crewai import Agent
from crewai_tools import WebSearchTool
# Initialize tools
search_tool = WebSearchTool() # assume configured with API key via .env
# Create Agent instances
researcher = Agent(role="Researcher", goal="Gather relevant info on a topic",
tools=[search_tool], allow_delegation=False)
writer = Agent(role="Writer", goal="Summarize research into a report",
tools=[], allow_delegation=False)
# Define Tasks
research_task = Task(
description="Research the topic: {topic}",
agent=researcher,
expected_output="Key facts about {topic}",
)
write_task = Task(
description="Write a summary report on '{topic}' using the research findings.",
agent=writer,
context=[research_task], # link output of research_task
expected_output="Summary report for {topic}"
)
# Create a Crew with sequential process
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, write_task],
process=Process.sequential, # sequential execution
memory=True # enable memory if we want agents to retain info
)
Now, to execute this crew, we would call crew.kickoff(topic=“Climate Change Impacts”) (for example, providing the topic parameter that our task descriptions expect). CrewAI will then:
Step 4: Run and test the workflow. We would invoke crewai run --param topic=“Climate Change Impacts” (if using the CLI with YAML) or call our crew.kickoff(topic=“Climate Change Impacts”) in a Python script. Monitor the console output:
Production tips for building workflows:
Extending the example: You could easily expand this workflow by adding a third agent (e.g., an “Editor” agent that checks the Writer’s report for quality or style adherence. You can also add a “FactChecker” agent that verifies facts using a second web search. You’d add an additional task, create the necessary context links, and perhaps allow for the Editor agent to request revisions (delegation) if necessary.
CrewAI comes with a large library of built-in tools that enable agents to interact with files, the web, databases, vector stores, and external services. Below is a table that briefly describes the main categories of tools, example tools for each, and general cases when you’d use them.
CrewAI includes a broad built-in tool library that helps agents interact with files, the web, databases, vector stores, and external services. The table below summarizes the main tool categories, example tools, and typical use cases.
| Tool (reference link) | Description |
|---|---|
| ScrapeWebsiteTool | Extracts content from websites by fetching and parsing HTML pages. |
| SerperDevTool | Performs Google searches via the Serper API for real-time web results. |
| FileReadTool | Reads content from local files of various types. |
| Rag Tool | Enables Retrieval-Augmented Generation by querying external documents. |
| CodeInterpreterTool | Executes Python code snippets safely within the agent environment. |
| DALLETool (DallETool) | Generates images from text descriptions using OpenAI’s DALL-E model. |
| PGSearchTool | Queries PostgreSQL databases for data retrieval (semantic/RAG-style search support). |
Integrations and Advanced Tools: CrewAI also integrates with the Model Context Protocol (MCP), an entire ecosystem of community-built tools that can be accessed with a common interface. Enabling MCP integration for your agent would mean it could have access to literally thousands of tools running on MCP servers (things like complex data crunching services, specialized APIs, etc). This requires installing the crewai-tools[mcp] extra, and (optionally) running an MCP server or connecting to one.
In production, you will want to account for a variety of failure modes. Agents getting into infinite loops, tools firing erroneously, cost spirals, etc. CrewAI provides patterns and built-in tools to improve reliability. Here are some important patterns to know about:
| Reliability pattern | What it prevents (with brief explanation) | How to apply in CrewAI (with a brief explanation) |
|---|---|---|
| Guardrailed tool results | Hallucinations after a correct tool output — the LLM “rephrases” and introduces errors even when the tool returns the truth. | For authoritative tools (DB query, calculator, retrieval), configure the workflow so tool output becomes the final answer (or is copied verbatim into the response). Also, write tasks that explicitly require tool usage for facts. |
| Iteration limits | Infinite loops and token burn — agents keep thinking, retrying, or re-planning without converging. | Set agent max_iter to cap thought/tool cycles. Keep it low for simple tasks; increase only for complex multi-step reasoning. Treat frequent “hit max_iter” events as a prompt/task design smell. |
| Execution timeouts | Hung runs — slow web scraping, network stalls, or long tool calls block the entire crew. | Set max_execution_time per agent/task. Inside custom tools, enforce timeouts (HTTP timeouts, Selenium page-load limits). Prefer “timeout + retry” over waiting indefinitely. |
| Retries + backoff | Transient failures — rate limits, flaky APIs, intermittent connectivity. | Use max_retry_limit and implement exponential backoff in tool code (e.g., wait 1s → 2s → 4s). After bounded retries, return a structured error payload so downstream tasks can handle it deterministically. |
| Bounded delegation | Delegation ping-pong — agents keep handing work back and forth and never finish. | Enable allow_delegation only for a manager/coordinator agent; keep worker agents non-delegating. Add explicit prompt rules like “delegate only when missing data/tool access,” and require workers to return concrete outputs, not requests. |
| Deterministic inputs | Unstable outputs — small prompt variance causes different formats/results, breaking downstream automation. | Use strict task specs (expected_output), fixed schemas (JSON), and structured prompts. For extraction/classification agents, reduce randomness (lower temperature) and constrain outputs to defined fields. |
| Output validation | Silent quality failures — outputs “look OK” but miss required fields, violate constraints, or contain contradictions. | Add a validator/reviewer task that checks schema, required sections, citations, and constraints. If validation fails, re-run with targeted corrective feedback (what’s missing + exact format expected). |
| Cost controls | Runaway spend — multi-agent workflows amplify tokens and tool calls quickly. | Use smaller/cheaper models for routine steps and reserve stronger models for critical reasoning. Cap steps/time, cache repeated context, and log tokens/cost per task to identify the cost hotspots for prompt/tool optimization. |
This table summarizes the three most practical observability layers for CrewAI workflows. It shows how to start with simple console visibility during development, then evolve into structured tracing and external monitoring for production. It finally adds debug hooks and evaluation to continuously improve reliability and output quality over time.
| Observability area | What it gives you (why it matters) | How to use it in CrewAI (practical, production-oriented) |
|---|---|---|
| Basic logging & verbosity | Immediate, readable debugging — you can see agent reasoning steps, tool invocations, and decision flow in real time. Best for development and quick triage, but limited for long-term analysis. | Set verbose=True on the Crew or specific Agents to print step-by-step logs (THINKING → ACTION → OBSERVATION). Use it locally to reproduce issues. In production, keep verbosity low and emit structured logs (JSON) where possible so you can aggregate them. |
| Tracing + external observability + production monitoring | Structured evidence trail — step-by-step execution captured with timestamps, inputs/outputs, tool calls, and run IDs. Enables post-mortems, performance tuning, and cost governance (token spikes, slow tools). | Enable CrewAI tracing (config/flag/env per docs) so every run generates trace data with a unique identifier. Ship traces/metrics to an observability backend (e.g., Langfuse for LLM traces, Datadog for logs/metrics/alerts, Arize/Phoenix for model monitoring, LangSmith/LangTrace for unified tracing, Neatlogs for shareable run replays). Add alerts on failures/latency, track tokens and cost per task, and monitor throughput for scaling decisions. |
| Debug hooks + evaluation workflows | Actionable diagnosis and quality control — custom hooks expose patterns (delegation loops, repeated retries, tool errors), while evaluation prevents silent regressions in output quality. | Use step_callback to annotate/record critical events after each step (delegation triggered, tool error, retry count, schema violations). Add an automated evaluation stage (rule checks + optional LLM scoring) on final outputs and persist scores. Use trends to refine prompts/tools and to gate releases (AgentOps-style continuous improvement). |
Different agents may have different philosophies, architectures, or features that fit your use cases. The three most popular frameworks are CrewAI, LangGraph, and AutoGen. Let’s briefly compare the three to point out key similarities and differences:
| Aspect | CrewAI (Role-Based Teams) | LangGraph (Graph Workflows) | AutoGen (Conversational Agents) |
|---|---|---|---|
| Approach | Role-playing agents in a structured crew; tasks & flows combine autonomy with control. | Directed acyclic graph of steps (often via LangChain); explicit state management. | Multi-agent chat sessions; emergent behavior with no fixed process by default. |
| Ease of Use | Clear Agent/Crew/Task model; quick to start. Logging can be tricky without proper tooling. | Steeper learning curve; more boilerplate and upfront state design. | Fast to prototype, but requires careful prompt design to coordinate agents. |
| Workflow Control | Dual modes: autonomous crews and deterministic flows; good balance of flexibility and structure. | Strong control via graphs; great for complex branching, less adaptive at runtime. | Limited built-in control; sequence emerges from conversation, harder to enforce. |
| State & Memory | Built-in memory modules (short/long/entity); robust out of the box. | Often relies on LangChain memory; state must be predefined and managed in code. | Primarily, conversation history, unless you implement additional memory. |
| Tools & Integration | 100+ tools; not tied to LangChain, easy to integrate any API/custom tools; can still use LangChain tools if needed. | Tight coupling to LangChain; many integrations, but added complexity and adaptation overhead. | Tooling via functions and AutoGen Studio UI; tool use is shaped by a conversational paradigm. |
| Performance | Minimal overhead; optimized core; reported faster execution on some tasks; strong for small and complex workflows. | More overhead (LangChain + graph runtime); can be slower but effective for structured flows. | Depends on the number of conversation turns; it may be slower if agents take many iterations. |
| Production Readiness | Built for production: observability hooks, enterprise features, fine-grained controls, maintainable systems. | Production-possible with careful engineering; may need extra glue due to LangChain quirks. | More research/demo oriented; production hardening (logging, errors, controls) is largely DIY. |
| Best For | Role-based workflows with known steps but flexible execution (enterprise automation, analytics pipelines, multi-step RAG with oversight). | Static/predictable workflows that require complex branching (e.g., ETL/QA pipelines), especially if already invested in LangChain. | Open-ended problem solving via conversation (brainstorming, critique loops, human-in-the-loop chat), where steps aren’t fixed. |
CrewAI is a framework designed to orchestrate multiple AI agents that work together using clearly defined roles, goals, and responsibilities. Unlike traditional single-agent setups, CrewAI emphasizes collaboration and task delegation, similar to how human teams operate. This makes it particularly effective for complex, multi-step workflows that require planning, execution, and validation.
In CrewAI, each agent is assigned a specific role, such as researcher, writer, or reviewer, along with a defined objective. Tasks are distributed based on these roles, ensuring agents focus only on what they are best suited for. This structured approach improves efficiency, reduces redundancy, and produces more coherent outputs.
CrewAI is widely used for content generation, research automation, data analysis pipelines, and AI-driven product workflows. It is especially useful in scenarios where tasks must be completed sequentially or collaboratively. Examples include building RAG systems, automating reports, and coordinating multiple LLM-powered tools.
Basic familiarity with Python and large language models is helpful but not mandatory. CrewAI is designed to be developer-friendly, with clear abstractions for agents, tasks, and workflows. Beginners can start with simple examples and gradually move to more complex multi-agent orchestration patterns.
Yes, CrewAI can be integrated with popular LLM providers, APIs, and external tools such as vector databases and workflow engines. This flexibility allows it to fit seamlessly into existing AI pipelines. As a result, teams can enhance their current systems without rewriting everything from scratch.
CrewAI is most useful when you treat a “crew” as an engineered workflow, rather than a chat experiment. Assign roles and task boundaries, ground key steps with tools, and enforce limits on runaway behavior so the system cannot degenerate into loops, failures, or surprise costs. Then instrument everything—logs, traces, and lightweight evaluations—so you can quickly debug issues and drive up quality release after release. If you need predictable multi-step automation with clear ownership and production controls, CrewAI is a strong default. However, if your problem is primarily complex branching logic or open-ended conversational collaboration, LangGraph or AutoGen may be a better fit.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
I am a skilled AI consultant and technical writer with over four years of experience. I have a master’s degree in AI and have written innovative articles that provide developers and researchers with actionable insights. As a thought leader, I specialize in simplifying complex AI concepts through practical content, positioning myself as a trusted voice in the tech community.
With a strong background in data science and over six years of experience, I am passionate about creating in-depth content on technologies. Currently focused on AI, machine learning, and GPU computing, working on topics ranging from deep learning frameworks to optimizing GPU-based workloads.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
Join the many businesses that use DigitalOcean’s Gradient AI Agentic Cloud to accelerate growth. Reach out to our team for assistance with GPU Droplets, 1-click LLM models, AI agents, and bare metal GPUs.
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
Sign up and get $200 in credit for your first 60 days with DigitalOcean.*
*This promotional offer applies to new accounts only.