Advanced Multi-Agent Systems

Advanced Multi-Agent Systems on Linux: Build Collaborative AI Teams (2024)

Advanced Multi-Agent Systems on Linux: Build Collaborative AI Teams

Last Updated: November 4, 2024 | Reading Time: 20 minutes | Difficulty: Advanced

Introduction to Multi-Agent Systems

Single agents are powerful, but multi-agent systems unlock unprecedented capabilities by combining specialized AI agents that collaborate, delegate, and solve complex problems together.

In this advanced guide, you’ll learn to:

  • Design multi-agent architectures with specialized roles
  • Implement agent communication and coordination
  • Build production systems with LangGraph and CrewAI
  • Deploy scalable multi-agent applications on Linux

Multi-Agent Architecture Patterns

1. Hierarchical Pattern (Manager-Worker)


        [Manager Agent]
              |
    +---------+---------+
    |         |         |
[Worker 1] [Worker 2] [Worker 3]

2. Sequential Pattern (Pipeline)


[Agent 1] → [Agent 2] → [Agent 3] → [Agent 4]
 Research     Draft      Review      Publish

3. Collaborative Pattern (Swarm)


    [Agent 1] ← → [Agent 2]
        ↑  ↘     ↗  ↑
        ↓    ↘ ↗    ↓
    [Agent 3] ← → [Agent 4]

Building with LangGraph

Install Dependencies

pip install langgraph langchain-openai langchain-anthropic
pip install python-dotenv redis

Example: Research Team with 3 Specialized Agents

#!/usr/bin/env python3
"""
Multi-Agent Research Team
- Researcher: Gathers information
- Analyzer: Processes and analyzes data
- Writer: Creates final report
"""

import os
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.tools import DuckDuckGoSearchRun
import operator

# Load environment
from dotenv import load_dotenv
load_dotenv()

# Define shared state
class AgentState(TypedDict):
    task: str
    research_data: Annotated[list, operator.add]
    analysis: str
    final_report: str
    iteration: int

# Initialize LLM
llm = ChatOpenAI(model="gpt-4-turbo-preview", temperature=0.7)

# Agent 1: Researcher
def researcher_agent(state: AgentState) -> AgentState:
    """Research agent gathers information"""
    print(f"\n🔍 RESEARCHER: Starting research on: {state['task']}")

    search = DuckDuckGoSearchRun()
    query = state['task']

    # Perform search
    search_results = search.run(query)

    # Process results with LLM
    prompt = ChatPromptTemplate.from_messages([
        ("system", "You are a research agent. Extract key facts and insights."),
        ("human", f"Task: {query}\n\nSearch Results:\n{search_results}\n\nProvide 5 key findings:")
    ])

    response = llm.invoke(prompt.format_messages())

    state['research_data'].append({
        "agent": "researcher",
        "findings": response.content
    })

    print(f"✅ RESEARCHER: Found {len(search_results)} sources")
    return state

# Agent 2: Analyzer
def analyzer_agent(state: AgentState) -> AgentState:
    """Analyzer processes research data"""
    print(f"\n📊 ANALYZER: Analyzing research data")

    research_summary = "\n".join([
        f"- {item['findings']}"
        for item in state['research_data']
    ])

    prompt = ChatPromptTemplate.from_messages([
        ("system", """You are an analytical agent. Your job is to:
        1. Identify patterns and trends
        2. Draw meaningful conclusions
        3. Highlight important insights
        4. Note any contradictions or gaps"""),
        ("human", f"Task: {state['task']}\n\nResearch Data:\n{research_summary}\n\nProvide analysis:")
    ])

    response = llm.invoke(prompt.format_messages())
    state['analysis'] = response.content

    print(f"✅ ANALYZER: Analysis complete")
    return state

# Agent 3: Writer
def writer_agent(state: AgentState) -> AgentState:
    """Writer creates final report"""
    print(f"\n✍️ WRITER: Creating final report")

    prompt = ChatPromptTemplate.from_messages([
        ("system", """You are a professional writer. Create a comprehensive,
        well-structured report with:
        - Executive summary
        - Key findings
        - Detailed analysis
        - Conclusions and recommendations"""),
        ("human", f"""Task: {state['task']}

Research Findings:
{state['research_data']}

Analysis:
{state['analysis']}

Create a professional report:""")
    ])

    response = llm.invoke(prompt.format_messages())
    state['final_report'] = response.content

    print(f"✅ WRITER: Report complete ({len(response.content)} characters)")
    return state

# Quality control agent
def quality_control(state: AgentState) -> AgentState:
    """Check if report meets quality standards"""
    print(f"\n🔍 QUALITY CONTROL: Reviewing report")

    # Check report length and completeness
    if len(state['final_report']) < 500:
        print("⚠️ Report too short, requesting revision")
        state['iteration'] += 1
        if state['iteration'] < 3:
            # Re-run writer with feedback
            return writer_agent(state)

    print(f"✅ QUALITY CONTROL: Report approved")
    return state

# Build workflow graph
workflow = StateGraph(AgentState)

# Add nodes
workflow.add_node("researcher", researcher_agent)
workflow.add_node("analyzer", analyzer_agent)
workflow.add_node("writer", writer_agent)
workflow.add_node("quality_control", quality_control)

# Define flow
workflow.set_entry_point("researcher")
workflow.add_edge("researcher", "analyzer")
workflow.add_edge("analyzer", "writer")
workflow.add_edge("writer", "quality_control")
workflow.add_edge("quality_control", END)

# Compile
app = workflow.compile()

# Run the multi-agent system
def run_research_team(task: str):
    """Execute research team workflow"""
    print("=" * 60)
    print("MULTI-AGENT RESEARCH TEAM")
    print("=" * 60)

    initial_state = {
        "task": task,
        "research_data": [],
        "analysis": "",
        "final_report": "",
        "iteration": 0
    }

    # Execute workflow
    final_state = app.invoke(initial_state)

    print("\n" + "=" * 60)
    print("FINAL REPORT")
    print("=" * 60)
    print(final_state['final_report'])

    return final_state

# Example usage
if __name__ == "__main__":
    task = "Latest developments in AI agents and autonomous systems in 2024"
    result = run_research_team(task)

Building with CrewAI

Install CrewAI

pip install crewai crewai-tools

Example: Content Creation Crew

#!/usr/bin/env python3
"""
Content Creation Crew
- SEO Researcher
- Content Writer
- Editor
"""

from crewai import Agent, Task, Crew, Process
from crewai_tools import SerperDevTool, ScrapeWebsiteTool
from langchain_openai import ChatOpenAI

# Initialize LLM
llm = ChatOpenAI(model="gpt-4")

# Initialize tools
search_tool = SerperDevTool()
scrape_tool = ScrapeWebsiteTool()

# Define agents
seo_researcher = Agent(
    role='SEO Research Specialist',
    goal='Find trending topics and keywords with high search volume',
    backstory="""You are an expert in SEO and content strategy.
    You excel at finding topics that will drive organic traffic.""",
    tools=[search_tool],
    llm=llm,
    verbose=True
)

content_writer = Agent(
    role='Technical Content Writer',
    goal='Write engaging, technically accurate articles',
    backstory="""You are a skilled technical writer who specializes
    in Linux, DevOps, and system administration topics.""",
    tools=[scrape_tool],
    llm=llm,
    verbose=True
)

editor = Agent(
    role='Content Editor',
    goal='Review and improve content quality',
    backstory="""You are a meticulous editor who ensures content
    is clear, accurate, and engaging.""",
    llm=llm,
    verbose=True
)

# Define tasks
research_task = Task(
    description="""Research trending Linux and DevOps topics.
    Find 3 topics with:
    - High search volume
    - Low competition
    - Relevant to Linux administrators

    Provide keyword data and content angles.""",
    agent=seo_researcher,
    expected_output="List of 3 topics with SEO data"
)

writing_task = Task(
    description="""Based on the research, write a comprehensive
    tutorial article (1500+ words) on the top topic.

    Include:
    - Step-by-step instructions
    - Code examples
    - Best practices
    - Troubleshooting tips""",
    agent=content_writer,
    expected_output="Complete article in markdown format"
)

editing_task = Task(
    description="""Review the article and improve:
    - Technical accuracy
    - Clarity and readability
    - SEO optimization
    - Structure and flow

    Provide the final polished version.""",
    agent=editor,
    expected_output="Edited article ready for publication"
)

# Create crew
content_crew = Crew(
    agents=[seo_researcher, content_writer, editor],
    tasks=[research_task, writing_task, editing_task],
    process=Process.sequential,
    verbose=True
)

# Run crew
if __name__ == "__main__":
    result = content_crew.kickoff()
    print("\n" + "=" * 60)
    print("FINAL ARTICLE")
    print("=" * 60)
    print(result)

Agent Communication Patterns

1. Message Passing

from queue import Queue
import threading

class AgentMessaging:
    def __init__(self):
        self.queues = {}

    def create_queue(self, agent_name):
        self.queues[agent_name] = Queue()

    def send(self, to_agent, message):
        self.queues[to_agent].put(message)

    def receive(self, agent_name):
        return self.queues[agent_name].get()

2. Shared State (Redis)

import redis
import json

class SharedState:
    def __init__(self):
        self.redis_client = redis.Redis(host='localhost', port=6379, db=0)

    def set_state(self, key, value):
        self.redis_client.set(key, json.dumps(value))

    def get_state(self, key):
        value = self.redis_client.get(key)
        return json.loads(value) if value else None

    def update_state(self, key, update_func):
        current = self.get_state(key) or {}
        updated = update_func(current)
        self.set_state(key, updated)

Production Deployment Architecture


┌─────────────────────────────────────────┐
│         Load Balancer (Nginx)           │
└────────────┬────────────────────────────┘
             │
    ┌────────┴────────┐
    │                 │
┌───▼────┐      ┌────▼───┐
│ Agent  │      │ Agent  │
│ API 1  │      │ API 2  │
└───┬────┘      └────┬───┘
    │                │
    └────────┬───────┘
             │
     ┌───────▼────────┐
     │  Redis Cluster │
     │  (Shared State)│
     └───────┬────────┘
             │
     ┌───────▼────────┐
     │   PostgreSQL   │
     │  (Persistence) │
     └────────────────┘

Monitoring and Observability

from prometheus_client import Counter, Histogram, start_http_server
import time

# Metrics
agent_requests = Counter('agent_requests_total', 'Total agent requests', ['agent_name'])
agent_duration = Histogram('agent_duration_seconds', 'Agent execution time', ['agent_name'])
agent_errors = Counter('agent_errors_total', 'Total agent errors', ['agent_name', 'error_type'])

def monitored_agent(agent_name, agent_func):
    """Wrapper to add monitoring to agents"""
    def wrapper(*args, **kwargs):
        agent_requests.labels(agent_name=agent_name).inc()

        start = time.time()
        try:
            result = agent_func(*args, **kwargs)
            agent_duration.labels(agent_name=agent_name).observe(time.time() - start)
            return result
        except Exception as e:
            agent_errors.labels(agent_name=agent_name, error_type=type(e).__name__).inc()
            raise

    return wrapper

# Start metrics server
start_http_server(8000)

Best Practices

  • Clear Responsibilities: Each agent should have one well-defined role
  • Explicit Communication: Use structured messages between agents
  • Error Recovery: Implement retry logic and fallback strategies
  • State Management: Use Redis or similar for shared state
  • Monitoring: Track agent performance and errors
  • Testing: Test individual agents before integration

Performance Optimization

Optimization Impact Implementation
Parallel Execution 3-5x faster Use asyncio or threading
Caching 50-70% cost reduction Redis with TTL
Prompt Optimization 30-40% token savings Shorter, precise prompts
Model Selection Varies by task Use smaller models where possible

Real-World Use Cases

  1. Customer Support: Triage → Specialist → Resolution agents
  2. Content Creation: Research → Write → Edit → Publish
  3. Data Analysis: Collect → Clean → Analyze → Visualize
  4. DevOps: Monitor → Diagnose → Fix → Verify
  5. Security: Detect → Analyze → Respond → Report

Next Steps

Continue your journey with:

  • Memory and Knowledge Management (next article)
  • Production Deployment on Linux
  • Real-World Case Studies

Was this article helpful?

R

About Ramesh Sundararamaiah

Red Hat Certified Architect

Expert in Linux system administration, DevOps automation, and cloud infrastructure. Specializing in Red Hat Enterprise Linux, CentOS, Ubuntu, Docker, Ansible, and enterprise IT solutions.