Why This Matters

The right architecture determines whether your agent scales from prototype to production.

The Problem: Monolithic Agent Code


// ❌ Everything in one file
const agent = async (userInput: string) => {
  // Model selection hardcoded
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: userInput }],
  });

  // Tools hardcoded
  if (response.includes('weather')) {
    const weather = await fetch('https://api.weather.com...');
  }

  // Application logic mixed in
  await db.save(response);
  await slack.post(response);
  
  return response;
};

Issues:

Can't swap models without rewriting code
Tools tightly coupled to agent logic
Testing requires mocking everything
No separation of concerns

The Solution: Four-Layer Architecture

┌─────────────────────────────────────────────────────────────┐
│                    APPLICATION LAYER                        │
│  • UI components                                            │
│  • API routes                                               │
│  • Workflows                                                │
│  • Business logic                                           │
└─────────────────────┬───────────────────────────────────────┘
                      │
┌─────────────────────┴───────────────────────────────────────┐
│                    FRAMEWORK LAYER                          │
│  • Agent orchestration (Mastra)                             │
│  • State management                                         │
│  • Tool execution                                           │
│  • Memory management                                        │
└─────────────────────┬───────────────────────────────────────┘
                      │
┌─────────────────────┴───────────────────────────────────────┐
│                    TOOLS LAYER                              │
│  • Data retrieval                                           │
│  • Actions (send email, create ticket)                      │
│  • Integrations (CRM, Slack, etc)                           │
└─────────────────────┬───────────────────────────────────────┘
                      │
┌─────────────────────┴───────────────────────────────────────┐
│                    LLM LAYER                                │
│  • Model providers (OpenAI, Anthropic)                      │
│  • Model selection                                          │
│  • Token management                                         │
└─────────────────────────────────────────────────────────────┘

The Four-Layer Architecture

Layer 1: LLM Layer - The Engine of Reasoning

The foundation that powers agent intelligence.


// lib/llm-provider.ts
import { openai } from '@ai-sdk/openai';
import { anthropic } from '@ai-sdk/anthropic';

export class LLMProvider {
  private models = {
    'gpt-4o': openai('gpt-4o'),
    'gpt-4o-mini': openai('gpt-4o-mini'),
    'claude-3-5-sonnet': anthropic('claude-3-5-sonnet-20241022'),
  };

  async generate(
    model: string,
    messages: Message[],
    options?: GenerateOptions
  ): Promise<Response> {
    const provider = this.models[model];
    
    if (!provider) {
      throw new Error(`Unknown model: ${model}`);
    }

    return await generateText({
      model: provider,
      messages,
      ...options,
    });
  }

  async stream(
    model: string,
    messages: Message[],
    options?: StreamOptions
  ): Promise<StreamResponse> {
    const provider = this.models[model];
    
    return await streamText({
      model: provider,
      messages,
      ...options,
    });
  }
}

// Usage - decoupled from agent logic
const llm = new LLMProvider();
const response = await llm.generate('gpt-4o', messages);

Key Responsibilities:

Model abstraction and selection
Provider management (OpenAI, Anthropic, etc.)
Token counting and management
Rate limiting and quotas

Layer 2: Framework Layer - Orchestration and State Management

Mastra sits here, orchestrating agents, tools, and state.


// src/mastra/index.ts
import { Mastra } from '@mastra/core';
import { summarizerAgent } from './agents/summarizer';
import { researchAgent } from './agents/researcher';
import * as tools from './tools';

export const mastra = new Mastra({
  agents: {
    summarizer: summarizerAgent,
    researcher: researchAgent,
  },
  tools: {
    searchWeb: tools.searchWeb,
    fetchCRM: tools.fetchCRM,
    sendSlack: tools.sendSlack,
  },
});

// Agent definition - framework handles orchestration
export const summarizerAgent = new Agent({
  name: 'Article Summarizer',
  instructions: `You summarize articles concisely.`,
  model: {
    provider: 'OPEN_AI',
    name: 'gpt-4o-mini',
  },
  tools: {
    searchWeb: true, // Framework provides tool
    fetchCRM: true,
  },
});

Key Responsibilities:

Agent lifecycle management
Tool registration and execution
State persistence
Memory management
Event handling
Workflow orchestration

Why This Layer Matters:


// ❌ Without framework - manual orchestration
const response = await model.generate(prompt);
const toolCall = parseToolCall(response);
const toolResult = await executeTool(toolCall);
const finalResponse = await model.generate(addToolResult(prompt, toolResult));

// ✅ With framework - automatic orchestration
const response = await agent.generate(prompt);
// Framework handles tool calls, loops, state

Layer 3: Tools Layer - Extending Agent Capabilities

Tools connect agents to the outside world.


// src/mastra/tools/index.ts

// Data Retrieval Tools
export { searchWeb } from './search-web';
export { fetchCRMContact } from './fetch-crm';
export { getCalendarEvents } from './get-calendar';

// Action Tools
export { sendEmail } from './send-email';
export { createTicket } from './create-ticket';
export { postSlack } from './post-slack';

// Integration Tools
export { queryDatabase } from './query-db';
export { uploadFile } from './upload-file';
export { analyzeImage } from './analyze-image';

// Example tool
// src/mastra/tools/search-web.ts
import { createTool } from '@mastra/core';
import { z } from 'zod';

export const searchWeb = createTool({
  id: 'search-web',
  description: 'Search the web for current information',
  inputSchema: z.object({
    query: z.string(),
    limit: z.number().default(5),
  }),
  execute: async ({ context }, { query, limit }) => {
    const results = await tavily.search(query, { limit });
    return { success: true, results };
  },
});

Tool Organization:

tools/
├── data/           # Read-only tools
│   ├── search-web.ts
│   ├── fetch-crm.ts
│   └── get-calendar.ts
├── actions/        # Write/mutate tools
│   ├── send-email.ts
│   ├── create-ticket.ts
│   └── post-slack.ts
└── integrations/   # Third-party APIs
    ├── hubspot.ts
    ├── salesforce.ts
    └── stripe.ts

Layer 4: Application Layer - User Experience and Integration

Where agents meet users and business logic.


// app/api/chat/route.ts - API endpoint
import { mastra } from '@/mastra';

export async function POST(request: Request) {
  const { message, agentId } = await request.json();

  const agent = mastra.getAgent(agentId);
  const response = await agent.generate(message);

  return Response.json({ response });
}

// app/api/workflows/summarize-article/route.ts - Workflow
export async function POST(request: Request) {
  const { articleUrl } = await request.json();

  // 1. Fetch article
  const article = await fetch(articleUrl).then(r => r.text());

  // 2. Summarize with agent
  const summary = await mastra.getAgent('summarizer').generate(
    `Summarize this article:\n\n${article}`
  );

  // 3. Post to Slack
  await mastra.getTool('postSlack').execute({
    channel: '#content',
    message: summary,
  });

  // 4. Save to database
  await db.summary.create({
    data: { url: articleUrl, summary, createdAt: new Date() },
  });

  return Response.json({ success: true, summary });
}

// components/ChatInterface.tsx - UI
import { FloatingCedarChat } from 'cedar-os';

export function ChatInterface() {
  return (
    <FloatingCedarChat
      title="AI Assistant"
      agentContext={{
        systemPrompt: 'You are a helpful assistant.',
      }}
    />
  );
}

Key Responsibilities:

User interfaces (web, mobile, CLI)
API endpoints
Business workflows
Integration with existing systems
Authentication and authorization

Complete Stack Example


// ============================================
// Layer 1: LLM Provider Configuration
// ============================================
// lib/llm.ts
export const llmConfig = {
  primary: {
    provider: 'openai',
    model: 'gpt-4o',
    temperature: 0.3,
  },
  fallback: {
    provider: 'openai',
    model: 'gpt-4o-mini',
    temperature: 0.3,
  },
};

// ============================================
// Layer 2: Framework - Agent Definitions
// ============================================
// src/mastra/agents/customer-support.ts
import { Agent } from '@mastra/core';

export const customerSupportAgent = new Agent({
  name: 'Customer Support Agent',
  instructions: `You are a customer support agent.
  
Your capabilities:
- Look up customer information
- Check order status
- Create support tickets
- Escalate to human agents when needed`,
  
  model: {
    provider: 'OPEN_AI',
    name: 'gpt-4o',
  },
  
  tools: {
    getCustomer: true,
    getOrders: true,
    createTicket: true,
    escalateToHuman: true,
  },
});

// ============================================
// Layer 3: Tools
// ============================================
// src/mastra/tools/get-customer.ts
export const getCustomer = createTool({
  id: 'get-customer',
  description: 'Retrieve customer details by ID or email',
  inputSchema: z.object({
    customerId: z.string().optional(),
    email: z.string().email().optional(),
  }),
  execute: async ({ context }, args) => {
    const customer = await db.customer.findFirst({
      where: {
        OR: [
          { id: args.customerId },
          { email: args.email },
        ],
      },
      include: { orders: true },
    });

    return { success: true, data: customer };
  },
});

// ============================================
// Layer 4: Application
// ============================================
// app/api/support/chat/route.ts
export async function POST(request: Request) {
  const { message, sessionId } = await request.json();

  // Get conversation history
  const history = await getConversationHistory(sessionId);

  // Generate response with agent
  const response = await mastra
    .getAgent('customer-support')
    .generate(message, {
      context: { history },
    });

  // Save to history
  await saveToHistory(sessionId, { message, response });

  return Response.json({ response });
}

// components/SupportChat.tsx
export function SupportChat() {
  const [sessionId] = useState(() => crypto.randomUUID());

  return (
    <FloatingCedarChat
      title="Customer Support"
      endpoint="/api/support/chat"
      sessionId={sessionId}
    />
  );
}

Common Pitfalls

1. Skipping the Framework Layer


// ❌ Direct LLM calls everywhere
const response1 = await openai.chat.completions.create(/*...*/);
const response2 = await openai.chat.completions.create(/*...*/);

// ✅ Use framework
const response = await agent.generate(prompt);

Why it matters: Frameworks provide orchestration, state, tools, memory.

2. Tools in Application Layer


// ❌ Business logic mixed with tools
async function handleRequest(req) {
  const customer = await db.customer.findUnique(/*...*/);
  const orders = await db.order.findMany(/*...*/);
  const response = await agent.generate(/*...*/);
}

// ✅ Tools in Tools layer, business logic in Application
async function handleRequest(req) {
  const response = await agent.generate(req.message);
  // Agent uses getCustomer and getOrders tools
}

3. Hardcoded Model Selection


// ❌ Model hardcoded
const agent = new Agent({
  model: { provider: 'OPEN_AI', name: 'gpt-4o' },
});

// ✅ Configurable
const agent = new Agent({
  model: getModelConfig(process.env.AGENT_MODEL),
});

4. No Layer Boundaries


// ❌ Everything mixed
const agent = {
  async run(input) {
    const llm = await openai.create(/*...*/);
    const data = await db.query(/*...*/);
    await slack.post(/*...*/);
    return llm.response;
  },
};

// ✅ Clear boundaries
// Application calls Framework calls Tools calls LLM

Migration Path

From Monolith to Layered

Step 1: Extract LLM Layer


// Before
const response = await openai.chat.completions.create(/*...*/);

// After
const llm = new LLMProvider();
const response = await llm.generate('gpt-4o', messages);

Step 2: Extract Tools


// Before
const customer = await db.customer.findUnique(/*...*/);

// After
const getCustomer = createTool({/*...*/});

Step 3: Add Framework


// Before
const response = await llm.generate(/*...*/);

// After
const agent = new Agent({/*...*/});
const response = await agent.generate(prompt);

Step 4: Clean Application Layer


// Before
async function handler(req) {
  // 200 lines of mixed logic
}

// After
async function handler(req) {
  return await agent.generate(req.message);
}

Key Takeaways

Four layers - LLM, Framework, Tools, Application
Separation of concerns - Each layer has clear responsibilities
Mastra is the framework - Orchestration, state, tools, memory
Tools extend capabilities - Clean, reusable, testable
Application layer is thin - Delegates to framework

The modern agentic stack enables scalable, maintainable, production-ready systems.

Get chapter updates & code samples

We’ll email diagrams, code snippets, and additions.

The Agentic Stack

Why This Matters

The Problem: Monolithic Agent Code

The Solution: Four-Layer Architecture

The Four-Layer Architecture

Layer 1: LLM Layer - The Engine of Reasoning

Layer 2: Framework Layer - Orchestration and State Management

Layer 3: Tools Layer - Extending Agent Capabilities

Layer 4: Application Layer - User Experience and Integration

Complete Stack Example

Common Pitfalls

1. Skipping the Framework Layer

2. Tools in Application Layer

3. Hardcoded Model Selection

4. No Layer Boundaries

Migration Path

From Monolith to Layered

Key Takeaways

Get chapter updates & code samples