Why This Matters
The right architecture determines whether your agent scales from prototype to production.
The Problem: Monolithic Agent Code
// ❌ Everything in one file const agent = async (userInput: string) => { // Model selection hardcoded const response = await openai.chat.completions.create({ model: 'gpt-4o', messages: [{ role: 'user', content: userInput }], }); // Tools hardcoded if (response.includes('weather')) { const weather = await fetch('https://api.weather.com...'); } // Application logic mixed in await db.save(response); await slack.post(response); return response; };
Issues:
- Can't swap models without rewriting code
- Tools tightly coupled to agent logic
- Testing requires mocking everything
- No separation of concerns
The Solution: Four-Layer Architecture
┌─────────────────────────────────────────────────────────────┐
│ APPLICATION LAYER │
│ • UI components │
│ • API routes │
│ • Workflows │
│ • Business logic │
└─────────────────────┬───────────────────────────────────────┘
│
┌─────────────────────┴───────────────────────────────────────┐
│ FRAMEWORK LAYER │
│ • Agent orchestration (Mastra) │
│ • State management │
│ • Tool execution │
│ • Memory management │
└─────────────────────┬───────────────────────────────────────┘
│
┌─────────────────────┴───────────────────────────────────────┐
│ TOOLS LAYER │
│ • Data retrieval │
│ • Actions (send email, create ticket) │
│ • Integrations (CRM, Slack, etc) │
└─────────────────────┬───────────────────────────────────────┘
│
┌─────────────────────┴───────────────────────────────────────┐
│ LLM LAYER │
│ • Model providers (OpenAI, Anthropic) │
│ • Model selection │
│ • Token management │
└─────────────────────────────────────────────────────────────┘
The Four-Layer Architecture
Layer 1: LLM Layer - The Engine of Reasoning
The foundation that powers agent intelligence.
// lib/llm-provider.ts import { openai } from '@ai-sdk/openai'; import { anthropic } from '@ai-sdk/anthropic'; export class LLMProvider { private models = { 'gpt-4o': openai('gpt-4o'), 'gpt-4o-mini': openai('gpt-4o-mini'), 'claude-3-5-sonnet': anthropic('claude-3-5-sonnet-20241022'), }; async generate( model: string, messages: Message[], options?: GenerateOptions ): Promise<Response> { const provider = this.models[model]; if (!provider) { throw new Error(`Unknown model: ${model}`); } return await generateText({ model: provider, messages, ...options, }); } async stream( model: string, messages: Message[], options?: StreamOptions ): Promise<StreamResponse> { const provider = this.models[model]; return await streamText({ model: provider, messages, ...options, }); } } // Usage - decoupled from agent logic const llm = new LLMProvider(); const response = await llm.generate('gpt-4o', messages);
Key Responsibilities:
- Model abstraction and selection
- Provider management (OpenAI, Anthropic, etc.)
- Token counting and management
- Rate limiting and quotas
Layer 2: Framework Layer - Orchestration and State Management
Mastra sits here, orchestrating agents, tools, and state.
// src/mastra/index.ts import { Mastra } from '@mastra/core'; import { summarizerAgent } from './agents/summarizer'; import { researchAgent } from './agents/researcher'; import * as tools from './tools'; export const mastra = new Mastra({ agents: { summarizer: summarizerAgent, researcher: researchAgent, }, tools: { searchWeb: tools.searchWeb, fetchCRM: tools.fetchCRM, sendSlack: tools.sendSlack, }, }); // Agent definition - framework handles orchestration export const summarizerAgent = new Agent({ name: 'Article Summarizer', instructions: `You summarize articles concisely.`, model: { provider: 'OPEN_AI', name: 'gpt-4o-mini', }, tools: { searchWeb: true, // Framework provides tool fetchCRM: true, }, });
Key Responsibilities:
- Agent lifecycle management
- Tool registration and execution
- State persistence
- Memory management
- Event handling
- Workflow orchestration
Why This Layer Matters:
// ❌ Without framework - manual orchestration const response = await model.generate(prompt); const toolCall = parseToolCall(response); const toolResult = await executeTool(toolCall); const finalResponse = await model.generate(addToolResult(prompt, toolResult)); // ✅ With framework - automatic orchestration const response = await agent.generate(prompt); // Framework handles tool calls, loops, state
Layer 3: Tools Layer - Extending Agent Capabilities
Tools connect agents to the outside world.
// src/mastra/tools/index.ts // Data Retrieval Tools export { searchWeb } from './search-web'; export { fetchCRMContact } from './fetch-crm'; export { getCalendarEvents } from './get-calendar'; // Action Tools export { sendEmail } from './send-email'; export { createTicket } from './create-ticket'; export { postSlack } from './post-slack'; // Integration Tools export { queryDatabase } from './query-db'; export { uploadFile } from './upload-file'; export { analyzeImage } from './analyze-image'; // Example tool // src/mastra/tools/search-web.ts import { createTool } from '@mastra/core'; import { z } from 'zod'; export const searchWeb = createTool({ id: 'search-web', description: 'Search the web for current information', inputSchema: z.object({ query: z.string(), limit: z.number().default(5), }), execute: async ({ context }, { query, limit }) => { const results = await tavily.search(query, { limit }); return { success: true, results }; }, });
Tool Organization:
tools/
├── data/ # Read-only tools
│ ├── search-web.ts
│ ├── fetch-crm.ts
│ └── get-calendar.ts
├── actions/ # Write/mutate tools
│ ├── send-email.ts
│ ├── create-ticket.ts
│ └── post-slack.ts
└── integrations/ # Third-party APIs
├── hubspot.ts
├── salesforce.ts
└── stripe.ts
Layer 4: Application Layer - User Experience and Integration
Where agents meet users and business logic.
// app/api/chat/route.ts - API endpoint import { mastra } from '@/mastra'; export async function POST(request: Request) { const { message, agentId } = await request.json(); const agent = mastra.getAgent(agentId); const response = await agent.generate(message); return Response.json({ response }); } // app/api/workflows/summarize-article/route.ts - Workflow export async function POST(request: Request) { const { articleUrl } = await request.json(); // 1. Fetch article const article = await fetch(articleUrl).then(r => r.text()); // 2. Summarize with agent const summary = await mastra.getAgent('summarizer').generate( `Summarize this article:\n\n${article}` ); // 3. Post to Slack await mastra.getTool('postSlack').execute({ channel: '#content', message: summary, }); // 4. Save to database await db.summary.create({ data: { url: articleUrl, summary, createdAt: new Date() }, }); return Response.json({ success: true, summary }); } // components/ChatInterface.tsx - UI import { FloatingCedarChat } from 'cedar-os'; export function ChatInterface() { return ( <FloatingCedarChat title="AI Assistant" agentContext={{ systemPrompt: 'You are a helpful assistant.', }} /> ); }
Key Responsibilities:
- User interfaces (web, mobile, CLI)
- API endpoints
- Business workflows
- Integration with existing systems
- Authentication and authorization
Complete Stack Example
// ============================================ // Layer 1: LLM Provider Configuration // ============================================ // lib/llm.ts export const llmConfig = { primary: { provider: 'openai', model: 'gpt-4o', temperature: 0.3, }, fallback: { provider: 'openai', model: 'gpt-4o-mini', temperature: 0.3, }, }; // ============================================ // Layer 2: Framework - Agent Definitions // ============================================ // src/mastra/agents/customer-support.ts import { Agent } from '@mastra/core'; export const customerSupportAgent = new Agent({ name: 'Customer Support Agent', instructions: `You are a customer support agent. Your capabilities: - Look up customer information - Check order status - Create support tickets - Escalate to human agents when needed`, model: { provider: 'OPEN_AI', name: 'gpt-4o', }, tools: { getCustomer: true, getOrders: true, createTicket: true, escalateToHuman: true, }, }); // ============================================ // Layer 3: Tools // ============================================ // src/mastra/tools/get-customer.ts export const getCustomer = createTool({ id: 'get-customer', description: 'Retrieve customer details by ID or email', inputSchema: z.object({ customerId: z.string().optional(), email: z.string().email().optional(), }), execute: async ({ context }, args) => { const customer = await db.customer.findFirst({ where: { OR: [ { id: args.customerId }, { email: args.email }, ], }, include: { orders: true }, }); return { success: true, data: customer }; }, }); // ============================================ // Layer 4: Application // ============================================ // app/api/support/chat/route.ts export async function POST(request: Request) { const { message, sessionId } = await request.json(); // Get conversation history const history = await getConversationHistory(sessionId); // Generate response with agent const response = await mastra .getAgent('customer-support') .generate(message, { context: { history }, }); // Save to history await saveToHistory(sessionId, { message, response }); return Response.json({ response }); } // components/SupportChat.tsx export function SupportChat() { const [sessionId] = useState(() => crypto.randomUUID()); return ( <FloatingCedarChat title="Customer Support" endpoint="/api/support/chat" sessionId={sessionId} /> ); }
Common Pitfalls
1. Skipping the Framework Layer
// ❌ Direct LLM calls everywhere const response1 = await openai.chat.completions.create(/*...*/); const response2 = await openai.chat.completions.create(/*...*/); // ✅ Use framework const response = await agent.generate(prompt);
Why it matters: Frameworks provide orchestration, state, tools, memory.
2. Tools in Application Layer
// ❌ Business logic mixed with tools async function handleRequest(req) { const customer = await db.customer.findUnique(/*...*/); const orders = await db.order.findMany(/*...*/); const response = await agent.generate(/*...*/); } // ✅ Tools in Tools layer, business logic in Application async function handleRequest(req) { const response = await agent.generate(req.message); // Agent uses getCustomer and getOrders tools }
3. Hardcoded Model Selection
// ❌ Model hardcoded const agent = new Agent({ model: { provider: 'OPEN_AI', name: 'gpt-4o' }, }); // ✅ Configurable const agent = new Agent({ model: getModelConfig(process.env.AGENT_MODEL), });
4. No Layer Boundaries
// ❌ Everything mixed const agent = { async run(input) { const llm = await openai.create(/*...*/); const data = await db.query(/*...*/); await slack.post(/*...*/); return llm.response; }, }; // ✅ Clear boundaries // Application calls Framework calls Tools calls LLM
Migration Path
From Monolith to Layered
Step 1: Extract LLM Layer
// Before const response = await openai.chat.completions.create(/*...*/); // After const llm = new LLMProvider(); const response = await llm.generate('gpt-4o', messages);
Step 2: Extract Tools
// Before const customer = await db.customer.findUnique(/*...*/); // After const getCustomer = createTool({/*...*/});
Step 3: Add Framework
// Before const response = await llm.generate(/*...*/); // After const agent = new Agent({/*...*/}); const response = await agent.generate(prompt);
Step 4: Clean Application Layer
// Before async function handler(req) { // 200 lines of mixed logic } // After async function handler(req) { return await agent.generate(req.message); }
Key Takeaways
- Four layers - LLM, Framework, Tools, Application
- Separation of concerns - Each layer has clear responsibilities
- Mastra is the framework - Orchestration, state, tools, memory
- Tools extend capabilities - Clean, reusable, testable
- Application layer is thin - Delegates to framework
The modern agentic stack enables scalable, maintainable, production-ready systems.
Get chapter updates & code samples
We’ll email diagrams, code snippets, and additions.