Why This Matters
Your agent needs to run 24/7, serve real users, and handle production traffic.
Local vs Production
// ✅ Local development - Single user (you) - Can restart anytime - No cost concerns - Manual testing // 🚀 Production - Thousands of users - Must be reliable - Cost matters - Automated everything
Deployment Strategies
Recommended: Serverless (Vercel + Mastra)
Why serverless:
- Auto-scales to zero and infinity
- Pay per request
- No server management
- Global edge deployment
// Your Next.js app deploys to Vercel // Agents run as serverless API routes // Mastra handles orchestration vercel deploy
Step 1: Environment Setup
Required Environment Variables
# .env.production # LLM Keys OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-... # Database DATABASE_URL=postgresql://... # Optional: Observability SENTRY_DSN=https://... # Optional: Authentication NEXTAUTH_SECRET=... NEXTAUTH_URL=https://yourapp.com
Vercel Environment Variables
# Add secrets to Vercel vercel env add OPENAI_API_KEY vercel env add DATABASE_URL vercel env add SENTRY_DSN # Or via Vercel dashboard: # Project Settings > Environment Variables
Step 2: Database Setup
Recommended: Neon (Serverless Postgres)
# 1. Create Neon project at neon.tech # 2. Get connection string DATABASE_URL="postgresql://user:pass@xxx.neon.tech/dbname?sslmode=require" # 3. Run migrations npx prisma migrate deploy # 4. Generate client npx prisma generate
Prisma Configuration
// prisma/schema.prisma generator client { provider = "prisma-client-js" } datasource db { provider = "postgresql" url = env("DATABASE_URL") } // Connection pooling for serverless // Add to DATABASE_URL: ?connection_limit=1&pool_timeout=0
Step 3: Deploy Frontend (Vercel)
Next.js Configuration
// next.config.mjs /** @type {import('next').NextConfig} */ const nextConfig = { // Enable edge runtime for AI routes experimental: { serverActions: true, }, }; export default nextConfig;
Deploy
# Install Vercel CLI npm i -g vercel # Login vercel login # Deploy vercel # Production deployment vercel --prod
Custom Domain
# Add domain via Vercel dashboard # Project Settings > Domains > Add Domain # Or CLI vercel domains add yourapp.com
Step 4: Deploy Agents (Mastra API Routes)
Agent API Endpoint
// app/api/agents/[agentId]/generate/route.ts import { mastra } from '@/mastra'; export async function POST( request: Request, { params }: { params: { agentId: string } } ) { try { const { message } = await request.json(); const agent = mastra.getAgent(params.agentId); if (!agent) { return Response.json( { error: 'Agent not found' }, { status: 404 } ); } const response = await agent.generate(message); return Response.json({ response }); } catch (error) { console.error('Agent error:', error); return Response.json( { error: 'Internal server error' }, { status: 500 } ); } }
Stream Endpoint
// app/api/agents/[agentId]/stream/route.ts export async function POST( request: Request, { params }: { params: { agentId: string } } ) { const { message } = await request.json(); const agent = mastra.getAgent(params.agentId); const stream = await agent.stream(message); return new Response(stream, { headers: { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache', 'Connection': 'keep-alive', }, }); }
Step 5: Scheduled Jobs (Cron)
Configure Cron Jobs
// vercel.json { "crons": [ { "path": "/api/cron/daily-summary", "schedule": "0 9 * * *" }, { "path": "/api/cron/poll-feeds", "schedule": "*/15 * * * *" } ] }
Cron Handler
// app/api/cron/daily-summary/route.ts import { mastra } from '@/mastra'; export async function GET(request: Request) { // Verify Vercel cron secret const authHeader = request.headers.get('authorization'); if (authHeader !== `Bearer ${process.env.CRON_SECRET}`) { return Response.json({ error: 'Unauthorized' }, { status: 401 }); } try { // Run workflow const result = await mastra .getWorkflow('daily-summary') .execute({ date: new Date() }); return Response.json({ success: true, result }); } catch (error) { console.error('Cron job failed:', error); return Response.json({ error: 'Failed' }, { status: 500 }); } }
Step 6: Monitoring
Error Tracking
// lib/sentry.ts import * as Sentry from '@sentry/nextjs'; export function initSentry() { Sentry.init({ dsn: process.env.SENTRY_DSN, environment: process.env.VERCEL_ENV, tracesSampleRate: 0.1, }); } // app/layout.tsx useEffect(() => { initSentry(); }, []);
Health Checks
// app/api/health/route.ts export async function GET() { const checks = { database: await checkDatabase(), openai: await checkOpenAI(), }; const isHealthy = Object.values(checks).every(c => c === 'up'); return Response.json({ status: isHealthy ? 'healthy' : 'unhealthy', checks, }); }
Going Live Checklist
Pre-Launch
- [ ] Environment variables set - [ ] Database migrated - [ ] All tests passing - [ ] Error tracking configured (Sentry) - [ ] Health checks working - [ ] Custom domain configured - [ ] SSL certificate active
Launch
# 1. Deploy to production vercel --prod # 2. Verify deployment curl https://yourapp.com/api/health # 3. Test critical flows # - User can chat with agent # - Agent can use tools # - Cron jobs running # 4. Monitor for errors # Check Sentry dashboard
Post-Launch
- [ ] Monitor error rates (first hour) - [ ] Check response times - [ ] Verify cron jobs executing - [ ] Monitor costs (first day) - [ ] Test from different locations
Common Issues & Solutions
Issue: Database Connection Errors
// Problem: Too many connections // Solution: Connection pooling DATABASE_URL="postgresql://...?connection_limit=1&pool_timeout=0" // Solution: Use Prisma connection pooling import { PrismaClient } from '@prisma/client'; const globalForPrisma = global as unknown as { prisma: PrismaClient }; export const db = globalForPrisma.prisma || new PrismaClient(); if (process.env.NODE_ENV !== 'production') { globalForPrisma.prisma = db; }
Issue: Cold Starts
// Problem: First request takes 5+ seconds // Solution 1: Keep functions warm with cron // vercel.json { "crons": [{ "path": "/api/warmup", "schedule": "*/5 * * * *" }] } // Solution 2: Use edge runtime (faster cold starts) export const runtime = 'edge';
Issue: Timeout Errors
// Problem: Agent takes longer than 30s (Vercel hobby limit) // Solution 1: Upgrade to Vercel Pro (60s timeout) // Solution 2: Use streaming export async function POST(request: Request) { const stream = await agent.stream(message); return new Response(stream); } // Solution 3: Use queue for long tasks await queue.add('long-task', { input }); return Response.json({ jobId: 'xxx' });
Issue: High Costs
// Problem: $100/day in API costs // Solution 1: Use cheaper models const agent = new Agent({ model: { provider: 'OPEN_AI', name: 'gpt-4o-mini', // 15x cheaper than gpt-4o }, }); // Solution 2: Cache responses const cached = await redis.get(cacheKey); if (cached) return cached; // Solution 3: Rate limiting import { Ratelimit } from '@upstash/ratelimit';
Scaling Considerations
Phase 1: 0-100 Users
✅ Vercel hobby plan ($0) ✅ Neon free tier ✅ Basic monitoring
Phase 2: 100-1,000 Users
✅ Vercel Pro ($20/month) ✅ Neon scale tier ✅ Redis caching (Upstash) ✅ Better error tracking
Phase 3: 1,000+ Users
✅ Vercel Enterprise ✅ Connection pooling ✅ Multiple agent replicas ✅ Advanced monitoring ✅ A/B testing
Key Takeaways
- Serverless first - Vercel for easy deployment and scaling
- Environment variables - Never hardcode secrets
- Database pooling - Essential for serverless
- Error tracking - Sentry catches production issues
- Health checks - Monitor system status
- Start small - Scale infrastructure as you grow
Your agent is ready for production! 🚀
Get chapter updates & code samples
We’ll email diagrams, code snippets, and additions.