Blueprint/Chapter 14
Chapter 14

Prompting Mastery

By Zavier SandersSeptember 21, 2025

Advanced prompt engineering techniques with real-world examples and iteration strategies.

Prefer something you can ship today? Start with theQuickstart: Ship One Agent with Mastra— then come back here to deepen the concepts.

Structured Prompts

Well-structured prompts are the foundation of reliable agent behavior.

The Problem with Unstructured Prompts

// ❌ Bad: Vague, unstructured
const badPrompt = "Summarize this article";

// ❌ Bad: Kitchen sink
const messyPrompt = `
Summarize this article. Make it short. Include key points.
Don't be too technical. Be engaging. Focus on main ideas.
${article}
`;

Issues:

  • Inconsistent output
  • Model confusion about priorities
  • Hard to debug
  • Difficult to improve

The Structured Prompt Pattern

// ✅ Good: Clear structure
const structuredPrompt = `
# ROLE
You are a professional content analyst.

# TASK
Summarize the following article for a business audience.

# REQUIREMENTS
- Length: 2-3 paragraphs
- Focus: Key insights and actionable takeaways
- Tone: Professional but accessible
- Format: Markdown with bullet points for key takeaways

# INPUT
${article}

# OUTPUT FORMAT
## Summary
[2-3 paragraph summary]

## Key Takeaways
- [Takeaway 1]
- [Takeaway 2]
- [Takeaway 3]
`;

Template Anatomy

Every structured prompt should have:

1. Role Definition - Who is the AI?

# ROLE
You are a {role} with expertise in {domain}.
Your communication style is {style}.

2. Task Statement - What should they do?

# TASK
{Clear, specific task description}

3. Context - What do they need to know?

# CONTEXT
- {Relevant background}
- {Constraints}
- {Goals}

4. Requirements - How should they do it?

# REQUIREMENTS
- Length: {specification}
- Format: {structure}
- Tone: {voice}
- Constraints: {limitations}

5. Input - The actual data

# INPUT
{data}

6. Output Format - Expected structure

# OUTPUT FORMAT
{template or schema}

Real-World Example: Code Review Agent

const codeReviewPrompt = `
# ROLE
You are a senior software engineer conducting a code review.
You prioritize code quality, security, and maintainability.

# TASK
Review the provided code changes and provide constructive feedback.

# CONTEXT
- This is production code for a web application
- Team follows TypeScript best practices
- Security is a top priority
- Performance matters

# REQUIREMENTS
1. Identify bugs and security vulnerabilities
2. Suggest improvements for readability
3. Check for proper error handling
4. Verify type safety
5. Be constructive and specific

# INPUT
## File: ${fileName}
## Changes:
\`\`\`typescript
${diff}
\`\`\`

# OUTPUT FORMAT
## Critical Issues
- [Issue 1]: {description} → {suggestion}

## Improvements
- [Area]: {current approach} → {better approach}

## Positive Points
- [What was done well]

## Overall Assessment
{1-2 sentence summary}
`;

Prompt Sections Best Practices

Role:

  • Be specific about expertise level
  • Define personality/communication style
  • Set expectations for behavior

Task:

  • One clear objective
  • Measurable success criteria
  • Explicit constraints

Context:

  • Only relevant information
  • Business goals
  • User needs
  • Technical constraints

Requirements:

  • Numbered list for priority
  • Specific, measurable criteria
  • Edge cases to handle
  • What NOT to do

Output Format:

  • Use Markdown structure
  • Show example structure
  • Specify data types
  • Use XML/JSON for parsing

Context Management

Effective context management keeps agents focused and efficient.

The Context Window Challenge

// Models have limited context windows
// GPT-4o: 128k tokens (~96k words)
// GPT-4o-mini: 128k tokens
// Claude 3.5 Sonnet: 200k tokens

// Problem: Your agent needs to process 500 documents
// Solution: Smart context management

Chunking Strategy

// lib/context-chunker.ts
export class ContextChunker {
  chunk(
    content: string,
    maxTokens: number = 4000
  ): string[] {
    // Simple word-based chunking (tokens ≈ words × 1.3)
    const maxWords = Math.floor(maxTokens / 1.3);
    const words = content.split(/\s+/);
    const chunks: string[] = [];

    for (let i = 0; i < words.length; i += maxWords) {
      const chunk = words.slice(i, i + maxWords).join(' ');
      chunks.push(chunk);
    }

    return chunks;
  }

  semanticChunk(
    content: string,
    chunkSize: number = 1000
  ): string[] {
    // Split on natural boundaries
    const sections = content.split(/\n\n+/);
    const chunks: string[] = [];
    let currentChunk = '';

    for (const section of sections) {
      if ((currentChunk + section).length > chunkSize && currentChunk) {
        chunks.push(currentChunk.trim());
        currentChunk = section;
      } else {
        currentChunk += '\n\n' + section;
      }
    }

    if (currentChunk) {
      chunks.push(currentChunk.trim());
    }

    return chunks;
  }
}

Hierarchical Summarization

// Process large documents hierarchically
export async function summarizeLargeDocument(
  document: string
): Promise<string> {
  const chunker = new ContextChunker();
  const chunks = chunker.semanticChunk(document, 3000);

  // Step 1: Summarize each chunk
  const chunkSummaries = await Promise.all(
    chunks.map(async (chunk) => {
      const result = await agent.generate(`
# TASK
Summarize this section concisely.

# SECTION
${chunk}

# OUTPUT
2-3 sentences capturing the main points.
      `);
      
      return result;
    })
  );

  // Step 2: Combine summaries
  const combined = chunkSummaries.join('\n\n');

  // Step 3: Final summary
  const finalSummary = await agent.generate(`
# TASK
Create a comprehensive summary from these section summaries.

# SECTION SUMMARIES
${combined}

# OUTPUT
A cohesive 2-3 paragraph summary of the entire document.
  `);

  return finalSummary;
}

Context Prioritization

export class ContextPrioritizer {
  prioritize(
    contexts: Context[],
    maxTokens: number
  ): Context[] {
    // Score each context by relevance
    const scored = contexts.map((ctx) => ({
      context: ctx,
      score: this.calculateRelevance(ctx),
      tokens: this.estimateTokens(ctx.content),
    }));

    // Sort by score descending
    scored.sort((a, b) => b.score - a.score);

    // Take highest scoring contexts within token limit
    const selected: Context[] = [];
    let totalTokens = 0;

    for (const item of scored) {
      if (totalTokens + item.tokens <= maxTokens) {
        selected.push(item.context);
        totalTokens += item.tokens;
      }
    }

    return selected;
  }

  private calculateRelevance(context: Context): number {
    let score = 0;

    // Recency
    const age = Date.now() - context.timestamp.getTime();
    const daysSinceCreation = age / (1000 * 60 * 60 * 24);
    score += Math.max(0, 10 - daysSinceCreation);

    // Importance
    if (context.priority === 'high') score += 20;
    if (context.priority === 'medium') score += 10;

    // Type
    if (context.type === 'user_message') score += 15;
    if (context.type === 'error') score += 12;

    return score;
  }

  private estimateTokens(text: string): number {
    // Rough estimate: 1 token ≈ 0.75 words
    return Math.ceil(text.split(/\s+/).length * 1.3);
  }
}

Dynamic Context Injection

export class DynamicContextBuilder {
  async buildContext(
    userQuery: string,
    availableContext: Context[]
  ): Promise<string> {
    // Analyze query to determine needed context
    const analysis = await this.analyzeQuery(userQuery);

    const relevantContext = availableContext.filter((ctx) => {
      // Time-based filtering
      if (analysis.timeframe === 'recent' && this.isOld(ctx)) {
        return false;
      }

      // Type-based filtering
      if (analysis.needsCRM && ctx.type !== 'crm') {
        return false;
      }

      return true;
    });

    // Prioritize and fit in context window
    const prioritizer = new ContextPrioritizer();
    const selected = prioritizer.prioritize(relevantContext, 4000);

    // Format for prompt
    return this.formatContext(selected);
  }

  private formatContext(contexts: Context[]): string {
    return contexts
      .map((ctx) => `
## ${ctx.type.toUpperCase()} (${ctx.timestamp.toLocaleDateString()})
${ctx.content}
      `)
      .join('\n\n---\n\n');
  }
}

Few-Shot Examples

Examples dramatically improve output quality and consistency.

Zero-Shot vs Few-Shot

// ❌ Zero-shot: No examples
const zeroShot = `
Classify this email as: sales, support, billing, or general.

Email: ${email}
`;

// ✅ Few-shot: Include examples
const fewShot = `
Classify emails into: sales, support, billing, or general.

# EXAMPLES

Email: "I'm interested in the enterprise plan pricing"
Classification: sales

Email: "My login isn't working"
Classification: support

Email: "When will I be charged for my subscription?"
Classification: billing

Email: "What are your office hours?"
Classification: general

# NOW CLASSIFY

Email: ${email}
Classification:
`;

Few-Shot Best Practices

1. Use 3-5 Examples

// Too few (1-2): Not enough pattern
// Too many (10+): Waste tokens, may confuse
// Sweet spot: 3-5 diverse examples

2. Cover Edge Cases

const examples = [
  // Typical case
  { input: 'Normal scenario', output: 'Expected result' },
  
  // Edge case 1
  { input: 'Ambiguous scenario', output: 'How to handle' },
  
  // Edge case 2
  { input: 'Complex scenario', output: 'Detailed handling' },
  
  // Counter-example
  { input: 'What NOT to do', output: 'Correct approach' },
];

3. Show Output Format

const fewShotWithFormat = `
# EXAMPLES

Input: "The product launch was delayed due to supply chain issues"
Output:
\`\`\`json
{
  "summary": "Product launch delayed",
  "reason": "Supply chain issues",
  "impact": "medium",
  "actionable": true
}
\`\`\`

Input: "Team meeting scheduled for next Tuesday"
Output:
\`\`\`json
{
  "summary": "Meeting scheduled",
  "reason": null,
  "impact": "low",
  "actionable": false
}
\`\`\`

# YOUR TURN
Input: ${input}
Output:
`;

Dynamic Few-Shot Selection

export class FewShotSelector {
  async selectExamples(
    query: string,
    exampleBank: Example[],
    k: number = 3
  ): Promise<Example[]> {
    // Embed the query
    const queryEmbedding = await this.embed(query);

    // Calculate similarity to all examples
    const withSimilarity = exampleBank.map((example) => ({
      example,
      similarity: this.cosineSimilarity(
        queryEmbedding,
        example.embedding
      ),
    }));

    // Return top K most similar
    return withSimilarity
      .sort((a, b) => b.similarity - a.similarity)
      .slice(0, k)
      .map((item) => item.example);
  }

  private async embed(text: string): Promise<number[]> {
    const response = await openai.embeddings.create({
      model: 'text-embedding-3-small',
      input: text,
    });

    return response.data[0].embedding;
  }

  private cosineSimilarity(a: number[], b: number[]): number {
    const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
    const magA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
    const magB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
    
    return dotProduct / (magA * magB);
  }
}

// Usage
const selector = new FewShotSelector();
const relevantExamples = await selector.selectExamples(
  userQuery,
  exampleBank,
  3
);

const prompt = buildPromptWithExamples(userQuery, relevantExamples);

Error Handling in Prompts

Teach agents how to handle errors and edge cases through prompts.

Explicit Error Handling

const promptWithErrorHandling = `
# TASK
Extract company information from text.

# INSTRUCTIONS
1. Look for company name, industry, location, employee count
2. If information is missing, use "Unknown" - DO NOT guess
3. If text is not about a company, return error
4. If text is unclear, ask for clarification

# EDGE CASES

## Missing Information
If any field is not mentioned:
\`\`\`json
{ "company_name": "Acme Corp", "industry": "Unknown", ... }
\`\`\`

## Not a Company
If text doesn't describe a company:
\`\`\`json
{ "error": "Not a company description" }
\`\`\`

## Ambiguous
If multiple interpretations exist:
\`\`\`json
{ "error": "Ambiguous - please clarify if you mean X or Y" }
\`\`\`

# INPUT
${text}

# OUTPUT
Return JSON with company info or error.
`;

Validation Prompts

export async function extractWithValidation(
  text: string
): Promise<CompanyInfo> {
  // Step 1: Extract
  const extraction = await agent.generate(extractionPrompt);

  // Step 2: Validate
  const validation = await agent.generate(`
# TASK
Validate this extracted company information.

# ORIGINAL TEXT
${text}

# EXTRACTED DATA
${extraction}

# VALIDATION RULES
1. Company name must be explicitly mentioned in text
2. Industry must match actual business description
3. Numbers must be accurate (don't round or estimate)
4. Location must be specific (not just "USA")

# INSTRUCTIONS
Check each field against the original text.
For any field that seems incorrect or guessed, mark it as "needs_review".

# OUTPUT FORMAT
\`\`\`json
{
  "valid": true/false,
  "issues": ["list of problems found"],
  "corrections": { "field": "corrected value" }
}
\`\`\`
  `);

  // Step 3: Apply corrections
  return this.applyCorrections(extraction, validation);
}

Self-Correction Prompts

const selfCorrectionPrompt = `
# TASK
Analyze the following article and extract key points.

# PROCESS
1. Read the article carefully
2. Extract 5 key points
3. Review your extraction for:
   - Accuracy (did you misrepresent anything?)
   - Completeness (did you miss major points?)
   - Clarity (are points clear and specific?)
4. Revise any points that don't meet these standards

# ARTICLE
${article}

# OUTPUT FORMAT
## Initial Extraction
1. [Point 1]
2. [Point 2]
...

## Self-Review
- Issues found: [list]
- Corrections needed: [list]

## Final Key Points
1. [Revised point 1]
2. [Revised point 2]
...
`;

Iteration and A/B Testing

Systematic improvement through testing and iteration.

Prompt Versioning

// prompts/summarizer.ts
export const summarizerPrompts = {
  v1: {
    version: '1.0.0',
    prompt: `Summarize this article: ${article}`,
    performance: { accuracy: 0.65, speed: 'fast' },
  },
  
  v2: {
    version: '2.0.0',
    prompt: `
# TASK
Summarize the article below for a business audience.

# REQUIREMENTS
- 2-3 paragraphs
- Focus on key insights

# ARTICLE
${article}
    `,
    performance: { accuracy: 0.78, speed: 'medium' },
  },
  
  v3: {
    version: '3.0.0',
    prompt: `
# ROLE
You are a business analyst summarizing market research.

# TASK
Summarize for executives who need actionable insights.

# FORMAT
- Executive Summary (2-3 sentences)
- Key Insights (3-5 bullet points)
- Implications (2-3 sentences)

# ARTICLE
${article}
    `,
    performance: { accuracy: 0.89, speed: 'medium' },
  },
} as const;

A/B Testing Framework

// lib/prompt-tester.ts
export class PromptTester {
  async abTest(
    variantA: string,
    variantB: string,
    testCases: TestCase[]
  ): Promise<TestResults> {
    const resultsA: Result[] = [];
    const resultsB: Result[] = [];

    for (const testCase of testCases) {
      // Run variant A
      const resultA = await this.runTest(variantA, testCase);
      resultsA.push(resultA);

      // Run variant B
      const resultB = await this.runTest(variantB, testCase);
      resultsB.push(resultB);
    }

    return {
      variantA: this.calculateMetrics(resultsA),
      variantB: this.calculateMetrics(resultsB),
      winner: this.determineWinner(resultsA, resultsB),
    };
  }

  private calculateMetrics(results: Result[]): Metrics {
    const scores = results.map((r) => r.score);
    const latencies = results.map((r) => r.latency);

    return {
      avgScore: scores.reduce((a, b) => a + b, 0) / scores.length,
      avgLatency: latencies.reduce((a, b) => a + b, 0) / latencies.length,
      successRate: results.filter((r) => r.success).length / results.length,
    };
  }

  private async runTest(
    prompt: string,
    testCase: TestCase
  ): Promise<Result> {
    const start = Date.now();

    try {
      const output = await agent.generate(prompt);
      const score = await this.scoreOutput(output, testCase.expected);

      return {
        success: true,
        score,
        latency: Date.now() - start,
        output,
      };
    } catch (error) {
      return {
        success: false,
        score: 0,
        latency: Date.now() - start,
        error: (error as Error).message,
      };
    }
  }

  private async scoreOutput(
    output: string,
    expected: string
  ): Promise<number> {
    // Use another LLM to score quality (0-1)
    const result = await scoringAgent.generate(`
Rate how well this output matches the expected result (0-1 score).

Expected: ${expected}
Actual: ${output}

Score (0-1):
    `);

    return parseFloat(result);
  }
}

Evaluation Dataset

// test/prompts/summarizer.test.ts
const evaluationDataset: TestCase[] = [
  {
    id: 'tech-article-1',
    input: 'Long technical article about AI...',
    expected: 'Should focus on technical insights and innovations',
    criteria: ['accuracy', 'technical_depth', 'clarity'],
  },
  {
    id: 'business-article-1',
    input: 'Business strategy article...',
    expected: 'Should focus on business implications and strategy',
    criteria: ['business_focus', 'actionability', 'clarity'],
  },
  {
    id: 'edge-case-1',
    input: 'Very short article with minimal content...',
    expected: 'Should handle gracefully, not hallucinate',
    criteria: ['accuracy', 'no_hallucination'],
  },
];

// Run evaluation
const tester = new PromptTester();
const results = await tester.abTest(
  summarizerPrompts.v2.prompt,
  summarizerPrompts.v3.prompt,
  evaluationDataset
);

console.log('Results:', results);
// Deploy winner to production
if (results.winner === 'B') {
  deployPrompt(summarizerPrompts.v3);
}

Continuous Improvement

export class PromptMonitor {
  async trackProduction(
    promptVersion: string,
    input: string,
    output: string,
    userFeedback?: number
  ): Promise<void> {
    await db.promptExecution.create({
      data: {
        promptVersion,
        input: input.substring(0, 1000),
        output: output.substring(0, 1000),
        userFeedback,
        timestamp: new Date(),
      },
    });
  }

  async analyzePerformance(
    promptVersion: string,
    timeframe: 'day' | 'week' | 'month'
  ): Promise<PerformanceReport> {
    const executions = await this.getExecutions(promptVersion, timeframe);

    return {
      totalExecutions: executions.length,
      avgUserRating: this.avgRating(executions),
      errorRate: this.errorRate(executions),
      commonIssues: this.identifyIssues(executions),
      recommendations: this.generateRecommendations(executions),
    };
  }

  private generateRecommendations(
    executions: Execution[]
  ): string[] {
    const recommendations: string[] = [];

    // Analyze failures
    const failures = executions.filter((e) => e.userFeedback < 3);
    if (failures.length > executions.length * 0.1) {
      recommendations.push(
        'High failure rate - consider adding more examples or clearer instructions'
      );
    }

    // Analyze patterns
    const commonFailures = this.findPatterns(failures);
    if (commonFailures.length > 0) {
      recommendations.push(
        `Common failure pattern: ${commonFailures[0]} - add specific handling`
      );
    }

    return recommendations;
  }
}

Production Prompt Template

export const productionPromptTemplate = `
# ROLE
${role}

# TASK
${task}

# CONTEXT
${context}

# REQUIREMENTS
${requirements.map((r, i) => `${i + 1}. ${r}`).join('\n')}

# EXAMPLES
${examples.map(ex => `
Input: ${ex.input}
Output: ${ex.output}
`).join('\n')}

# ERROR HANDLING
- If ${errorCondition1}: ${errorResponse1}
- If ${errorCondition2}: ${errorResponse2}

# INPUT
${input}

# OUTPUT FORMAT
${outputFormat}

# VALIDATION
Before responding, verify:
${validationChecks.map((c, i) => `${i + 1}. ${c}`).join('\n')}
`;

Key Takeaways

  1. Structure your prompts - Role, task, context, requirements, format
  2. Manage context wisely - Chunk, prioritize, inject dynamically
  3. Use few-shot examples - 3-5 diverse examples dramatically improve quality
  4. Handle errors in prompts - Teach agents how to fail gracefully
  5. Test and iterate - A/B test, measure, improve continuously

Prompting is software engineering for AI systems!

Get chapter updates & code samples

We’ll email diagrams, code snippets, and additions.