Structured Prompts
Well-structured prompts are the foundation of reliable agent behavior.
The Problem with Unstructured Prompts
// ❌ Bad: Vague, unstructured const badPrompt = "Summarize this article"; // ❌ Bad: Kitchen sink const messyPrompt = ` Summarize this article. Make it short. Include key points. Don't be too technical. Be engaging. Focus on main ideas. ${article} `;
Issues:
- Inconsistent output
- Model confusion about priorities
- Hard to debug
- Difficult to improve
The Structured Prompt Pattern
// ✅ Good: Clear structure const structuredPrompt = ` # ROLE You are a professional content analyst. # TASK Summarize the following article for a business audience. # REQUIREMENTS - Length: 2-3 paragraphs - Focus: Key insights and actionable takeaways - Tone: Professional but accessible - Format: Markdown with bullet points for key takeaways # INPUT ${article} # OUTPUT FORMAT ## Summary [2-3 paragraph summary] ## Key Takeaways - [Takeaway 1] - [Takeaway 2] - [Takeaway 3] `;
Template Anatomy
Every structured prompt should have:
1. Role Definition - Who is the AI?
# ROLE You are a {role} with expertise in {domain}. Your communication style is {style}.
2. Task Statement - What should they do?
# TASK {Clear, specific task description}
3. Context - What do they need to know?
# CONTEXT - {Relevant background} - {Constraints} - {Goals}
4. Requirements - How should they do it?
# REQUIREMENTS - Length: {specification} - Format: {structure} - Tone: {voice} - Constraints: {limitations}
5. Input - The actual data
# INPUT {data}
6. Output Format - Expected structure
# OUTPUT FORMAT {template or schema}
Real-World Example: Code Review Agent
const codeReviewPrompt = ` # ROLE You are a senior software engineer conducting a code review. You prioritize code quality, security, and maintainability. # TASK Review the provided code changes and provide constructive feedback. # CONTEXT - This is production code for a web application - Team follows TypeScript best practices - Security is a top priority - Performance matters # REQUIREMENTS 1. Identify bugs and security vulnerabilities 2. Suggest improvements for readability 3. Check for proper error handling 4. Verify type safety 5. Be constructive and specific # INPUT ## File: ${fileName} ## Changes: \`\`\`typescript ${diff} \`\`\` # OUTPUT FORMAT ## Critical Issues - [Issue 1]: {description} → {suggestion} ## Improvements - [Area]: {current approach} → {better approach} ## Positive Points - [What was done well] ## Overall Assessment {1-2 sentence summary} `;
Prompt Sections Best Practices
Role:
- Be specific about expertise level
- Define personality/communication style
- Set expectations for behavior
Task:
- One clear objective
- Measurable success criteria
- Explicit constraints
Context:
- Only relevant information
- Business goals
- User needs
- Technical constraints
Requirements:
- Numbered list for priority
- Specific, measurable criteria
- Edge cases to handle
- What NOT to do
Output Format:
- Use Markdown structure
- Show example structure
- Specify data types
- Use XML/JSON for parsing
Context Management
Effective context management keeps agents focused and efficient.
The Context Window Challenge
// Models have limited context windows // GPT-4o: 128k tokens (~96k words) // GPT-4o-mini: 128k tokens // Claude 3.5 Sonnet: 200k tokens // Problem: Your agent needs to process 500 documents // Solution: Smart context management
Chunking Strategy
// lib/context-chunker.ts export class ContextChunker { chunk( content: string, maxTokens: number = 4000 ): string[] { // Simple word-based chunking (tokens ≈ words × 1.3) const maxWords = Math.floor(maxTokens / 1.3); const words = content.split(/\s+/); const chunks: string[] = []; for (let i = 0; i < words.length; i += maxWords) { const chunk = words.slice(i, i + maxWords).join(' '); chunks.push(chunk); } return chunks; } semanticChunk( content: string, chunkSize: number = 1000 ): string[] { // Split on natural boundaries const sections = content.split(/\n\n+/); const chunks: string[] = []; let currentChunk = ''; for (const section of sections) { if ((currentChunk + section).length > chunkSize && currentChunk) { chunks.push(currentChunk.trim()); currentChunk = section; } else { currentChunk += '\n\n' + section; } } if (currentChunk) { chunks.push(currentChunk.trim()); } return chunks; } }
Hierarchical Summarization
// Process large documents hierarchically export async function summarizeLargeDocument( document: string ): Promise<string> { const chunker = new ContextChunker(); const chunks = chunker.semanticChunk(document, 3000); // Step 1: Summarize each chunk const chunkSummaries = await Promise.all( chunks.map(async (chunk) => { const result = await agent.generate(` # TASK Summarize this section concisely. # SECTION ${chunk} # OUTPUT 2-3 sentences capturing the main points. `); return result; }) ); // Step 2: Combine summaries const combined = chunkSummaries.join('\n\n'); // Step 3: Final summary const finalSummary = await agent.generate(` # TASK Create a comprehensive summary from these section summaries. # SECTION SUMMARIES ${combined} # OUTPUT A cohesive 2-3 paragraph summary of the entire document. `); return finalSummary; }
Context Prioritization
export class ContextPrioritizer { prioritize( contexts: Context[], maxTokens: number ): Context[] { // Score each context by relevance const scored = contexts.map((ctx) => ({ context: ctx, score: this.calculateRelevance(ctx), tokens: this.estimateTokens(ctx.content), })); // Sort by score descending scored.sort((a, b) => b.score - a.score); // Take highest scoring contexts within token limit const selected: Context[] = []; let totalTokens = 0; for (const item of scored) { if (totalTokens + item.tokens <= maxTokens) { selected.push(item.context); totalTokens += item.tokens; } } return selected; } private calculateRelevance(context: Context): number { let score = 0; // Recency const age = Date.now() - context.timestamp.getTime(); const daysSinceCreation = age / (1000 * 60 * 60 * 24); score += Math.max(0, 10 - daysSinceCreation); // Importance if (context.priority === 'high') score += 20; if (context.priority === 'medium') score += 10; // Type if (context.type === 'user_message') score += 15; if (context.type === 'error') score += 12; return score; } private estimateTokens(text: string): number { // Rough estimate: 1 token ≈ 0.75 words return Math.ceil(text.split(/\s+/).length * 1.3); } }
Dynamic Context Injection
export class DynamicContextBuilder { async buildContext( userQuery: string, availableContext: Context[] ): Promise<string> { // Analyze query to determine needed context const analysis = await this.analyzeQuery(userQuery); const relevantContext = availableContext.filter((ctx) => { // Time-based filtering if (analysis.timeframe === 'recent' && this.isOld(ctx)) { return false; } // Type-based filtering if (analysis.needsCRM && ctx.type !== 'crm') { return false; } return true; }); // Prioritize and fit in context window const prioritizer = new ContextPrioritizer(); const selected = prioritizer.prioritize(relevantContext, 4000); // Format for prompt return this.formatContext(selected); } private formatContext(contexts: Context[]): string { return contexts .map((ctx) => ` ## ${ctx.type.toUpperCase()} (${ctx.timestamp.toLocaleDateString()}) ${ctx.content} `) .join('\n\n---\n\n'); } }
Few-Shot Examples
Examples dramatically improve output quality and consistency.
Zero-Shot vs Few-Shot
// ❌ Zero-shot: No examples const zeroShot = ` Classify this email as: sales, support, billing, or general. Email: ${email} `; // ✅ Few-shot: Include examples const fewShot = ` Classify emails into: sales, support, billing, or general. # EXAMPLES Email: "I'm interested in the enterprise plan pricing" Classification: sales Email: "My login isn't working" Classification: support Email: "When will I be charged for my subscription?" Classification: billing Email: "What are your office hours?" Classification: general # NOW CLASSIFY Email: ${email} Classification: `;
Few-Shot Best Practices
1. Use 3-5 Examples
// Too few (1-2): Not enough pattern // Too many (10+): Waste tokens, may confuse // Sweet spot: 3-5 diverse examples
2. Cover Edge Cases
const examples = [ // Typical case { input: 'Normal scenario', output: 'Expected result' }, // Edge case 1 { input: 'Ambiguous scenario', output: 'How to handle' }, // Edge case 2 { input: 'Complex scenario', output: 'Detailed handling' }, // Counter-example { input: 'What NOT to do', output: 'Correct approach' }, ];
3. Show Output Format
const fewShotWithFormat = ` # EXAMPLES Input: "The product launch was delayed due to supply chain issues" Output: \`\`\`json { "summary": "Product launch delayed", "reason": "Supply chain issues", "impact": "medium", "actionable": true } \`\`\` Input: "Team meeting scheduled for next Tuesday" Output: \`\`\`json { "summary": "Meeting scheduled", "reason": null, "impact": "low", "actionable": false } \`\`\` # YOUR TURN Input: ${input} Output: `;
Dynamic Few-Shot Selection
export class FewShotSelector { async selectExamples( query: string, exampleBank: Example[], k: number = 3 ): Promise<Example[]> { // Embed the query const queryEmbedding = await this.embed(query); // Calculate similarity to all examples const withSimilarity = exampleBank.map((example) => ({ example, similarity: this.cosineSimilarity( queryEmbedding, example.embedding ), })); // Return top K most similar return withSimilarity .sort((a, b) => b.similarity - a.similarity) .slice(0, k) .map((item) => item.example); } private async embed(text: string): Promise<number[]> { const response = await openai.embeddings.create({ model: 'text-embedding-3-small', input: text, }); return response.data[0].embedding; } private cosineSimilarity(a: number[], b: number[]): number { const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0); const magA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0)); const magB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0)); return dotProduct / (magA * magB); } } // Usage const selector = new FewShotSelector(); const relevantExamples = await selector.selectExamples( userQuery, exampleBank, 3 ); const prompt = buildPromptWithExamples(userQuery, relevantExamples);
Error Handling in Prompts
Teach agents how to handle errors and edge cases through prompts.
Explicit Error Handling
const promptWithErrorHandling = ` # TASK Extract company information from text. # INSTRUCTIONS 1. Look for company name, industry, location, employee count 2. If information is missing, use "Unknown" - DO NOT guess 3. If text is not about a company, return error 4. If text is unclear, ask for clarification # EDGE CASES ## Missing Information If any field is not mentioned: \`\`\`json { "company_name": "Acme Corp", "industry": "Unknown", ... } \`\`\` ## Not a Company If text doesn't describe a company: \`\`\`json { "error": "Not a company description" } \`\`\` ## Ambiguous If multiple interpretations exist: \`\`\`json { "error": "Ambiguous - please clarify if you mean X or Y" } \`\`\` # INPUT ${text} # OUTPUT Return JSON with company info or error. `;
Validation Prompts
export async function extractWithValidation( text: string ): Promise<CompanyInfo> { // Step 1: Extract const extraction = await agent.generate(extractionPrompt); // Step 2: Validate const validation = await agent.generate(` # TASK Validate this extracted company information. # ORIGINAL TEXT ${text} # EXTRACTED DATA ${extraction} # VALIDATION RULES 1. Company name must be explicitly mentioned in text 2. Industry must match actual business description 3. Numbers must be accurate (don't round or estimate) 4. Location must be specific (not just "USA") # INSTRUCTIONS Check each field against the original text. For any field that seems incorrect or guessed, mark it as "needs_review". # OUTPUT FORMAT \`\`\`json { "valid": true/false, "issues": ["list of problems found"], "corrections": { "field": "corrected value" } } \`\`\` `); // Step 3: Apply corrections return this.applyCorrections(extraction, validation); }
Self-Correction Prompts
const selfCorrectionPrompt = ` # TASK Analyze the following article and extract key points. # PROCESS 1. Read the article carefully 2. Extract 5 key points 3. Review your extraction for: - Accuracy (did you misrepresent anything?) - Completeness (did you miss major points?) - Clarity (are points clear and specific?) 4. Revise any points that don't meet these standards # ARTICLE ${article} # OUTPUT FORMAT ## Initial Extraction 1. [Point 1] 2. [Point 2] ... ## Self-Review - Issues found: [list] - Corrections needed: [list] ## Final Key Points 1. [Revised point 1] 2. [Revised point 2] ... `;
Iteration and A/B Testing
Systematic improvement through testing and iteration.
Prompt Versioning
// prompts/summarizer.ts export const summarizerPrompts = { v1: { version: '1.0.0', prompt: `Summarize this article: ${article}`, performance: { accuracy: 0.65, speed: 'fast' }, }, v2: { version: '2.0.0', prompt: ` # TASK Summarize the article below for a business audience. # REQUIREMENTS - 2-3 paragraphs - Focus on key insights # ARTICLE ${article} `, performance: { accuracy: 0.78, speed: 'medium' }, }, v3: { version: '3.0.0', prompt: ` # ROLE You are a business analyst summarizing market research. # TASK Summarize for executives who need actionable insights. # FORMAT - Executive Summary (2-3 sentences) - Key Insights (3-5 bullet points) - Implications (2-3 sentences) # ARTICLE ${article} `, performance: { accuracy: 0.89, speed: 'medium' }, }, } as const;
A/B Testing Framework
// lib/prompt-tester.ts export class PromptTester { async abTest( variantA: string, variantB: string, testCases: TestCase[] ): Promise<TestResults> { const resultsA: Result[] = []; const resultsB: Result[] = []; for (const testCase of testCases) { // Run variant A const resultA = await this.runTest(variantA, testCase); resultsA.push(resultA); // Run variant B const resultB = await this.runTest(variantB, testCase); resultsB.push(resultB); } return { variantA: this.calculateMetrics(resultsA), variantB: this.calculateMetrics(resultsB), winner: this.determineWinner(resultsA, resultsB), }; } private calculateMetrics(results: Result[]): Metrics { const scores = results.map((r) => r.score); const latencies = results.map((r) => r.latency); return { avgScore: scores.reduce((a, b) => a + b, 0) / scores.length, avgLatency: latencies.reduce((a, b) => a + b, 0) / latencies.length, successRate: results.filter((r) => r.success).length / results.length, }; } private async runTest( prompt: string, testCase: TestCase ): Promise<Result> { const start = Date.now(); try { const output = await agent.generate(prompt); const score = await this.scoreOutput(output, testCase.expected); return { success: true, score, latency: Date.now() - start, output, }; } catch (error) { return { success: false, score: 0, latency: Date.now() - start, error: (error as Error).message, }; } } private async scoreOutput( output: string, expected: string ): Promise<number> { // Use another LLM to score quality (0-1) const result = await scoringAgent.generate(` Rate how well this output matches the expected result (0-1 score). Expected: ${expected} Actual: ${output} Score (0-1): `); return parseFloat(result); } }
Evaluation Dataset
// test/prompts/summarizer.test.ts const evaluationDataset: TestCase[] = [ { id: 'tech-article-1', input: 'Long technical article about AI...', expected: 'Should focus on technical insights and innovations', criteria: ['accuracy', 'technical_depth', 'clarity'], }, { id: 'business-article-1', input: 'Business strategy article...', expected: 'Should focus on business implications and strategy', criteria: ['business_focus', 'actionability', 'clarity'], }, { id: 'edge-case-1', input: 'Very short article with minimal content...', expected: 'Should handle gracefully, not hallucinate', criteria: ['accuracy', 'no_hallucination'], }, ]; // Run evaluation const tester = new PromptTester(); const results = await tester.abTest( summarizerPrompts.v2.prompt, summarizerPrompts.v3.prompt, evaluationDataset ); console.log('Results:', results); // Deploy winner to production if (results.winner === 'B') { deployPrompt(summarizerPrompts.v3); }
Continuous Improvement
export class PromptMonitor { async trackProduction( promptVersion: string, input: string, output: string, userFeedback?: number ): Promise<void> { await db.promptExecution.create({ data: { promptVersion, input: input.substring(0, 1000), output: output.substring(0, 1000), userFeedback, timestamp: new Date(), }, }); } async analyzePerformance( promptVersion: string, timeframe: 'day' | 'week' | 'month' ): Promise<PerformanceReport> { const executions = await this.getExecutions(promptVersion, timeframe); return { totalExecutions: executions.length, avgUserRating: this.avgRating(executions), errorRate: this.errorRate(executions), commonIssues: this.identifyIssues(executions), recommendations: this.generateRecommendations(executions), }; } private generateRecommendations( executions: Execution[] ): string[] { const recommendations: string[] = []; // Analyze failures const failures = executions.filter((e) => e.userFeedback < 3); if (failures.length > executions.length * 0.1) { recommendations.push( 'High failure rate - consider adding more examples or clearer instructions' ); } // Analyze patterns const commonFailures = this.findPatterns(failures); if (commonFailures.length > 0) { recommendations.push( `Common failure pattern: ${commonFailures[0]} - add specific handling` ); } return recommendations; } }
Production Prompt Template
export const productionPromptTemplate = ` # ROLE ${role} # TASK ${task} # CONTEXT ${context} # REQUIREMENTS ${requirements.map((r, i) => `${i + 1}. ${r}`).join('\n')} # EXAMPLES ${examples.map(ex => ` Input: ${ex.input} Output: ${ex.output} `).join('\n')} # ERROR HANDLING - If ${errorCondition1}: ${errorResponse1} - If ${errorCondition2}: ${errorResponse2} # INPUT ${input} # OUTPUT FORMAT ${outputFormat} # VALIDATION Before responding, verify: ${validationChecks.map((c, i) => `${i + 1}. ${c}`).join('\n')} `;
Key Takeaways
- Structure your prompts - Role, task, context, requirements, format
- Manage context wisely - Chunk, prioritize, inject dynamically
- Use few-shot examples - 3-5 diverse examples dramatically improve quality
- Handle errors in prompts - Teach agents how to fail gracefully
- Test and iterate - A/B test, measure, improve continuously
Prompting is software engineering for AI systems!
Get chapter updates & code samples
We’ll email diagrams, code snippets, and additions.