A/B Test Tracker
Including adversarial test cases and a decision framework prevents the common mistake of picking a winner based only on happy-path performance.
I'm A/B testing two versions of a prompt for {{task_description}}.\n\nVersion A:\n{{version_a}}\n\nVersion B:\n{{version_b}}\n\nEvaluation criteria:\n{{evaluation_criteria}}\n\nDesign an A/B test plan that includes:\n1. Sample size recommendation (number of test cases)\n2. Test cases that cover normal, edge, and adversarial inputs\n3. Scoring rubric for each evaluation criterion (1-5 scale with descriptions)\n4. Statistical method for determining a winner\n5. A results template I can fill in as I run tests\n6. Decision framework: when to pick A, pick B, or iterate further
Variables to customize
Why this prompt works
Including adversarial test cases and a decision framework prevents the common mistake of picking a winner based only on happy-path performance.
Save this prompt to your library
Organize, version, and access your best prompts across ChatGPT, Claude, and Cursor.
Related prompts
Forcing the agent to plan before acting prevents premature execution and wasted steps. Explicit dependency mapping enables parallel execution and catches logical gaps early.
Tool Selection AgentThe ReAct pattern (Reason + Act) creates an explicit reasoning trace that improves tool selection accuracy. The error-handling rule prevents infinite retry loops.
Prompt CompressorExplicitly requiring all functional requirements to be preserved prevents the model from over-compressing and losing critical instructions.
Memory Management AgentExplicit memory read/write instructions create agents that improve over time. Categorization keeps memories organized, and the deduplication rule prevents context bloat.