Data Extraction from Documents
Gemini's multimodal document understanding handles PDFs, images of forms, and scanned documents. The explicit normalization rules and NOT_FOUND convention make the output immediately usable in spreadsheets or databases.
Extract structured data from the attached {{document_type}}. Extract these fields: {{fields_list}} Output as a {{output_format}} with one row per {{entity_unit}}. Extraction rules: - If a field appears multiple times, take the most recent / most specific value - If a field is missing, use "NOT_FOUND" (not null, not empty) - For dates, normalize to YYYY-MM-DD format regardless of input format - For currency, normalize to numbers without symbols (include a "currency" column) - For names, use "LastName, FirstName" format - Flag any field where the extraction is ambiguous with a trailing " [AMBIGUOUS]" marker After the data, provide: - Total records extracted - Fields with the highest "NOT_FOUND" rate - Any patterns or anomalies you noticed in the data
Variables to customize
Why this prompt works
Gemini's multimodal document understanding handles PDFs, images of forms, and scanned documents. The explicit normalization rules and NOT_FOUND convention make the output immediately usable in spreadsheets or databases.
What you get when you save this prompt
Your workspace unlocks powerful tools to iterate and improve.
AI Optimization
One-click improvement with structure analysis and pattern suggestions.
Version History
Track every edit. Compare versions side-by-side with word-level diffs.
Folders & Tags
Organize your library with nested folders, tags, and drag-and-drop.
$ npm i -g @promptingbox/mcpUse Everywhere
Access prompts from Claude, Cursor, ChatGPT & more via MCP integration.
Your prompts, organized
Save, version, and access your best prompts across ChatGPT, Claude, Cursor, and more.