Multimodal Image Analysis

General Productivitygemini-promptsimage_typeoutput_format

Gemini's native multimodal processing handles complex image analysis better than most models. The structured extraction format and uncertainty tagging produce reliable, parseable output.

Prompt

Analyze the attached {{image_type}} and provide a comprehensive breakdown:

1. **Visual inventory**: List every distinct element you can identify (objects, text, colors, layout)
2. **Text extraction**: Transcribe ALL text visible in the image exactly as written
3. **Spatial relationships**: Describe how elements are positioned relative to each other
4. **Context clues**: What can you infer about when, where, and why this was created?
5. **Data extraction**: If this contains charts, tables, or diagrams, extract the data into a structured {{output_format}} format

For any element you're uncertain about, say "[uncertain]" rather than guessing.

Finally, suggest 3 follow-up questions I could ask about this image to get deeper insights.

Variables to customize

Why this prompt works

Gemini's native multimodal processing handles complex image analysis better than most models. The structured extraction format and uncertainty tagging produce reliable, parseable output.

What you get when you save this prompt

Your workspace unlocks powerful tools to iterate and improve.

AI OPTIMIZE

AI Optimization

One-click improvement with structure analysis and pattern suggestions.

VERSION DIFF

Version History

Track every edit. Compare versions side-by-side with word-level diffs.

ORGANIZE

Development

Code Review

Testing

Marketing

Folders & Tags

Organize your library with nested folders, tags, and drag-and-drop.

MCP

$ npm i -g @promptingbox/mcp

Claude · Cursor · ChatGPT

Use Everywhere

Access prompts from Claude, Cursor, ChatGPT & more via MCP integration.

Your prompts, organized

Save, version, and access your best prompts across ChatGPT, Claude, Cursor, and more.