ChatGPT Prompts for Data Analysis

Data analysis with AI is only as good as the questions you ask. A prompt like "analyze this data" gives you a surface-level summary that misses the insights that actually drive decisions. The analysts getting real value from ChatGPT are writing prompts that specify the dataset structure, the business question, the type of analysis needed, and the output format — whether that is a summary table, a visualization recommendation, or a SQL query they can run directly.

For data cleaning, describe the specific issues you are dealing with: missing values, duplicates, inconsistent formats, or outliers. Tell the AI what your columns represent and what clean data looks like for your use case. For SQL generation, include the table schema, relationships between tables, and the exact question you need answered — the more context about your database structure, the more accurate the query. Statistical analysis prompts should specify the test or method you want (regression, hypothesis testing, clustering), the variables involved, and your significance threshold. Visualization prompts work best when you describe the audience, the story you want the chart to tell, and the tool you are using (Python matplotlib, R ggplot, Excel).

Build a library of your most effective analysis prompts and refine them as you learn what produces the best results. PromptingBox lets you save, version, and organize your data analysis prompts so you can reuse proven templates across projects and share them with your team.

Data Analysis Prompts You Can Copy Right Now

Practical prompts for SQL, statistics, visualization, and reporting. Copy, fill in your dataset details, and paste into ChatGPT.

SQL Query Writer

Write a SQL query for {{database_type}} to answer this question: "{{business_question}}"

Table schema:
{{table_schema}}

Table relationships:
{{relationships}}

Requirements:
- Use clear aliases for all tables and columns
- Add comments explaining each major section of the query
- Handle NULL values appropriately
- Optimize for readability over cleverness
- If the query requires aggregation, include a GROUP BY with the most useful breakdown
- Return results ordered by {{order_preference}}

After the query, explain what each section does and flag any assumptions you made about the data.
database_typebusiness_questiontable_schemarelationshipsorder_preference

Why it works: Providing the full schema and relationships eliminates guesswork. Asking for comments and assumption flags makes the output reviewable and trustworthy before running against production.

Data Cleaning Script

Write a {{language}} script to clean this dataset. Here is the structure:

Columns: {{column_definitions}}
Row count: approximately {{row_count}}
Known issues:
{{data_issues}}

For each issue, implement a cleaning step that:
1. Logs how many rows are affected before the fix
2. Applies the transformation
3. Validates the result

After cleaning, generate a summary report showing:
- Rows before/after
- Columns modified
- Values imputed or removed
- Any rows flagged for manual review

Use {{library}} for data manipulation. Add error handling so the script does not fail silently on unexpected values.
languagecolumn_definitionsrow_countdata_issueslibrary

Why it works: Requiring before/after logging and a summary report turns a cleaning script into an auditable process. Flagging rows for manual review prevents silent data loss.

Visualization Recommendation

I need to visualize {{data_description}} for an audience of {{audience}}. The key insight I want to communicate is: "{{key_insight}}"

Data shape: {{num_rows}} rows, {{num_columns}} columns
Key variables: {{variables_list}}
Tool: {{visualization_tool}}

Recommend the top 3 chart types for this data and insight. For each recommendation:
1. Chart type and why it works for this specific data + insight combination
2. Which variables to map to which visual encodings (x-axis, y-axis, color, size)
3. Specific design choices (color palette for {{audience}}, annotations to highlight {{key_insight}})
4. Code snippet to create it in {{visualization_tool}}
5. One common mistake to avoid with this chart type

Rank them by how clearly they communicate {{key_insight}} to a non-technical audience.
data_descriptionaudiencekey_insightnum_rowsnum_columnsvariables_listvisualization_tool

Why it works: Starting from the insight rather than the chart type ensures the visualization serves the story. Asking for common mistakes prevents the most frequent data viz errors.

Statistical Analysis Guide

I need to analyze whether {{hypothesis}} using the following data:

Dataset: {{dataset_description}}
Variables:
- Dependent variable: {{dependent_var}} ({{dv_type}})
- Independent variable(s): {{independent_vars}}
- Potential confounders: {{confounders}}
Sample size: {{sample_size}}
Significance level: {{alpha}}

Walk me through:
1. Which statistical test is appropriate and why (consider alternatives)
2. Assumptions to check before running the test, with code to check each one
3. The {{language}} code to run the analysis
4. How to interpret the output — what numbers matter and what they mean in plain English
5. How to report the results in a paper or presentation (APA format if applicable)
6. Limitations of this analysis and what could strengthen the conclusion
hypothesisdataset_descriptiondependent_vardv_typeindependent_varsconfounderssample_sizealphalanguage

Why it works: Listing confounders and asking for assumption checks prevents the common mistake of running a test on data that violates its assumptions. The plain-English interpretation ensures you understand the results.

Dashboard Design Spec

Design a dashboard for {{dashboard_purpose}} used by {{target_users}}.

Data sources available: {{data_sources}}
Key metrics to track: {{key_metrics}}
Update frequency: {{update_frequency}}
Tool: {{dashboard_tool}}

Provide:
1. Dashboard layout (which charts go where, with wireframe description)
2. For each chart/widget:
   - Metric shown
   - Chart type and why
   - Filters or drill-down capability needed
   - Thresholds or conditional formatting rules
3. Global filters (date range, {{filter_dimensions}})
4. A "north star" KPI that should be the most prominent element
5. Mobile considerations if applicable

Design for a {{expertise_level}} audience — choose complexity accordingly. The dashboard should answer these questions at a glance: {{key_questions}}
dashboard_purposetarget_usersdata_sourceskey_metricsupdate_frequencydashboard_toolfilter_dimensionsexpertise_levelkey_questions

Why it works: Defining the key questions the dashboard must answer at a glance prevents scope creep and ensures every widget earns its space. Specifying the audience expertise level calibrates complexity.

Report Narrative Writer

Write an executive summary and narrative for a data analysis report. Context:

Analysis topic: {{analysis_topic}}
Audience: {{audience}} ({{technical_level}} technical level)
Key findings:
{{key_findings}}

Business impact: {{business_impact}}
Recommended actions: {{recommendations}}

Structure:
1. Executive summary (3-4 sentences, lead with the most important finding)
2. Context paragraph (why this analysis was done)
3. Key findings section (use bullet points, each finding should have a data point and a "so what")
4. Implications and recommended actions
5. Methodology note (1-2 sentences, what data, what timeframe, what method)
6. Caveats and limitations

Tone: {{tone}}. Use specific numbers from the findings — no vague language like "significant improvement" without the actual percentage. Total length: 400-500 words.
analysis_topicaudiencetechnical_levelkey_findingsbusiness_impactrecommendationstone

Why it works: Requiring a 'so what' for each finding transforms a data dump into actionable insight. The ban on vague language forces specificity that executives can act on.