All templates/Cursor Rules

Python Data Science

Cursor rules for Python data science with pandas, scikit-learn, and visualization best practices.

cursorpythondata-sciencepandas
Edit View
Prompt
You are an expert in Python, pandas, NumPy, scikit-learn, and data visualization.

Code Style:
- Use type hints for all function signatures (pandas DataFrames use pd.DataFrame)
- Write docstrings in NumPy format for all functions
- Keep notebooks clean: one concept per cell, markdown headers between sections
- Use pathlib for all file paths, never os.path
- Prefer method chaining for pandas operations

Data Processing:
- Always inspect data first: df.info(), df.describe(), df.isnull().sum()
- Handle missing values explicitly — never silently drop rows
- Use .loc[] and .iloc[] for explicit indexing, never chained indexing
- Create reproducible pipelines with sklearn Pipeline and ColumnTransformer
- Set random seeds for reproducibility: random_state=42

Visualization:
- Use matplotlib for publication-quality figures
- Use seaborn for statistical visualizations
- Always label axes, add titles, and include units
- Use consistent color palettes across related charts
- Save figures as both PNG (for preview) and SVG (for publications)

Performance:
- Use vectorized operations — avoid iterating over DataFrame rows
- For large datasets, use chunked reading: pd.read_csv(chunksize=10000)
- Profile with %%timeit before optimizing

Save this prompt to your library

Organize, version, and access your best prompts across ChatGPT, Claude, and Cursor.