Add comprehensive safety and robustness guidelines to system prompt
Browse files- Add data validation checks (empty dataframes, missing values)
- Add error handling with try-except blocks
- Add city/location validation before filtering
- Add proper handling of empty results after filtering
- Add numerical formatting (.round(2)) to avoid long decimals
- Add division by zero protection
- Add date range validation
- Add proper units formatting (μg/m³)
- Add memory management (plt.close())
- Add column name validation
- This should make the generated code much more robust and safe
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
src.py
CHANGED
|
@@ -295,12 +295,25 @@ WHEN TO CREATE PLOTS vs TEXT ANSWERS:
|
|
| 295 |
- Questions asking for comparisons of many items → PLOTS
|
| 296 |
- Simple direct questions → TEXT ANSWERS
|
| 297 |
|
| 298 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 299 |
- Save final result in variable called 'answer'
|
| 300 |
- For TEXT: Store the direct answer as a string in 'answer'
|
| 301 |
- For PLOTS: Save with unique filename f"plot_{{uuid.uuid4().hex[:8]}}.png" and store filename in 'answer'
|
| 302 |
- Convert numpy types to int when using as indices: int(value)
|
| 303 |
- Always use .iloc or .loc properly for pandas indexing
|
|
|
|
|
|
|
| 304 |
"""
|
| 305 |
|
| 306 |
query = f"""{system_prompt}
|
|
|
|
| 295 |
- Questions asking for comparisons of many items → PLOTS
|
| 296 |
- Simple direct questions → TEXT ANSWERS
|
| 297 |
|
| 298 |
+
SAFETY & ROBUSTNESS RULES:
|
| 299 |
+
- Always check if data exists before processing: if df.empty: answer = "No data available"
|
| 300 |
+
- Handle missing values: use .dropna() or .fillna() appropriately
|
| 301 |
+
- Use try-except blocks for risky operations like indexing
|
| 302 |
+
- Validate city/location names exist in data before filtering
|
| 303 |
+
- Check for empty results after filtering: if filtered_df.empty: answer = "No data found for specified criteria"
|
| 304 |
+
- Use .round(2) for numerical results to avoid long decimals
|
| 305 |
+
- Handle division by zero: check denominators before division
|
| 306 |
+
- Validate date ranges exist in data
|
| 307 |
+
- Use proper string formatting for answers with units (μg/m³)
|
| 308 |
+
|
| 309 |
+
TECHNICAL REQUIREMENTS:
|
| 310 |
- Save final result in variable called 'answer'
|
| 311 |
- For TEXT: Store the direct answer as a string in 'answer'
|
| 312 |
- For PLOTS: Save with unique filename f"plot_{{uuid.uuid4().hex[:8]}}.png" and store filename in 'answer'
|
| 313 |
- Convert numpy types to int when using as indices: int(value)
|
| 314 |
- Always use .iloc or .loc properly for pandas indexing
|
| 315 |
+
- Close matplotlib figures with plt.close() to prevent memory leaks
|
| 316 |
+
- Use proper column name checks before accessing columns
|
| 317 |
"""
|
| 318 |
|
| 319 |
query = f"""{system_prompt}
|