PromptsVault AI is thinking...
Searching the best prompts from our community
Searching the best prompts from our community
Prompts matching the #pandas tag
Optimize Pandas data processing pipeline. Techniques: 1. Vectorize operations (avoid loops). 2. Use appropriate data types (int8, category). 3. Process large datasets with chunking. 4. Parallelize processing with Dask or Swifter. 5. Efficient file formats (Parquet/Feather). 6. Memory usage profiling. 7. Index optimization for merging. 8. Caching intermediate results. Include benchmark comparisons.
Build a robust data cleaning pipeline for a messy CSV dataset. Requirements: 1. Handle missing values using forward-fill, backward-fill, and mean imputation strategies. 2. Detect and remove outliers using IQR method. 3. Standardize date formats across multiple columns. 4. Remove duplicate rows based on composite keys. 5. Generate a data quality report showing before/after statistics. Use pandas best practices with method chaining for readability.