PromptsVault AI is thinking...
Searching the best prompts from our community
Searching the best prompts from our community
Prompts matching the #data-engineering tag
Architect a real-time data pipeline using Apache Kafka. Components: 1. Producer sending clickstream events (JSON). 2. Kafka topic with 3 partitions for scalability. 3. Consumer group processing events in parallel. 4. Stream processing with Kafka Streams for aggregations. 5. Sink connector to write to Elasticsearch. Include error handling, exactly-once semantics, and monitoring with Kafka lag metrics.
Design a production-grade Airflow DAG for daily ETL. Workflow: 1. Extract data from PostgreSQL and REST API. 2. Transform using pandas (clean, join, aggregate). 3. Load to data warehouse (Snowflake/BigQuery). 4. Send Slack notification on success/failure. 5. Implement retry logic and SLA monitoring. Use TaskGroups for organization, XComs for data passing, and proper error handling with callbacks.