Pinterest launched a next-generation CDC-based database ingestion framework using Kafka, Flink, Spark, and Iceberg. The system reduces data availability latency from 24+ hours to 15 minutes, processes ...
A Python-based workflow for profiling data and assessing data quality in SQLite databases, powered by CrewAI. This project automatically generates SQL queries for profiling, identifies data quality ...