What is Data Observability?
Data observability, also known as monitoring, continuously collects metrics about your data. You can collect data about the number of rows, columns, and properties for each dataset. You can also manage metadata about the dataset, such as when it was last updated.
From the great article Choosing a Data Quality Tool - by Sarah Krasnik, there are also different categories for observability:
- Auto-profiling data
- Bigeye: unique in a wide range of ML-driven automatic threshold tests and alerts
- Datafold: unique Github integration presenting Data Diff between environments with custom tests
- Monte Carlo: unique in being the most enterprise-ready enterprise-ready with many data lake integrations
- Lightup: unique self-hosted deployment option, appealing to highly regulated industries
- Metaplane: unique in a high level of configuration for a hosted tool with both out-of-the-box and custom tests
- Pipeline Testing
- Infrastructure monitoring
- A little bit of everything