Data Glossary 🧠
What is the Modern Data Stack?
The Modern Data Stack (MDS) is a heap of open-source tools to achieve end-to-end analytics from ingestion to transformation to ML over to a columnar data warehouse or lake solution with an analytics BI dashboard backend. This stack is extendable like lego blocks. Usually, it consists of data integration, a transformation tool, an Orchestrator, and a Business Intelligence Tool. With growing data, you might add Data Quality and observability tools, Data Catalog, Semantic Layer, and more.
In a way, it is unbundling the data stack as Gorkem says:
Products start small, in time, add adjacent verticals and functionality to their offerings, and become a platform. Once these platforms become big enough, people begin to figure out how to serve better-neglected verticals or abstract out functionality to break it down into purpose-built chunks, and the unbundling starts.
The goal of an MDS is to get data insight with the best suitable tools for each part. It’s noteworthy that it’s a relatively new term.
New Terms popping up
There is already a new term ngods (new generation open-source data stack). Or DataStack 2.0 in Dagster’s recent blog post.
# The Future of MDS
If we look a little in the future, Barr Moses illustrates in her article What’s In Store For The Future Of The Modern Data Stack? more features such as data sharing, universal Data Governance, Data Lake, and Data Warehouse equalized, and a newer evolution of predictive analysis: