Data Glossary 🧠
What is Apache Parquet?
Apache Parquet is a free and open-source column-oriented Data Lake File Format in the Apache Hadoop ecosystem. It is similar to RCFile and ORC, the other columnar-storage file formats in Hadoop, and is compatible with most of the data processing frameworks around Hadoop.
Read more about how to build a Data Lake on top of it on our Data Lake and Lakehouse Guide.