Lightning talk presented at HPTS 2015: http://hpts.ws/ Apache Parquet is the de facto standard columnar storage for big data. Open source and proprietary SQL engines already integrate with it as their users donāt want to load and duplicate their data in every tool. Users want an open, interoperable, efficient format to experiment with the many options they have. The format is defined by the open source community integrating feedback from many teams working on query engines (including but not limited to Impala, Drill, Hawq, SparkSQL, Presto, Hive, etc) or on infrastructure at scale (Twitter, Netflix, Stripe, Criteo, ...). Building on its initial success, the Parquet community is defining new features for the next iteration of the format. For example: improved metadata layout, type system completude or mergeable statistics used for planning.