Avro

Avro

Performance comparison of different file formats and storage engines in the Hadoop ecosystem

Submitted by zbaranow on

TOPIC

 

This post reports performance tests for a few popular data formats and storage engines available in the Hadoop ecosystem: Apache Avro, Apache Parquet, Apache HBase and Apache Kudu. This exercise evaluates space efficiency, ingestion performance, analytic scans and random data lookup for a workload of interest at CERN Hadoop service.

 

 

INTRO

 

Disclaimer

The views expressed in this blog are those of the authors and cannot be regarded as representing CERN’s official position.

CERN Social Media Guidelines

 

Blogroll