Performance

Performance

Building an Apache Spark Performance Lab: Tools and Techniques for Spark Optimization

Submitted by canali on

Apache Spark is renowned for its speed and efficiency in handling large-scale data processing. However, optimizing Spark to achieve maximum performance requires a precise understanding of its inner workings. This blog post will guide you through establishing a Spark Performance Lab with essential tools and techniques aimed at enhancing Spark performance through detailed metrics analysis.

Enhancing Apache Spark and Parquet Efficiency: A Deep Dive into Column Indexes and Bloom Filters

In the ever-evolving landscape of big data, Apache Spark and Apache Parquet continue to introduce game-changing features.

canali

Enhancing Apache Spark Performance with Flame Graphs: A Practical Example Using Grafana Pyroscope

Submitted by canali on

TL;DR Explore a step-by-step example of troubleshooting Apache Spark job performance using flame graph visualization and profiling. Discover the seamless integration of Grafana Pyroscope with Spark for streamlined data collection and visualization.

 

Performance Comparison of 5 JDKs on Apache Spark

Dive into a comprehensive load-testing exploration using Apache Spark with CPU-intensive workloads.

canali
Apache Spark 3.0 Memory Monitoring Improvements

TLDR; Apache Spark 3.0 comes with many improvements, including new features for memory monitoring.

canali

Disclaimer

The views expressed in this blog are those of the authors and cannot be regarded as representing CERN’s official position.

CERN Social Media Guidelines

 

Blogroll