Understanding Apache Spark Failures and Bottlenecks

When everything goes according to plan, it's easy to write and understand applications in Apache Spark. However, sometimes a well-tuned application might fail due to a data change or a data layout change — or an application that had been running well so far, might start behaving badly due to resource starvation. It's important to understand underlying runtime components like disk usage, network usage, contention, and so on, so that we can make an informed decision when things go bad.

Understanding Apache Spark Failures and Bottlenecks

Categories

Tags