Random notes from crashing and hanging EMR Spark job

It sometimes happens that your EMR job crashes or hangs indefinitely with no meaningful log. You can try to capture memory dump but it is not very useful when your cluster machines have hundreds gigabytes of memory each. Below are “fixes” which worked for me. If it just crashes with lost slave or lost task, … Continue reading Random notes from crashing and hanging EMR Spark job