What are the different debugging/troubleshooting steps that you can use to tackle spark memory errors, such as ‘executor out of memory’ or ‘executor using more than its physical memory’?

First, you analyse the data and check if any partition of data can be bigger than the spark executor memory. If yes, you repartition it in such a way that each partition is significantly smaller than the executor memory.

Then, analyse the Spark UI to identify the flow of the job and stages and check whether anything is wrong with the job flow, such as excessive shuffling steps.

Try to increase the executor memory just to identify how much memory it needs to fit the data partition and complete the job