Advanced Spark Job Monitoring (Cloud)

By default, the Pepperdata PepAgent monitors container-launched Spark jobs where the Spark driver and the PepAgent are on the same host. It’s possible, however, to run container-launched Spark jobs and have the Spark driver send the metrics data to a PepAgent on a remote host—a host other than where the Spark driver is resident. To enable such monitoring for a container-launched Spark job, include the following Pepperdata configuration override in the launch command: --conf spark.force.data.toRemoteHost=true.

Procedure

  • Add the following Pepperdata configuration override to the launch command, whether it’s command-line based, scripted, or entered into a proprietary framework’s user interface:

    --conf spark.force.data.toRemoteHost=true