Apache Hadoop with YARN: Configuring Pepperdata Activation (RPM/DEB)

Supported versions: See the Apache Hadoop entries for Pepperdata 7.1.x in the table of Supported Platforms by Pepperdata Version

To activate Pepperdata—inject the necessary instrumentation—across a manually configured cluster, add the activation script to each Hadoop service’s environment shell executable file.

Procedure

  1. Beginning with any host, add the snippet below to the appropriate environment shell executable file(s), based on which services are configured to run on the host.

    • YARN ResourceManager/NodeManager: yarn-env.sh, typically located in /etc/hadoop/conf

    • Apache Spark: spark-env.sh (all hosts, including edge/client), typically located in /etc/spark/conf. If you are running multiple versions of Spark, add the snippet to the configuration file for each Spark version.

    PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/pepperdata/supervisor/lib/pepperdata-activate.sh"
    if [ -e $PEPPERDATA_ACTIVATE_SCRIPT_PATH ]; then
     . $PEPPERDATA_ACTIVATE_SCRIPT_PATH
    fi
    
  2. Repeat step 1 on every host in the cluster.

  3. Restart the following application daemons:

    • YARN ResourceManager
    • YARN NodeManagers (all in the cluster)