(Platform Spotlight and Dependent Products) Configure Hadoop to Use Pepperdata (Parcel)

This procedure is required:

  • For Platform Spotlight

  • If you’re using a dependent Pepperdata product—a product that requires Platform Spotlight to be installed and configured—in a Hadoop cluster:

    • Application Spotlight
    • Capacity Optimizer
    • Query Spotlight
    • Streaming Spotlight
If the only Pepperdata product that you’re installing is Streaming Spotlight, and the cluster is a Kafka-only cluster (without Hadoop), do not perform this procedure. For this scenario, skip to Configure Streaming Spotlight.

Supported versions: See the CDH and CDP Private Cloud Base entries for Pepperdata 8.1.x in the table of Supported Platforms by Pepperdata Version

Task 1: Activate Pepperdata on a Cloudera Manager Configured Cluster

To activate Pepperdata—inject the necessary instrumentation—across a Cloudera Manager configured cluster, add Pepperdata activation environment variables or Java options to each Hadoop service’s configuration options.

Procedure

Use Cloudera Manager to add the snippets described in this procedure’s steps.

  1. Instrument the ResourceManager.

    Add the following snippet to the YARN (MR2 Included) > Configuration > ResourceManager > Java Configuration Options for ResourceManager template.

    -Dpepperdata.class.transformer=com.pepperdata.supervisor.classpatcherlib.ClassTransformerForYarn3Daemon -Dpepperdata.class.patching.enabled=true -javaagent:/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/lib/PepperdataSupervisor.jar
    
    The snippets are different for instrumenting the ResourceManager and NodeManager.
  2. Instrument the NodeManager.

    Add the following snippet to the YARN (MR2 Included) > Configuration > NodeManager > Java Configuration Options for NodeManager template.

    -Dpepperdata.nodemanager.connector.enabled=true -Dpepperdata.class.transformer=com.pepperdata.supervisor.classpatcherlib.ClassTransformerForYarn3Daemon -Dpepperdata.class.patching.enabled=true -javaagent:/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/lib/PepperdataSupervisor.jar
    
    The snippets are different for instrumenting the NodeManager and ResourceManager.
  3. Instrument the Spark service.

    Add the following snippet to the Spark > Configuration > Gateway > Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh template.

    PD_HOME=/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR
    PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/pepperdata-activate.sh"
    if [ -e $PEPPERDATA_ACTIVATE_SCRIPT_PATH ]; then
     . $PEPPERDATA_ACTIVATE_SCRIPT_PATH
    fi
    
  4. (Spark 2) Instrument the Spark 2 service.

    Add the following snippet to the Spark 2 > Configuration > Gateway > Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh template.

    PD_HOME=/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR
    PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/pepperdata-activate.sh"
    if [ -e $PEPPERDATA_ACTIVATE_SCRIPT_PATH ]; then
     . $PEPPERDATA_ACTIVATE_SCRIPT_PATH
    fi
    
  5. Restart the following application daemons:

    • YARN ResourceManager
    • YARN NodeManagers (all in the cluster)
  6. Select the Restart action for the Pepperdata service.

Task 2: (Rarely Required) Open Port for Listening

PepAgents listen on port 50505, whether they’re running on ResourceManager hosts, as we recommend, or on NodeManager hosts.

In most environments this port is available for use and is not blocked by internal firewalls. However, in rare situations, you might need to open/unblock this port or reconfigure which port Pepperdata uses.

To enable SSL support, see Configure SSL Near Real-Time Monitoring on Port 50505.

For information about accessing the stats that are provided via the Web servlets associated with this port, with either HTTP or SSL-secured HTTPS communication, see Pepperdata Status Views via Web Servlets.