(Platform Spotlight and Dependent Products) Configure Hadoop to Use Pepperdata (Parcel)
This procedure is required:
-
For Platform Spotlight
-
If you’re using a dependent Pepperdata product—a product that requires Platform Spotlight to be installed and configured—in a Hadoop cluster:
- Application Spotlight
- Capacity Optimizer
- Query Spotlight
- Streaming Spotlight
Supported versions: See the CDH and CDP Private Cloud Base entries for Pepperdata 7.0.x in the table of Supported Platforms by Pepperdata Version
On This Page
Task 1: Activate Pepperdata on a Cloudera Manager Configured Cluster
To activate Pepperdata—inject the necessary instrumentation—across a Cloudera Manager configured cluster, add Pepperdata activation environment variables or Java options to each Hadoop service’s configuration options.
Procedure
Use Cloudera Manager to add the snippets described in this procedure’s steps.
-
Instrument the ResourceManager.
Add the following snippet to the YARN (MR2 Included) > Configuration > ResourceManager > Java Configuration Options for ResourceManager template.
Important: Enter all values on a single line.-Dpepperdata.class.transformer=com.pepperdata.supervisor.classpatcherlib.ClassTransformerForYarn3Daemon -Dpepperdata.class.patching.enabled=true -javaagent:/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/lib/PepperdataSupervisor.jar
The snippets are different for instrumenting the ResourceManager and NodeManager. -
Instrument the NodeManager.
Add the following snippet to the YARN (MR2 Included) > Configuration > NodeManager > Java Configuration Options for NodeManager template.
Important: Enter all values on a single line.-Dpepperdata.nodemanager.connector.enabled=true -Dpepperdata.class.transformer=com.pepperdata.supervisor.classpatcherlib.ClassTransformerForYarn3Daemon -Dpepperdata.class.patching.enabled=true -javaagent:/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/lib/PepperdataSupervisor.jar
The snippets are different for instrumenting the NodeManager and ResourceManager. -
Instrument the Spark service.
Add the following snippet to the Spark > Configuration > Gateway > Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh template.
Important: Add the snippet to the end of the template.
This ensures that the activation script’s variable appends (YARN_NODEMANAGER_OPTS
,YARN_RESOURCEMANAGER_OPTS
,HBASE_REGIONSERVER_OPTS
, andSPARK_SUBMIT_OPTS
) are not overwritten by other assignments in the template.PD_HOME=/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/pepperdata-activate.sh" if [ -e $PEPPERDATA_ACTIVATE_SCRIPT_PATH ]; then . $PEPPERDATA_ACTIVATE_SCRIPT_PATH fi
-
(Spark 2) Instrument the Spark 2 service.
Add the following snippet to the Spark 2 > Configuration > Gateway > Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh template.
Important: Add the snippet to the end of the template.
This ensures that the activation script’s variable appends (YARN_NODEMANAGER_OPTS
,YARN_RESOURCEMANAGER_OPTS
,HBASE_REGIONSERVER_OPTS
, andSPARK_SUBMIT_OPTS
) are not overwritten by other assignments in the template.PD_HOME=/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/pepperdata-activate.sh" if [ -e $PEPPERDATA_ACTIVATE_SCRIPT_PATH ]; then . $PEPPERDATA_ACTIVATE_SCRIPT_PATH fi
-
Restart the following application daemons:
- YARN ResourceManager
- YARN NodeManagers (all in the cluster)
-
Select the Restart action for the Pepperdata service.
Task 2: (Rarely Required) Open Ports for Listening
Supervisor listens on port 50510
for communication on the ResourceManager host.
PepAgents listen on port 50505
, whether they’re running on ResourceManager hosts, as we recommend, or on NodeManager hosts.
In most environments these ports are available for use and are not blocked by internal firewalls. However, in rare situations, you might need to open/unblock these ports or reconfigure which port Pepperdata uses.
50510
or port 50505
is used by another service, you can reconfigure which port Pepperdata uses by redefining the pepperdata.supervisor.rpc.server.port
and pepperdata.agent.rpc.server.port
properties, respectively, in the Pepperdata site file, pepperdata-site.xml
.• After you reconfigure the
pepperdata.supervisor.rpc.server.port
(default=50510), restart the ResourceManagers.• After you reconfigure the
pepperdata.agent.rpc.server.port
property (default=50505), restart the PepAgents.(By default, the Pepperdata site file,
pepperdata-site.xml
, is located in /etc/pepperdata
. If you customized the location, the file is specified by the PD_CONF_DIR
environment variable. See Change the Location of pepperdata-site.xml for details.)To enable SSL support, see Configure SSL Near Real-Time Monitoring on Ports 50510 and 50505.
For information about accessing the stats that are provided via the Web servlets associated with these ports, with either HTTP or SSL-secured HTTPS communication, see Pepperdata Status Views via Web Servlets.