(Platform Spotlight and Dependent Products) Configure Pepperdata Collector (RPM/DEB)
The Pepperdata Collector—the pepcollectd
daemon—collects the metrics that the other Pepperdata daemons obtain from hosts, and uploads the aggregated (collected) data to the Pepperdata dashboard.
The pepcollectd
daemon should be configured and started on every host in the cluster, including the NodeManagers and the ResourceManager.
If the pepcollectd
daemon is not running on a host, Pepperdata cannot collect metrics for that host or upload the host data to the Pepperdata dashboard.
This procedure is required:
-
For Platform Spotlight
- If you’re using a dependent Pepperdata product—a product that requires Platform Spotlight to be installed and configured—in a Hadoop cluster:
- Application Spotlight
- Capacity Optimizer
- Query Spotlight
- Streaming Spotlight
- If you’re installing Streaming Spotlight in a Kafka-only cluster (without Hadoop)
To omit a host from Pepperdata data collection, add the Pepperdata Collector to the host as described in this procedure, and assign a value of 0
to the PD_COLLECT_AND_UPLOAD
environment variable.
If you skip the Collector configuration for a host, the Collector cannot manage the data retention and expiration, and you run the risk of filling up disk space.
To disable data collection for a host after initial installation, perform the Disable/Enable Pepperdata Data Collection for a Host procedure.
Procedure
-
Open the Pepperdata configuration file,
/etc/pepperdata/pepperdata-config.sh
, for editing. -
Read the inline comments and edit the variable values for all the TODO items, as applicable for your cluster.
Be sure to specify only one value for the
PD_COLLECT_AND_UPLOAD
environment variable:1
to enable Pepperdata to collect data from the host, or0
to omit the host from data collection.At a minimum, be sure to update the following variables’ values:
JAVA_HOME
PD_CLUSTER_NAME
PD_COLLECT_AND_UPLOAD
PD_LICENSE_KEY
: Set the value to the location of your Pepperdata license file. For details, see Manage the License Key File.
We recommend setting the
JAVA_HOME
environment variable to the JVM used by your Hadoop installation because it can be different from system-scope JVMs, especially when running a managed instance. -
Save your changes and close the file.