(Platform Spotlight and Dependent Products) Configure Pepperdata Collector (RPM/DEB)

The Pepperdata Collector—the pepcollectd daemon—collects the metrics that the other Pepperdata daemons obtain from hosts, and uploads the aggregated (collected) data to the Pepperdata dashboard. The pepcollectd daemon should be configured and started on every host in the cluster, including the NodeManagers and the ResourceManager. If the pepcollectd daemon is not running on a host, Pepperdata cannot collect metrics for that host or upload the host data to the Pepperdata dashboard.

This procedure is required:

  • For Platform Spotlight

  • If you’re using a dependent Pepperdata product—a product that requires Platform Spotlight to be installed and configured—in a Hadoop cluster:
    • Application Spotlight
    • Capacity Optimizer
    • Query Spotlight
    • Streaming Spotlight
  • If you’re installing Streaming Spotlight in a Kafka-only cluster (without Hadoop)

To omit a host from Pepperdata data collection, add the Pepperdata Collector to the host as described in this procedure, and assign a value of 0 to the PD_COLLECT_AND_UPLOAD environment variable. If you skip the Collector configuration for a host, the Collector cannot manage the data retention and expiration, and you run the risk of filling up disk space.

To disable data collection for a host after initial installation, perform the Disable/Enable Pepperdata Data Collection for a Host procedure.


  1. Open the Pepperdata configuration file, /etc/pepperdata/pepperdata-config.sh, for editing.

  2. Read the inline comments and edit the variable values for all the TODO items, as applicable for your cluster.

    Be sure to specify only one value for the PD_COLLECT_AND_UPLOAD environment variable: 1 to enable Pepperdata to collect data from the host, or 0 to omit the host from data collection.

    At a minimum, be sure to update the following variables’ values:

    • PD_LICENSE_KEY: Set the value to the location of your Pepperdata license file. For details, see Manage the License Key File.

    We recommend setting the JAVA_HOME environment variable to the JVM used by your Hadoop installation because it can be different from system-scope JVMs, especially when running a managed instance.

  3. Save your changes and close the file.