Upgrade the Hadoop Distribution

To upgrade the Hadoop distribution on a cluster, begin with any host, and stop the Pepperdata agents, remove the Pepperdata software, upgrade the Hadoop distribution, reinstall the Pepperdata software, and restart the Pepperdata agents. Repeat this process on every ResourceManager host and NodeManager host in your cluster.

This upgrade procedure is for Cloudera Distribution of Hadoop (CDH) and Cloudera Data Platform (CDP) Private Cloud Base.

If you want to perform an upgrade in a CDP Public Cloud environment, you must create a new environment and Data Hub cluster, and install the Pepperdata Supervisor version that you want; see Installing Pepperdata (CDP Public Cloud).

Run all commands as the root user.


  • Ensure that the Hadoop distro to which you’re upgrading is supported by the currently-installed version of the Pepperdata Supervisor (see Pepperdata-Platform Support). If the new distro is not supported, do not use this Upgrade the Hadoop Distribution procedure. You must instead upgrade both the distro and Pepperdata; see Upgrade Hadoop Distribution and Pepperdata.

Task 1: Stop the Pepperdata Agents


  • In Cloudera Manager, select the Stop action for the Pepperdata service.

Task 2: Remove/Deactivate the Old Pepperdata Supervisor


  • In Cloudera Manager, deactivate the existing (old) Pepperdata Supervisor parcel. (For details, see the Cloudera documentation for your version of Cloudera Manager.)

Task 3: Upgrade Your Hadoop Distribution


  1. Upgrade the Hadoop distribution according to the distribution’s instructions.

  2. Verify that the snippet below is still included in the appropriate template(s), based on which services are configured to run on the host.

    • YARN ResourceManager/NodeManager: YARN > Configs > Advanced > Advanced yarn-env > yarn-env template
    • HBase Master or HBase RegionServer: HBase > Configs > Advanced > Advanced hbase-env > hbase-env template
    • Apache Spark: Spark > Configs > Advanced > Advanced spark-env > spark-env template
    • Apache Spark 2: Spark 2 > Configuration > Gateway > Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh

Task 4: Reinstall the Pepperdata Software


Creation of the pepperdata user and pepperdata log directories uses the CM Agent, a CM Component, at the time of parcel activation and at the time of adding the pepperdata service. Each of these operations requires the CM agent to run as the root user. This requires one of the following permissions during the initial CM installation:

  • Access to the root user account using a password or SSH key file.

  • Passwordless sudo access for a specific user.


  1. Download the following artifacts from the Downloads page to any local directory, and copy them to the Cloudera Manager Server.

  2. Extract the contents of the TGZ archives and move the files as follows:

    • Move the parcel (the *.parcel file) and corresponding SHA checksum file (*.parcel.sha) to the /opt/cloudera/parcel-repo directory.
    • Move the CSD JAR file to the /opt/cloudera/csd directory.
  3. Restart the Cloudera Service and Configuration Manager (SCM) server (service: cloudera-scm-server).

    service cloudera-scm-server restart

    After the restart, the new parcels and the Pepperdata service (in the CSD JAR file) are available for activation.

  4. In Cloudera Manager, distribute and activate the Pepperdata Supervisor parcel—the *.parcel file.

Task 5: Restart Hadoop YARN Services


  1. In Cloudera Manager, navigate to your cluster’s YARN (MR2 Included) service > Instances, select all ResourceManager and NodeManager hosts, and in the Actions for Selected, select Restart.

  2. (If using HBase) Navigate back to the cluster view, and for the HBase service, select the Restart action.

Task 6: Restart the Pepperdata Agents


  • In Cloudera Manager, select the Start action for the Pepperdata service.