Upgrade the Hadoop Distribution

To upgrade the Hadoop distribution on a cluster, begin with any host, and stop the Pepperdata agents, remove the Pepperdata software, upgrade the Hadoop distribution, reinstall the Pepperdata software, and restart the Pepperdata agents. Repeat this process on every ResourceManager host and NodeManager host in your cluster.

On This Page

Task 1: Stop the Pepperdata Agents
Task 2: Remove/Deactivate the Old Pepperdata Supervisor
Task 3: Upgrade Your Hadoop Distribution
Task 4: Reinstall the Pepperdata Software
Task 5: Restart Hadoop YARN Services
Task 6: Restart the Pepperdata Agents

Important: Be sure to repeat all the tasks of the upgrade process on every ResourceManager host, NodeManager host, and Edge/Job Submission host in your cluster.

Run all commands as the root user.

Task 1: Stop the Pepperdata Agents

Procedure

Stop the PepAgent. You can use either the service (if provided by your OS) or systemctl command:
- sudo service pepagentd stop
- sudo systemctl stop pepagentd
Stop the Pepperdata Collector. You can use either the service (if provided by your OS) or systemctl command:
- sudo service pepcollectd stop
- sudo systemctl stop pepcollectd
(Only if running PepMetrics; otherwise not applicable.
Do not restart PepMetrics because it is now unsupported and unneeded.)
1. Stop the PepMetrics agent.
  
  You can use either the service (if provided by your OS) or systemctl command:
  - sudo service pepmetricsd stop
  - sudo systemctl stop pepmetricsd
2. Disable the PepMetrics agent.
  
  You can use either the service (if provided by your OS) or systemctl command:
  - sudo chkconfig pepmetricsd off
  - sudo systemctl disable pepmetricsd

Task 2: Remove/Deactivate the Old Pepperdata Supervisor

If you are not upgrading your Hadoop distribution, or you are upgrading your Hadoop distribution to a version that is compatible with your current Pepperdata package, removing the Pepperdata package is optional. If you want the ability to quickly rollback to your previously installed Pepperdata version, you can skip this procedure and leave the original version installed. When you install the new Pepperdata package, it will not overwrite any existing Pepperdata packages.

Warning: If you are upgrading your Hadoop distribution and do not want to remove the Pepperdata package, be sure that the Pepperdata package is compatible with your new Hadoop distribution. If they are incompatible, Pepperdata and/or Hadoop might fail to start.

Procedure

Depending on the management of the cluster, remove the RPM/DEB package by running the appropriate command for your environment or by using site-specific administrative tools.

Task 3: Upgrade Your Hadoop Distribution

Procedure

Upgrade the Hadoop distribution according to the distribution’s instructions.
Verify that the snippet below is still included in the appropriate template(s), based on which services are configured to run on the host.
- YARN ResourceManager/NodeManager: YARN > Configs > Advanced > Advanced yarn-env > yarn-env template
- HBase Master or HBase RegionServer: HBase > Configs > Advanced > Advanced hbase-env > hbase-env template
- Apache Spark: Spark > Configs > Advanced > Advanced spark-env > spark-env template
- Apache Spark 2: Spark 2 > Configuration > Gateway > Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh
Important: Add the snippet to the end of the template(s).

This ensures that the activation script’s variable appends (YARN_NODEMANAGER_OPTS, YARN_RESOURCEMANAGER_OPTS, HBASE_REGIONSERVER_OPTS, and SPARK_SUBMIT_OPTS) are not overwritten by other assignments in the template(s).
```
PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/pepperdata/supervisor/lib/pepperdata-activate.sh"
if [ -e $PEPPERDATA_ACTIVATE_SCRIPT_PATH ]; then
 . $PEPPERDATA_ACTIVATE_SCRIPT_PATH
fi
```

Task 4: Reinstall the Pepperdata Software

A single RPM/DEB package contains all the Pepperdata products.

Procedure

Obtain the appropriate Pepperdata <supervisor-package-name> RPM/DEB package. There are version-specific Pepperdata packages for some Hadoop versions. In such cases, the Pepperdata package name includes the Hadoop version number. See the Downloads page.

Depending on the management of the cluster, install the RPM/DEB package by running the appropriate command for your environment or by using site-specific administrative tools.

The table describes the locations of the Pepperdata files after you install the package. Except for the primary installation target, the locations are created by symlinks.

Directory	Description
`/opt/pepperdata/supervisor-<your-version>`	Primary installation target, containing many subdirectories and files
`/opt/pepperdata/lib/`	JAR and library files
`/etc/init.d/`	Initialization scripts
`/etc/pepperdata/`	Configuration files, configuration templates, and site-specific configuration files

Merge the contents of the new Pepperdata configuration file template, /etc/pepperdata/pepperdata-config.sh-template, into your existing Pepperdata configuration file, /etc/pepperdata/pepperdata-config.sh.

If the installation fails on any host, contact Pepperdata Support.

Task 5: Restart Hadoop YARN Services

Note: If you are using a management framework such as Cloudera Manager or Ambari, it might ask you to restart additional services (such as Hive and Pig). In this case, follow the recommendations of your management framework.

Procedure

Restart the affected YARN services:
- YARN ResourceManager
- YARN NodeManagers
(If using HBase) Restart the HBase RegionServers.
(If using Query Spotlight) On every Hive server, restart the hiveserver2 service.

Task 6: Restart the Pepperdata Agents

Procedure

Start the Pepperdata Collector.

You can use either the service (if provided by your OS) or systemctl command:
- sudo service pepcollectd start
- sudo systemctl start pepcollectd
If any of the process’s startup checks fail, an explanatory message appears and the process does not start. Address the issues and try again to start the process.
Start the PepAgent.

You can use either the service (if provided by your OS) or systemctl command:
- sudo service pepagentd start
- sudo systemctl start pepagentd
If the service successfully starts, the Pepperdata Version [VERSION-STRING] OK message appears in the /var/log/pepperdata/pepagent/pepagent.log file.

If any of the process’s startup checks fail, an explanatory message appears and the process does not start. Address the issues and try again to start the process.

Important: Be sure to repeat the preceding upgrade tasks on every ResourceManager host, NodeManager host, and Edge/Job Submission host in your cluster.