Upgrade Hadoop Distribution and Pepperdata
Supported Pepperdata upgrade paths: from any earlier Pepperdata 6.x version
To upgrade a cluster’s Hadoop distribution and the Pepperdata software, begin with any host, and stop the Pepperdata agents, remove the Pepperdata software, upgrade the Hadoop distribution and the Pepperdata software, and restart the Hadoop services and Pepperdata agents.
If you want to perform an upgrade in a CDP Public Cloud environment, you must create a new environment and Data Hub cluster, and install the Pepperdata Supervisor version that you want; see Installing Pepperdata (CDP Public Cloud).
On This Page
- Task 1: Stop the Pepperdata Agents
- Task 2: Upgrade Your Hadoop Distribution
- Task 3: Install the New Pepperdata Supervisor Parcel and/or CSD
- Task 4: (Upgrades from Supervisor v6.2 or earlier) Update Paths to Pepperdata JARs and Scripts
- Task 5: Restart the Required Services
- Task 6: Restart the Pepperdata Agents
- Task 7: (Parcel Upgrade) Remove the Old Pepperdata Parcels
Task 1: Stop the Pepperdata Agents
Procedure
- In Cloudera Manager, select the Stop action for the Pepperdata service.
Task 2: Upgrade Your Hadoop Distribution
Procedure
-
Upgrade the Hadoop distribution according to the distribution’s instructions.
-
Verify that the snippet below is still included in the appropriate template(s), based on which services are configured to run on the host.
- YARN ResourceManager/NodeManager: YARN > Configs > Advanced > Advanced yarn-env > yarn-env template
- HBase Master or HBase RegionServer: HBase > Configs > Advanced > Advanced hbase-env > hbase-env template
- Apache Spark: Spark > Configs > Advanced > Advanced spark-env > spark-env template
- Apache Spark 2: Spark 2 > Configuration > Gateway > Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh
Important: Add the snippet to the end of the template(s).
This ensures that the activation script’s variable appends (YARN_NODEMANAGER_OPTS
,YARN_RESOURCEMANAGER_OPTS
,HBASE_REGIONSERVER_OPTS
, andSPARK_SUBMIT_OPTS
) are not overwritten by other assignments in the template(s).PD_HOME=/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/pepperdata-activate.sh" if [ -e $PEPPERDATA_ACTIVATE_SCRIPT_PATH ]; then . $PEPPERDATA_ACTIVATE_SCRIPT_PATH fi
Task 3: Install the New Pepperdata Supervisor Parcel and/or CSD
This procedure includes migration-upgrade steps that are applicable only if you are upgrading from a PEPPERDATA-1.x.jar
CSD for the Pepperdata Supervisor (that is, from Supervisor v6.4.17 or earlier).
For such migration-upgrades, this procedure explains how to use the PepperdataMigration
service (the PEPPERDATA-MIGRATION-X.Y.jar
file) before you actually install Pepperdata 8.1 (which uses a non-root
user, PD_USER
).
The PepperdataMigration
service manages the directory ownership changes that are required for upgrades from user root
to any other user.
If you do not need to perform a migration-upgrade, your upgrade process is referred to in these procedures as the standard upgrade.
Procedure
-
Download the installation files that you need—Parcel, CSD(s), or both—from the Downloads page to any local directory, and copy them to the Cloudera Manager Server.
- Parcel upgrade: The appropriate
PepperdataSupervisor
parcel for your distro; see Downloads: CDP Private Cloud Base and CDP Public Cloud or Downloads: CDH. - CSD upgrade:
- The latest
PEPPERDATA-X.Y.Z.jar
CSD (custom service descriptor) for Pepperdata Supervisor 8.1. - (For migration-upgrades only) The latest
PEPPERDATA-MIGRATION-X.Y.jar
CSD; see Downloads: CDP Private Cloud Base and CDP Public Cloud or Downloads: CDH.
- The latest
- Parcel upgrade: The appropriate
-
Install the installation files for your upgrade (parcel and/or CSD).
-
(Parcel) Extract the contents of the TGZ archive, and move the parcel (the
*.parcel
file) and corresponding SHA checksum file (*.parcel.sha
) to the/opt/cloudera/parcel-repo
directory. -
(CSD) Remove any existing
PEPPERDATA-X.Y.Z.jar
(andPEPPERDATA-X.Y.jar
) files from the/opt/cloudera/csd
directory. -
(CSD) Move one custom service descriptor (CSD) file to the
/opt/cloudera/csd
directory:- (Standard upgrade) Move the
PEPPERDATA-X.Y.Z.jar
CSD JAR file - (Migration-upgrade) Move the
PEPPERDATA-MIGRATION-X.Y.jar
CSD JAR file
- (Standard upgrade) Move the
-
-
Restart the Cloudera Service and Configuration Manager (SCM) server (service:
cloudera-scm-server
).Note: Restarting the SCM server is not the same as restarting the Cloudera Management Service by using the Cloudera Manager interface. Unless you use the command line to explicitly restart the SCM server (thecloudera-scm-server
service), you will be unable to use Cloudera Manager to add the Pepperdata service.service cloudera-scm-server restart
After the restart, the new parcel and the Pepperdata service (in the CSD JAR file that you moved—
PEPPERDATA-X.Y.Z.jar
orPEPPERDATA-MIGRATION-X.Y.jar
) are available for activation. -
(Parcel upgrade) In Cloudera Manager, distribute and activate the Pepperdata Supervisor parcel—the
*.parcel
file.
PD_USER
] was root
, or upgrading from a Pepperdata installation where pepperdata
(the PD_USER
) is not the default user that you want in effect after the upgrade.-
(Migration-upgrade only) Perform the migration.
-
Run the
PepperdataMigration
service.In the configuration wizard, you can optionally change the values of any/all of the following variables. If you change them, be sure to make note of your changed (non-default) values; you’ll need them later, in step 7.e.
- Pepperdata user (default=
pepperdata
) - Pepperdata YARN group; typically this should be the group that the ResourceManager/NodeManager daemons are running as
- Logging and configuration directories
After running the
PepperdataMigration
service, the directory permissions will correctly support running Pepperdata as a non-root
user. - Pepperdata user (default=
-
Remove the
PepperdataMigration
service.-
In Cloudera Manager, stop and remove the
PepperdataMigration
service. (For details, see the Cloudera documentation for your version of Cloudera Manager.) -
Delete the
PepperdataMigration
service CSD JAR (PEPPERDATA-MIGRATION-X.Y.jar
) from the/opt/cloudera/csd
directory.
-
-
Ready the Pepperdata service installation.
Move the
PEPPERDATA-X.Y.Z.jar
CSD JAR to the/opt/cloudera/csd
directory.
-
-
(Migration-upgrade only) Install the Pepperdata service, which is contained in the
PEPPERDATA-X.Y.Z.jar
CSD JAR file that you already moved to thecsd
directory.-
Restart the Cloudera Service and Configuration Manager (SCM) server (service:
cloudera-scm-server
).Note: Restarting the SCM server is not the same as restarting the Cloudera Management Service by using the Cloudera Manager interface. Unless you use the command line to explicitly restart the SCM server (thecloudera-scm-server
service), you will be unable to use Cloudera Manager to add the Pepperdata service.service cloudera-scm-server restart
After the restart, the new Pepperdata service (in the
PEPEPRDATA-X.Y.Z.jar
CSD JAR file) is available for activation. -
Use the Cloudera Manager interface to restart the Cloudera Management service.
-
-
(Migration-upgrade only) Add the Pepperdata service to Cloudera Manager.
Use Cloudera Manager to perform this procedure, which adds the Pepperdata service and the custom service descriptor (CSD) to the Cloudera Manager environment.
-
Select your cluster, click Actions > Add Service, in the Service Type column, select Pepperdata, and click Continue.
-
Select Dependencies page.
-
(Kerberized clusters) If the core services of the ResourceManagers and the MapReduce Job History Server are Kerberized (secured with Kerberos), select Optional Dependencies. (The YARN dependency is required so that Pepperdata can fetch YARN-related values to use for the Pepperdata configuration.)
-
(Clusters without Kerberos) Select No Optional Dependencies.
-
-
Assign Roles page. Customize the Role Assignments:
- Click PepAgent, select all hosts, and click OK.
- Click Supervisor, select all the ResourceManager hosts, click OK, and click Continue.
Do not assign the PepMetrics role. It is now unsupported and unneeded. -
In the Review Changes page, enter your custom information.
-
For the Pepperdata License Specification, enter
data://
and then (without any additional spaces) the contents of the license file that we emailed you. If thedata://
string is already shown, do not enter it a second time. -
For the Pepperdata Dashboard Cluster Realm Name, enter the cluster name exactly as shown in the license email. Be sure to use the same capitalization.
-
(Non-Hadoop Clusters) If you’re installing Pepperdata on a cluster without Hadoop, such as a Kafka-only cluster for Streaming Spotlight, the Pepperdata PepAgent must be configured to run without Hadoop.
If you’re installing Pepperdata in a cluster that has Hadoop, skip this substep. If you perform this substep in a Hadoop cluster, Pepperdata will not operate correctly.Locate the Run Pepperdata in Non-Hadoop Environment parameter, and select it.
-
(Kerberized clusters) If the core services of the ResourceManagers and the MapReduce Job History Server are Kerberized (secured with Kerberos), locate the Enable Access to Kerberized Cluster Components parameter, and ensure that it is selected.
-
Newer versions of Cloudera Manager automatically detect that Kerberos is enabled on a cluster. In this case, the option will already be selected, and you must be careful to not cancel the option by selecting (clicking) it again.
-
Older versions of Cloudera Manager do not detect that Kerberos is enabled, so you must select this option.
-
-
Click Continue.
-
-
Complete the steps as prompted by the Add Service wizard, all the way through (and including) clicking Finish.
Important: The Pepperdata service will fail to start because of the change in default user from the earlier installation. You will reconfigure the user as needed, and restart the service, in the next substeps. -
In Cloudera Manager, navigate to Pepperdata > Configuration, and reconfigure the user.
If you changed the System Group and logging and configuration directories when you ran the
PepperdataMigration
service (step 5.a), be sure to update the the System Group and logging and configuration directories variables, as well.- System User—Set the value to match the value of the Pepperdata user (default=
pepperdata
). - System Group—Set the value to match the Pepperdata YARN group.
- Logging and configuration directories.
- System User—Set the value to match the value of the Pepperdata user (default=
-
In Cloudera Manager, select the Start action for the Pepperdata service.
-
Task 4: (Upgrades from Supervisor v6.2 or earlier) Update Paths to Pepperdata JARs and Scripts
To update the paths for Pepperdata JARs and scripts to their new locations, use Cloudera Manager to edit the snippets in the applicable templates.
Procedure
-
Revise the paths for YARN instrumentation.
Use Cloudera Manager to edit the snippets in the templates. If there is no corresponding entry in a given template—for example, you are not using Spark 2, so the Spark 2 template is empty—do not add the new snippet.
-
YARN (MR2 Included) > Configuration > ResourceManager > Java Configuration Options for ResourceManager:
Old value:
-javaagent:/opt/pepperdata/lib/PepperdataSupervisor.jar
New value:
-javaagent:/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/lib/PepperdataSupervisor.jar
-
YARN (MR2 Included) > Configuration > NodeManager > Java Configuration Options for NodeManager
Old value:
-javaagent:/opt/pepperdata/lib/PepperdataSupervisor.jar
New value:
-javaagent:/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/lib/PepperdataSupervisor.jar
-
Spark > Configuration > Gateway > Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh
Old value:
PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/pepperdata/supervisor/lib/pepperdata-activate.sh"
New value (entered as two separate lines):
PD_HOME=/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/pepperdata-activate.sh"
-
Spark2 > Configuration > Gateway > Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh
Old value:
PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/pepperdata/supervisor/lib/pepperdata-activate.sh"
New value (entered as two separate lines):
PD_HOME=/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/pepperdata-activate.sh"
-
-
(Query Spotlight) If Query Spotlight is enabled for the cluster, use Cloudera Manager to edit the snippets in the templates.
-
Hive > Configuration > Client Java Configuration Options
Old value:
-javaagent:/opt/pepperdata/supervisor/lib/internal-jars/query/hive/PLACEHOLDER-FOR-YOUR-HIVE-QUERY-JAR-NAME
New value:
-javaagent:/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/internal-jars/query/hive/PLACEHOLDER-FOR-YOUR-HIVE-QUERY-JAR-NAME
-
Hive > Configuration > Java Configuration Options for HiveServer2
Old value:
-javaagent:/opt/pepperdata/supervisor/lib/internal-jars/query/hive/PLACEHOLDER-FOR-YOUR-HIVE-QUERY-JAR-NAME
New value:
-javaagent:/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/internal-jars/query/hive/PLACEHOLDER-FOR-YOUR-HIVE-QUERY-JAR-NAME
-
-
(HBase monitoring) If the cluster includes HBase and you’ve enabled Pepperdata to monitor it, use Cloudera Manager to edit the snippets in the templates.
-
HBase > Configuration > RegionServer > Java Configuration Options for HBase RegionServer
Old value:
-javaagent:/opt/pepperdata/lib/PepperdataSupervisor.jar
New value:
-javaagent:/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/lib/PepperdataSupervisor.jar
-
Task 5: Restart the Required Services
Procedure
-
In Cloudera Manager, navigate to your cluster’s YARN (MR2 Included) service > Instances, select all ResourceManager and NodeManager hosts, and in the Actions for Selected list, select Restart.
-
(If using HBase) Navigate back to the cluster view, and for the HBase service, select the Rolling Restart action, and then select only the HBase RegionServers.
-
(If using Hive) Restart the required service according to your version of Cloudera’s Distribution of Hadoop (CDH) or Cloudera Data Platform (CDP) Runtime.
-
CDP 7.x:
Navigate back to the cluster view, and for the Hive on Tez service, select the Restart action.
-
CDH 6.x:
Navigate to the Hive Service instances, select all HiveServer2 hosts, and in the Actions for Selected list, select Restart.
-
Task 6: Restart the Pepperdata Agents
Procedure
- In Cloudera Manager, select the Start action for the Pepperdata service.
Task 7: (Parcel Upgrade) Remove the Old Pepperdata Parcels
Procedure
- In Cloudera Manager, remove all old Pepperdata Supervisor parcels. (For details, see the Cloudera documentation for your version of Cloudera Manager.)