Upgrade Hadoop Distribution and Pepperdata
Supported Pepperdata upgrade paths: from any earlier Pepperdata 6.x version
To upgrade a cluster’s Hadoop distribution and the Pepperdata software, begin with any host, and stop the Pepperdata agents, remove the Pepperdata software, upgrade the Hadoop distribution and the Pepperdata software, and restart the Hadoop services and Pepperdata agents.
If you want to perform an upgrade in a CDP Public Cloud environment, you must create a new environment and Data Hub cluster, and install the Pepperdata Supervisor version that you want; see Installing Pepperdata (CDP Public Cloud).
On This Page
- Task 1: Stop the Pepperdata Agents
- Task 2: Upgrade Your Hadoop Distribution
- Task 3: Install the New Pepperdata Supervisor Parcel and/or CSD
- Task 4: (Upgrades from Supervisor v6.2 or earlier) Update Paths to Pepperdata JARs and Scripts
- Task 5: Restart the Required Services
- Task 6: Restart the Pepperdata Agents
- Task 7: (Parcel Upgrade) Remove the Old Pepperdata Parcels
Task 1: Stop the Pepperdata Agents
Procedure
- In Cloudera Manager, select the Stop action for the Pepperdata service.
Task 2: Upgrade Your Hadoop Distribution
Procedure
- 
    Upgrade the Hadoop distribution according to the distribution’s instructions. 
- 
    Verify that the snippet below is still included in the appropriate template(s), based on which services are configured to run on the host. - YARN ResourceManager/NodeManager: YARN > Configs > Advanced > Advanced yarn-env > yarn-env template
- HBase Master or HBase RegionServer: HBase > Configs > Advanced > Advanced hbase-env > hbase-env template
- Apache Spark: Spark > Configs > Advanced > Advanced spark-env > spark-env template
- Apache Spark 2: Spark 2 > Configuration > Gateway > Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh
 Important: Add the snippet to the end of the template(s).
 This ensures that the activation script’s variable appends (YARN_NODEMANAGER_OPTS,YARN_RESOURCEMANAGER_OPTS,HBASE_REGIONSERVER_OPTS, andSPARK_SUBMIT_OPTS) are not overwritten by other assignments in the template(s).PD_HOME=/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/pepperdata-activate.sh" if [ -e $PEPPERDATA_ACTIVATE_SCRIPT_PATH ]; then . $PEPPERDATA_ACTIVATE_SCRIPT_PATH fi
Task 3: Install the New Pepperdata Supervisor Parcel and/or CSD
This procedure includes migration-upgrade steps that are applicable only if you are upgrading from a PEPPERDATA-1.x.jar CSD for the Pepperdata Supervisor (that is, from Supervisor v6.4.17 or earlier).
For such migration-upgrades, this procedure explains how to use the PepperdataMigration service (the PEPPERDATA-MIGRATION-X.Y.jar file) before you actually install Pepperdata 8.2 (which uses a non-root user, PD_USER).
The PepperdataMigration service manages the directory ownership changes that are required for upgrades from user root to any other user.
If you do not need to perform a migration-upgrade, your upgrade process is referred to in these procedures as the standard upgrade.
Procedure
- 
    Download the installation files that you need—Parcel, CSD(s), or both—from the Downloads page to any local directory, and copy them to the Cloudera Manager Server. - Parcel upgrade: The appropriate PepperdataSupervisorparcel for your distro; see Downloads: CDP Private Cloud Base and CDP Public Cloud or Downloads: CDH.
- CSD upgrade:
        - The latest PEPPERDATA-X.Y.Z.jarCSD (custom service descriptor) for Pepperdata Supervisor 8.2.
- (For migration-upgrades only) The latest PEPPERDATA-MIGRATION-X.Y.jarCSD; see Downloads: CDP Private Cloud Base and CDP Public Cloud or Downloads: CDH.
 
- The latest 
 
- Parcel upgrade: The appropriate 
- 
    Install the installation files for your upgrade (parcel and/or CSD). - 
        (Parcel) Extract the contents of the TGZ archive, and move the parcel (the *.parcelfile) and corresponding SHA checksum file (*.parcel.sha) to the/opt/cloudera/parcel-repodirectory.
- 
        (CSD) Remove any existing PEPPERDATA-X.Y.Z.jar(andPEPPERDATA-X.Y.jar) files from the/opt/cloudera/csddirectory.
- 
        (CSD) Move one custom service descriptor (CSD) file to the /opt/cloudera/csddirectory:- (Standard upgrade) Move the PEPPERDATA-X.Y.Z.jarCSD JAR file
- (Migration-upgrade) Move the PEPPERDATA-MIGRATION-X.Y.jarCSD JAR file
 
- (Standard upgrade) Move the 
 
- 
        
- 
    Restart the Cloudera Service and Configuration Manager (SCM) server (service: cloudera-scm-server).Note: Restarting the SCM server is not the same as restarting the Cloudera Management Service by using the Cloudera Manager interface. Unless you use the command line to explicitly restart the SCM server (thecloudera-scm-serverservice), you will be unable to use Cloudera Manager to add the Pepperdata service.service cloudera-scm-server restartAfter the restart, the new parcel and the Pepperdata service (in the CSD JAR file that you moved— PEPPERDATA-X.Y.Z.jarorPEPPERDATA-MIGRATION-X.Y.jar) are available for activation.
- 
    (Parcel upgrade) In Cloudera Manager, distribute and activate the Pepperdata Supervisor parcel—the *.parcelfile.
PD_USER] was root, or upgrading from a Pepperdata installation where pepperdata (the PD_USER) is not the default user that you want in effect after the upgrade.- 
    (Migration-upgrade only) Perform the migration. - 
        Run the PepperdataMigrationservice.In the configuration wizard, you can optionally change the values of any/all of the following variables. If you change them, be sure to make note of your changed (non-default) values; you’ll need them later, in step 7.e. - Pepperdata user (default=pepperdata)
- Pepperdata YARN group; typically this should be the group that the ResourceManager/NodeManager daemons are running as
- Logging and configuration directories
 After running the PepperdataMigrationservice, the directory permissions will correctly support running Pepperdata as a non-rootuser.
- Pepperdata user (default=
- 
        Remove the PepperdataMigrationservice.- 
            In Cloudera Manager, stop and remove the PepperdataMigrationservice. (For details, see the Cloudera documentation for your version of Cloudera Manager.)
- 
            Delete the PepperdataMigrationservice CSD JAR (PEPPERDATA-MIGRATION-X.Y.jar) from the/opt/cloudera/csddirectory.
 
- 
            
- 
        Ready the Pepperdata service installation. Move the PEPPERDATA-X.Y.Z.jarCSD JAR to the/opt/cloudera/csddirectory.
 
- 
        
- 
    (Migration-upgrade only) Install the Pepperdata service, which is contained in the PEPPERDATA-X.Y.Z.jarCSD JAR file that you already moved to thecsddirectory.- 
        Restart the Cloudera Service and Configuration Manager (SCM) server (service: cloudera-scm-server).Note: Restarting the SCM server is not the same as restarting the Cloudera Management Service by using the Cloudera Manager interface. Unless you use the command line to explicitly restart the SCM server (thecloudera-scm-serverservice), you will be unable to use Cloudera Manager to add the Pepperdata service.service cloudera-scm-server restartAfter the restart, the new Pepperdata service (in the PEPEPRDATA-X.Y.Z.jarCSD JAR file) is available for activation.
- 
        Use the Cloudera Manager interface to restart the Cloudera Management service. 
 
- 
        
- 
    (Migration-upgrade only) Add the Pepperdata service to Cloudera Manager. Use Cloudera Manager to perform this procedure, which adds the Pepperdata service and the custom service descriptor (CSD) to the Cloudera Manager environment. - 
        Select your cluster, click Actions > Add Service, in the Service Type column, select Pepperdata, and click Continue. 
- 
        Select Dependencies page. - 
            (Kerberized clusters) If the core services of the ResourceManagers and the MapReduce Job History Server are Kerberized (secured with Kerberos), select Optional Dependencies. (The YARN dependency is required so that Pepperdata can fetch YARN-related values to use for the Pepperdata configuration.) 
- 
            (Clusters without Kerberos) Select No Optional Dependencies. 
 
- 
            
- 
        Assign Roles page. Customize the Role Assignments: - Click PepAgent, select all hosts, and click OK.
- Click Supervisor, select all the ResourceManager hosts, click OK, and click Continue.
 Do not assign the PepMetrics role. It is now unsupported and unneeded.
- 
        In the Review Changes page, enter your custom information. - 
            For the Pepperdata License Specification, enter data://and then (without any additional spaces) the contents of the license file that we emailed you. If thedata://string is already shown, do not enter it a second time.
- 
            For the Pepperdata Dashboard Cluster Realm Name, enter the cluster name exactly as shown in the license email. Be sure to use the same capitalization. 
- 
            (Non-Hadoop Clusters) If you’re installing Pepperdata on a cluster without Hadoop, such as a Kafka-only cluster for Streaming Spotlight, the Pepperdata PepAgent must be configured to run without Hadoop. If you’re installing Pepperdata in a cluster that has Hadoop, skip this substep. If you perform this substep in a Hadoop cluster, Pepperdata will not operate correctly.Locate the Run Pepperdata in Non-Hadoop Environment parameter, and select it. 
- 
            (Kerberized clusters) If the core services of the ResourceManagers and the MapReduce Job History Server are Kerberized (secured with Kerberos), locate the Enable Access to Kerberized Cluster Components parameter, and ensure that it is selected. - 
                Newer versions of Cloudera Manager automatically detect that Kerberos is enabled on a cluster. In this case, the option will already be selected, and you must be careful to not cancel the option by selecting (clicking) it again. 
- 
                Older versions of Cloudera Manager do not detect that Kerberos is enabled, so you must select this option. 
 
- 
                
- 
            Click Continue. 
 
- 
            
- 
        Complete the steps as prompted by the Add Service wizard, all the way through (and including) clicking Finish. Important: The Pepperdata service will fail to start because of the change in default user from the earlier installation. You will reconfigure the user as needed, and restart the service, in the next substeps.
- 
        In Cloudera Manager, navigate to Pepperdata > Configuration, and reconfigure the user. If you changed the System Group and logging and configuration directories when you ran the PepperdataMigrationservice (step 5.a), be sure to update the the System Group and logging and configuration directories variables, as well.- System User—Set the value to match the value of the Pepperdata user (default=pepperdata).
- System Group—Set the value to match the Pepperdata YARN group.
- Logging and configuration directories.
 
- System User—Set the value to match the value of the Pepperdata user (default=
- 
        In Cloudera Manager, select the Start action for the Pepperdata service. 
 
- 
        
Task 4: (Upgrades from Supervisor v6.2 or earlier) Update Paths to Pepperdata JARs and Scripts
To update the paths for Pepperdata JARs and scripts to their new locations, use Cloudera Manager to edit the snippets in the applicable templates.
Procedure
- 
    Revise the paths for YARN instrumentation. Use Cloudera Manager to edit the snippets in the templates. If there is no corresponding entry in a given template—for example, you are not using Spark 2, so the Spark 2 template is empty—do not add the new snippet. - 
        YARN (MR2 Included) > Configuration > ResourceManager > Java Configuration Options for ResourceManager: Old value: -javaagent:/opt/pepperdata/lib/PepperdataSupervisor.jarNew value: -javaagent:/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/lib/PepperdataSupervisor.jar
- 
        YARN (MR2 Included) > Configuration > NodeManager > Java Configuration Options for NodeManager Old value: -javaagent:/opt/pepperdata/lib/PepperdataSupervisor.jarNew value: -javaagent:/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/lib/PepperdataSupervisor.jar
- 
        Spark > Configuration > Gateway > Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh Old value: PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/pepperdata/supervisor/lib/pepperdata-activate.sh"New value (entered as two separate lines): PD_HOME=/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/pepperdata-activate.sh"
- 
        Spark2 > Configuration > Gateway > Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh Old value: PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/pepperdata/supervisor/lib/pepperdata-activate.sh"New value (entered as two separate lines): PD_HOME=/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/pepperdata-activate.sh"
 
- 
        
- 
    (Query Spotlight) If Query Spotlight is enabled for the cluster, use Cloudera Manager to edit the snippets in the templates. - 
        Hive > Configuration > Client Java Configuration Options Old value: -javaagent:/opt/pepperdata/supervisor/lib/internal-jars/query/hive/PLACEHOLDER-FOR-YOUR-HIVE-QUERY-JAR-NAMENew value: -javaagent:/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/internal-jars/query/hive/PLACEHOLDER-FOR-YOUR-HIVE-QUERY-JAR-NAME
- 
        Hive > Configuration > Java Configuration Options for HiveServer2 Old value: -javaagent:/opt/pepperdata/supervisor/lib/internal-jars/query/hive/PLACEHOLDER-FOR-YOUR-HIVE-QUERY-JAR-NAMENew value: -javaagent:/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/internal-jars/query/hive/PLACEHOLDER-FOR-YOUR-HIVE-QUERY-JAR-NAME
 
- 
        
- 
    (HBase monitoring) If the cluster includes HBase and you’ve enabled Pepperdata to monitor it, use Cloudera Manager to edit the snippets in the templates. - 
        HBase > Configuration > RegionServer > Java Configuration Options for HBase RegionServer Old value: -javaagent:/opt/pepperdata/lib/PepperdataSupervisor.jarNew value: -javaagent:/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/lib/PepperdataSupervisor.jar
 
- 
        
Task 5: Restart the Required Services
Procedure
- 
    In Cloudera Manager, navigate to your cluster’s YARN (MR2 Included) service > Instances, select all ResourceManager and NodeManager hosts, and in the Actions for Selected list, select Restart. 
- 
    (If using HBase) Navigate back to the cluster view, and for the HBase service, select the Rolling Restart action, and then select only the HBase RegionServers. 
- 
    (If using Hive) Restart the required service according to your version of Cloudera’s Distribution of Hadoop (CDH) or Cloudera Data Platform (CDP) Runtime. - 
        CDP 7.x: Navigate back to the cluster view, and for the Hive on Tez service, select the Restart action. 
- 
        CDH 6.x: Navigate to the Hive Service instances, select all HiveServer2 hosts, and in the Actions for Selected list, select Restart. 
 
- 
        
Task 6: Restart the Pepperdata Agents
Procedure
- In Cloudera Manager, select the Start action for the Pepperdata service.
Task 7: (Parcel Upgrade) Remove the Old Pepperdata Parcels
Procedure
- In Cloudera Manager, remove all old Pepperdata Supervisor parcels. (For details, see the Cloudera documentation for your version of Cloudera Manager.)
