System Requirements

Before installing the Pepperdata software, ensure that your system meets the necessary requirements, such as operating system, JDK (Java Development Kit), and so on. Also be sure that the Pepperdata version that you’re installing supports the Pepperdata products that you’ll be using.

For information about platform distro support, see Pepperdata-Platform Support.

Environment (YARN/Kubernetes)

Pepperdata support for environment varies by product, as shown in the table. Before configuring a Pepperdata product, be sure that it is supported for your environment.

Supported Product-Environment Combinations by Pepperdata Version
Product-Environment Combination* Pepperdata
8.0.x 7.1.x 7.0.x 6.5.x 6.4.x 6.3.x 6.2.x
Capacity Optimizer
YARN (non-cloud)
Cloud (HDaaS) **
Kubernetes
Application Spotlight
YARN (non-cloud)
Cloud (HDaaS) †
Kubernetes ‡
Platform Spotlight
YARN (non-cloud)
Cloud (HDaaS) †
Kubernetes ‡
Query Spotlight
YARN (non-cloud) §
Cloud (HDaaS) ‖
Kubernetes
Streaming Spotlight (deprecated)
Bare metal (non-cloud)
Cloud (HDaaS)
Kubernetes

Notes

* Streaming Spotlight is deprecated as of August 5, 2022.

** Support for autoscaling optimization depends on the cloud distribution and the Pepperdata version; see System Requirements: Cloud Environments for Autoscaling Optimization.

Support for Application Spotlight depends on the cloud distribution; see Pepperdata-Platform Support.

Not all features are applicable for Kubernetes environments. Details are provided in the relevant documentation.

§ Support depends on the combination of Hive version and Hadoop distro; see Query Spotlight in Pepperdata-Platform Support.

(Supervisor v6.5.10 and later) Supported for Amazon EMR and Google Dataproc; for supported distro versions, see Pepperdata-Platform Support.

Operating System

Pepperdata supports the operating systems shown in the table.

Supported Operating Systems by Pepperdata Version
Operating System Pepperdata
8.0.x 7.1.x 7.0.x 6.5.x 6.4.x 6.3.x 6.2.x
Red Hat Enterprise Linux 8/CentOS-8
RHEL 8.2/CentOS-8.2.x
RHEL 8.1/CentOS-8.1.x
RHEL 8.0/CentOS-8.0.x
Red Hat Enterprise Linux 7/CentOS 7/Oracle Linux 7
RHEL 7.9/CentOS-7.9/OL-7.9
RHEL 7.8/CentOS-7.8/OL-7.8
RHEL 7.7/CentOS-7.7/OL-7.7
RHEL 7.6/CentOS-7.6/OL-7.6
RHEL 7.5/CentOS-7.5/OL-7.5
RHEL 7.4/CentOS-7.4/OL-7.4
RHEL 7.3/CentOS-7.3
RHEL 7.2/CentOS-7.2 *
RHEL 7.1/CentOS-7.1
RHEL 7.0/CentOS-7.0
Red Hat Enterprise Linux 6/CentOS 6
RHEL 6.9/CentOS-6.9
RHEL 6.8/CentOS-6.8
RHEL 6.7/CentOS-6.7
RHEL 6.6/CentOS-6.6
RHEL 6.5/CentOS-6.5
RHEL 6.4/CentOS-6.4
RHEL 6.3/CentOS-6.3
RHEL 6.2/CentOS-6.2
RHEL 6.1/CentOS-6.1
RHEL 6.0/CentOS-6.0
SUSE Linux Enterprise Server (SLES)
SLES 15
SLES 12
Debian
Debian 8
Ubuntu
Ubuntu 18.04
Ubuntu 16.04 LTS

* Package systemd-219-19.el7 is unsupported. Use packages older than systemd-219-19.el7, or use package systemd-sysv-219-30.el7.x86_64.rpm or newer.

JDK (Java Development Kit)

The table lists the Pepperdata-supported versions of JDK.

JDK Support by Pepperdata Version
JDK Pepperdata
8.0.x 7.1.x 7.0.x 6.5.x 6.4.x 6.3.x 6.2.x
Java 17
JDK 17.0.11 or later **
Java 11
JDK 11.0.3 or later *
Java 8
JDK build 8u131 or newer
Java 7
JDK build 7u65 or newer

* Support for distros other than CDH 6.3.3 requires Pepperdata Supervisor v6.3.16 or later.

** Requires Pepperdata Supervisor versions > 8.0.26.

Apache Spark (Application Spotlight only)

Application Spotlight supports the Spark versions listed in the table.

Supported Spark-Environment Combinations by Pepperdata Version
Spark-Environment Combination Pepperdata
8.0.x 7.1.x 7.0.x 6.5.x 6.4.x 6.3.x 6.2.x
Apache Spark 3.2.x-3.5.x ‡
YARN
Kubernetes
Apache Spark 3.0.x--3.1.x
YARN
Kubernetes
Apache Spark 2.0.x--2.4.x
YARN
Kubernetes †
Apache Spark 1.6
YARN
Kubernetes *

* Spark 1.x does not support Kubernetes.

Only Spark 2.3 and later support Kubernetes.

Requires Pepperdata Supervisor v7.0.8 or later.

Stream Processors (Streaming Spotlight only)

Streaming Spotlight is deprecated as of August 5, 2022.

Streaming Spotlight supports the stream processors listed in the table.

Streaming Processor Support by Streaming Spotlight Version
Stream Processor Streaming Spotlight
8.0.x 7.1.x 7.0.x 6.5.x 6.4.x 6.3.x 6.2.x
Apache® Kafka
2.7.x *
2.6.x *
2.5.x *
2.4.x
2.3.x
2.2.x
2.1.x
2.0.x
1.1.x
1.0.x
Confluent® Kafka
6.1.x *
5.5.x *
5.3.3

* Requires Pepperdata Supervisor v6.4.17 or later.

GPU (Spark on Kubernetes applications only)

Pepperdata supports the GPUs listed in the table.

Supported GPU-Platform Combinations by Pepperdata Version
GPU-Platform Combination Pepperdata
8.0.x 7.1.x 7.0.x 6.5.x
NVIDIA
Microsoft AKS
Amazon EKS *
HPE Ezmeral Container Platform 5.3.x
HPE Ezmeral Container Platform 5.2.x

* Requires Pepperdata Supervisor v6.5.22 or later.

HBase

Pepperdata supports the HBase versions listed in the table.

HBase Support by Pepperdata Version
HBase Pepperdata
8.0.x 7.1.x 7.0.x 6.5.x 6.4.x 6.3.x 6.2.x
HBase
2.x *
1.x

* Requires Pepperdata Supervisor v6.4.15 or later.

PepMetrics Agent Support for HBase

The Pepperdata PepMetrics agent—used by very old (now unsupported) versions of Pepperdata to gather HBase metrics, but replaced by fetchers invoked by PepAgents—supports the HBase versions shown in the table.

HBase Support by PepMetrics Version
HBase PepMetrics Agent *
8.0.x 7.1.x 7.0.x 6.5.x 6.4.x 6.3.x 6.2.x
HBase
1.2
1.1
1.0

* PepMetrics agent is unsupported in Pepperdata v6.5.x and later.

Cloud Environments for Autoscaling Optimization

The table lists the cloud environments where autoscaling optimization is supported.

Cloud's Automatic Scaling Support by Capacity Optimizer (Pepperdata) Version
Cloud Distro Pepperdata
8.0.x 7.1.x 7.0.x 6.5.x 6.4.x 6.3.x
Amazon EKS
Cluster Autoscaler
Horizontal Pod Autoscaler
Amazon EMR ‖
6.5.x ¶ (EMR-managed scaling)
6.5.x ¶ (custom automatic scaling policy)
6.4.x § (EMR-managed scaling)
6.4.x § (custom automatic scaling policy)
6.3.x § (EMR-managed scaling)
6.3.x § (custom automatic scaling policy)
6.2.0 † (EMR-managed scaling)
6.2.0 † (custom automatic scaling policy)
6.1.0 (EMR-managed scaling)
6.1.0 (custom automatic scaling policy)
5.34.x ¶ (EMR-managed scaling)
5.34.x ¶ (custom automatic scaling policy)
5.33.0 (EMR-managed scaling)
5.33.0 (custom automatic scaling policy)
5.32.0 † (EMR-managed scaling)
5.32.0 † (custom automatic scaling policy)
5.31.0 (EMR-managed scaling)
5.31.0 (custom automatic scaling policy)
5.30.0 (EMR-managed scaling)
5.30.0 (custom automatic scaling policy)
5.3.x--5.29.x ‡ (custom automatic scaling policy)
Google Dataproc
2.0.x-debian10
1.5-debian10
1.4-debian10
1.3-debian10
Qubole®
R59 (AWS) *

* Requires Pepperdata Supervisor v6.3.13 or later.

Bucket names cannot include any dot characters (.).

EMR-managed scaling is unavailable in EMR 5.29.0 and earlier.

§ Amazon EMR 6.3.x and EMR 6.4.x require Pepperdata Supervisor v6.5.24 or later.

For information about EMR-managed scaling vs. custom automatic scaling policy, see How can I configure automatic scaling in Amazon EMR? .

Amazon EMR 5.34.x and EMR 6.5.x require Pepperdata Supervisor v7.0.8 or later.

Swappiness and Swap Space (Capacity Optimizer only)

A prerequisite for using Pepperdata Capacity Optimizer is for your cluster to be swap-enabled, with sufficient swap space available.

Capacity Optimizer pushes hosts to do more work and to use more physical memory than they otherwise would. To minimize the risk of service outages or application (job) failures due to the Linux Out of Memory (OOM) killer being called if the hosts run out of physical memory, your cluster hosts must adhere to the vendor-recommended settings for swap and swappiness for your Hadoop distribution and operating system. (Although Capacity Optimizer does include a number of safety valves for avoiding this scenario, swap space is required for best results.)

Hadoop Distribution Recommendation Vendor Documentation
Cloudera CDH swapiness = 1
swap space >= 4 GB *
Swappiness setting recommendation

* Or as recommended by your operating system’s vendor documentation.

You should also consult your operating system’s documentation for recommendations. For example, RedHat recommends swappiness >= 10 (Tuning Virtual Memory ) and that you configure the swap space based on the overall system RAM (Swap Space ).