About the Pepperdata Products
Pepperdata products are both independent of each other and complementary of each other. You can configure almost any combination or subset of products so as to focus on a given part of your tech stack; for example, use Capacity Optimizer to make the best use of available resources, and use Query Spotlight to optimize your query plans. And for a given scenario, such as running Hive queries, you can use multiple Pepperdata Spotlight products—Application Spotlight to optimize the runtime parameters and resources of the Hive query engine, and Query Spotlight to optimize the components/parts of the query itself—databases, tables, explain plan, partitions, and so on.
On This Page
Pepperdata Platform Spotlight continuously monitors and collects unique data, and provides a 360° cluster view that enables you to quickly diagnose performance issues and make resource decisions about your big data platform based on user priorities and needs. Platform Spotlight leverages AI-driven resource management to automatically tune YARN clusters and recapture wasted capacity. Platform Spotlight enables alerting to identify root causes, and receive recommendations to rightsize containers, queues and other resources.
Platform Spotlight is required by Capacity Optimizer, Application Spotlight, Query Spotlight, and in Supervisor v6.2.x, Cloudera Manager Parcel installations of Streaming Spotlight.
For information about the Platform Spotlight Overviews and tips for troubleshooting your clusters, see Platform Spotlight User’s Guide.
Pepperdata Capacity Optimizer improves the resource usage and throughput in YARN and Kubernetes clusters.
For details, see Capacity Optimizer User’s Guide: 7.1 (or the comparable page for a Supervisor version other than the latest).
Application Spotlight provides a 360° view of your clusters. With this data, you can gauge cluster and application performance within the context of the entire cluster, quickly diagnose application performance issues, and apply Pepperdata job-specific recommendations to improve overall efficiency.
For information about Application Spotlight overviews (Applications and Workflows), reports (Application Profiler and Application Status), how to analyze and compare applications, and how to use Pepperdata recommendations to improve future runs, see Application Spotlight User’s Guide.
With Query Spotlight, IT operators can avoid missed SLAs due to inefficient and slow queries. Application developers and data analysts can avoid scheduling bottlenecks and inefficiently organized data that degrade performance. Query Spotlight provides visibility into database metrics (which lets you optimize performance), highlights which queries are poor performers, and provides operational context.
For total coverage, you can use Query Spotlight to optimize the components/parts of the query itself—databases, tables, explain plan, partitions, and so on—and use Application Spotlight to optimize the runtime parameters and resources of the query engine.
For information about monitoring and analyzing queries and databases, and using Pepperdata recommendations to improve future query runs, see Query Spotlight User’s Guide.
Streaming Spotlight enables IT operators to maintain SLAs for near real-time stream processing applications; the dashboard highlights which brokers are falling behind. For application developers and data analysts, Streaming Spotlight highlights bottlenecks and failures that degrade performance. Streaming Spotlight shows metadata that’s inaccessible by other tools, and in the same dashboard as the rest of your cluster.
- Overloaded brokers signal heavy traffic, and a need for more brokers.
- Unconsumed topics point to issues on the consumer side.
- Starved topics are cues to finding failed producers.
For information about Overview and Detail pages for brokers and topics, see Streaming Spotlight User’s Guide.
Product Availability by Environment
In general, you choice of Pepperdata products is not limited by the specifics of your big data environment. With only a few exceptions, you can use Pepperdata products on-premise/bare metal and the leading cloud platforms. Likewise, most Pepperdata products support both YARN and Kubernetes environments.
|Cloud (HDaaS) **|
|Cloud (HDaaS) †|
|Cloud (HDaaS) †|
|YARN (non-cloud) §|
|Cloud (HDaaS) ‖|
|Streaming Spotlight (deprecated)|
|Bare metal (non-cloud)|
* Streaming Spotlight is deprecated as of August 5, 2022.
** Support for autoscaling optimization depends on the cloud distribution and the Pepperdata version; see System Requirements: Cloud Environments for Autoscaling Optimization.
† Support for Application Spotlight depends on the cloud distribution; see Pepperdata-Platform Support.
‡ Not all features are applicable for Kubernetes environments. Details are provided in the relevant documentation.
§ Support depends on the combination of Hive version and Hadoop distro; see Query Spotlight in Pepperdata-Platform Support.
‖ (Supervisor v6.5.10 and later) Supported for Amazon EMR and Google Dataproc; for supported distro versions, see Pepperdata-Platform Support.