(Application Spotlight) Configure Application Profiler (Parcel)
Supported versions: See the CDH and CDP Private Cloud Base entries for Pepperdata 7.1.x in the table of Supported Platforms by Pepperdata Version
On This Page
- Prerequisites
- Supported Authentication Protocols for Application Profiler
- Task 1: Configure Pepperdata to Monitor Application History
- Task 2: (Kerberized Clusters) Configure HTTP/HTTPS Endpoint Authentication
- Task 3: (Basic Access Authentication) Add BA Authentication Credentials
- Task 4: Activate Application History Monitoring by Restarting PepAgent
- Task 5: Access the Application Profiler on the Pepperdata Dashboard
- Task 6: (Hadoop 2) Confirm Near Real-Time Data Collection
Prerequisites
Before you begin configuring Application Profiler, ensure that your system meets the required prerequisites.
- Pepperdata must be installed on the host running the MapReduce Job History Server
- MapReduce Job History Server must be running
- (Spark Monitoring) Spark History Server must be running
- Your cluster uses a supported authentication protocol; see Supported Authentication Protocols for Application Profiler, below
Supported Authentication Protocols for Application Profiler
To enable Application Profiler to fetch application data from the MapReduce Job History Server/Spark History Server, your cluster must use a Pepperdata-supported authentication protocol:
-
No authentication.
-
Pseudo auth (also known as Hadoop’s simple authentication)—the server authenticates requests based on the
user.name
query string parameter contained in the request. -
Kerberos.
-
Basic access (BA) authentication—uses standard fields in the HTTP header to specify the user name and password; for details, see https://en.wikipedia.org/wiki/Basic_access_authentication .
Task 1: Configure Pepperdata to Monitor Application History
Procedure
-
In Cloudera Manager, locate the Enable JobHistory Monitoring parameter, and select it.
-
To enable Spark application monitoring, enable the Spark dependency, and enable the associated Pepperdata CSD parameter.
-
Locate the SPARK_ON_YARN Service dependency, and for Pepperdata (Servie-Wide), select Spark.
-
Locate the Enable Spark Application History parameter, and select it.
Note: If you’re using Application Profiler to fetch history data for Spark apps, you can customize the connection timeout value and/or add a second Spark History Server for monitoring. See Configure Spark History Servers. -
Task 2: (Kerberized Clusters) Configure HTTP/HTTPS Endpoint Authentication
If the core services of the ResourceManagers or the MapReduce Job History Server are Kerberized (secured with Kerberos), add the authentication type for the auxiliary HTTP/HTTPS endpoint service to the Pepperdata configuration.
kerberos
authentication, the auxiliary services, such as HTTP/HTTPS, can use either simple
or kerberos
authentication.Prerequisites
-
Be sure that the Kerberos principal has access to the ResourceManager and MapReduce Job History Server endpoints (HTTP or HTTPS).
-
(CDH and CDP Private Cloud Base) Be sure that you selected Enable Access to Kerberized Cluster Components during the installation process (Task 2: Add Pepperdata Service to Cloudera Manager). (For CDP Public Cloud, Pepperdata automatically enables this for Kerberized clusters.)
Procedure
-
For your Kerberized ResourceManager host, determine its authentication type by running the following cURL command, where
{your-protocol}
ishttp
orhttps
:curl --tlsv1.2 -kI {your-protocol}://RM_HOST:PORT/ws/v1/cluster/info | grep WWW-Authenticate
- If the returned response is
WWW-Authenticate: Negotiate
, the authentication type (your-rm-auth-type
) iskerberos
. - Otherwise nothing is returned, and the authentication type (
your-rm-auth-type
) issimple
.
- If the returned response is
-
For your Kerberized MapReduce Job History Server host, determine its authentication type by running the following cURL command, where
{your-protocol}
ishttp
orhttps
:curl --tlsv1.2 -kI {your-protocol}://JHS_HOST:PORT/ws/v1/history | grep WWW-Authenticate
- If the returned response is
WWW-Authenticate: Negotiate
, the authentication type (your-jhs-auth-type
) iskerberos
. - Otherwise nothing is returned, and the authentication type (
your-jhs-auth-type
) issimple
.
- If the returned response is
-
(
simple
authentication type) Add the environment variables for the HTTP/HTTPS endpoint’ssimple
authentication type for the ResourceManager and the MapReduce Job History Server.By default, Pepperdata assigns the authentication type to bekerberos
.
• If your authentication type iskerberos
, skip this step.
• If your authentication type issimple
, perform this step to override the default values and assign them tosimple
.Use Cloudera Manager to add the following snippet to the Pepperdata > Configuration > PepAgent > PepAgent Environment Advanced Configuration Snippet (Safety Valve) template.
# For ResourceManager: PD_JOBHISTORY_RESOURCE_MANAGER_HTTP_AUTH_TYPE=simple # For MapReduce Job History Server: PD_JOBHISTORY_MR_HISTORY_SERVER_HTTP_AUTH_TYPE=simple
Task 3: (Basic Access Authentication) Add BA Authentication Credentials
For Basic access (BA) authentication, add the BA authentication credentials for the monitored applications’ servers to the Pepperdata configuration.
Procedure
-
Use Cloudera Manager to add the following snippet to the Pepperdata > Configuration > PepAgent > PepAgent Environment Advanced Configuration Snippet (Safety Valve) template.
Be sure to substitute your user name and password for the
your-username
andyour-password
placeholders. (The same environment variables are used to configure the BA authentication credentials for the ResourceManager and MapReduce Job History Server.)# For ResourceManager and MapReduce Job History Server PD_AGENT_SIMPLE_OR_BASIC_AUTH_USERNAME=your-username PD_AGENT_BASIC_AUTH_PASSWORD=your-password # For Spark History Server PD_JOBHISTORY_SPARK_HISTORY_BASIC_AUTH_USERNAME=your-username PD_JOBHISTORY_SPARK_HISTORY_BASIC_AUTH_PASSWORD=your-password
Task 4: Activate Application History Monitoring by Restarting PepAgent
On the MapReduce Job History Server host, use Cloudera Manager to start/restart the PepAgent.
Procedure
-
If the Pepperdata services are not yet running, select the Start action for the Pepperdata service.
-
Otherwise, select the Restart action for the Pepperdata service.
Task 5: Access the Application Profiler on the Pepperdata Dashboard
The Application Profiler interface is integrated into Application Spotlight in the Pepperdata Dashboard.
Task 6: (Hadoop 2) Confirm Near Real-Time Data Collection
To confirm that Application Profiler is correctly configured for near real-time monitoring in Hadoop 2, view the data collection process stats (MapReduce Job History Server retrieval).
Be sure to replace the your-jobhistory-server-host
placeholder with the URL of your actual MapReduce Job History Server.
http://your-jobhistory-server-host:50505/JobHistoryMonitor
.