Configure Query Spotlight: Impala (Parcel)
On This Page
Prerequisites
Before you begin configuring Query Spotlight, ensure that your system meets the required prerequisites.
- Pepperdata must be installed on the host(s) to be configured for Query Spotlight.
- Your cluster uses a supported combination of Impala version and platform; see the entries for Query Spotlight 8.0.x in the table of Supported Impala-Distro Combinations by Query Spotlight Version.
Task 1: Enable Fetching of Impala Query Data
To enable Pepperdata to fetch data from the Impala query data, add the required variables to the Pepperdata configuration.
Procedure
-
Add the Impala query properties to the Pepperdata configuration to enable query monitoring and to configure where to read the query profiles from.
-
Locate the Pepperdata > Configuration > Pepperdata (Service-Wide) > Enable Impala query monitoring parameter, and select it.
-
(Optional) To configure a non-default location—any location other than
/var/log/impalad/profiles/
—for where to read the query profiles, add the following snippet to the Pepperdata > Service Wide > PepAgent Advanced Configuration Snippet (Safety Valve) for conf/pepperdata-site.xml template, as an XML block.Be sure to substitute your location of your query profiles for the
your-impalad-profiles-location
placeholder.<property> <name>pepperdata.impala.query.queryLogDir</name> <value>your-impalad-profiles-location</value> </property>
-
-
(HTTPS
impalad
daemon endpoints) If yourimpalad
daemon is configured for HTTPS instead of HTTP, add thepepperdata.agent.genericJsonFetch.impala.httpsEnabled
property so that the fetcher for information about Impala queries in flight uses the HTTPS endpoint instead of the default HTTP endpoint (http://LOCALHOST:25000/queries?json
).Add the following snippet to the Pepperdata > Service Wide > PepAgent Advanced Configuration Snippet (Safety Valve) for conf/pepperdata-site.xml template, as an XML block.
<property> <name>pepperdata.agent.genericJsonFetch.impala.httpsEnabled</name> <value>true</value> </property>
-
(Digest authentication for the Impala Web UI for debugging) If the
impalad
daemon for your Impala Web UI for debugging is secured by digest authentication, add the authentication credentials.Note: This step is for using digest auth to secure theimpalad
daemon of the Impala Web UI for debugging , not for securing the Impala core services with Kerberos or LDAP.Add the following snippet to the Pepperdata > Service Wide > PepAgent Advanced Configuration Snippet (Safety Valve) for conf/pepperdata-site.xml template, as an XML block. Be sure to substitute your username and password for the
your-username
andyour-password
placeholders in the following code snippet.<property> <name>pepperdata.agent.genericJsonFetch.impala.http.authentication.type</name> <value>digest</value> </property> <property> <name>pepperdata.agent.genericJsonFetch.impala.auth.username</name> <value>your-username</value> </property> <property> <name>pepperdata.agent.genericJsonFetch.impala.auth.password</name> <value>your-password</value> </property> <property> <name>pepperdata.agent.genericJsonFetch.impala.httpsEnabled</name> <value>true</value> </property>
-
(Kerberos for the Impala Web UI for debugging) If the
impalad
daemon for your Impala Web UI for debugging is Kerberized, add the authentication credentials.Note: This step is for using Kerberos to secure theimpalad
daemon of the Impala Web UI for debugging , not for securing the Impala core services with Kerberos or LDAP.Add the following snippet to the Pepperdata > Service Wide > PepAgent Advanced Configuration Snippet (Safety Valve) for conf/pepperdata-site.xml template, as an XML block.
Be sure to substitute your Kerberos principal and the path of the corresponding keytab file for the
your-kerberos-principal
andyour-kerberos-keytab-pathname
placeholders in the following code snippet.If you already configured the cluster for Kerberos during the installation process (that is, you you selected Enable Access to Kerberized Cluster Components in Task 2: Add Pepperdata Service to Cloudera Manager), do not assign the fetcher properties except to override the cluster-level assignments.
The fetcher properties (pepperdata.agent.genericJsonFetch.impala.kerberos.principal
andpepperdata.agent.genericJsonFetch.impala.keytab.location
) are inherited from the cluster-level properties that were automatically assigned when you configured the cluster.<property> <name>pepperdata.agent.genericJsonFetch.impala.http.authentication.type</name> <value>kerberos</value> </property> <property> <name>pepperdata.agent.genericJsonFetch.impala.kerberos.principal</name> <value>your-kerberos-principal</value> </property> <property> <name>pepperdata.agent.genericJsonFetch.impala.keytab.location</name> <value>your-kerberos-keytab-pathname</value> </property> <property> <name>pepperdata.agent.genericJsonFetch.impala.httpsEnabled</name> <value>true</value> </property>
-
(Re)start the PepAgent.
-
If the Pepperdata services are not yet running, in Cloudera Manager, select the Start action for the Pepperdata service.
-
Otherwise, in Cloudera Manager, select the Restart action for the Pepperdata service.
-
-
Contact Pepperdata Support to request that Impala query metrics be activated for your Pepperdata dashboard.
Task 2: (Optional) Encrypt the Connect String for the Hive Metastore
If you want to encrypt the connect string for the Hive metastore, regardless of whether you’ll store it in the Pepperdata site file or an external file, use the Pepperdata password encryption script.
At a minimum, the unencrypted connect string must include the jdbc:hive2://YOUR-HOSTNAME:YOUR-PORTNUM/
string.
You can add as many connection properties/parameters as you need for your environment, separating them with a semicolon, ;
.
Example Connect Strings
- Without properties/parameters:
jdbc:hive2://localhost:10000/
- Add properties for authenticated environments:
jdbc:hive2://localhost:10000/;user=YOUR-USERNAME;password=YOUR-PASSWORD
- Multiple properties/parameters:
jdbc:hive2://<zookeeper quorum>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=<hiveserver2_namespace>
Procedure
-
Run the Pepperdata encryption script.
/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/encrypt_password.sh
-
At the
Enter the password to encrypt:
prompt, enter your connect string. -
Copy (or make note of) the resulting encrypted connect string.
For example, in the following output from the script, the encrypted connect string is the string
W+ONY3ZcR6QLP5sqoRqcpA=2
.Encrypted password is W+ONY3ZcR6QLP5sqoRqcpA=2
Use this encrypted result as the value for pepperdata.jdbcfetch.hive.connect.string.encrypted
(which you’ll configure later), or store it in the external file specified by the pepperdata.jdbcfetch.hive.connect.string.encrypted.file
property.
Task 3: Enable Fetching of Hive Databases and Tables’ Metadata
To enable Pepperdata to fetch data from the Hive metastore, add the required variables to the Pepperdata configuration.
Procedure
-
Use Cloudera Manager to configure the hostname by adding the following snippet to the PepAgent Advanced Configuration Snippet (Safety Valve) for conf/pepperdata-site.xml template, as an XML block.
-
Select a host that is configured to be a Hive client (and from which you launch Hive queries).
-
Be sure to substitute the fully-qualified, canonical hostname of the selected host for the
YOUR.CANONICAL.HOSTNAME
placeholder in the following code snippet.
<property> <name>pepperdata.jdbcfetch.hive.pepagent.host</name> <value>YOUR.CANONICAL.HOSTNAME</value> <description>Host where the fetching should be enabled.</description> </property>
-
-
Configure the connect string.
Add one of the following properties, depending on your environment and security requirements.
Be sure to substitute your information for the
YOUR...
placeholders.-
Plain text connect string stored in the Pepperdata site file.
At a minimum, the connect string must include the
jdbc:hive2://YOUR-HOSTNAME:YOUR-PORTNUM/
string. You can add as many connection properties/parameters as you need for your environment, separating them with a semicolon,;
.Example Connect Strings
- Without properties/parameters:
jdbc:hive2://localhost:10000/
- Add properties for authenticated environments:
jdbc:hive2://localhost:10000/;user=YOUR-USERNAME;password=YOUR-PASSWORD
- Multiple properties/parameters:
jdbc:hive2://<zookeeper quorum>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=<hiveserver2_namespace>
<property> <name>pepperdata.jdbcfetch.hive.connect.string</name> <value>jdbc:hive2://YOUR-HOSTNAME:YOUR-PORTNUM/${;OPTIONAL-ADDITIONAL-PROPERTY}</value> <description>JDBC Connect string to be used.</description> </property>
- Without properties/parameters:
-
Plain text connect string stored in an external file:
<property> <name>pepperdata.jdbcfetch.hive.connect.string.file</name> <value>YOUR-PATH-TO-JDBCSTRING-FILE</value> <description>Path to file containing JDBC Connect string.</description> </property>
-
Encrypted connect string—the result from encrypting the string earlier in the configuration procedure—stored in the Pepperdata site file:
<property> <name>pepperdata.jdbcfetch.hive.connect.string.encrypted</name> <value>YOUR-ENCRYPTED-TEXT</value> <description>Encrypted JDBC Connect string to be used.</description> </property>
-
Encrypted connect string—the result from encrypting the string earlier in the configuration procedure—stored in an external file:
<property> <name>pepperdata.jdbcfetch.hive.connect.string.encrypted.file</name> <value>YOUR-PATH-TO-JDBCSTRING-FILE</value> <description>Path to file containing encrypted JDBC Connect string.</description> </property>
-
-
(Kerberized Clusters) If the
hiveserver2
service is Kerberized, add the properties for the Kerberos principal and keytab to the Pepperdata site file.-
Enable fetching from a Kerberized Hiveserver2.
<property> <name>pepperdata.jdbcfetch.hive.kerberos.enabled</name> <value>true</value> <description>Should kerberos be used when connecting to Hive?</description> </property>
-
(CDH and CDP Private Cloud Base) Configure the principal and keytab.
• If you already configured the cluster for Kerberos during the installation process (that is, you you selected Enable Access to Kerberized Cluster Components in Task 2: Add Pepperdata Service to Cloudera Manager), you do not need to manually configure the principal and keytab, and you should skip this substep.
• For installations on CDP Public Cloud, Pepperdata automatically configures this for Kerberized clusters; therefore you should skip this substep.Be sure to substitute your information for the
YOUR...
placeholders.<property> <name>pepperdata.jdbcfetch.hive.kerberos.principal</name> <value>YOUR_PRINICPAL/HOST@DOMAIN.COM</value> <description>The Kerberos principal to use to authenticate with the Hive client.</description> </property> <property> <name>pepperdata.jdbcfetch.hive.kerberos.keytab.location</name> <value>YOUR-PATH-TO-KEYTAB-FILE</value> <description>Path to the keytab file for the specified principal.</description> </property>
-
-
Add the
hive-jdbc-*standalone.jar
JAR file to the PepAgent’s classpath on the host that you selected in step 1.-
Find the fully-qualified name of the JAR, which depends on the cluster’s distro.
-
The filename pattern is
hive-jdbc-*standalone.jar
. -
For Parcel installations on Cloudera CDH/CDP Runtime, the JAR file is located in
/opt/cloudera/parcels/CDH/jars/
. -
You can use the
find
command to locate all available JAR files, and output their names to the console; for example:find /opt/cloudera/parcels/CDH/jars/ /usr/lib/hive/lib/ /usr/lib/hive/jdbc/ -name "hive-jdbc-*standalone.jar" 2>/dev/null /usr/lib/hive/lib/hive-jdbc-standalone.jar
Make a note of the JAR file to use. You’ll need this information in the next substep, as the value for the
YOUR-HIVE-JDBC-JAR
placeholder. -
-
Use Cloudera Manage to add the following snippet to the PepAgent Environment Advanced Configuration Snippet (Safety Valve) template.
Be sure to substitute the actual path and filename for the
YOUR-HIVE-JDBC-JAR
placeholder.PD_EXTRA_CLASSPATH_ITEMS=YOUR-HIVE-JDBC-JAR
-
-
In Cloudera Manager, select the Restart action for the PepAgent service.