Adding Apache® Impala Query Metrics (RPM/DEB)
Supported Versions of Impala: Versions supported by any Cloudera CDH/CDP distribution that Pepperdata supports; see Pepperdata-Platform Support
Cluster administrators often need precise resource usage information, even down to the query level of detail, in order to create accurate chargeback reports. If you’re using Apache Impala, you can enable Pepperdata to collect Impala query metrics for CPU and memory usage. When the queries are finished, Pepperdata reads the Impala query profiles to calculate the resource usage.
For more information about Apache® Impala query metrics, see Monitoring Apache® Impala Query Metrics.
• When you enable Impala query monitoring, the metrics appear in the Charts page.
• When you enable Impala query monitoring and Query Spotlight, the metrics appear in both the Charts and in Query Spotlight.
• If you do not enable Impala query monitoring, but you do enable Query Spotlight, Impala metrics do not appear in Query Spotlight, but Query Spotlight will monitor any other query types that you’ve configured Pepperdata to gather.
Prerequisites
Before you enable Impala query monitoring, ensure that your system meets the required prerequisites.
- Pepperdata PepAgent (
pepagentd
) and PepCollector (pepcollectd
) must be installed and running on all hosts on which the Impalaimpalad
daemon is running.
Procedure
-
On any coordinator—a host on which the Impala
impalad
daemon is running—open the host’s Pepperdata site file,pepperdata-site.xml
, for editing.By default, the Pepperdata site file,
pepperdata-site.xml
, is located in/etc/pepperdata
. If you customized the location, the file is specified by thePD_CONF_DIR
environment variable. See Change the Location of pepperdata-site.xml for details. -
Add the properties to enable query monitoring, and (optionally) to configure a non-default location for where to read the query profiles.
By default, the PepAgent reads profiles of completed queries from
/var/log/impalad/profiles/
.-
To use the default location, omit the
pepperdata.impala.query.queryLogDir
property. -
To use a different location, add the
pepperdata.impala.query.queryLogDir
property, and be sure to substitute your location for theyour-impalad-profiles-location
placeholder.
<property> <name>pepperdata.impala.query.monitoring.enabled</name> <value>true</value> </property> <property> <name>pepperdata.impala.query.queryLogDir</name> <value>your-impalad-profiles-location</value> </property>
-
-
(HTTPS
impalad
daemon endpoints) If yourimpalad
daemon is configured for HTTPS instead of HTTP, add thepepperdata.agent.genericJsonFetch.impala.httpsEnabled
property so that the fetcher for information about Impala queries in flight uses the HTTPS endpoint instead of the default HTTP endpoint (http://LOCALHOST:25000/queries?json
).<property> <name>pepperdata.agent.genericJsonFetch.impala.httpsEnabled</name> <value>true</value> </property>
-
(Digest authentication for the Impala Web UI for debugging) If the
impalad
daemon for your Impala Web UI for debugging is secured by digest authentication, add the authentication credentials.Note: This step is for using digest auth to secure theimpalad
daemon of the Impala Web UI for debugging , not for securing the Impala core services with Kerberos or LDAP.Be sure to substitute your username and password for the
your-username
andyour-password
placeholders in the following code snippet.<property> <name>pepperdata.agent.genericJsonFetch.impala.http.authentication.type</name> <value>digest</value> </property> <property> <name>pepperdata.agent.genericJsonFetch.impala.auth.username</name> <value>your-username</value> </property> <property> <name>pepperdata.agent.genericJsonFetch.impala.auth.password</name> <value>your-password</value> </property> <property> <name>pepperdata.agent.genericJsonFetch.impala.httpsEnabled</name> <value>true</value> </property>
Malformed XML files can cause operational errors that can be difficult to debug. To prevent such errors, we recommend that you use a linter, such asxmllint
, after you edit any .xml configuration file. -
(Kerberos for the Impala Web UI for debugging) If the
impalad
daemon for your Impala Web UI for debugging is Kerberized, add the authentication credentials.Support for using Kerberos to secure theimpalad
daemon of the Impala Web UI for debugging requires Supervisor v7.0.12 or later.Be sure to substitute your Kerberos principal and the path of the corresponding keytab file for the
your-kerberos-principal
andyour-kerberos-keytab-pathname
placeholders in the following code snippet.If you already configured thePD_AGENT_PRINCIPAL
andPD_AGENT_KEYTAB_LOCATION
environment variables during the installation process (Task 4. (Kerberized clusters) Enable Kerberos Authentication), except to override the cluster-level assignments.
The fetcher properties (pepperdata.agent.genericJsonFetch.impala.kerberos.principal
andpepperdata.agent.genericJsonFetch.impala.keytab.location
) are inherited from the properties that were automatically assigned when you installed Pepperdata in the cluster.<property> <name>pepperdata.agent.genericJsonFetch.impala.http.authentication.type</name> <value>kerberos</value> </property> <property> <name>pepperdata.agent.genericJsonFetch.impala.kerberos.principal</name> <value>your-kerberos-principal</value> </property> <property> <name>pepperdata.agent.genericJsonFetch.impala.keytab.location</name> <value>your-kerberos-keytab-pathname</value> </property> <property> <name>pepperdata.agent.genericJsonFetch.impala.httpsEnabled</name> <value>true</value> </property>
Malformed XML files can cause operational errors that can be difficult to debug. To prevent such errors, we recommend that you use a linter, such asxmllint
, after you edit any .xml configuration file. -
Save your changes and close the file.
-
Restart the PepAgent.
You can use either the
service
(if provided by your OS) orsystemctl
command:sudo service pepagentd restart
sudo systemctl restart pepagentd
If any of the process’s startup checks fail, an explanatory message appears and the process does not start. Address the issues and try again to start the process.
Tip: Any time you modify the yaml rules file, you must reload the rules file by restarting PepAgent. -
Repeat steps 1–7 on every coordinator host in your cluster.
Important: Be sure to repeat steps 1–7 on every coordinator host. If you skip the configuration process on a coordinator host, Pepperdata is unable to collect metrics for queries that run on that host. -
Contact Pepperdata Support to request that Impala query metrics be activated for your Pepperdata dashboard.