(Streaming Spotlight) Configure Streaming Spotlight (Parcel)

Supported versions: See the entries for Streaming Spotlight 8.0.x in the table of Streaming Processor Support by Streaming Spotlight Version

Prerequisites

Before you begin configuring Streaming Spotlight, ensure that your system meets the required prerequisites.

  • Pepperdata must be installed on the host(s) to be configured for Streaming Spotlight.

  • Your cluster uses a supported stream processor (see Stream Processors in the System Requirements) that is already installed.

  • (Kafka streaming) Be sure that the configuration file on each Kafka host, server.properties, has the correct IP address for the zookeeper.connect property, and the correct broker Id for the broker.id property.

  • (Kafka streaming) Kafka brokers must enable local, unencrypted, unauthenticated access to the JMX port.

Task 1: (Kafka) Configure the Environment to Enable Streaming Spotlight

To enable Streaming Spotlight to operate with your Kafka stream processing, the environment must be configured as follows:

  • JAVA_HOME must point to Java 8, which is what Kafka requires.
  • JMX_PORT must be set to a unique Id, to tell Kafka that JMX metrics are to be exported through this port
  • Enable JMX local access (not remote access): com.sun.management.jmxremote.ssl=false com.sun.management.jmxremote.local.only=true

Procedure

  1. Using Cloudera Manager, configure Kafka to be compatible with the Streaming Spotlight requirements.

    • Set the JMX_PORT to a unique Id, to tell Kafka that JMX metrics are to be exported through this port. Navigate to Kafka > Configuration > JMX Port, and for the Kafka Broker Default Group, enter the port Id.

    • Enable JMX local access (not remote access). Navigate to Kafka > Configuration > Additional Broker Java Options broker_java_opts, and add -Dcom.sun.management.jmxremote.ssl=false and -Dcom.sun.management.jmxremote.local.only=true.

  2. In Cloudera Manager, select the Restart action for the Kafka service.

Task 2: (Non-Hadoop Clusters) Enable Pepperdata PepAgent

If you’re installing Streaming Spotlight on a cluster without Hadoop, such as a Kafka-only cluster, the Pepperdata PepAgent must be configured to run without Hadoop.

If you’re installing Streaming Spotlight in a cluster that has Hadoop, skip this task. If you perform this task in a Hadoop cluster, Pepperdata will not operate correctly.

Procedure

  • In Cloudera Manager, locate the Pepperdata > Configuration > PepAgent Default Group > Run Pepperdata in Non-Hadoop Environment parameter, and select it.

    If you already selected the Run Pepperdata in Non-Hadoop Environment parameter when you added the Pepperdata service to Cloudera Manager, be careful to not cancel the option by selecting (clicking) it again.

Task 3: (Optional) Configure Password Authentication for JMX connections

To enable Pepperdata to fetch data from Kafka servers whose JMX connections are secured with password authentication, add the required variables to the PepAgent configuration.

Procedure

  1. Encrypt your JMX connection password, and copy/note the result.

    1. Run the Pepperdata password encryption script.

      /opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/encrypt_password.sh

    2. At the Enter the password to encrypt: prompt, enter your JMX connection password.

    3. Copy (or make note of) the resulting encrypted password.

      For example, in the following output from the script, the encrypted password is the string W+ONY3ZcR6QLP5sqoRqcpA=2.

      Encrypted password is W+ONY3ZcR6QLP5sqoRqcpA=2

  2. Add the properties for password authentication.

    In Cloudera Manager, navigate to Pepperdata > Configuration, and locate and configure the following values for the PepAgent Default Group.

    • Kafka JMX Remote Authentication Enabled—Select it.

    • Kafka JMX Remote User—Enter your user name.

    • Kafka JMX Remote Encrypted Password—Enter your encrypted password.

Task 4: Enable Kafka Monitoring

To enable Pepperdata to fetch data from the stream processors, add the required variables to the PepAgent configuration on all the hosts where Streaming Spotlight is to run.

Prerequisites

  • (SASL_SSL-secured Kafka clusters) For Simple Authentication and Security Layer (SASL) protocol with SSL, encrypt the truststore password, which will be used in its encrypted form when you enter the Kafka Client Trust Store File Password later in the procedure.

    1. Run the Pepperdata encryption script.

      /opt/pepperdata/supervisor/encrypt_password.sh

    2. At the Enter the password to encrypt: prompt, enter your password.

    3. Copy (or make note of) the resulting encrypted password.

      For example, in the following output from the script, the encrypted password is the string W+ONY3ZcR6QLP5sqoRqcpA=2.

      Encrypted password is W+ONY3ZcR6QLP5sqoRqcpA=2

  • (SASL_SSL-secured and SASL_PLAINTEXT-secured Kafka clusters) For Simple Authentication and Security Layer (SASL) protocol, whether authentication is SSL-secured or not, format the SASL Java Authorization and Authentication Service (JAAS) configuration as required for the Kafka SASL JAAS Configuration value that you’ll enter later in the procedure.

    1. Locate your cluster’s jaas.conf file that contains the Client role. The file can be role-specific or define multiple roles, such as KafkaServer and Client. This typical snippet is from a file that has multiple roles.

      KafkaServer {
      
                com.sun.security.auth.module.Krb5LoginModule required
                doNotPrompt=true
                useKeyTab=true
                storeKey=true
                keyTab="kafka.keytab"
                principal="kafka/broker.pepperdata.demo@PEPPERDATA.DEMO";
      
                org.apache.kafka.common.security.scram.ScramLoginModule required
                ;
      
      };
      
      Client {
         com.sun.security.auth.module.Krb5LoginModule required
         useKeyTab=true
         storeKey=true
         keyTab="kafka.keytab"
         principal="kafka/broker.pepperdata.demo@PEPPERDATA.DEMO";
         };
      
    2. Copy the portion of the Client configuration that is between the curly-braces ({ and }) to a text editor, remove all the newline and extra/padding whitespace characters, and add serviceName="kafka" immediately before the terminating semicolon (;).

      • Continuing with the example snippet, the result with GSSAPI SASL mechanism would be:

        com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="kafka.keytab" principal="kafka/broker.pepperdata.demo@PEPPERDATA.DEMO" serviceName="kafka";
        
      • If the cluster uses PLAIN SASL mechanism, you must add the pdkafka. prefix to the path for the PlainLoginModule class; for example:

        pdkafka.org.apache.kafka.common.security.plain.PlainLoginModule required username="client" password="client-secret";
        
  • (SSL-secured Kafka clusters) Encrypt the following passwords, which you’ll enter in their encrypted forms later in the procedure.

    • Password for accessing the Kafka clients truststore (Kafka Client Trust Store File Password)
    • Password for accessing the Kafka clients keystore (Kafka Client Keystore File Password)
    • Password for accessing the Kafka clients key (Kafka Client Keystore Key Password)

    1. Run the Pepperdata encryption script.

      /opt/pepperdata/supervisor/encrypt_password.sh

    2. At the Enter the password to encrypt: prompt, enter your password.

    3. Copy (or make note of) the resulting encrypted password.

      For example, in the following output from the script, the encrypted password is the string W+ONY3ZcR6QLP5sqoRqcpA=2.

      Encrypted password is W+ONY3ZcR6QLP5sqoRqcpA=2

Procedure

  1. Enable monitoring of the Kafka brokers, and specify the JMX port.

    In Cloudera Manager, navigate to Pepperdata > Configuration, and locate and configure the following values for the PepAgent Default Group.

    • Enable Kafka Monitoring—Select it.

    • Kafka JMX Port—(Default=9393) If you’ve set the JMX_PORT to anything other than 9393 in Task 1, enter that same port number here.

  2. Enable monitoring of the Kafka cluster via the Kafka Admin API.

    Unlike typical Pepperdata monitoring (such as monitoring Kafka brokers), which uses a PepAgent on every host, Kafka Admin monitoring uses only a single PepAgent per cluster. This minimizes resource usage and prevents receipt of duplicate data from multiple hosts.

    In Cloudera Manager, navigate to Pepperdata > Configuration, and locate and configure the following values for the PepAgent Default Group.

    • Kafka Admin PepAgent Host—(No default) Fully-qualified host name/IP address. You can choose any host in the cluster to serve as the host where the Kafka Admin Fetcher runs.

    • Kafka Admin Brokers—(Default=localhost:9092) Comma-separated list of Kafka brokers, formatted as host1:port1, host2:port2, ..., from which Pepperdata collects data.

    • Kafka Configuration Data Fetch Interval—(Default=6 hours) Interval for fetching Kafka cluster configuration data from Kafka brokers. Recommended values are long intervals, such as six hours, because changes to the configuration of brokers and topics occur infrequently.

  3. (SASL_SSL-secured Kafka clusters) If your Kafka cluster is secured with Simple Authentication and Security Layer (SASL) with SSL encryption, enable Pepperdata communication with the Kafka hosts.

    In Cloudera Manager, navigate to Pepperdata > Configuration, and locate and configure the following values for the PepAgent Default Group.

    • Kafka Security Enabled—Select it.

    • Kafka Security Protocol—(Default=SASL_SSL) Be sure that SASL_SSL is selected.

    • SSL Endpoint Identification Algorithm—(No default) SSL endpoint identification algorithm. If the cluster does not support host identification, leave this blank.

    • Kafka Client Trust Store File Location—(No Default) Valid filename that identifies the truststore of the Kafka clients. If an invalid location is specified, Pepperdata cannot collect the metrics.

    • Kafka Client Trust Store File Password—(No default) The encrypted password for accessing the Kafka clients truststore. For encryption instructions, see this task’s Prerequisites.

    • Kafka SASL Mechanism—(Default=PLAIN) The authentication mechanism that SASL_SSL is using. Select any of the provided options (PLAIN, GSSAPI, SCRAM-SHA-256, OAUTHBEARER).

    • Kafka SASL JAAS Configuration—(No default) The SASL Java Authorization and Authentication Service (JAAS) configuration used by the Kafka cluster. For formatting instructions, see this task’s Prerequisites.

  4. (SASL_PLAINTEXT-secured Kafka clusters) If your Kafka cluster is secured with Simple Authentication and Security Layer (SASL) with no SSL encryption, enable Pepperdata communication with the Kafka hosts.

    In Cloudera Manager, navigate to Pepperdata > Configuration, and locate and configure the following values for the PepAgent Default Group.

    • Kafka Security Enabled—Select it.

    • Kafka Security Protocol—(Default=SASL_SSL) Select SASL_PLAINTEXT.

    • SSL Endpoint Identification Algorithm—(No default) SSL endpoint identification algorithm. Leave this blank.

    • Kafka SASL Mechanism—(Default=PLAIN) The authentication mechanism that SASL_PLAINTEXT is using. Generally you should select PLAIN from the provided options (PLAIN, GSSAPI, SCRAM-SHA-256, OAUTHBEARER).

    • Kafka SASL JAAS Configuration—(No default) The SASL Java Authorization and Authentication Service (JAAS) configuration used by the Kafka cluster. For formatting instructions, see this task’s Prerequisites.

  5. (SSL-secured Kafka clusters) If your Kafka cluster is secured with SSL, enable Pepperdata communication with the host.

    In Cloudera Manager, navigate to Pepperdata > Configuration, and locate and configure the following values for the PepAgent Default Group.

    • Kafka Security Enabled—Select it.

    • Kafka Security Protocol—(Default=SASL_SSL) Select SSL.

    • SSL Endpoint Identification Algorithm—(No default) SSL endpoint identification algorithm. If the cluster does not support host identification, leave this blank.

    • Kafka Client Trust Store File Location—(No Default) Valid filename that identifies the truststore of the Kafka clients. If an invalid location is specified, Pepperdata cannot collect the metrics.

    • Kafka Client Trust Store File Password—(No default) The encrypted password for accessing the Kafka clients truststore. For encryption instructions, see this task’s Prerequisites.

    • Kafka Client Keystore File Location—(No default) Valid filename that identifies the keystore of the Kafka clients. If an invalid location is specified, Pepperdata cannot collect the metrics.

    • Kafka Client Keystore File Password—(No default) The encrypted password for accessing the Kafka clients keystore. For encryption instructions, see this task’s Prerequisites.

    • Kafka Client Keystore Key Password—(No default) The encrypted password for accessing the Kafka clients key. For encryption instructions, see this task’s Prerequisites.

  6. Click Save Changes.

  7. (Re)start the PepAgent.

    • If the Pepperdata services are not yet running, in Cloudera Manager, select the Start action for the Pepperdata service.

    • Otherwise, in Cloudera Manager, select the Restart action for the Pepperdata service.