(Streaming Spotlight) Configure Streaming Spotlight (RPM/DEB)

Supported versions: See the entries for Streaming Spotlight 8.0.x in the table of Streaming Processor Support by Streaming Spotlight Version

Prerequisites

Before you begin configuring Streaming Spotlight, ensure that your system meets the required prerequisites.

  • Pepperdata must be installed on the host(s) to be configured for Streaming Spotlight.

  • Your cluster uses a supported stream processor (see Stream Processors in the System Requirements) that is already installed.

  • (Kafka streaming) Be sure that the configuration file on each Kafka host, server.properties, has the correct IP address for the zookeeper.connect property, and the correct broker Id for the broker.id property.

  • (Kafka streaming) Kafka brokers must enable local, unencrypted, unauthenticated access to the JMX port.

Task 1: (Kafka) Configure the Environment to Enable Streaming Spotlight

To enable Streaming Spotlight to operate with your Kafka stream processing, the environment must be configured as follows:

  • JAVA_HOME must point to Java 8, which is what Kafka requires.
  • JMX_PORT must be set to a unique Id, to tell Kafka that JMX metrics are to be exported through this port
  • Enable JMX local access (not remote access):
    • com.sun.management.jmxremote.ssl=false
    • com.sun.management.jmxremote.local.only=true

Procedure

  1. Beginning with any host that is to be configured for Streaming Spotlight, restart Kafka with the required environment variables and arguments.

    • You can set the options on the command line as shown, or by using your usual cluster management/framework approach.

    • If your Java 8 binary, Kafka start script, or Zookeeper configuration file are located somewhere other than /usr/java/java-1.8.0, /bin, or config, respectively, or you’re using a port other than 9999 for JMX, be sure to use your locations or port number instead of the defaults that are used in this example. Likewise, if your script to start the Kafka server is named something other than kafka-server-start.sh, use your script’s name instead.

    cd kafka_2.11-2.1.1
    KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=true" JMX_PORT=9999 JAVA_HOME=/usr/java/java-1.8.0 ./bin/kafka-server-start.sh -daemon config/server.properties
    
  2. Repeat step 1 on every host where Streaming Spotlight is to run.

Task 2: (Non-Hadoop Clusters) Enable Pepperdata PepAgent

If you’re installing Streaming Spotlight on a cluster without Hadoop, such as a Kafka-only cluster, the Pepperdata PepAgent must be configured to run without Hadoop.

If you’re installing Streaming Spotlight in a cluster that has Hadoop, skip this task. If you perform this task in a Hadoop cluster, Pepperdata will not operate correctly.

Procedure

  1. Beginning with any host in the Kafka cluster, open the Pepperdata configuration file, /etc/pepperdata/pepperdata-config.sh, for editing.

  2. Add the following variables.

    export PD_CONFIG_FILE=/etc/pepperdata/pepperdata-site.xml
    export PD_HADOOP_ENABLED=0
    export PD_MAPRED_USER=root
    export PD_YARN_USER=root
    
  3. Save your changes and close the file.

  4. Repeat steps 1–3 on every host in the Kafka cluster.

Task 3: (Optional) Configure Password Authentication for JMX Connections

To enable Pepperdata to fetch data from Kafka servers whose JMX connections are secured with password authentication, add the required variables to the PepAgent configuration on all the hosts where Streaming Spotlight is to run.

Prerequisites

  • Encrypt your JMX connection password, and copy/note the result.

    1. Run the Pepperdata password encryption script.

      /opt/pepperdata/supervisor/encrypt_password.sh

    2. At the Enter the password to encrypt: prompt, enter your JMX connection password.

    3. Copy (or make note of) the resulting encrypted password.

      For example, in the following output from the script, the encrypted password is the string W+ONY3ZcR6QLP5sqoRqcpA=2.

      Encrypted password is W+ONY3ZcR6QLP5sqoRqcpA=2

Procedure

  1. Beginning with any host on which Streaming Spotlight is to run, open the Pepperdata site file, pepperdata-site.xml, for editing.

    By default, the Pepperdata site file, pepperdata-site.xml, is located in /etc/pepperdata. If you customized the location, the file is specified by the PD_CONF_DIR environment variable. See Change the Location of pepperdata-site.xml for details.

  2. Add the properties for password authentication.

    Be sure to substitute your user name and encrypted password for the your-username and your-encrypted-password placeholders in the following code snippet.

    <!-- For Kafka Broker JMX Password Authentication-->
    <property>
      <name>pepperdata.kafka.jmxremote.authentication</name>
      <value>true</value>
    </property>
    <property>
      <name>pepperdata.kafka.jmxremote.username</name>
      <value>your-username</value>
    </property>
    <property>
      <name>pepperdata.kafka.jmxremote.encrypted.password</name>
      <value>your-encrypted-password</value>
    </property>
    
  3. Be sure that the XML properties that you added are correctly formatted.

    Malformed XML files can cause operational errors that can be difficult to debug. To prevent such errors, we recommend that you use a linter, such as xmllint, after you edit any .xml configuration file.
  4. Save your changes and close the file.

  5. Repeat steps 1–4 on every host that Streaming Spotlight is to monitor.

Task 4: Enable Kafka Monitoring

To enable Pepperdata to fetch data from the stream processors, add the required variables to the PepAgent configuration on all the hosts where Streaming Spotlight is to run.

Prerequisites

  • (SASL_SSL-secured Kafka clusters) For Simple Authentication and Security Layer (SASL) protocol with SSL, encrypt the truststore password, which will be used in its encrypted form to replace the your-truststore-encrypted-password placeholder in the snippet shown later in the procedure (that will be added to the Pepperdata site file, pepperdata-site.xml).

    1. Run the Pepperdata encryption script.

      /opt/pepperdata/supervisor/encrypt_password.sh

    2. At the Enter the password to encrypt: prompt, enter your password.

    3. Copy (or make note of) the resulting encrypted password.

      For example, in the following output from the script, the encrypted password is the string W+ONY3ZcR6QLP5sqoRqcpA=2.

      Encrypted password is W+ONY3ZcR6QLP5sqoRqcpA=2

  • (SASL_SSL-secured and SASL_PLAINTEXT-secured Kafka clusters) For Simple Authentication and Security Layer (SASL) protocol, whether authentication is SSL-secured or not, format the SASL Java Authorization and Authentication Service (JAAS) configuration as required for the pepperdata.kafka.sasl.jaas.config property’s value—the your-jaas-config placeholder that is shown later in the procedure.

    1. Locate your cluster’s jaas.conf file that contains the Client role. The file can be role-specific or define multiple roles, such as KafkaServer and Client. This typical snippet is from a file that has multiple roles.

      KafkaServer {
      
                com.sun.security.auth.module.Krb5LoginModule required
                doNotPrompt=true
                useKeyTab=true
                storeKey=true
                keyTab="kafka.keytab"
                principal="kafka/broker.pepperdata.demo@PEPPERDATA.DEMO";
      
                org.apache.kafka.common.security.scram.ScramLoginModule required
                ;
      
      };
      
      Client {
         com.sun.security.auth.module.Krb5LoginModule required
         useKeyTab=true
         storeKey=true
         keyTab="kafka.keytab"
         principal="kafka/broker.pepperdata.demo@PEPPERDATA.DEMO";
         };
      
    2. Copy the portion of the Client configuration that is between the curly-braces ({ and }) to a text editor, remove all the newline and extra/padding whitespace characters, and add serviceName="kafka" immediately before the terminating semicolon (;).

      • Continuing with the example snippet, the result with GSSAPI SASL mechanism would be:

        com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="kafka.keytab" principal="kafka/broker.pepperdata.demo@PEPPERDATA.DEMO" serviceName="kafka";
        
      • If the cluster uses PLAIN SASL mechanism, you must add the pdkafka. prefix to the path for the PlainLoginModule class; for example:

        pdkafka.org.apache.kafka.common.security.plain.PlainLoginModule required username="client" password="client-secret";
        
  • (SSL-secured Kafka clusters) Encrypt the following passwords, which will be used in their encrypted forms to replace the corresponding your-*-encrypted-password placeholders in the snippet shown later in the procedure (that will be added to the Pepperdata site file, pepperdata-site.xml).

    • Password for accessing the Kafka clients truststore (your-truststore-encrypted-password)
    • Password for accessing the Kafka clients keystore (your-keystore-encrypted-password)
    • Password for accessing the Kafka clients key (your-key-encrypted-password)

    1. Run the Pepperdata encryption script.

      /opt/pepperdata/supervisor/encrypt_password.sh

    2. At the Enter the password to encrypt: prompt, enter your password.

    3. Copy (or make note of) the resulting encrypted password.

      For example, in the following output from the script, the encrypted password is the string W+ONY3ZcR6QLP5sqoRqcpA=2.

      Encrypted password is W+ONY3ZcR6QLP5sqoRqcpA=2

Procedure

  1. Beginning with any host on which Streaming Spotlight is to run, open the Pepperdata site file, pepperdata-site.xml, for editing.

    By default, the Pepperdata site file, pepperdata-site.xml, is located in /etc/pepperdata. If you customized the location, the file is specified by the PD_CONF_DIR environment variable. See Change the Location of pepperdata-site.xml for details.

  2. Enable monitoring of the Kafka brokers, and specify the JMX port, by adding the required properties.

    The snippet shows the default values. You can change them according to your environment:

    • If you’ve set the JMX_PORT to anything other than 9999 in Task 1, assign that same port number to the kafka.jmx.port value.

    • For Streaming Spotlight to work, the pepperdata.kafka.monitoring.enabled setting must be true.

    <property>
      <name>pepperdata.kafka.monitoring.enabled</name>
      <value>true</value>
    </property>
    <property>
      <name>kafka.jmx.port</name>
      <value>9999</value>
    </property>
    
  3. Enable monitoring of the Kafka cluster via the Kafka Admin API by adding the required properties.

    Unlike typical Pepperdata monitoring (such as monitoring Kafka brokers), which uses a PepAgent on every host, Kafka Admin monitoring uses only a single PepAgent per cluster. This minimizes resource usage and prevents receipt of duplicate data from multiple hosts.

    Choose any PepAgent host in the Kafka cluster, and be sure to substitute your host and Kafka brokers’ names for the your-kafka-admin-host and your-brokers-list placeholders, respectively, in the following code snippet.

    • your-kafka-admin-host—(No default) Fully-qualified host name/IP address. You can choose any host in the cluster to serve as the host where the Kafka Admin Fetcher runs.

    • your-brokers-list—(Default=localhost:9092) Comma-separated list of Kafka brokers, formatted as host1:port1, host2:port2, ..., from which Pepperdata collects data.

    • intervalMillis—(Default=21600000 [6 hours]) Interval in milliseconds (expressed as an unsigned, positive integer) for fetching Kafka cluster configuration data from Kafka brokers. Recommended values are long intervals, such as six hours, because changes to the configuration of brokers and topics occur infrequently.

    <property>
      <name>pepperdata.kafka.admin.pepagent.host</name>
      <value>your-kafka-admin-host</value>
    </property>
    <property>
      <name>pepperdata.kafka.admin.brokers</name>
      <value>your-brokers-list</value>
    </property>
    <property>
      <name>pepperdata.agent.genericJsonFetch.kafka.configData.intervalMillis</name>
      <value>21600000</value>
    </property>
    
  4. (SASL_SSL-secured Kafka clusters) If your Kafka cluster is secured with Simple Authentication and Security Layer (SASL) with SSL encryption, enable Pepperdata communication with the Kafka hosts by adding the required properties.

    • The pepperdata.kafka.security.enabled property must be set to true. If it is false, the remaining properties that are shown in the snippet are ignored, and Pepperdata is unable to communicate with an SASL_SSL-secured Kafka cluster.

    • your-id-algorithm—(Default="" [an empty string, including the quote marks]) SSL endpoint identification algorithm. If the cluster does not support host identification, use the default value of an empty string (that includes the quote marks).

    • your-truststore-location—(Default="" [an empty string, including the quote marks]) Valid filename that identifies the truststore of the Kafka clients. If an invalid location is specified, Pepperdata cannot collect the metrics.

    • your-truststore-encrypted-password—(No default) The encrypted password for accessing the Kafka clients truststore. For encryption instructions, see this task’s Prerequisites.

    • your-sasl-mechanism—(Default=PLAIN) The authentication mechanism that SASL_SSL is using; any of the following: PLAIN, GSSAPI, SCRAM-SHA-256, OAUTHBEARER.

    • your-jaas-config—(Default="" [an empty string, including the quote marks]) The SASL Java Authorization and Authentication Service (JAAS) configuration used by the Kafka cluster. For formatting instructions, see this task’s Prerequisites.

    Be sure to substitute your cluster’s configuration values for the placeholders in the following snippet (your-id-algorithm, your-truststore-location, your-truststore-encrypted-password, your-sasl-mechanism, and your-jaas-config).

    <property>
      <name>pepperdata.kafka.security.enabled</name>
      <value>true</value>
    </property>
    <property>
      <name>pepperdata.kafka.security.protocol</name>
      <value>SASL_SSL</value>
    </property>
    <property>
      <name>pepperdata.kafka.ssl.endpoint.id.algorithm</name>
      <value>your-id-algorithm</value>
    </property>
    <property>
      <name>pepperdata.kafka.ssl.truststore.location</name>
      <value>your-truststore-location</value>
    </property>
    <property>
      <name>pepperdata.kafka.ssl.truststore.password</name>
      <value>your-truststore-encrypted-password</value>
    </property>
    <property>
      <name>pepperdata.kafka.sasl.mechanism</name>
      <value>your-sasl-mechanism</value>
    </property>
    <property>
      <name>pepperdata.kafka.sasl.jaas.config</name>
      <value>your-jaas-config</value>
    </property>
    
  5. (SASL_PLAINTEXT-secured Kafka clusters) If your Kafka cluster is secured with Simple Authentication and Security Layer (SASL) with no SSL encryption, enable Pepperdata communication with the Kafka hosts by adding the required properties.

    • The pepperdata.kafka.security.enabled property must be set to true. If it is false, the remaining properties that are shown in the snippet are ignored, and Pepperdata is unable to communicate with an SASL_PLAINTEXT-secured Kafka cluster.

    • your-sasl-mechanism—(Default=PLAIN) The authentication mechanism that SASL_PLAINTEXT is using; generally you should use the default, PLAIN. (Other options are: GSSAPI, SCRAM-SHA-256, OAUTHBEARER.)

    • your-jaas-config—(Default="" [an empty string, including the quote marks]) The SASL Java Authorization and Authentication Service (JAAS) configuration used by the Kafka cluster. For formatting instructions, see this task’s Prerequisites.

    Be sure to substitute your cluster’s configuration values for the placeholders in the following snippet (your-id-algorithm, your-truststore-location, your-truststore-encrypted-password, your-sasl-mechanism, and your-jaas-config).

    <property>
      <name>pepperdata.kafka.security.enabled</name>
      <value>true</value>
    </property>
    <property>
      <name>pepperdata.kafka.security.protocol</name>
      <value>SASL_PLAINTEXT/value>
    </property>
    <property>
      <name>pepperdata.kafka.sasl.mechanism</name>
      <value>your-sasl-mechanism</value>
    </property>
    <property>
      <name>pepperdata.kafka.sasl.jaas.config</name>
      <value>your-jaas-config</value>
    </property>
    
  6. (SSL-secured Kafka cluster) If your Kafka cluster is secured with SSL, enable Pepperdata communication with the Kafka hosts by adding the required properties.

    • The pepperdata.kafka.security.enabled property must be set to true. If it is false, the remaining properties that are shown in the snippet are ignored, and Pepperdata is unable to communicate with an SSL-secured Kafka cluster.

    • The pepperdata.kafka.security.protocol property must be set to "SSL" (a string, including the quote marks). If you do not explicitly set this property, its default value of "SASL_SSL" will apply, which will not work for an SSL-secured Kafka cluster.

    • your-id-algorithm—(Default="" [an empty string, including the quote marks]) SSL endpoint identification algorithm. If the cluster does not support host identification, use the default value of an empty string (that includes the quote marks).

    • your-truststore-location—(Default="" [an empty string, including the quote marks]) Valid filename that identifies the truststore of the Kafka clients. If an invalid location is specified, Pepperdata cannot collect the metrics.

    • your-truststore-encrypted-password—(No default) The encrypted password for accessing the Kafka clients truststore. For encryption instructions, see this task’s Prerequisites.

    • your-keystore-location—(Default="" [an empty string, including the quote marks]) Valid filename that identifies the keystore of the Kafka clients. If an invalid location is specified, Pepperdata cannot collect the metrics.

    • your-keystore-encrypted-password—(No default) The encrypted password for accessing the Kafka clients keystore. For encryption instructions, see this task’s Prerequisites.

    • your-key-encrypted-password—(No default) The encrypted password for accessing the Kafka clients key. For encryption instructions, see this task’s Prerequisites.

    Be sure to substitute your cluster’s configuration values for the placeholders in the following snippet (your-id-algorithm, your-truststore-location, your-truststore-encrypted-password, your-keystore-location, your-keystore-encrypted-password, and your-key-encrypted-password).

    <property>
      <name>pepperdata.kafka.security.enabled</name>
      <value>true</value>
    </property>
    <property>
      <name>pepperdata.kafka.security.protocol</name>
      <value>SSL</value>
    </property>
    <property>
      <name>pepperdata.kafka.ssl.endpoint.id.algorithm</name>
      <value>your-id-algorithm</value>
    </property>
    <property>
      <name>pepperdata.kafka.ssl.truststore.location</name>
      <value>your-truststore-location</value>
    </property>
    <property>
      <name>pepperdata.kafka.ssl.truststore.password</name>
      <value>your-truststore-encrypted-password</value>
    </property>
    <property>
      <name>pepperdata.kafka.ssl.keystore.location</name>
      <value>your-keystore-location</value>
    </property>
    <property>
      <name>pepperdata.kafka.ssl.keystore.password</name>
      <value>your-keystore-encrypted-password</value>
    </property>
    <property>
      <name>pepperdata.kafka.ssl.key.password</name>
      <value>your-key-encrypted-password</value>
    </property>
    
  7. Be sure that the XML properties that you added are correctly formatted.

    Malformed XML files can cause operational errors that can be difficult to debug. To prevent such errors, we recommend that you use a linter, such as xmllint, after you edit any .xml configuration file.
  8. Save your changes and close the file.

  9. Repeat steps 1–8 on every host that Streaming Spotlight is to monitor.

  10. Prevent dependency collisions when Streaming Spotlight starts by configuring Streaming Spotlight to load the required JAR file.

    1. Beginning with any host in the Kafka cluster, open the Pepperdata configuration file, /etc/pepperdata/pepperdata-config.sh, for editing.

    2. Add the following variable.

      export PD_LOAD_KAFKA_JAR=1
      
    3. Save your changes and close the file.

    1. Restart the PepAgent.

      You can use either the service (if provided by your OS) or systemctl command:

      • sudo service pepagentd restart
      • sudo systemctl restart pepagentd

      If any of the process’s startup checks fail, an explanatory message appears and the process does not start. Address the issues and try again to start the process.

    2. Repeat steps a–d on every host in the Kafka cluster.