Preconfigured Custom Program Monitoring

By default, Pepperdata software is preconfigured to monitor Apache Impala®, Apache Spark History Server, and MapReduce Job History Server processes. If you do not want to monitor these programs you can override their program matching rules—a dictionary of key/value pairs in a yaml file—in your own yaml file for program matching rules. Or, you can disable custom program monitoring altogether.

As an example for writing your own program matching rules, it can be helpful to examine the default rules file for the preconfigured program matching rules, /opt/pepperdata/supervisor/lib/pepagent-program-monitor-config-default.yaml, duplicated here.

For details of all the keys, see YAML Sections: Program Matching Rules.

programs:
    impala:
      active: yes # optional and can be over-ridden.
      domain: "impala"
      rules:
      # List of possible locations for pid files.
      # pid match based rule
        - pid-locations:
            - /var/run/impala/impalad-impala.pid
          command-match:
            regex: "^.*impalad "
        # non pid based rule in case pid based matching fails.
        - command-match:
            regex: "^.*impalad "
    sparkhistory:
      domain: "yarn-daemon"
      rules:
        - pid-locations:
          - /var/run/spark/spark-history-server.pid
          - /var/run/spark/spark-spark-org.apache.spark.deploy.history.HistoryServer-1.pid
          - /var/run/spark2/spark-spark-org.apache.spark.deploy.history.HistoryServer-1.pid
          command-match:
            substring: "org.apache.spark.deploy.history.HistoryServer"
    mapreducehistory:
      domain: "yarn-daemon"
      rules:
        - pid-locations:
          - /var/run/hadoop-mapreduce/mapred-mapred-historyserver.pid
          - /var/run/hadoop-mapreduce/mapred/mapred-mapred-historyserver.pid
          command-match:
            substring: "-Dproc_historyserver"
        - command-match:
            substring: "-Dproc_historyserver"
    nodemanager:
      active: yes
      domain: "yarn-daemon"
      rules:
        - pid-locations: # could be a list of alternate locations for pid files.
            - /var/run/hadoop-yarn/yarn-yarn-nodemanager.pid
          command-match:
            substring: "-Dproc_nodemanager"
          ignore-match:
            substring: "container"  # helps ignore matches with any child container process.
        - command-match:
            substring: "-Dproc_nodemanager"
          ignore-match:
            substring: "container" # Helps ignore matches with any child container process.
    resourcemanager:
      active: yes
      domain: "yarn-daemon"
      rules:
        - pid-locations:
            - /var/run/hadoop-yarn/yarn-yarn-resourcemanager.pid
          command-match:
            substring: "-Dproc_resourcemanager"
        - command-match:
            substring: "-Dproc_resourcemanager"
    namenode:
      active: yes
      domain: "yarn-daemon"
      rules:
        - pid-locations:
            - /var/run/hadoop-hdfs/hadoop-hdfs-namenode.pid
          command-match:
            substring: "-Dproc_namenode"
        - command-match:
            substring: "-Dproc_namenode"
    datanode:
      active: yes
      domain: "yarn-daemon"
      rules:
        - pid-locations:
            - /var/run/hadoop-hdfs/hadoop-hdfs-datanode.pid
          command-match:
            substring: "-Dproc_datanode"
        - command-match:
            substring: "-Dproc_datanode"

Related Topics

Next: Default Metrics Collection for Custom-Monitored Programs