Configure History Fetcher Retries (RPM/DEB)

To ensure that application history is successfully fetched from the applicable component (MapReduce Job History Server for MapReduce apps, Spark History Server for Spark apps, or YARN Timeline Server for Tez apps), the Pepperdata Supervisor uses a two-phase approach. Phase 1 makes the initial attempt to fetch the history, and if it fails, makes up to three retries. Phase 2 adds an additional try and by default up to five retries, with the interval between retries increased by a factor of five every time. You can customize the number of retries for each phase, which might be required for environments with extreme network latency or frequent connectivity issues.

Related Topics

Configure Connection Timeout for Spark History Server

Procedure

Add the environment variables for the number of history fetcher retries, for either or both fetching phases.
1. On any host in the cluster, open the Pepperdata configuration file, /etc/pepperdata/pepperdata-config.sh, for editing.
2. Add the environment variables for the number of history fetcher retries, in the following format. Be sure to replace the default number of retries for the first and second phases (3 and 5, respectively) with your custom values.
```
export PD_JOBHISTORY_MONITOR_FIRST_RETRY_COUNT=3
export PD_JOBHISTORY_MONITOR_SECOND_RETRY_COUNT=5
```
3. Save your changes and close the file.
On every host in the cluster, restart the Pepperdata service(s).

Although restarting the PepAgent is optional, we recommend restarting it.
1. Restart the Pepperdata Collector.
  
  You can use either the service (if provided by your OS) or systemctl command:
  - sudo service pepcollectd restart
  - sudo systemctl restart pepcollectd
1. Restart the PepAgent.
  
  You can use either the service (if provided by your OS) or systemctl command:
  - sudo service pepagentd restart
  - sudo systemctl restart pepagentd