Obtain Hadoop Container Logs

The Pepperdata dashboard can tell you a lot about how your application is performing, but sometimes you need to answer a “why” question such as “Why did my reducers start seeing heavy GC pauses?” or “Why are my mappers taking an hour to complete?” The best way to answer these sorts of questions is to go to the source: the Hadoop container logs.

Unfortunately, the container logs are typically discarded as soon as an application finishes. However, you can configure YARN to retain the logs for any given length of time.

Configure Logs Retention

  1. Starting with any NodeManager, configure the yarn.nodemanager.delete.debug-delay-sec property.

    • For manually configured clusters, add the property to the host’s /etc/hadoop/conf/yarn-default.xml file.

      Malformed XML files can cause operational errors that can be difficult to debug. To prevent such errors, we recommend that you use a linter, such as xmllint, after you edit any .xml configuration file.
    • If you are using Cloudera Manager, you can set the property value individually for each NodeManager or set the same value for all NodeManagers by using the service configuration page.

      • For an individual NodeManager, navigate to YARN (MR2 INCLUDED) >the-NodeManager-node > Configuration > Localized Dir Deletion Delay.

      • For the service configuration page, navigate to YARN (MR2 INCLUDED) > Configuration > Localized Dir Deletion Delay.

    • If you are using Ambari, navigate to YARN > Configs > Advanced > Advanced yarn-site > yarn.nodemanager.delete.debug-delay-sec.

    Set the property’s value as described in the YARN r2.7.0 documentation :

    “Number of seconds after an application finishes before the NodeManager’s DeletionService will delete the application’s localized file directory and log directory. To diagnose Yarn application problems, set this property’s value large enough (for example, to 600 = 10 minutes) to permit examination of these directories. After changing the property’s value, you must restart the NodeManager in order for it to have an effect. The roots of Yarn applications’ work directories is configurable with the yarn.nodemanager.local-dirs property … and the roots of the Yarn applications’ log directories is configurable with the yarn.nodemanager.log-dirs property.”

  2. Repeat step 1 on every NodeManager in your cluster.

  3. Restart all NodeManagers in the cluster.

Log Files and Locations

The Hadoop container logs directory is configured by the yarn.nodemanager.log-dirs property in the yarn-default.xml file. The directory naming format is ${yarn.nodemanager.log-dirs}/<applicationID>/<containerID/, with files for syslog (in the Hadoop log4j-style output for the container), stdout, and stderr.

In the local-dirs directory, the top level looks as follows:

${yarn.nodemanager.local-dirs}
|--filecache
|--nmPrivate
|--registeredExecutors.ldb
|--usercache

The usercache is our primary interest. An application’s localized file directory is in ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}, which includes subdirectories for individual containers’ work directories, container_${contid}.