Hadoop Logging
Logging Overview
Log4J Log Levels
Find log4j log levels for daemon service using http://<namenode>:50070/logLevel or http://<resource manager>:8088/logLevel
You can temporarily change log level using the above links. Log levels resets to the log4j.properties after restarts
You can also find log level and temporarily set the log level using shell command
$ hadoop daemonlog -getlevel <resource manager>:8088 <daemon class name as appears in jps -l command as below>
Find daemon class name using jps command
$ sudo jps -l
5826 org.apache.hadoop.yarn.server.nodemanager.NodeManager
5401 org.apache.hadoop.hdfs.server.namenode.NameNode
5537 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode
6115 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
7454 org.apache.spark.deploy.history.HistoryServer
5725 org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer
5310 org.apache.hadoop.hdfs.qjournal.server.JournalNode
5203 org.apache.hadoop.hdfs.server.datanode.DataNode
HDFS Audit Logging
Enable HDFS audit logging by setting the env variable in hadoop-env.sh and set log4j log hanger in log4j.properties
export HDFS_AUDIT_LOGGER="INFO, RFAAUDIT"
Job History
Job history is maintained for all completed job by job history service
Job history is stored HDFS as specified in mapreduce.jobhistory.done-dir [mapred-site.xml]
Job history records are stored for a week because system deletes them
Job history includes - job, tasks, attempts and stored in json format.
View job history information by http://<job history server>:19888
MapReduce Task Logs
MapReduce task logs are produced by log4j and stored locally by default
MapReduce task logs can be aggregated to a HDFS location by setting yarn.log-aggregation-enable [yarn-site.xml] property
Task logs are stored for 3 hrs or as configured in yarn.log-aggregation.retain-seconds [yarn-site.xml]
Logging level at the task level can be altered using the following properties
mapreduce.map.log.level
mapreduce.reduce.log.level