1、配置
hadoop開啟審計日志需要修改hadoop-env.sh
export HADOOP_NAMENODE_OPTS=.... -Dhdfs.audit.logger=HADOOP_NAMENODE_OPTS
(RFAAUDIT默認(rèn)已在log4j.properties中配置)
2、源碼分析:
源碼在FSNamesystem類中
public static final Log auditLog = LogFactory.getLog(
FSNamesystem.class.getName() + ".audit");
對應(yīng)log4j.properties中的配置:
#
# hdfs audit logging
#
hdfs.audit.logger=INFO,NullAppender
hdfs.audit.log.maxfilesize=256MB
hdfs.audit.log.maxbackupindex=20
log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger}
log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false
log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender
log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log
log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout
log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize}
log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex}
|
輸出日志是在logAuditEvent方法中,該方法有多個重載:
FSNamesystem中最終實現(xiàn)體:
private void logAuditEvent(boolean succeeded,
UserGroupInformation ugi, InetAddress addr, String cmd, String src,
String dst, HdfsFileStatus stat) {
FileStatus status = null;
if (stat != null) {
Path symlink = stat.isSymlink() ? new Path(stat.getSymlink()) : null;
Path path = dst != null ? new Path(dst) : new Path(src);
status = new FileStatus(stat.getLen(), stat.isDir(),
stat.getReplication(), stat.getBlockSize(), stat.getModificationTime(),
stat.getAccessTime(), stat.getPermission(), stat.getOwner(),
stat.getGroup(), symlink, path);
}
for (AuditLogger logger : auditLoggers) {
if (logger instanceof HdfsAuditLogger) {
HdfsAuditLogger hdfsLogger = (HdfsAuditLogger) logger;
hdfsLogger.logAuditEvent(succeeded, ugi.toString(), addr, cmd, src, dst,
status, ugi, dtSecretManager);
} else {
logger.logAuditEvent(succeeded, ugi.toString(), addr,
cmd, src, dst, status);
}
}
}
|
auditLoggers在hdfs-site.xml中配置dfs.namenode.audit.loggers(該參數(shù)目前未設(shè)置)
默認(rèn)實現(xiàn)類為:
private static class DefaultAuditLogger extends HdfsAuditLogger {
@Override
public void logAuditEvent(boolean succeeded, String userName,
InetAddress addr, String cmd, String src, String dst,
FileStatus status, UserGroupInformation ugi,
DelegationTokenSecretManager dtSecretManager) {
if (auditLog.isInfoEnabled()) {
final StringBuilder sb = auditBuffer.get();
sb.setLength(0);
sb.append("allowed=").append(succeeded).append("\t");
sb.append("ugi=").append(userName).append("\t");
sb.append("ip=").append(addr).append("\t");
sb.append("cmd=").append(cmd).append("\t");
sb.append("src=").append(src).append("\t");
sb.append("dst=").append(dst).append("\t");
if (null == status) {
sb.append("perm=null");
} else {
sb.append("perm=");
sb.append(status.getOwner()).append(":");
sb.append(status.getGroup()).append(":");
sb.append(status.getPermission());
}
if (logTokenTrackingId) {
sb.append("\t").append("trackingId=");
String trackingId = null;
if (ugi != null && dtSecretManager != null
&& ugi.getAuthenticationMethod() == AuthenticationMethod.TOKEN) {
for (TokenIdentifier tid: ugi.getTokenIdentifiers()) {
if (tid instanceof DelegationTokenIdentifier) {
DelegationTokenIdentifier dtid =
(DelegationTokenIdentifier)tid;
trackingId = dtSecretManager.getTokenTrackingId(dtid);
break;
}
}
}
sb.append(trackingId);
}
sb.append("\t").append("proto=");
sb.append(NamenodeWebHdfsMethods.isWebHdfsInvocation() ? "webhdfs" : "rpc");
logAuditMessage(sb.toString());
}
}
|
因為如需設(shè)置局部開啟審計日志,需修改此處,或者自定義實現(xiàn)org.apache.hadoop.hdfs.server.namenode.AuditLogger。
hadoop的審計日志格式如下:
2014-04-30 10:19:13,173 INFO FSNamesystem.audit: allowed=true ugi=cdh5 (auth:SIMPLE) ip=/10.1.251.52 cmd=create src=/a.COPYING dst=null perm=cdh5:supergroup:rw-r--r--
3、日志同步
日志同步考慮使用elk技術(shù)棧,但在調(diào)研時發(fā)現(xiàn)logstash對性能消耗太大(參考https://blog.csdn.net/u010871982/article/details/79035317/),logstash官方文檔推薦使用filebeat采集日志,再經(jīng)由logstash最終寫入到ES。Filebeat是Elastic Stack的一部分(go語言開發(fā)),這意味著它可以無縫地與Logstash、Elasticsearch和Kibana結(jié)合,考慮到我們需要在hadoop的namenode節(jié)點上運行日志采集服務(wù),同時不需要配置日志內(nèi)容的過濾策略,可以使用filebeat+ES的方式。
3.1 filebeat安裝
下載地址:https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-6.3.2-x86_64.rpm
安裝:rpm -vi filebeat-6.3.2-x86_64.rpm
配置:
配置文件位置:/etc/filebeat/filebeat.yml,配置input和output(當(dāng)設(shè)置index名稱時需設(shè)置setup.template.name、setup.template.pattern。官網(wǎng)文檔參考:https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-template.html)
filebeat.inputs:
- type: log
enabled: true
paths:
- /usr/local/hadoop-2.6.5/logs/yarn-hadoop-resourcemanager-luckybinamenode01.intranet.test01.tj1.log
output.elasticsearch:
hosts: ["10.104.106.232:9200","10.104.106.233:9200","10.104.106.234:9200"]
index: "hadoopaudit"
username: "sdk"
password: "pass4Sdk"
loadbalance: true
setup.template.name: "hadoopaudit"
setup.template.pattern: "hadoopaudit-*"
|
es中配置模板:
{
"template" : "hadoopaudit",
"index_patterns": ["hadoopaudit-*"],
"settings": {
"number_of_shards": "5"
},
"mappings": {
"doc": {
"properties": {
"message": {
"type": "string",
"index": "analyzed",
"analyzer": "ik"
},
"@timestamp": {
"format": "strict_date_optional_time||epoch_millis",
"type": "date"
},
"input": {
"properties": {
"type": {
"type": "string"
}
}
},
"beat": {
"properties": {
"name": {
"type": "string"
},
"hostname": {
"type": "string"
},
"version": {
"type": "string"
}
}
},
"host": {
"properties": {
"name": {
"type": "string"
}
}
},
"source": {
"type": "string"
},
"prospector": {
"properties": {
"type": {
"type": "string"
}
}
},
"offset": {
"type": "long"
}
}
}
}
}
|
啟動:service filebeat start
filebeat的日志路徑:/var/log/filebeat