爬蟲日志收集(flume+kafka+elk)

(一)flume1.6

1.1 flume配置(將日志上傳到HDFS離線分析和kafka實(shí)時(shí)分析)

a1.sources = r1

a1.sinks = k2 k1

a1.channels = c2 c1

# Describe/configure the source

a1.sources.r1.type = exec

a1.sources.r1.command=tail -n +0 -f /usr/lang/log.log

a1.sources.r1.channels = c1

a1.sources.r1.channels = c2

# Describe the sink

a1.sinks.k1.type = hdfs

a1.sinks.k1.channel = c1

a1.sinks.k1.hdfs.path = hdfs://lang:8020/user/flume

a1.sinks.k1.hdfs.filePrefix = events-

a1.sinks.k1.hdfs.round = true

a1.sinks.k1.hdfs.roundValue = 10

a1.sinks.k1.hdfs.roundUnit = minute

a1.sinks.k2.channel=c2

a1.sinks.k2.type=org.apache.flume.sink.kafka.KafkaSink

a1.sinks.k2.topic=lang

a1.sinks.k2.brokerList=node1:9092

a1.sinks.k2.requiredAcks=1

a1.sinks.k2.batchSize=20

# Use a channel which buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

a1.channels.c2.type = memory

a1.channels.c2.capacity = 1000

a1.channels.c2.transactionCapacity = 100

1.2 flume啟動

bin/flume-ng ?agent -c conf -f conf/flume-conf -n a1 -Dflume.root.logger=DEBUG,console

(二)kafka 0.11集群

2.1重要配置文件

server.properties:

? ? ? ? broker.id=0 ?(根據(jù)實(shí)際主機(jī),分配0,1,2)

? ? ? ? listeners=PLAINTEXT://:9092

? ? ? ? zookeeper.connect=192.168.205.11:2181,192.168.205.12:2181,192.168.205.13:2181

producer.properties

? ? ? ? bootstrap.servers=192.168.205.11:9092,192.168.205.12:9092,192.168.205.13:9092

consumer.properties

? ? ? ? ?zookeeper.connect=192.168.205.11:2181,192.168.205.12:2181,192.168.205.13:2181

2.2同步配置文件

2.3相關(guān)命令

先啟動zookeeper

啟動kafka ? bin/kafka-server-start.sh config/server.properties &

停止kafka ? ?bin/kafka-server-stop.sh

創(chuàng)建topic ? ? bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic lang

展示topic ? ? bin/kafka-topics.sh --list --zookeeper localhost:2181

描述topic ? ? bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic lang

生產(chǎn)者: ? ? ?bin/kafka-console-producer.sh --broker-list node1:9092 --topic lang

消費(fèi)者: ? ? ?bin/kafka-console-consumer.sh -bootstrap-server localhost:9092 --topic lang --from-beginning

刪除topic: ? ? bin/kafka-topics.sh --delete --zookeeper 130.51.23.95:2181 --topic topicname

(三)logstash5.5.1

3.1配置(文件輸入,es輸出)

input {

file {

path => ["/usr/lang/log.log"]

start_position => "beginning"

}

}

filter {

date {

match => [ "timestamp" , "YYYY-MM-dd HH:mm:ss" ]

}

}

output {

elasticsearch {

hosts => ["192.168.205.14:9200"]

}

stdout {

codec => rubydebug

}

}

3.2配置(kafka輸入,es輸出)

input {

kafka {

#workers =>2

bootstrap_servers => "node1:9092,node2:9092,node3:9092"? ? #zookeeper地址

topics => "lang"? ? #kafka中topic名稱,記得創(chuàng)建該topic

#group_id => "logstash"? ? #默認(rèn)為“l(fā)ogstash”

#consumer_threads =>2? ? #消費(fèi)的線程數(shù)

#reset_beginning => false

#reset_beginning=>true

#decorate_events => true? ? #在輸出消息的時(shí)候回輸出自身的信息,包括:消費(fèi)消息的大小、topic來源以及consumer的group信息。

#type => "nginx-access-log"

}

}

filter {

date {

match => [ "timestamp" , "YYYY-MM-dd HH:mm:ss" ]

}

}

output {

elasticsearch {

hosts => ["192.168.205.14:9200"]

#index => "kafakindex-%{+YYYY.MM.dd}"

}

stdout {

codec => rubydebug

}

}

(四)elasticsearch

4.1內(nèi)存配置 ? config/jvm.properties

4.2配置文件 ? config/elsticsearch

cluster.name: my-application

node.name: node-1(集群中名稱不一樣)

network.host: 192.168.205.14

http.port: 9200

bootstrap.system_call_filter: false

http.cors.enabled: true

http.cors.allow-origin: "*"

4.3注意事項(xiàng):Java內(nèi)存參數(shù),配置文件中空格問題

4.4elasticsearch-head(索引UI管理界面)

(五)kibana

沒啥,直接啟動

有問題直接聯(lián)系我 QQ:1146941596

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容