Logstash從kafka集群Topic獲取數(shù)據(jù),解析出其字段,然后寫(xiě)入到ES中,logstash.conf配置如下:
input {
kafka {
bootstrap_servers => "ip1:9094,ip2:9094,ip3:9094"
auto_offset_reset => "latest"
group_id => "lo"
id => "8.0.6"
client_id => "logstash-1"
check_crcs => "false"
topics => ["exception_log"]
codec => "json"
}
}
filter{
json {
source => "body"
}
}
output {
elasticsearch {
hosts => ["es1:9200","es2:9200","es3:9200"]
index => "exception-%{+YYYY.MM.dd}"
}
}
寫(xiě)入完成后,查詢(xún)其結(jié)果:
{
"_index": "exception-2019.04.30",
"_type": "doc",
"_id": "ssvZbGoB9BdDCt57XEyk",
"_score": 1,
"_source": {
"nanos": "0",
"success": true,
"msg": "Add TimelineEntry success",
"@timestamp": "2019-04-30T06:05:44.661Z",
"errorInfo": "",
"timestamp": "1556604344651",
"type": "ShareTimelineImpl",
"priority": "INFO",
"@version": "1",
"userId": 521585010,
"body": """{"errorInfo":"","msg":"Add TimelineEntry success","success":true,"type":"ShareTimelineImpl","userId":521585010}""",
"fields": {
"_ds_unique_id": "452:65025:3752073:21855ca9:244477118",
"HOSTNAME": "timeline11.server.163.org"
},
"host": ""
}
},
搜索userID=526952388的結(jié)果
GET /lofter-exception-2019.04.30/_search
{"query": {
"bool": {
"must": [
{"match": {
"userId": 526952388
}}
]
}
}
}
結(jié)果有626條,而且HOSTNAME都是timeline11.server.163.org:
"hits": {
"total": 626,
"max_score": 1,
"hits": [
{
"_index": "lofter-exception-2019.04.30",
"_type": "doc",
"_id": "t5_cbGoBnt41zOK2Ltpc",
"_score": 1,
"_source": {
"nanos": "0",
"success": true,
"msg": "Add TimelineEntry success",
"@timestamp": "2019-04-30T06:08:50.150Z",
"errorInfo": "",
"timestamp": "1556604530017",
"type": "ShareTimelineImpl",
"priority": "INFO",
"@version": "1",
"userId": 526952388,
"body": """{"errorInfo":"","msg":"Add TimelineEntry success","success":true,"type":"ShareTimelineImpl","userId":526952388}""",
"fields": {
"_ds_unique_id": "452:65025:3752065:c732bf64:243554598",
"HOSTNAME": "timeline11.server.163.org"
},
"host": ""
}
},
想根據(jù)userID和HOSTNAME以及success的條件進(jìn)行過(guò)濾。
HOSTNAME條件設(shè)置為timeline111.server.163.org,期望匹配結(jié)果為空。
GET /exception-2019.04.30/_search
{"query": {
"bool": {
"must": [
{
"match": {
"fields.HOSTNAME": "timeline111.server.163.org"
}
},
{"match": {
"userId": 526952388
}},
{
"match": {
"success": true
}
}
]
}
}
, "_source": ["userId","body","fields.HOSTNAME"]
}
實(shí)際結(jié)果還是有626條,感覺(jué)加的過(guò)濾條件不生效。
"hits": {
"total": 626,
"max_score": 1.4777467,
"hits": [
{
"_index": "lofter-exception-2019.04.30",
"_type": "doc",
"_id": "t5_cbGoBnt41zOK2Ltpc",
"_score": 1.4777467,
"_source": {
"body": """{"errorInfo":"","msg":"Add TimelineEntry success","success":true,"type":"ShareTimelineImpl","userId":526952388}""",
"fields": {
"HOSTNAME": "timeline11.server.163.org"
},
"userId": 526952388
}
},
網(wǎng)上針對(duì)這個(gè)問(wèn)題的分析
[https://stackoverflow.com/questions/23150670/elasticsearch-match-vs-term-query]
修改請(qǐng)求體之后:
GET /lofter-exception-2019.04.30/_search
{"query": {
"bool": {
"must": [
{
"term": {
"fields.HOSTNAME.keyword": "timeline111.server.163.org"
}
},
{"term": {
"userId": 526952388
}},
{
"term": {
"success": true
}
}
]
}
}
, "_source": ["userId","body","fields.HOSTNAME"]
}
結(jié)果是符合預(yù)期的:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
如果使用ES的搜索過(guò)程中,發(fā)現(xiàn)加了過(guò)濾條件不生效,可以嘗試以下方法:
1)條件字段是否有keyword,有的話,使用xxx.keyword
- match是分詞匹配的,會(huì)將條件中的一句話切分為多個(gè)單詞,只需其中一個(gè)單詞匹配,就命中,然后根據(jù)Lucence的評(píng)分系統(tǒng)計(jì)算評(píng)分;而term是嚴(yán)格全句匹配的,但是有個(gè)條件:使用term要確定的這個(gè)字段是否“被分析”(analyzed),默認(rèn)的字符串是被分析的。根據(jù)實(shí)際情況選擇到底是match還是term??梢詤⒄眨?http://www.itdecent.cn/p/eb30eee13923
attention
ES中的查詢(xún)操作分為2種:查詢(xún)(query)和過(guò)濾(filter)。查詢(xún)即是之前提到的query查詢(xún),它(查詢(xún))默認(rèn)會(huì)計(jì)算每個(gè)返回文檔的得分,然后根據(jù)得分排序。而過(guò)濾(filter)只會(huì)篩選出符合的文檔,并不計(jì)算得分,且它可以緩存文檔。所以,單從性能考慮,過(guò)濾比查詢(xún)更快。
使用過(guò)濾語(yǔ)句得到的結(jié)果集 -- 一個(gè)簡(jiǎn)單的文檔列表,快速匹配運(yùn)算并存入內(nèi)存是十分方便的, 每個(gè)文檔僅需要1個(gè)字節(jié)。這些緩存的過(guò)濾結(jié)果集與后續(xù)請(qǐng)求的結(jié)合使用是非常高效的。
查詢(xún)語(yǔ)句不僅要查找相匹配的文檔,還需要計(jì)算每個(gè)文檔的相關(guān)性,所以一般來(lái)說(shuō)查詢(xún)語(yǔ)句要比 過(guò)濾語(yǔ)句更耗時(shí),并且查詢(xún)結(jié)果也不可緩存。詳細(xì)介紹可以參考:
https://doc.yonyoucloud.com/doc/mastering-elasticsearch/chapter-2/27_README.html