ES搜索條件不生效問(wèn)題分析

Logstash從kafka集群Topic獲取數(shù)據(jù),解析出其字段,然后寫(xiě)入到ES中,logstash.conf配置如下:

input {
     kafka {
        bootstrap_servers => "ip1:9094,ip2:9094,ip3:9094"
        auto_offset_reset => "latest"
        group_id => "lo"  
        id => "8.0.6"
        client_id => "logstash-1"
        check_crcs => "false"
    topics => ["exception_log"]
        codec => "json"
        }
}

filter{
        json {
                source => "body"
        }
} 

output {
      elasticsearch {
                hosts => ["es1:9200","es2:9200","es3:9200"]
                index => "exception-%{+YYYY.MM.dd}"
        }
}

寫(xiě)入完成后,查詢(xún)其結(jié)果:

 {
        "_index": "exception-2019.04.30",
        "_type": "doc",
        "_id": "ssvZbGoB9BdDCt57XEyk",
        "_score": 1,
        "_source": {
          "nanos": "0",
          "success": true,
          "msg": "Add TimelineEntry success",
          "@timestamp": "2019-04-30T06:05:44.661Z",
          "errorInfo": "",
          "timestamp": "1556604344651",
          "type": "ShareTimelineImpl",
          "priority": "INFO",
          "@version": "1",
          "userId": 521585010,
          "body": """{"errorInfo":"","msg":"Add TimelineEntry success","success":true,"type":"ShareTimelineImpl","userId":521585010}""",
          "fields": {
            "_ds_unique_id": "452:65025:3752073:21855ca9:244477118",
            "HOSTNAME": "timeline11.server.163.org"
          },
          "host": ""
        }
      },

搜索userID=526952388的結(jié)果

GET /lofter-exception-2019.04.30/_search
{"query": {
    "bool": {
      "must": [
        {"match": {
          "userId": 526952388
        }}
      ]
    }
  }
}

結(jié)果有626條,而且HOSTNAME都是timeline11.server.163.org:

 "hits": {
    "total": 626,
    "max_score": 1,
    "hits": [
      {
        "_index": "lofter-exception-2019.04.30",
        "_type": "doc",
        "_id": "t5_cbGoBnt41zOK2Ltpc",
        "_score": 1,
        "_source": {
          "nanos": "0",
          "success": true,
          "msg": "Add TimelineEntry success",
          "@timestamp": "2019-04-30T06:08:50.150Z",
          "errorInfo": "",
          "timestamp": "1556604530017",
          "type": "ShareTimelineImpl",
          "priority": "INFO",
          "@version": "1",
          "userId": 526952388,
          "body": """{"errorInfo":"","msg":"Add TimelineEntry success","success":true,"type":"ShareTimelineImpl","userId":526952388}""",
          "fields": {
            "_ds_unique_id": "452:65025:3752065:c732bf64:243554598",
            "HOSTNAME": "timeline11.server.163.org"
          },
          "host": ""
        }
      },

想根據(jù)userID和HOSTNAME以及success的條件進(jìn)行過(guò)濾。

HOSTNAME條件設(shè)置為timeline111.server.163.org,期望匹配結(jié)果為空。

GET /exception-2019.04.30/_search
{"query": {
  "bool": {
    "must": [
            {
        "match": {
          "fields.HOSTNAME": "timeline111.server.163.org"
        }
      },
      {"match": {
        "userId": 526952388
      }},
      {
        "match": {
          "success": true
        }
      }
    ]
  }
}
, "_source": ["userId","body","fields.HOSTNAME"]
}

實(shí)際結(jié)果還是有626條,感覺(jué)加的過(guò)濾條件不生效。

"hits": {
    "total": 626,
    "max_score": 1.4777467,
    "hits": [
      {
        "_index": "lofter-exception-2019.04.30",
        "_type": "doc",
        "_id": "t5_cbGoBnt41zOK2Ltpc",
        "_score": 1.4777467,
        "_source": {
          "body": """{"errorInfo":"","msg":"Add TimelineEntry success","success":true,"type":"ShareTimelineImpl","userId":526952388}""",
          "fields": {
            "HOSTNAME": "timeline11.server.163.org"
          },
          "userId": 526952388
        }
      },

網(wǎng)上針對(duì)這個(gè)問(wèn)題的分析
[https://stackoverflow.com/questions/23150670/elasticsearch-match-vs-term-query]
修改請(qǐng)求體之后:

GET /lofter-exception-2019.04.30/_search
{"query": {
  "bool": {
    "must": [
            {
        "term": {
          "fields.HOSTNAME.keyword": "timeline111.server.163.org"
        }
      },
      {"term": {
        "userId": 526952388
      }},
      {
        "term": {
          "success": true
        }
      }
    ]
  }
}
, "_source": ["userId","body","fields.HOSTNAME"]
}

結(jié)果是符合預(yù)期的:

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

如果使用ES的搜索過(guò)程中,發(fā)現(xiàn)加了過(guò)濾條件不生效,可以嘗試以下方法:
1)條件字段是否有keyword,有的話,使用xxx.keyword

  1. match是分詞匹配的,會(huì)將條件中的一句話切分為多個(gè)單詞,只需其中一個(gè)單詞匹配,就命中,然后根據(jù)Lucence的評(píng)分系統(tǒng)計(jì)算評(píng)分;而term是嚴(yán)格全句匹配的,但是有個(gè)條件:使用term要確定的這個(gè)字段是否“被分析”(analyzed),默認(rèn)的字符串是被分析的。根據(jù)實(shí)際情況選擇到底是match還是term??梢詤⒄眨?http://www.itdecent.cn/p/eb30eee13923

attention
ES中的查詢(xún)操作分為2種:查詢(xún)(query)和過(guò)濾(filter)。查詢(xún)即是之前提到的query查詢(xún),它(查詢(xún))默認(rèn)會(huì)計(jì)算每個(gè)返回文檔的得分,然后根據(jù)得分排序。而過(guò)濾(filter)只會(huì)篩選出符合的文檔,并不計(jì)算得分,且它可以緩存文檔。所以,單從性能考慮,過(guò)濾比查詢(xún)更快。
使用過(guò)濾語(yǔ)句得到的結(jié)果集 -- 一個(gè)簡(jiǎn)單的文檔列表,快速匹配運(yùn)算并存入內(nèi)存是十分方便的, 每個(gè)文檔僅需要1個(gè)字節(jié)。這些緩存的過(guò)濾結(jié)果集與后續(xù)請(qǐng)求的結(jié)合使用是非常高效的。
查詢(xún)語(yǔ)句不僅要查找相匹配的文檔,還需要計(jì)算每個(gè)文檔的相關(guān)性,所以一般來(lái)說(shuō)查詢(xún)語(yǔ)句要比 過(guò)濾語(yǔ)句更耗時(shí),并且查詢(xún)結(jié)果也不可緩存。詳細(xì)介紹可以參考:
https://doc.yonyoucloud.com/doc/mastering-elasticsearch/chapter-2/27_README.html

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

友情鏈接更多精彩內(nèi)容