Elasticsearch 搜索基本操作

2021年初報名了阿里云舉辦的elasticsearch 百人大作戰(zhàn),共同完成《ELK操作手冊》的編寫,有幸參與到了基礎能力編寫的部分-search基本操作,現(xiàn)整理部分內容展現(xiàn)給大家參考學習。

業(yè)務背景

在2B行業(yè),對商品的搜索展示是有一定業(yè)務要求的,例如:存在合作關系的買家和供應商才能看到供應商店鋪的商品,不存在合作關系的買家則不展示商品,另外,有些商品對客戶甲展示一種價格,對另外一些客戶則展示另外一種價格,從而區(qū)分不同的會員、分組對商品價格的區(qū)別。一句話總結:2B行業(yè)的商品銷售具有一定封閉性、特殊性。后續(xù)例子均在此背景下展開描述,以方便大家更加貼近業(yè)務場景來熟悉elastic search 對文檔、索引、查詢的一系列操作。

定義mapping

商品字段描述如下:

goodsName: 商品名稱
skuCode:商品sku編碼
brandName:商品品牌名稱
channelType:渠道類型
shopCode: 店鋪編碼
publicPrice:售賣價格(基礎價,對所有人開放價格)
closeUserCode:封閉會員編碼
groupPrice:分組價格,其中使用嵌套類型存儲
boxLevelPrice:分組價格
level:分組級別

定義商品mapping
PUT my_goods_20210423
{
  "settings": {
    "index": {
      "number_of_shards": 1,
      "number_of_replicas": 1
    }
  },
  "mappings": {
    "properties": {
      "goodsName": {
        "type": "text",
        "analyzer": "ik_smart"
      },
      "skuCode": {
        "type": "keyword"
      },
      "brandName": {
        "type": "keyword"
      },
      "channelType": {
        "type": "keyword"
      },
      "shopCode": {
        "type": "keyword"
      },
      "publicPrice": {
        "type": "float"
      },
      "closeUserCode": {
        "type": "text",
        "analyzer": "standard"
      },
      "boostValue": {
        "type": "keyword"
      },
      "groupPrice": {
        "type": "nested",
        "properties": {
          "boxLevelPrice": {
            "type": "float"
          },
          "level": {
            "type": "text"
          }
        }
      }
    }
  }
}

操作文檔

主要涉及以下幾個核心功能


document.png

1、新增

對文檔的新增操作支持以下類型

PUT /<target>/_doc/<_id>

POST /<target>/_doc/

PUT /<target>/_create/<_id>

POST /<target>/_create/<_id>

以 POST /<target>/_create/<_id>為例,以下將創(chuàng)建文檔ID為1的商品信息:

POST /my_goods_20210423/_create/1
{
    "goodsName":"蘋果 51英寸 4K超高清",
    "skuCode":"skuCode1",
    "brandName":"蘋果",
    "closeUserCode":[
        "0"
    ],
    "channelType":"cloudPlatform",
    "shopCode":"sc00001",
    "publicPrice":"8188.88",
    "groupPrice":null,
    "boxPrice":null,
    "boostValue":1.8
}

ES支持批量插入,_bulk桶插入

POST my_goods_20210423/_bulk
{"index":{"_id":3}}
{"goodsName":"蘋果UA55RU7520JXXZ 53英寸 4K高清","skuCode":"skuCode3","brandName":"美國蘋果","closeUserCode":["0"],"channelType":"cloudPlatform","shopCode":"sc00001","publicPrice":"8388.88","groupPrice":null,"boxPrice":[{"boxType":"box1","boxUserCode":["htd003","uc004"],"boxPriceDetail":4388.88},{"boxType":"box2","boxUserCode":["uc005","uc0010"],"boxPriceDetail":5388.88}],"boostValue":1.2}
{"index":{"_id":4}}
{"goodsName":"山東蘋果UA55RU7520JXXZ 蘋果54英寸 5K超高清","skuCode":"skuCode4","brandName":"山東蘋果","closeUserCode":["uc001","uc002","uc003"],"channelType":"cloudPlatform","shopCode":"sc00001","publicPrice":"8488.88","groupPrice":[{"level":"level1","boxLevelPrice":"2488.88"},{"level":"level2","boxLevelPrice":"3488.88"}],"boxPrice":[{"boxType":"box1","boxUserCode":["uc004","uc005","uc006","uc001"],"boxPriceDetail":4488.88},{"boxType":"box2","boxUserCode":["htd007","htd008","htd009","uc0010"],"boxPriceDetail":5488.88}],"boostValue":1.2}
{"index":{"_id":5}}
{"goodsName":"蘋果UA55R蘋果U7蘋果520JXXZ 55英寸 5K超高清","skuCode":"skuCode5","brandName":"三星蘋果","closeUserCode":["uc001","uc002","uc003"],"channelType":"cloudPlatform","shopCode":"sc00001","publicPrice":"8488.88","groupPrice":[{"level":"level1","boxLevelPrice":"2500"},{"level":"level2","boxLevelPrice":"3500"}],"boxPrice":[{"boxType":"box1","boxUserCode":["uc004","uc005","uc006","uc001"],"boxPriceDetail":3588.88},{"boxType":"box2","boxUserCode":["htd007","htd008","htd009","uc0010"],"boxPriceDetail":5588.88}],"boostValue":1.2}
{"index":{"_id":6}}
{"goodsName":"三星UA55RU7520JXXZ 51英寸 4K超高清","skuCode":"skuCode1","brandName":"三星","closeUserCode":["0"],"channelType":"cmccPlatform","shopCode":"sc00001","publicPrice":"8188.88","groupPrice":null,"boxPrice":null,"boostValue":1.2}
{"index":{"_id":7}}
{"goodsName":"三星UA55RU7520JXXZ 52英寸 4K超高清","skuCode":"skuCode2","brandName":"三星","closeUserCode":["0"],"channelType":"cmccPlatform","shopCode":"sc00001","publicPrice":"8288.88","groupPrice":null,"boxPrice":[{"boxType":"box1","boxUserCode":["htd002"],"boxPriceDetail":4288.88}],"boostValue":1.2}
{"index":{"_id":8}}
{"goodsName":"三星UA55RU7520JXXZ 52英寸 4K超高清","skuCode":"skuCode2","brandName":"三星","closeUserCode":["uc0022"],"channelType":"cloudPlatform","shopCode":"sc00001","publicPrice":"8288.88","groupPrice":null,"boxPrice":[{"boxType":"box1","boxUserCode":["uc0022"],"boxPriceDetail":4288.88}],"boostValue":1.2}
{"index":{"_id":9}}
{"goodsName":"三星UA55RU7520JXXZ 52英寸 4K超高清","skuCode":"skuCode2","brandName":"三星","closeUserCode":["uc0022"],"channelType":"cloudPlatform","shopCode":"sc00001","publicPrice":"8288.88","groupPrice":null,"boxPrice":[{"boxType":"box1","boxUserCode":["uc0022"],"boxPriceDetail":4288.88}],"boostValue":1.2}
{"index":{"_id":10}}
{"goodsName":"三星UA55RU7520JXXZ 52英寸 4K超高清","skuCode":"skuCode2","brandName":"三星","closeUserCode":["uc0022"],"channelType":"cloudPlatform","shopCode":"sc00001","publicPrice":"8288.88","groupPrice":null,"boxPrice":[{"boxType":"box1","boxUserCode":["uc0022"],"boxPriceDetail":4288.88}],"boostValue":1.8}

2、刪除

對文檔的刪除操作支持以下類型

DELETE /<index>/_doc/<_id>

刪除文檔ID為2的數(shù)據(jù):

DELETE /my_goods_20210423/_doc/2

另外,刪除操作支持帶多種條件的刪除,可以使用_delete_by_query,
如下操縱,將刪除店鋪編碼為“sc00002”的所有商品

POST /my_goods_20210423/_delete_by_query
{
  "query": {
    "match": {
      "shopCode": "sc00002"
    }
  }
}

3、修改

對文檔的修改操作支持以下類型

POST /<index>/_update/<_id>
修改文檔ID為1的文檔信息

新增字段

POST /my_goods_20210423/_update/1
{
  "doc": {
    "shopName": "小王店鋪"
  }
}

修改店鋪名稱為:“張三店鋪”

POST /my_goods_20210423/_update/1
{
  "doc": {
    "shopName": "張三店鋪"
  }
}
{
  "goodsName" : "蘋果 51英寸 4K超高清",
  "skuCode" : "skuCode1",
  "brandName" : "蘋果",
  "closeUserCode" : [
    "0"
  ],
  "channelType" : "cloudPlatform",
  "shopCode" : "sc00001",
  "publicPrice" : "8188.88",
  "groupPrice" : null,
  "boxPrice" : null,
  "boostValue" : 1.8,
  "shopName" : "張三店鋪"
}

另外,更新操作還可以使用_update_by_query api,當?shù)赇伨幋a為"sc00002"時修改"publicPrice"為5888.00元
插入文檔ID為2的店鋪商品信息

POST /my_goods_20210423/_create/2
{"goodsName":"蘋果 55英寸 3K超高清","skuCode":"skuCode2","brandName":"蘋果","closeUserCode":["0"],"channelType":"cloudPlatform","shopCode":"sc00002","publicPrice":"6188.88","groupPrice":null,"boxPrice":null,"boostValue":1.0}

此時查詢返回:

{
  "goodsName" : "蘋果 55英寸 3K超高清",
  "skuCode" : "skuCode2",
  "brandName" : "蘋果",
  "closeUserCode" : [
    "0"
  ],
  "channelType" : "cloudPlatform",
  "shopCode" : "sc00002",
  "publicPrice" : "6188.88",
  "groupPrice" : null,
  "boxPrice" : null,
  "boostValue" : 1.0
}

更新當?shù)赇伨幋a為"sc00002"時修改"publicPrice"為5888.00元

POST /my_goods_20210423/_update_by_query
{
  "script": {
    "source": "ctx._source.publicPrice=5888.00",
    "lang": "painless"
  },
  "query": {
    "term": {
      "shopCode": "sc00002"
    }
  }
}

再次查詢結果

GET /my_goods_20210423/_source/2
{
  "shopCode" : "sc00002",
  "brandName" : "蘋果",
  "closeUserCode" : [
    "0"
  ],
  "groupPrice" : null,
  "boxPrice" : null,
  "channelType" : "cloudPlatform",
  "boostValue" : 1.0,
  "publicPrice" : 5888.0,
  "goodsName" : "蘋果 55英寸 3K超高清",
  "skuCode" : "skuCode2"
}

當有業(yè)務需要重建索引時需要用到_reindex api
索引的來源和目的地必須是已經(jīng)存在的index,、index alias、或者data stream
你可以簡單的將索引A reindex到索引B,當然也可以帶條件的reindex到索引B
如下所示,將skuCode=skuCode2的商品信息reindex到索引my_goods_20210423_new中

POST _reindex
{
  "source": {
    "index": "my_goods_20210423",
    "query": {
      "match": {
        "skuCode": "skuCode2"
      }
    }
  },
  "dest": {
    "index": "my_goods_20210423_new"
  }
}

查詢my_goods_20210423_new索引數(shù)據(jù)

GET my_goods_20210423_new/_search/
{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "my_goods_20210423_new",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 1.0,
        "_source" : {
          "goodsName" : "三星UA55RU7520JXXZ 52英寸 4K超高清",
          "skuCode" : "skuCode2",
          "brandName" : "三星",
          "closeUserCode" : [
            "0"
          ],
          "channelType" : "cmccPlatform",
          "shopCode" : "sc00001",
          "publicPrice" : "8288.88",
          "groupPrice" : null,
          "boxPrice" : [
            {
              "boxType" : "box1",
              "boxUserCode" : [
                "htd002"
              ],
              "boxPriceDetail" : 4288.88
            }
          ],
          "boostValue" : 1.2
        }
      },
      {
        "_index" : "my_goods_20210423_new",
        "_type" : "_doc",
        "_id" : "8",
        "_score" : 1.0,
        "_source" : {
          "goodsName" : "三星UA55RU7520JXXZ 52英寸 4K超高清",
          "skuCode" : "skuCode2",
          "brandName" : "三星",
          "closeUserCode" : [
            "uc0022"
          ],
          "channelType" : "cloudPlatform",
          "shopCode" : "sc00001",
          "publicPrice" : "8288.88",
          "groupPrice" : null,
          "boxPrice" : [
            {
              "boxType" : "box1",
              "boxUserCode" : [
                "uc0022"
              ],
              "boxPriceDetail" : 4288.88
            }
          ],
          "boostValue" : 1.2
        }
      },
      {
        "_index" : "my_goods_20210423_new",
        "_type" : "_doc",
        "_id" : "9",
        "_score" : 1.0,
        "_source" : {
          "goodsName" : "三星UA55RU7520JXXZ 52英寸 4K超高清",
          "skuCode" : "skuCode2",
          "brandName" : "三星",
          "closeUserCode" : [
            "uc0022"
          ],
          "channelType" : "cloudPlatform",
          "shopCode" : "sc00001",
          "publicPrice" : "8288.88",
          "groupPrice" : null,
          "boxPrice" : [
            {
              "boxType" : "box1",
              "boxUserCode" : [
                "uc0022"
              ],
              "boxPriceDetail" : 4288.88
            }
          ],
          "boostValue" : 1.2
        }
      },
      {
        "_index" : "my_goods_20210423_new",
        "_type" : "_doc",
        "_id" : "10",
        "_score" : 1.0,
        "_source" : {
          "goodsName" : "三星UA55RU7520JXXZ 52英寸 4K超高清",
          "skuCode" : "skuCode2",
          "brandName" : "三星",
          "closeUserCode" : [
            "uc0022"
          ],
          "channelType" : "cloudPlatform",
          "shopCode" : "sc00001",
          "publicPrice" : "8288.88",
          "groupPrice" : null,
          "boxPrice" : [
            {
              "boxType" : "box1",
              "boxUserCode" : [
                "uc0022"
              ],
              "boxPriceDetail" : 4288.88
            }
          ],
          "boostValue" : 1.8
        }
      }
    ]
  }
}

4、查詢

對文檔的查詢操作支持以下類型

GET <index>/_doc/<_id>

HEAD <index>/_doc/<_id>

GET <index>/_source/<_id>

HEAD <index>/_source/<_id>
查詢文檔ID為1的文檔信息
GET /my_goods_20210423/_doc/1
查詢文檔ID為1的文檔是否存在

只判斷文檔是否存在,head返回的信息更少、性能更高,滿足特殊業(yè)務場景使用

HEAD /my_goods_20210423/_doc/1

返回:

200 - OK
只返回文檔信息

查詢時只返回_source信息

GET /my_goods_20210423/_source/1

返回:

{
  "goodsName" : "蘋果 51英寸 4K超高清",
  "skuCode" : "skuCode1",
  "brandName" : "蘋果",
  "closeUserCode" : [
    "0"
  ],
  "channelType" : "cloudPlatform",
  "shopCode" : "sc00001",
  "publicPrice" : "8188.88",
  "groupPrice" : null,
  "boxPrice" : null,
  "boostValue" : 1.8
}

定制化返回參數(shù)

只獲取_source部分參數(shù),類似數(shù)據(jù)庫查詢中的指定字段,而不是select * 返回所有字段

GET my_goods_20210423/_source/1/?_source_includes=brandName,goodsName

返回:

{
  "brandName" : "蘋果",
  "goodsName" : "蘋果 51英寸 4K超高清"
}
查詢文檔ID為1的文檔是否存在

只判斷文檔是否存在,head返回的信息更少、性能更高,滿足特殊業(yè)務場景使用

HEAD /my_goods_20210423/_doc/1

返回:

200 - OK
批量查詢

ES同時支持批量查詢,需要使用_mget API,查詢文檔ID等于1和2的文檔信息

GET /my_goods_20210423/_mget
{
  "docs": [
    {
      "_id": "1"
    },
    {
      "_id": "2"
    }
  ]
}

返回:

{
  "docs" : [
    {
      "_index" : "my_goods_20210423",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 7,
      "_seq_no" : 8,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "goodsName" : "蘋果 51英寸 4K超高清",
        "skuCode" : "skuCode1",
        "brandName" : "蘋果",
        "closeUserCode" : [
          "0"
        ],
        "channelType" : "cloudPlatform",
        "shopCode" : "sc00001",
        "publicPrice" : "8188.88",
        "groupPrice" : null,
        "boxPrice" : null,
        "boostValue" : 1.8,
        "shopName" : "張三店鋪"
      }
    },
    {
      "_index" : "my_goods_20210423",
      "_type" : "_doc",
      "_id" : "2",
      "found" : false
    }
  ]
}

Query DSL

查詢索引包括全文本查詢、組合查詢、結構化查詢等,主要分為query與filter查詢。
2者查詢是有區(qū)別的:

  1. query查詢,用于解答文檔是否存在并且告知返回文檔與查詢條件的匹配度,返回_score評分供用戶選擇
  2. filter查詢,只用于返回文檔是否與查詢匹配,但是不會告訴你匹配度,在做聚合查詢時filter經(jīng)常發(fā)揮更大的作用,因為沒有評分ES的處理速度就會提高,提升了整體響應時間。同時filter可以緩存查詢結果,而query則不能緩存
使用場景:如果涉及到全文檢索以及評分相關業(yè)務使用query,其他場景推薦使用filter查詢

組合查詢

image.png
boolean查詢

boolean 查詢包含must、filter、should、must_not
must為必須匹配并且返回評分,filter忽略評分,should相當于數(shù)據(jù)庫查詢中的or,must_not 為不匹配,相當于不等于
查詢:店鋪編碼=sc00001 且渠道channelType=cloudPlatform 且publicPrice價格區(qū)間不在8288-8888之間或者品牌包含蘋果

POST /my_goods_20210423/_search
{
  "query": {
    "bool": {
      "must": {
        "term":{
          "shopCode":"sc00001"
        }
      },
      "filter": {
        "term": {
          "channelType": "cloudPlatform"
        }
      },
      "must_not": [
        {
         "range": {
           "publicPrice": {
             "gte": 8288,
             "lte": 8888
           }
         }
        }
      ],
      "should": [
        {
          "term": {
            "brandName": {
              "value": "蘋果"
            }
          }
        }
      ],
      "minimum_should_match" : 1
    }
  }
}
boosting 查詢

boosting用于控制評分相關度相關,可以提升評分也可以降低評分

POST /my_goods_20210423/_search
{
  "query": {
    "boosting": {
      "positive": {
        "term": {
          "skuCode": {
            "value": "skuCode1"
          }
        }
      },
      "negative": {
        "term": {
          "goodsName": {
            "value": "三星"
          }
        }
      }, 
      "negative_boost": 1
    }
  }
}

此時設置的negative_boost=1,不提升也不降低,返回:

"hits" : [
      {
        "_index" : "my_goods_20210423",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.3862942,
        "_source" : {
          "goodsName" : "蘋果 51英寸 4K超高清",
          "skuCode" : "skuCode1",
          "brandName" : "蘋果",
          "closeUserCode" : [
            "0"
          ],
          "channelType" : "cloudPlatform",
          "shopCode" : "sc00001",
          "publicPrice" : "8188.88",
          "groupPrice" : null,
          "boxPrice" : null,
          "boostValue" : 1.8,
          "shopName" : "張三店鋪"
        }
      },
      {
        "_index" : "my_goods_20210423",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 1.3862942,
        "_source" : {
          "goodsName" : "三星UA55RU7520JXXZ 51英寸 4K超高清",
          "skuCode" : "skuCode1",
          "brandName" : "三星",
          "closeUserCode" : [
            "0"
          ],
          "channelType" : "cmccPlatform",
          "shopCode" : "sc00001",
          "publicPrice" : "8188.88",
          "groupPrice" : null,
          "boxPrice" : null,
          "boostValue" : 1.2
        }
      }
    ]

可以看到2條文檔記錄評分一致:"_score" : 1.3862942

當我們修改 "negative_boost": 0.2時,此時返回(省略部分無關字段):

"hits" : [
      {
        "_index" : "my_goods_20210423",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.3862942,
        "_source" : {
          "goodsName" : "蘋果 51英寸 4K超高清",
          "skuCode" : "skuCode1",
          "brandName" : "蘋果",
          "closeUserCode" : [
            "0"
          ],
          "channelType" : "cloudPlatform",
          "shopCode" : "sc00001",
          "publicPrice" : "8188.88",
          "groupPrice" : null,
          "boxPrice" : null,
          "boostValue" : 1.8,
          "shopName" : "張三店鋪"
        }
      },
      {
        "_index" : "my_goods_20210423",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 0.27725884,
        "_source" : {
          "goodsName" : "三星UA55RU7520JXXZ 51英寸 4K超高清",
          "skuCode" : "skuCode1",
          "brandName" : "三星",
          "closeUserCode" : [
            "0"
          ],
          "channelType" : "cmccPlatform",
          "shopCode" : "sc00001",
          "publicPrice" : "8188.88",
          "groupPrice" : null,
          "boxPrice" : null,
          "boostValue" : 1.2
        }
      }
    ]

此時發(fā)現(xiàn)文檔ID=6的評分下降到_score" : 0.27725884,因為在negative命中了查詢條件,negative_boost在0到1之間時,用于降低評分,相反,大于1用于提升評分

Constant score query 查詢

當查詢不關心TF(詞頻)時,就可以使用constant score query

POST /my_goods_20210423/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "goodsName": "蘋果"
        }
      },
      "boost": 1.2
    }
  }
}

返回(省略部分無關字段):

{
        "_index" : "my_goods_20210423",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.2,
        "_source" : {
          "goodsName" : "蘋果UA55RU7520JXXZ 53英寸 4K高清"
        }
      },
      {
        "_index" : "my_goods_20210423",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.2,
        "_source" : {
          "goodsName" : "山東蘋果UA55RU7520JXXZ 蘋果54英寸 5K超高清"
        }
      }
}

可以看到,文檔ID=3的評分和文檔ID=4的評分一樣,但是ID=4的匹配相關度更高,這是由于我們忽略了詞頻對打分的影響。

Disjunction max query 查詢

Disjunction 查詢也被理解為分離最大化查詢,指的是: 將任何與任一查詢匹配的文檔作為結果返回,但只將最佳匹配的評分作為查詢的評分結果返回,例如查詢商品名稱和品牌名稱中包含“蘋果”的信息,當品牌的評分高于商品名稱時,則返回品牌的評分做為總評分(忽略tie_breaker緩沖)

GET /my_goods_20210423/_search
{
  "query": {
    "dis_max": {
      "tie_breaker": 0.7,
      "boost": 1.2,
      "queries": [
        {
          "term": {
            "goodsName": {
              "value": "蘋果"
            }
          }
        },
        {
          "term": {
            "brandName": {
              "value": "蘋果"
            }
          }
        }
        ]
    }
  }
}

返回結果(忽略無關字段):

"max_score" : 3.0150018,
    "hits" : [
      {
        "_index" : "my_goods_20210423",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 3.0150018,
        "_source" : {
          "goodsName" : "蘋果 51英寸 4K超高清",
          "brandName" : "蘋果"
        }
      },
      {
        "_index" : "my_goods_20210423",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.3465583,
        "_source" : {
          "goodsName" : "蘋果UA55R蘋果U7蘋果520JXXZ 55英寸 5K超高清",
          "brandName" : "三星蘋果"
        }
      },
      {
        "_index" : "my_goods_20210423",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.2337791,
        "_source" : {
          "goodsName" : "山東蘋果UA55RU7520JXXZ 蘋果54英寸 5K超高清",
          "brandName" : "山東蘋果"
        }
      },

分析:

  1. id=1的記錄,由于品牌只包含“蘋果”2字,ES認為這種匹配度更高,所以此條記錄評分排在第一位
  2. id=5的記錄,由于品牌中和ID=4的記錄都包含蘋果且字數(shù)一樣,此時就要看goodsName包含蘋果的詞頻數(shù)量了,ID=5的品牌中,“蘋果”出現(xiàn)了3次,而ID=4的值出現(xiàn)了2次,所以評分沒有ID=5的高,符合我們的預期結果。
  3. tie_breaker字段做什么用呢?它是起到了緩沖的作用(取值范圍:0到1之間),Disjunction查詢會將匹配度最高的字段得分做為整個文檔的得分返回,這種情況其他字段就不起作用了,難免有點走極端,此時就需要tie_breaker來做緩存,提升其他字段的影響力,最終的結果:brandName評分+goodsName評分*tie_breaker。作為總評分返回
Function score query 查詢

Function score 允許你控制查詢評分,是用來控制評分過程的終極武器。最高效的用法是用過濾器對結果的子集應用不同的函數(shù),同時運用了filter的緩存并且達到了控制評分的過程。
我們想讓山東的蘋果搜索出現(xiàn)美國蘋果之前,查詢商品名稱包含“蘋果”,當品牌中包含“美國”時,權重設置為2,當出現(xiàn)“山東”時,權重設置為40

GET /my_goods_20210423/_search
{
  "query": {
    "function_score": {
      "query": {
        "term": {
          "goodsName": {
            "value": "蘋果"
          }
        }
      },
      "boost": 2, 
      "functions": [
        {
          "filter": {
            "match":{
              "brandName":"美國"
            }
          },
          "random_score": {
            
          },
          "weight": 2
        },
        {
          "filter": {
            "match":{
              "brandName":"山東"
            }
          },
          "weight": 40
        }
      ],
      "max_boost": 60,
      "score_mode": "max",
      "boost_mode": "multiply",
      "min_score": 2
    }
  }
}

返回主要信息:

    "max_score" : 2.2442641,
    "hits" : [
     {
        "_index" : "my_goods_20210423",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 2.0562985,
        "_source" : {
          "goodsName" : "山東蘋果UA55RU7520JXXZ 蘋果54英寸 5K超高清",
          "brandName" : "山東蘋果"
        }
      },
      {
        "_index" : "my_goods_20210423",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.7582327,
        "_source" : {
          "goodsName" : "蘋果UA55RU7520JXXZ 53英寸 4K高清",
          "brandName" : "美國蘋果",
        }
      }
    ]

解釋幾個參數(shù):

  1. score_mode
    multiply:默認,分數(shù)相乘
    sum:分數(shù)求和
    avg:平均分數(shù)
    first:第一個 function的分數(shù)
    max:使用評分最大的分數(shù)
    min:使用評分最小的分數(shù)

avg舉例,如果2個函數(shù)返回的分數(shù)為1和2,并且它們的權重分別為3和4,則他們的評分為:(13+24)/(3+4)
其他詳解請參考官方score-functions詳解

全文檢索

全文檢索.png
match 查詢

match 查詢是一種標準的查詢,示例如下:

GET /my_goods_20210423/_search
{
  "query": {
    "match": {
      "goodsName": "蘋果 高清 英寸"
    }
  }
}

match查詢是一種boolean類型的查詢,可以使用"operator"來控制boolean 字句,operator包含 and 和 or(默認為 or)

GET /my_goods_20210423/_search
{
  "query": {
    "match": {
      "goodsName": {
        "query": "蘋果 高清 英寸",
        "operator": "and"
      }
    }
  }
}

返回結果:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

命中為0,因為沒有標題中包含“蘋果 高清 英寸”詞組的商品信息

match boolean prefix query

添加2條商品名稱是因為的測試數(shù)據(jù),方便測試

POST my_goods_20210423/_bulk
{"index":{"_id":11}}
{"goodsName":"apple goods test","skuCode":"skuCode3","brandName":"美國蘋果","closeUserCode":["0"],"channelType":"cloudPlatform","shopCode":"sc00001","publicPrice":"8388.88","groupPrice":null,"boxPrice":[{"boxType":"box1","boxUserCode":["htd003","uc004"],"boxPriceDetail":4388.88},{"boxType":"box2","boxUserCode":["uc005","uc0010"],"boxPriceDetail":5388.88}],"boostValue":1.2}
{"index":{"_id":12}}
{"goodsName":"apple goods online","skuCode":"skuCode3","brandName":"美國蘋果","closeUserCode":["0"],"channelType":"cloudPlatform","shopCode":"sc00001","publicPrice":"8388.88","groupPrice":null,"boxPrice":[{"boxType":"box1","boxUserCode":["htd003","uc004"],"boxPriceDetail":4388.88},{"boxType":"box2","boxUserCode":["uc005","uc0010"],"boxPriceDetail":5388.88}],"boostValue":1.2}
GET /my_goods_20210423/_search
{
  "query": {
    "match_bool_prefix": {
      "goodsName": "apple goods t"
    }
  }
}

2條剛添加的商品都被查詢到了,match_bool_prefix原理就相當于把詞組分開后的boolean查詢,轉換后類似如下查詢:

GET /my_goods_20210423/_search
{
  "query": {
    "bool" : {
      "should": [
        { "term": { "goodsName": "apple" }},
        { "term": { "goodsName": "goods" }},
        { "prefix": { "goodsName": "t"}}
      ]
    }
  }
}
match prefix query

用于匹配索引中是否存在所輸入的查詢條件數(shù)據(jù)

GET /my_goods_20210423/_search
{
  "query": {
    "match_phrase": {
      "goodsName": "apple"
    }
  }
}

比較match_phrase與match區(qū)別,match_phrase會將查詢條件的中的信息看做一個整體,不做分詞去查詢,當然你也可以指定分詞類型,而match會將查詢中的條件做分詞處理后,再去做查詢

#查詢不到任何數(shù)據(jù),因為不存在'goods t'的詞組
GET /my_goods_20210423/_search
{
  "query": {
    "match_phrase": {
      "goodsName": "goods t"
    }
  }
}
#能查詢到數(shù)據(jù),因為文檔中包含goods和t的詞組
GET /my_goods_20210423/_search
{
  "query": {
    "match": {
      "goodsName": "goods t"
    }
  }
}
match phrase prefix query

返回文檔包含給定查詢條件的文檔,文檔中必須包含給定條件的內容且是按照順序的,如"apple goods t" ,商品名稱包含"apple goods test"的數(shù)據(jù)將被查詢到返回。
新增一條測試數(shù)據(jù)

POST my_goods_20210423/_bulk
{"index":{"_id":13}}
{"goodsName":"apple and goods product ","skuCode":"skuCode3","brandName":"美國蘋果","closeUserCode":["0"],"channelType":"cloudPlatform","shopCode":"sc00001","publicPrice":"8388.88","groupPrice":null,"boxPrice":[{"boxType":"box1","boxUserCode":["htd003","uc004"],"boxPriceDetail":4388.88},{"boxType":"box2","boxUserCode":["uc005","uc0010"],"boxPriceDetail":5388.88}],"boostValue":1.2}
#只返回goodsName : apple goods test的數(shù)據(jù)
GET /my_goods_20210423/_search
{
  "query": {
    "match_phrase_prefix": {
      "goodsName": "apple goods t"
    }
  }
}
總結比較match這四種查詢
match比較.png
Multi-match

多字段匹配,可以在多個字段中匹配查詢相關信息,通過type參數(shù)可以調整結果集

#查詢商品名稱和品牌名稱中包含蘋果的文檔信息
POST /my_goods_20210423/_search
{
  "query": {
    "multi_match": {
      "query": "蘋果",
      "type": "best_fields", 
      "fields": ["goodsName","brandName"],
      "tie_breaker": 0.3
    }
  }
}

type參數(shù)類型詳解:

  • best_fields :默認,匹配fields,將評分最高的分數(shù)做為整個查詢的分數(shù)返回
  • most_fields:查詢匹配的文檔,并且返回各個字段的分數(shù)之和的平均值
  • cross_fields:跨字段匹配,匹配多個字段中是否包含查詢詞組
  • phrase:以match_phrase方式運行查詢,并返回最佳匹配的評分做為總評分
  • phrase_prefix:以match_phrase_prefix方式運行查詢,并返回最佳匹配的評分做為總評分
  • bool_prefix:在每個字段上運行match_bool_prefix查詢并組合每個字段的評分,詳情參考bool_prefix
    cross_fields為例進行實戰(zhàn)講解
#插入測試數(shù)據(jù)
PUT my_shop
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  },
  "mappings": {
    "properties": {
      "firstName":{
        "type":"text"
      },
      "lastName":{
        "type":"text"
      }
    }
  }
}
POST my_shop/_bulk
{"index":{"_id":1}}
{"first_name":"Will","last_name":"Smith","age":25}
{"index":{"_id":2}}
{"first_name":"Smith","last_name":"hello","age":21}
{"index":{"_id":3}}
{"first_name":"Will","last_name":"hello","age":20}

#查詢姓名為Will Smith的信息
GET /my_shop/_search
{
  "query": {
    "multi_match" : {
      "query":      "Will Smith",
      "type":       "cross_fields",
      "fields":     [ "first_name^2", "last_name" ],
      "operator":   "and"
    }
  }
}
#返回
"max_score" : 1.9208363,
    "hits" : [
      {
        "_index" : "my_shop",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.9208363,
        "_source" : {
          "first_name" : "Will",
          "last_name" : "Smith",
          "age" : 25
        }
      }
    ]

另外,first_name提升了權重,默認為1

Term-level查詢

可以使用term-level 查詢結構化數(shù)據(jù),結構化數(shù)據(jù)如日期范圍、IP地址、價格等,下面分別演示在業(yè)務場景中的實際使用

  • exists查詢
    返回包含字段索引值的文檔
#返回包含goodsName字段的索引文檔
GET /my_goods_20210423/_search
{
  "query": {
    "exists": {
      "field": "goodsName"
    }
  }
}
  • fuzzy查詢
    返回包含與搜索字詞相似的字詞的文檔,可以用于查詢糾錯功能
#以官網(wǎng)例子舉例說明
POST /my_index/_bulk
{ "index": { "_id": 1 }}
{ "text": "Surprise me!"}
{ "index": { "_id": 2 }}
{ "text": "That was surprising."}
{ "index": { "_id": 3 }}
{ "text": "I wasn't surprised."}

GET /my_index/_search
{
  "query": {
    "fuzzy": {
      "text": {
        "value": "surprize",
        "prefix_length": 1
      }
    }
  }
}
#發(fā)揮
"hits" : [
      {
        "_index" : "my_index",
        "_type" : "my_type",
        "_id" : "1",
        "_score" : 0.9559981,
        "_source" : {
          "text" : "Surprise me!"
        }
      },
      {
        "_index" : "my_index",
        "_type" : "my_type",
        "_id" : "3",
        "_score" : 0.69983494,
        "_source" : {
          "text" : "I wasn't surprised."
        }
      }

默認如果不設置,prefix_length就是2

  1. surprising 錯誤3個位置,不能糾錯
  2. surprize 拼寫錯誤,s->z,錯誤在一個位置,在2個位置的糾錯范圍之內
    為提高性能,可以設置max_expansions,將限制產(chǎn)生模糊文檔的個數(shù),
    另外,prefix_length不宜設置過大,也將影響查詢性能,同時錯誤過多也將導致查詢結果不是用戶期望的。
  • ids查詢
    范圍文檔包含ID的文檔信息
GET /my_goods_20210423/_search
{
  "query": {
    "ids" : {
      "values" : ["1", "4", "5"]
    }
  }
}
  • prefix查詢
    返回在提供的字段中包含特定前綴的文檔
GET /my_shop_test/_search
{
  "query": {
    "prefix": {
      "shopName": {
        "value": "bo"
      }
    }
  }
}
#返回
"hits" : [
      {
        "_index" : "my_shop_test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "shopName" : "box",
          "shopCode" : "Smith"
        }
      },
      {
        "_index" : "my_shop_test",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "shopName" : "booex",
          "shopCode" : "act"
        }
      }
    ]
  • range查詢
    rand查詢類似數(shù)據(jù)庫中的 大于、小于范圍查詢
GET my_goods_20210423/_search
{
  "query": {
    "range": {
      "publicPrice": {
        "gte": 2000,
        "lte": 8488
      }
    }
  }
}
  1. gt :大于
  2. gte:大于等于
  3. lt:小于
  4. lte:小于等于
  • regexp查詢
    正則表達式查詢,查詢店鋪編碼以's'開頭,中間包括任何字符以及長度并且以'1'結尾的數(shù)據(jù)
GET my_goods_20210423/_search
{
  "query": {
    "regexp": {
      "shopCode": {
        "value": "s.*1",
        "flags": "ALL",
        "case_insensitive": true,
        "max_determinized_states": 10000,
        "rewrite": "constant_score"
      }
    }
  }
}
  • term查詢
#返回確切的文檔內容,避免對text字段類型使用term
GET my_goods_20210423/_search
{
  "query": {
    "term": {
      "brandName": {
        "value": "三星",
        "boost": 1.0
      }
    }
  }
}
  • terms查詢
    terms返回一個或多個包含精確查詢條件的文檔信息
GET /my_goods_20210423/_search
{
  "query": {
    "terms": {
      "brandName": [ "美國", "三星" ],
      "boost": 1.0
    }
  }
}
  • terms_set查詢
    返回最小精確匹配成功的文檔信息,terms_set類似terms 查詢,只不過terms_se多定義了返回最小匹配的數(shù)量
#新定義商品信息
PUT /my_goods_info
{
  "mappings": {
    "properties": {
      "goodsName": {
        "type": "keyword"
      },
      "sale_property": {
        "type": "keyword"
      },
      "required_matches": {
        "type": "long"
      }
    }
  }
}

#添加3條商品測試數(shù)據(jù)
#銷售屬性 白色、64G、標品
PUT /my_goods_info/_doc/1?refresh
{
  "name": "apple",
  "sale_property": [ "white", "64","standard" ],
  "required_matches": 2
}
#黑色、32G、非標品
PUT /my_goods_info/_doc/2?refresh
{
  "name": "apple",
  "sale_property": [ "black", "32","no standard" ],
  "required_matches": 2
}
#黑色、64 非標品
PUT /my_goods_info/_doc/3?refresh
{
  "name": "apple",
  "sale_property": [ "black", "64","no standard" ],
  "required_matches": 2
}
#查詢
GET /my_goods_info/_search
{
  "query": {
    "terms_set": {
      "sale_property": {
        "terms": [ "white", "64"],
        "minimum_should_match_field": "required_matches"
      }
    }
  }
}
#返回
"hits" : [
      {
        "_index" : "my_goods_info",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.1149836,
        "_source" : {
          "name" : "apple",
          "sale_property" : [
            "white",
            "64",
            "standard"
          ],
          "required_matches" : 2
        }
      }
    ]
  • wildcard查詢
    返回包含與通配符模式匹配的術語的文檔
#返回
GET /my_goods_20210423/_search
{
  "query": {
    "wildcard": {
      "shopCode": {
        "value": "sc*1",
        "boost": 1.0,
        "rewrite": "constant_score"
      }
    }
  }
}
最后編輯于
?著作權歸作者所有,轉載或內容合作請聯(lián)系作者
【社區(qū)內容提示】社區(qū)部分內容疑似由AI輔助生成,瀏覽時請結合常識與多方信息審慎甄別。
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發(fā)布,文章內容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

相關閱讀更多精彩內容

友情鏈接更多精彩內容