九九热精品视频999,99re热这里只

elasticsearch Search APIs

URL Search API

語法：

get <index_name>/_search

post <index_name>/_search
{

}

說明：

<index_name>/_search可以省略不寫，如果不寫，查詢范圍整個集群的所有索引
<index_name>/_search支持通配符，比如user*表示查詢范圍所有以user開頭的索引
<index_name>/_search支持多個，中間以英文（半角）逗號隔開，比如user1,user2/_search表示查詢范圍是user1,user2這兩個索引
get請求可以在URL上加上請求參數(shù)，使用Query String Syntax
post/get請求可以添加Request Body，使用Query Domain Specific Language(DSL)
具體可以參考Search API

Query String Syntax

demo:

# 獲取2012的電影
get movies/_search?q=2012&df=year&sort=year:desc&from=0&size=10&timeout=1s

q指定查詢語句，使用Query String Syntax
df指定查詢的字段
sort指定排序規(guī)則
from和size用于分頁
q可以指定字段，精確查詢，模糊查詢。
- 單字段精確查詢，q=k:v，例如：q=year:2012
- 泛查詢，正對_all，所有字段：q=v，例如：get movies/_search?q=2012
- Term查詢 Beautiful Mind等效于Beautiful OR Mind
- Phrase查詢："Beautiful Mind"等效于Beautiful AND Mind。要求前后順序一致
- 條件組合查詢：
  - 單條件查詢：q=+k1:v1 -k2:v2 k3:v3，+ 前綴表示必須與查詢條件匹配；類似地，- 前綴表示一定不與查詢條件匹配；沒有 + 或者 - 地所有其他條件都是可選的，匹配的越多，文檔就越相關(guān)。例如：get movies/_search?q=+year:2012 -title:"Bullet to the Head"
  - 多條件組合查詢：AND / OR / NOT 或者 && / || / !，注意：必須是大寫。
- 范圍查詢：
  - 區(qū)間表示：[]閉區(qū)間，{}開區(qū)間
    - year:{2019 TO 2018]
    - year:[* TO 2018]
  - 算數(shù)表示：
    - year:>2012
    - year:(>2012 && <=2018)
    - year:(+>2010 +<=2018)
- 通配符查詢（通配符查詢效率低，占用內(nèi)存大，不建議使用。特別是放在最前面）
  - ?表示1個字符，*表示0個或多個字符：例如GET /movies/_search?q=title:b*
- 正則表達式查詢（查詢效率低，不建議使用）：GET /movies/_search?q=title:[bt]oy
- 模糊查詢與近似查詢：
  - 用 ~ 表示搜索單詞可能有一兩個字母寫的不對，按照相似度返回結(jié)果，最多可以模糊 2 個距離。
    - GET /movies/_search?q=title:beautifl~1
    - GET /movies/_search?q=title:"Lord Rings"~2

Query Domain Specific Language(DSL)

舉例：

# 查詢2005年上映的電影
get movies/_search?q=year:2005

post movies/_search
{
  "query":{
    "match": {"year": 2005}
  }
}

分頁查詢：
- ```
{
    "from": 10,
    "size": 20,
    "query": {
        "match_all": {}
    }
}
```
- from從0開始，默認返回10個結(jié)果，獲取靠后的翻頁成本較高。
排序
- 最好是數(shù)字類型或者日期類型的字段排序
- 因為對于多值類型或分析過的字段排序，系統(tǒng)會選一個值，無法得知該值
- ```
{
    "sort": [{"order_date": "desc"}]
}
```
_source filtering
- 如果_source沒有存儲，那就只返回匹配的文檔的元數(shù)據(jù)
- _source支持使用通配符： _source["name*","desc*"]
- ```
{
    "_source": ["order_date", "order_date","category_keyword"]
}
```

腳本字段

{
    "script_field": {
        "new_field": {
            "script":{
                "lang": "painless",
                "source": "doc['order_date'].value+'hello'"
            }
        }
    }
}

用例：訂單中有不同的匯率，需要結(jié)合匯率對訂單價格進行排序。

Term-Level Queries

Term是表達語意的最小單位。搜索和利用統(tǒng)計語言模型進行自然語言處理都需要處理Term
Term Level Query: Term Query / Range Query / Exists Query / Prefix Query / Wildcard Query
在Es中，Term Query，對輸入不做分詞。會將輸入作為一個整體，在倒排索引中查找準確的詞項，并且使用相關(guān)度算分公式為每個包含該詞項的文檔進行相關(guān)性算分
可以通過Constant Score將查詢轉(zhuǎn)換成一個Filtering，避免算分，并利用緩存，提高性能。

案例：

創(chuàng)建一個products的index，并插入3條數(shù)據(jù)

DELETE products
PUT products
{
  "settings": {
    "number_of_shards": 1
  }
}


POST /products/_bulk
{ "index": { "_id": 1 }}
{ "productID" : "XHDK-A-1293-#fJ3","desc":"iPhone" }
{ "index": { "_id": 2 }}
{ "productID" : "KDKE-B-9947-#kL5","desc":"iPad" }
{ "index": { "_id": 3 }}
{ "productID" : "JODL-X-1937-#pV7","desc":"MBP" }

Term Query

使用Term Query，查看desc的值是iPhone

POST /products/_search
{
  "query": {
    "term": {
      "desc": {
        "value":"iPhone"
      }
    }
  }
}

結(jié)果：

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

思考：document里明明有desc的值是iPhone的，為什么查不到數(shù)據(jù)呢？

答案：

由于插入一條document的時候，會做分詞處理，使用的是Standard Analyzer，默認會轉(zhuǎn)成小寫字母，但是使用Term Query的時候，輸入不會做分詞處理，所以大寫的P不會轉(zhuǎn)成小寫的p。如果查詢的值是iphone就能得到結(jié)果

POST /products/_search
{
  "query": {
    "term": {
      "desc": {
        "value":"iphone"
      }
    }
  }
}

使用Term Query，根據(jù)productId查看

POST /products/_search
{
  "query": {
    "term": {
      "productID": {
        "value": "XHDK-A-1293-#fJ3"
      }
    }
  }
}

結(jié)果：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

思考：為什么查不到數(shù)據(jù)？

答案：

如果我們使用的分詞器的語法對XHDK-A-1293-#fJ3這個text進行分詞

post _analyze
{
  "analyzer": "standard",
  "text": "XHDK-A-1293-#fJ3"
}

結(jié)果：

{
  "tokens" : [
    {
      "token" : "xhdk",
      "start_offset" : 0,
      "end_offset" : 4,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "a",
      "start_offset" : 5,
      "end_offset" : 6,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "1293",
      "start_offset" : 7,
      "end_offset" : 11,
      "type" : "<NUM>",
      "position" : 2
    },
    {
      "token" : "fj3",
      "start_offset" : 13,
      "end_offset" : 16,
      "type" : "<ALPHANUM>",
      "position" : 3
    }
  ]
}

還是因為Term Query對輸入不做分詞的緣故，導致查詢結(jié)果不符合預(yù)期。

如果執(zhí)行的是如下語句：

POST /products/_search
{
  "query": {
    "term": {
      "productID": {
        "value": "xhdk"
      }
    }
  }
}

則會返回對應(yīng)的結(jié)果。

如果想要全文匹配，可以執(zhí)行如下語句：

POST /products/_search
{
  "query": {
    "term": {
      "productID.keyword": {
        "value": "XHDK-A-1293-#fJ3"
      }
    }
  }
}

為什么加上keyword就能全文匹配呢？

這實際上index mapping的配置。

GET /products/_mapping

結(jié)果：

{
  "products" : {
    "mappings" : {
      "properties" : {
        "desc" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "productID" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

由于Term Query還會返回Score，比較影響性能，可以跳過算分的步驟
- 將Query轉(zhuǎn)成Filter，忽略TF-IDF計算，避免相關(guān)性算分的開銷
- Filter可以有效利用緩存
```
POST /products/_search
{
  "explain": true,
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "productID.keyword": "XHDK-A-1293-#fJ3"
        }
      }
    }
  }
}
```

Structured Search

對結(jié)構(gòu)化數(shù)據(jù)的搜索
- 日期，bool類型和數(shù)字都是結(jié)構(gòu)化的
文本也可以是結(jié)構(gòu)化的
- 如彩色筆可以有離散的顏色集合：red、green、blue
- 一個blog可能被標記了tag：distributed 、search
- 電商網(wǎng)站上的商品都有upcs(通用產(chǎn)品碼universal product codes)或其他的唯一標識，它們都需要遵從嚴格規(guī)定的、結(jié)構(gòu)化的格式。
布爾，時間，日期和數(shù)字這類結(jié)構(gòu)化數(shù)據(jù)：有精確的格式，我們可以對這些格式進行邏輯操作。
結(jié)構(gòu)化的文本可以做精確匹配或部分匹配
- Term Query / Prefix Query
結(jié)構(gòu)化結(jié)果只有“是”或“否”兩個值
- 根據(jù)場景需要，可以決定結(jié)構(gòu)化搜索是否需要打分。

Boolean

數(shù)據(jù)準備：

DELETE products
POST /products/_bulk
{ "index": { "_id": 1 }}
{ "price" : 10,"avaliable":true,"date":"2018-01-01", "productID" : "XHDK-A-1293-#fJ3" }
{ "index": { "_id": 2 }}
{ "price" : 20,"avaliable":true,"date":"2019-01-01", "productID" : "KDKE-B-9947-#kL5" }
{ "index": { "_id": 3 }}
{ "price" : 30,"avaliable":true, "productID" : "JODL-X-1937-#pV7" }
{ "index": { "_id": 4 }}
{ "price" : 30,"avaliable":false, "productID" : "QQPX-R-3956-#aD8" }

GET products/_mapping

案例：

#對布爾值 match 查詢，有算分
POST products/_search
{
  "profile": "true",
  "explain": true,
  "query": {
    "term": {
      "avaliable": true
    }
  }
}

#對布爾值，通過constant score 轉(zhuǎn)成 filtering，沒有算分
POST products/_search
{
  "profile": "true",
  "explain": true,
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "avaliable": true
        }
      }
    }
  }
}

Numeric Range

gt 大于
lt 小于
gte 大于等于
lte 小于等于

#數(shù)字類型 Term
POST products/_search
{
  "profile": "true",
  "explain": true,
  "query": {
    "term": {
      "price": 30
    }
  }
}

#數(shù)字類型 terms
POST products/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "terms": {
          "price": [
            "20",
            "30"
          ]
        }
      }
    }
  }
}

#數(shù)字 Range 查詢
GET products/_search
{
    "query" : {
        "constant_score" : {
            "filter" : {
                "range" : {
                    "price" : {
                        "gte" : 20,
                        "lte"  : 30
                    }
                }
            }
        }
    }
}

Date Range

表達式	說明
`y`	Years
`M`	Months
`w`	Weeks
`d`	Days
`h`	Hours
`H`	Hours
`m`	Minutes
`s`	Seconds

假設(shè)now表示現(xiàn)在時間是2021-07-04 12:00:00

表達式	說明
`now+1h`	`2021-07-04 13:00:00`
`now-1h`	`2021-07-04 11:00:00`
`2021.07.04\|\|+1M/d`	`2021-08-04 00:00:00`

案列：

POST products/_search{    "query" : {        "constant_score" : {            "filter" : {                "range" : {                    "date" : {                      "gte" : "now-5y"                    }                }            }        }    }}

Exists

如下情況，調(diào)用exists方法時不會返回結(jié)果

如果該字段不存在，對應(yīng)的值為null或者[]
如果該字段存在，存在如下情況：
- 空字符串""或者"-"
- 數(shù)組中包含null，[null, "foo"]
- 自定義了null-value，在定義index mapping的時候

POST products/_search{  "query": {    "constant_score": {      "filter": {        "exists": {          "field": "date"        }      }    }  }}POST products/_search{  "query": {    "constant_score": {      "filter": {        "bool": {          "must_not": {            "exists": {              "field": "date"            }          }        }      }    }  }}

Terms

查找包含多個精確值，注意包含而不是相等

PUT my-index-000001{  "mappings": {    "properties": {      "color": { "type": "keyword" }    }  }}PUT my-index-000001/_bulk{"index": {"_id": 1}}{"color": ["blue", "green"]}{"index": {"_id": 2}}{"color": "blue"}GET my-index-000001/_search?pretty{  "query": {    "terms": {        "color" : {            "index" : "my-index-000001",            "id" : "2",            "path" : "color"        }    }  }}POST movies/_search{  "query": {    "constant_score": {      "filter": {        "term": {          "genre.keyword": "Comedy"        }      }    }  }}POST products/_search{  "query": {    "constant_score": {      "filter": {        "terms": {          "productID.keyword": [            "QQPX-R-3956-#aD8",            "JODL-X-1937-#pV7"          ]        }      }    }  }}

Full Text Query

Full Text Query的分類
- Match Query
- Match Phrase Query
- Query String Query
- Multi Match Query
- Simple Query String Query
特點
- 索引和搜索時都會進行分詞，查詢字符串先傳遞到一個合適的分詞器，然后生成一個供查詢的詞項列表。
- 查詢時候，先會對輸入的查詢進行分詞，然后每個詞項逐個進行底層的查詢，最終將結(jié)果進行合并。并未每個文檔生成一個算分。

Query String Query

類似[URL Search](#URL Search API)

Query String Query

GET /movies/_search{    "profile": true,    "query":{       "query_string":{            "default_field": "title",           "query": "Beautiful AND Mind"       }   }}

GET /movies/_search{    "profile": true,    "query":{       "query_string":{            "fields":[              "title",                "year"          ],          "query": "2012"     }   }}

Simple Query String Query

類似Query String，但是會忽略錯誤的語法。
只支持部分查詢語句
不支持AND OR NOT，會被當做字符串處理
Term之間默認的關(guān)系是OR，可以指定Operator
支持部分邏輯
- +替代AND
- |替代OR
- -替代NOT

GET /movies/_search{    "profile":true, "query":{       "simple_query_string":{         "query":"Beautiful +mind",          "fields":["title"]      }   }}

Match Query

# 查看title里包含Beautiful OR Mind的電影POST movies/_search{  "query": {    "match": {      "title": {        "query": "Beautiful Mind"      }    }  }}# 查看title里包含Beautiful AND Mind的電影POST movies/_search{  "query": {    "match": {      "title": {        "query": "Beautiful Mind",        "operator": "AND"      }    }  }}

Match Phrase Query

與Match Query不同的是，不會對查詢的text進行分詞，還是作為一個完整的短語。

POST movies/_search{  "query": {    "match_phrase": {      "title":{        "query": "one I love"      }    }  }}POST movies/_search{  "query": {    "match_phrase": {      "title":{        "query": "one love",        "slop": 1      }    }  }}

這種精確匹配在大部分情況下顯得太嚴苛了，有時我們想要包含 ""I like swimming and riding!"" 的文檔也能夠匹配 "I like riding"。這時就要以用到 "slop" 參數(shù)來控制查詢語句的靈活度。

slop 參數(shù)告訴 match_phrase 查詢詞條相隔多遠時仍然能將文檔視為匹配什么是相隔多遠？意思是說為了讓查詢和文檔匹配你需要移動詞條多少次？

Multi Match Query

multi_match 查詢建立在 match 查詢之上，重要的是它允許對多個字段查詢。

類型	說明	備注
`Best Fields`	查找匹配任何字段的文檔，但使用來自最佳字段的 _score	當字段之間相互競爭，又相互關(guān)聯(lián)。評分來自最匹配的字段。
`Most Fields`	多個字段都包含相同的文本的場合，會將所有字段的評分合并起來	處理英文內(nèi)容時：一種常見的手段是，在主字段(Engilsh Analyzer)，抽取詞干，以匹配更多的文檔。相同的文本，加入子字段（Standard Analyzer），以提供更加精確的匹配。其他字段作為匹配文檔提高相關(guān)度的信號。匹配字段越多則越好。<br />無法使用Operator<br />可以用copy_to解決，但需要額外的存儲空間
`Cross Fields`	首先分析查詢字符串并生成一個詞列表，然后從所有字段中依次搜索每個詞，只要查詢到，就算匹配上。	對于某些實體，例如人名，地址，圖書信息。需要在多個字段中確定信息，單個字段只能作為整體的一部分。希望在任何這些列出的字段中找到盡可能多的詞。<br />支持operator<br />與copy_to相比，它可以在搜索時為單個字段提升權(quán)重
`phrase`	同match_phrase + best_field
`phrase_prefix`	同match_phrase_prefix + best_field
`bool_prefix`	同match_bool_prefix + most field

POST blogs/_search{    "query": {        "dis_max": {            "queries": [                { "match": { "title": "Quick pets" }},                { "match": { "body":  "Quick pets" }}            ],            "tie_breaker": 0.2        }    }}POST blogs/_search{  "query": {    "multi_match": {      "type": "best_fields",      "query": "Quick pets",      "fields": ["title","body"],      "tie_breaker": 0.2,      "minimum_should_match": "20%"    }  }}POST books/_search{    "multi_match": {        "query":  "Quick brown fox",        "fields": "*_title"    }}POST books/_search{    "multi_match": {        "query":  "Quick brown fox",        "fields": [ "*_title", "chapter_title^2" ]    }}DELETE /titlesPUT /titles{  "mappings": {    "properties": {      "title": {        "type": "text",        "analyzer": "english",        "fields": {"std": {"type": "text","analyzer": "standard"}}      }    }  }}POST titles/_bulk{ "index": { "_id": 1 }}{ "title": "My dog barks" }{ "index": { "_id": 2 }}{ "title": "I see a lot of barking dogs on the road " }GET /titles/_search{   "query": {        "multi_match": {            "query":  "barking dogs",            "type":   "most_fields",            "fields": [ "title", "title.std" ]        }    }}GET /titles/_search{   "query": {        "multi_match": {            "query":  "barking dogs",            "type":   "most_fields",            "fields": [ "title^10", "title.std" ]        }    }}

Compound queries

Query Context & Filter Context

高級搜索的功能：支持多項文本輸入，針對多個字段進行搜索。
搜索引擎一般也提供基于時間，價格等條件的過濾
在es中，有Query和Filter兩種不同的Context
- Query Context：相關(guān)性算分
- Filter Context: 不需要算分，可以利用Cache，獲得更好的性能

Boolean Query

案例：

假設(shè)要搜索一本電影，包含了以下一些條件
- 評論中包含了Guitar，用戶打分高于3分，同時上映日期要在1993與2000年之間
這個搜索包含了3段邏輯
- 評論字段中要包含Guitar
- 用戶評分字段要高于3分
- 上映日期字段需要在給定的范圍

特點：

一個boolean Query，是一個或多個查詢子句的組合
- 總共包括4種子句，其中2種影響算分，2種不影響
  - must: 必須匹配，貢獻算分
  - should:選擇性匹配，貢獻算分
  - must_not:Filter Context 查詢子句，必須不能匹配，不貢獻算分
  - filter:Filter Context必須匹配，但是不貢獻算分
相關(guān)性并不只是全文檢索的專利，也適用于yes|no的子句，匹配的子句越多，相關(guān)性評分越高。如果多條查詢子句被合并為一條復(fù)合查詢語句，比如boolean query，則每個查詢子句計算得出的評分會被合到總的相關(guān)性評分中。
同一層級下的競爭字段，具有相同的權(quán)重
通過嵌套boolean query，可以改變對算分的影響
should里嵌套must_not子查詢，可以實現(xiàn)should not的邏輯

語法：

子查詢可以任意順序出現(xiàn)
可以嵌套子查詢
如果沒有Must條件，should中必須滿足其中一條查詢，使用數(shù)組

POST /products/_search{  "query": {    "bool" : {      "must" : {        "term" : { "price" : "30" }      },      "filter": {        "term" : { "avaliable" : "true" }      },      "must_not" : {        "range" : {          "price" : { "lte" : 10 }        }      },      "should" : [        { "term" : { "productID.keyword" : "JODL-X-1937-#pV7" } },        { "term" : { "productID.keyword" : "XHDK-A-1293-#fJ3" } }      ],      "minimum_should_match" :1    }  }}

如何解決Terms Query遺留下來的問題，包含而不是相等。

增加count字段，使用boolean query解決

#改變數(shù)據(jù)模型，增加字段。解決數(shù)組包含而不是精確匹配的問題POST /newmovies/_bulk{ "index": { "_id": 1 }}{ "title" : "Father of the Bridge Part II","year":1995, "genre":"Comedy","genre_count":1 }{ "index": { "_id": 2 }}{ "title" : "Dave","year":1993,"genre":["Comedy","Romance"],"genre_count":2 }#must，有算分POST /newmovies/_search{  "query": {    "bool": {      "must": [        {"term": {"genre.keyword": {"value": "Comedy"}}},        {"term": {"genre_count": {"value": 1}}}      ]    }  }}#Filter。不參與算分，結(jié)果的score是0POST /newmovies/_search{  "query": {    "bool": {      "filter": [        {"term": {"genre.keyword": {"value": "Comedy"}}},        {"term": {"genre_count": {"value": 1}}}        ]    }  }}#Query ContextPOST /products/_search{  "query": {    "bool": {      "should": [        {          "term": {            "productID.keyword": {              "value": "JODL-X-1937-#pV7"}}        },        {"term": {"avaliable": {"value": true}}        }      ]    }  }}#嵌套，實現(xiàn)了 should not 邏輯POST /products/_search{  "query": {    "bool": {      "must": {        "term": {          "price": "30"        }      },      "should": [        {          "bool": {            "must_not": {              "term": {                "avaliable": "false"              }            }          }        }      ],      "minimum_should_match": 1    }  }}#Controll the PrecisionPOST _search{  "query": {    "bool" : {      "must" : {        "term" : { "price" : "30" }      },      "filter": {        "term" : { "avaliable" : "true" }      },      "must_not" : {        "range" : {          "price" : { "lte" : 10 }        }      },      "should" : [        { "term" : { "productID.keyword" : "JODL-X-1937-#pV7" } },        { "term" : { "productID.keyword" : "XHDK-A-1293-#fJ3" } }      ],      "minimum_should_match" :2    }  }}

Boosting Query

Boosting 是控制相關(guān)度的一種手段
- 索引，字段或者查詢子條件
參數(shù)boost的含義
- 當boost > 1時，打分的相關(guān)度相對性提升
- 當0<boost<1時，打分的權(quán)重相對性降低
- 當boost<0時，貢獻負分
希望包含了某項內(nèi)容的結(jié)果不是不出現(xiàn)，而是排序靠后。

案例：

DELETE blogsPOST /blogs/_bulk{ "index": { "_id": 1 }}{"title":"Apple iPad", "content":"Apple iPad,Apple iPad" }{ "index": { "_id": 2 }}{"title":"Apple iPad,Apple iPad", "content":"Apple iPad" }POST blogs/_search{  "query": {    "bool": {      "should": [        {"match": {          "title": {            "query": "apple,ipad",            "boost": 1.1          }        }},        {"match": {          "content": {            "query": "apple,ipad",            "boost": 2          }        }}      ]    }  }}DELETE newsPOST /news/_bulk{ "index": { "_id": 1 }}{ "content":"Apple Mac" }{ "index": { "_id": 2 }}{ "content":"Apple iPad" }{ "index": { "_id": 3 }}{ "content":"Apple employee like Apple Pie and Apple Juice" }POST news/_search{  "query": {    "bool": {      "must": {        "match":{"content":"apple"}      }    }  }}POST news/_search{  "query": {    "bool": {      "must": {        "match":{"content":"apple"}      },      "must_not": {        "match":{"content":"pie"}      }    }  }}POST news/_search{  "query": {    "boosting": {      "positive": {        "match": {          "content": "apple"        }      },      "negative": {        "match": {          "content": "pie"        }      },      "negative_boost": 0.5    }  }}

positive: 必須存在,查詢對象,指定希望執(zhí)行的查詢子句,返回的結(jié)果都將滿足該子句指定的條件
negative:必須存在,查詢對象,指定的查詢子句用于降低匹配文檔的相似度分
negative_boost：必須存在,浮點數(shù),介于0與1.0之間的浮點數(shù),用于降低匹配文檔的相似分

Constant Score Query

Disjunction Max Query

單字符串查詢的實例

PUT /blogs/_doc/1{    "title": "Quick brown rabbits",    "body":  "Brown rabbits are commonly seen."}PUT /blogs/_doc/2{    "title": "Keeping pets healthy",    "body":  "My quick brown fox eats rabbits on a regular basis."}POST /blogs/_search{    "query": {        "bool": {            "should": [                { "match": { "title": "Brown fox" }},                { "match": { "body":  "Brown fox" }}            ]        }    }}

預(yù)期：

title:文檔1中出現(xiàn)了Brown

body:文檔1中出現(xiàn)了Brown，文檔2中出現(xiàn)了Brown fox，并且保持和查詢一致的順序，目測應(yīng)該是文檔2的相關(guān)性算分最高。

結(jié)果：

文檔1的算分比文檔2的高。

{  "took" : 0,  "timed_out" : false,  "_shards" : {    "total" : 1,    "successful" : 1,    "skipped" : 0,    "failed" : 0  },  "hits" : {    "total" : {      "value" : 2,      "relation" : "eq"    },    "max_score" : 0.90425634,    "hits" : [      {        "_index" : "blogs",        "_type" : "_doc",        "_id" : "1",        "_score" : 0.90425634,        "_source" : {          "title" : "Quick brown rabbits",          "body" : "Brown rabbits are commonly seen."        }      },      {        "_index" : "blogs",        "_type" : "_doc",        "_id" : "2",        "_score" : 0.77041256,        "_source" : {          "title" : "Keeping pets healthy",          "body" : "My quick brown fox eats rabbits on a regular basis."        }      }    ]  }}

算分過程：

查詢should語句中的兩個查詢
兩個查詢的評分相加
乘以匹配語句的總數(shù)
除以所以語句的總數(shù)

可以使用explain看一下查詢結(jié)果和分析

title和body相互競爭，不應(yīng)該將分數(shù)簡單疊加，而是應(yīng)該找到單個最佳匹配的字段的評分。Disjunction Max Query將任何與任意查詢匹配的文檔作為結(jié)果返回。采用字段上最匹配的評分最終評分返回。

POST blogs/_search{    "query": {        "dis_max": {            "queries": [                { "match": { "title": "Brown fox" }},                { "match": { "body":  "Brown fox" }}            ]        }    }}

這樣返回的結(jié)果就會符合預(yù)期。

tie_breaker參數(shù)：

獲得最佳匹配語句的評分_score。
將其他匹配語句的評分與tie_breaker相乘
對以上評分求和并規(guī)范化
是一個介于0-1之間的浮點數(shù)。0代表使用最佳匹配，1代表所有語句同等重要。

Function Score Query

算分與排序

Elasticsearch默認會以文檔的相關(guān)度算分進行排序
可以指定一個或多個字段進行排序
使用相關(guān)度算分排序，不能滿足某些特定條件
- 無法針對相關(guān)度，對排序?qū)崿F(xiàn)更多的控制

Function Score Query

可以在查詢后，對每個匹配的文檔進行一系列重新算分，根據(jù)新生成的分數(shù)重新排序。
function
- weight: 為每一個文檔設(shè)置一個簡單而不被規(guī)范化的權(quán)重
- Field Value Factor:使用該數(shù)值來修改_score，例如將”熱度“和”點贊數(shù)“作為算分的參考因素
- Random Score：為每一個用戶使用不同的，隨機算分結(jié)果
- 衰減函數(shù)：以某個字段的值為標準，距離某個值越近，得分越高
- Script Score：自定義腳本完全控制所需邏輯
Boost Mode
- Multiply: 算分與函數(shù)值的成績
- Sum:算分與函數(shù)的和
- Min/Max: 算分與函數(shù)取最小、最大值
- Replace: 使用函數(shù)值取代算分
Max Boost可以將算分控制在一個最大值
一致性隨機函數(shù)：
- 使用場景：網(wǎng)站的廣告需要提供展現(xiàn)率
- 具體需求：讓每個用戶能看到不同的隨機數(shù)，也希望同一個用戶訪問的時候，結(jié)果的相對順序一致

DELETE blogsPUT /blogs/_doc/1{  "title":   "About popularity",  "content": "In this post we will talk about...",  "votes":   0}PUT /blogs/_doc/2{  "title":   "About popularity",  "content": "In this post we will talk about...",  "votes":   100}PUT /blogs/_doc/3{  "title":   "About popularity",  "content": "In this post we will talk about...",  "votes":   1000000}POST /blogs/_search{  "query": {    "function_score": {      "query": {        "multi_match": {          "query":    "popularity",          "fields": [ "title", "content" ]        }      },      "field_value_factor": {        "field": "votes"      }    }  }}POST /blogs/_search{  "query": {    "function_score": {      "query": {        "multi_match": {          "query":    "popularity",          "fields": [ "title", "content" ]        }      },      "field_value_factor": {        "field": "votes",        "modifier": "log1p"      }    }  }}POST /blogs/_search{  "query": {    "function_score": {      "query": {        "multi_match": {          "query":    "popularity",          "fields": [ "title", "content" ]        }      },      "field_value_factor": {        "field": "votes",        "modifier": "log1p" ,        "factor": 0.1      }    }  }}POST /blogs/_search{  "query": {    "function_score": {      "query": {        "multi_match": {          "query":    "popularity",          "fields": [ "title", "content" ]        }      },      "field_value_factor": {        "field": "votes",        "modifier": "log1p" ,        "factor": 0.1      },      "boost_mode": "sum",      "max_boost": 3    }  }}POST /blogs/_search{  "query": {    "function_score": {      "random_score": {        "seed": 911119      }    }  }}

Search Template

Elasticsearch的查詢語句
- 對相關(guān)性算分、查詢性能都至關(guān)重要
在開發(fā)初期，雖說可以明確查詢參數(shù)，但是往往不能最終定義查詢的DSL的具體結(jié)構(gòu)
- 通過Search Template定義一個Contract
各司其職，解耦
- 開發(fā)人員、搜索工程師，性能工程師

GET _search/template{  "source" : {    "query": { "match" : { "{{my_field}}" : "{{my_value}}" } },    "size" : "{{my_size}}"  },  "params" : {    "my_field" : "message",    "my_value" : "foo",    "my_size" : 5  }}

Suggester API

什么是搜索建議
- 現(xiàn)代的搜索引擎，一般都會提供Suggest as you type的功能
- 幫助用戶在輸入搜索的過程中，進行自動補全或者糾錯，通過協(xié)助用戶輸入更加精準的關(guān)鍵詞，提高后續(xù)搜索階段文檔匹配的程度
- 在google上搜索，一開始會自動補全，當輸入到一定長度，如因為單詞拼寫錯誤無法補全，就會開始提示相似的詞或者句子
API
- 搜索引擎中類似的功能，es是通過Suggester API實現(xiàn)的
- 原理：將輸入的文本分為Token，然后在索引的字典里查找相似的Term并返回。
- Term Suggester（糾錯補全，輸入錯誤的情況下補全正確的單詞）
- Phrase Suggester（自動補全短語，輸入一個單詞補全整個短語）
- Complete Suggester(完成補全單詞，輸出如前半部分，補全整個單詞）
- Context Suggester（上下文補全）
Suggestion Mode
- Missing-如索引中已經(jīng)存在，就不提供建議
- Popular-推薦出現(xiàn)頻率更高的詞
- Always-無論是否存在，都提供建議
精準度和召回率比較
- 精準度
  - completion > phrase > term
- 召回率
  - term > phrase > completion
- 性能
  - completion > phrase > term

Term Suggester && Prase Suggester

Term Suggester 先將搜索詞進行分詞，然后逐個與指定的索引數(shù)據(jù)進行比較，計算出編輯距離再返回建議詞。

編輯距離：這里使用了叫做Levenstein edit distance的算法，核心思想就是一個詞改動多少次就可以和另外的詞一致。比如說為了從elasticseach得到elasticsearch，就必須加入1個字母 r ，也就是改動1次，所以這兩個詞的編輯距離就是1。

Prase Suggester在Term Suggester上增加了一些邏輯

Prase Suggester常用參數(shù)里max errors：最多可以拼錯的Terms數(shù)，confidence：限制返回結(jié)果數(shù)，默認為1

DELETE articlesPUT articles{  "mappings": {    "properties": {      "title_completion":{        "type": "completion"      }    }  }}POST articles/_bulk{ "index" : { } }{ "title_completion": "lucene is very cool"}{ "index" : { } }{ "title_completion": "Elasticsearch builds on top of lucene"}{ "index" : { } }{ "title_completion": "Elasticsearch rocks"}{ "index" : { } }{ "title_completion": "elastic is the company behind ELK stack"}{ "index" : { } }{ "title_completion": "Elk stack rocks"}{ "index" : {} }POST articles/_search?pretty{  "size": 0,  "suggest": {    "article-suggester": {      "prefix": "elk ",      "completion": {        "field": "title_completion"      }    }  }}DELETE articlesPOST articles/_bulk{ "index" : { } }{ "body": "lucene is very cool"}{ "index" : { } }{ "body": "Elasticsearch builds on top of lucene"}{ "index" : { } }{ "body": "Elasticsearch rocks"}{ "index" : { } }{ "body": "elastic is the company behind ELK stack"}{ "index" : { } }{ "body": "Elk stack rocks"}{ "index" : {} }{  "body": "elasticsearch is rock solid"}POST _analyze{  "analyzer": "standard",  "text": ["Elk stack  rocks rock"]}POST /articles/_search{  "size": 1,  "query": {    "match": {      "body": "lucen rock"    }  },  "suggest": {    "term-suggestion": {      "text": "lucen rock",      "term": {        "suggest_mode": "missing",        "field": "body"      }    }  }}POST /articles/_search{  "suggest": {    "term-suggestion": {      "text": "lucen rock",      "term": {        "suggest_mode": "popular",        "field": "body"      }    }  }}POST /articles/_search{  "suggest": {    "term-suggestion": {      "text": "lucen rock",      "term": {        "suggest_mode": "always",        "field": "body",      }    }  }}POST /articles/_search{  "suggest": {    "term-suggestion": {      "text": "lucen hocks",      "term": {        "suggest_mode": "always",        "field": "body",        "prefix_length":0,        "sort": "frequency"      }    }  }}POST /articles/_search{  "suggest": {    "my-suggestion": {      "text": "lucne and elasticsear rock hello world ",      "phrase": {        "field": "body",        "max_errors":2,        "confidence":0,        "direct_generator":[{          "field":"body",          "suggest_mode":"always"        }],        "highlight": {          "pre_tag": "<em>",          "post_tag": "</em>"        }      }    }  }}

Complection Suggester

Complection Suggester提供了Auto Complete的功能。用戶每輸入一個字符，就需要即時發(fā)送一個查詢請求到后端查找匹配項。
對性能要求比較苛刻，elasticsearch采用了不同的數(shù)據(jù)結(jié)構(gòu)，并非通過倒排索引來完成的，而是將Analyzer的數(shù)據(jù)編碼成FST和索引一起存放，F(xiàn)ST會被ES整個加載到內(nèi)存，速度很快
FST只能用于前綴查找
定義mapping, 使用completion type
索引數(shù)據(jù)
運行suggest查詢

context Suggester

擴展了Completion Suggester
可以在搜索中加入更多的上下文信息，例如輸入“star”
- 咖啡相關(guān)：建議“starbucks”
- 電影相關(guān)：建議”star wars“
定義兩種類型的context
- Category-任意的字符串
- Geo-地理信息
定義mapping
- type
- name
索引數(shù)據(jù)，并且為每個document加入context信息
結(jié)合context進行suggestion查詢

DELETE articlesPUT articles{  "mappings": {    "properties": {      "title_completion":{        "type": "completion"      }    }  }}POST articles/_bulk{ "index" : { } }{ "title_completion": "lucene is very cool"}{ "index" : { } }{ "title_completion": "Elasticsearch builds on top of lucene"}{ "index" : { } }{ "title_completion": "Elasticsearch rocks"}{ "index" : { } }{ "title_completion": "elastic is the company behind ELK stack"}{ "index" : { } }{ "title_completion": "Elk stack rocks"}{ "index" : {} }POST articles/_search?pretty{  "size": 0,  "suggest": {    "article-suggester": {      "prefix": "elk ",      "completion": {        "field": "title_completion"      }    }  }}DELETE commentsPUT commentsPUT comments/_mapping{  "properties": {    "comment_autocomplete":{      "type": "completion",      "contexts":[{        "type":"category",        "name":"comment_category"      }]    }  }}POST comments/_doc{  "comment":"I love the star war movies",  "comment_autocomplete":{    "input":["star wars"],    "contexts":{      "comment_category":"movies"    }  }}POST comments/_doc{  "comment":"Where can I find a Starbucks",  "comment_autocomplete":{    "input":["starbucks"],    "contexts":{      "comment_category":"coffee"    }  }}POST comments/_search{  "suggest": {    "MY_SUGGESTION": {      "prefix": "sta",      "completion":{        "field":"comment_autocomplete",        "contexts":{          "comment_category":"coffee"        }      }    }  }}

Cross Cluster Search

水平擴展的痛點：

單集群：
- 當水平擴展時，節(jié)點數(shù)不能無限增加
- 當集群的meta信息（節(jié)點，索引，集群狀態(tài)）過多，會導致更新壓力變大，單個active master會成為性能瓶頸，導致整個集群無法正常工作
早期版本，會通過tribe node可以實現(xiàn)多集群訪問的需求，但是還存在一定的問題
- tribe node會以client node的方式加入每個cluster，cluster中的master node的任務(wù)變更需要tribe node的回應(yīng)才能繼續(xù)
- tribe node不能保存cluster state信息，一旦restart cluster，初始化很慢
- 當多個cluster存在索引重名的情況下，只能設(shè)置一種prefer規(guī)則

Cross Cluster Search

早期tribe node的方案存在一定的問題，所以被deprecated
es5.3引入了cross cluster search功能
- 允許任何節(jié)點扮演federated節(jié)點，以輕量的方式，將搜搜請求進行代理
- 不需要以client node 的形式加入其它集群

案例：

//啟動3個集群bin/elasticsearch -E node.name=cluster0node -E cluster.name=cluster0 -E path.data=cluster0_data -E discovery.type=single-node -E http.port=9200 -E transport.port=9300bin/elasticsearch -E node.name=cluster1node -E cluster.name=cluster1 -E path.data=cluster1_data -E discovery.type=single-node -E http.port=9201 -E transport.port=9301bin/elasticsearch -E node.name=cluster2node -E cluster.name=cluster2 -E path.data=cluster2_data -E discovery.type=single-node -E http.port=9202 -E transport.port=9302//在每個集群上設(shè)置動態(tài)的設(shè)置PUT _cluster/settings{  "persistent": {    "cluster": {      "remote": {        "cluster0": {          "seeds": [            "127.0.0.1:9300"          ],          "transport.ping_schedule": "30s"        },        "cluster1": {          "seeds": [            "127.0.0.1:9301"          ],          "transport.compress": true,          "skip_unavailable": true        },        "cluster2": {          "seeds": [            "127.0.0.1:9302"          ]        }      }    }  }}#cURLcurl -XPUT "http://localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'{"persistent":{"cluster":{"remote":{"cluster0":{"seeds":["127.0.0.1:9300"],"transport.ping_schedule":"30s"},"cluster1":{"seeds":["127.0.0.1:9301"],"transport.compress":true,"skip_unavailable":true},"cluster2":{"seeds":["127.0.0.1:9302"]}}}}}'curl -XPUT "http://localhost:9201/_cluster/settings" -H 'Content-Type: application/json' -d'{"persistent":{"cluster":{"remote":{"cluster0":{"seeds":["127.0.0.1:9300"],"transport.ping_schedule":"30s"},"cluster1":{"seeds":["127.0.0.1:9301"],"transport.compress":true,"skip_unavailable":true},"cluster2":{"seeds":["127.0.0.1:9302"]}}}}}'curl -XPUT "http://localhost:9202/_cluster/settings" -H 'Content-Type: application/json' -d'{"persistent":{"cluster":{"remote":{"cluster0":{"seeds":["127.0.0.1:9300"],"transport.ping_schedule":"30s"},"cluster1":{"seeds":["127.0.0.1:9301"],"transport.compress":true,"skip_unavailable":true},"cluster2":{"seeds":["127.0.0.1:9302"]}}}}}'#創(chuàng)建測試數(shù)據(jù)curl -XPOST "http://localhost:9200/users/_doc" -H 'Content-Type: application/json' -d'{"name":"user1","age":10}'curl -XPOST "http://localhost:9201/users/_doc" -H 'Content-Type: application/json' -d'{"name":"user2","age":20}'curl -XPOST "http://localhost:9202/users/_doc" -H 'Content-Type: application/json' -d'{"name":"user3","age":30}'#查詢GET /users,cluster1:users,cluster2:users/_search{  "query": {    "range": {      "age": {        "gte": 20,        "lte": 40      }    }  }}

resources

REST APIs

Search APIs

Query DSL

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

elasticsearch學習筆記（三）-elasticsearch Search APIs

elasticsearch學習筆記（三）-elasticsearch Search APIs

elasticsearch Search APIs

URL Search API

Query String Syntax

Query Domain Specific Language(DSL)

Term-Level Queries

Term Query

Structured Search

Boolean

Numeric Range

Date Range

Exists

Terms

Full Text Query

Query String Query

Simple Query String Query

Match Query

Match Phrase Query

Multi Match Query

Compound queries

Query Context & Filter Context

Boolean Query

Boosting Query

Constant Score Query

Disjunction Max Query

Function Score Query

Search Template

Suggester API

Term Suggester && Prase Suggester

Complection Suggester

context Suggester

Cross Cluster Search

resources

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

elasticsearch學習筆記（三）-elasticsearch Search APIs

elasticsearch Search APIs

URL Search API

Query String Syntax

Query Domain Specific Language(DSL)

Term-Level Queries

Term Query

Structured Search

Boolean

Numeric Range

Date Range

Exists

Terms

Full Text Query

Query String Query

Simple Query String Query

Match Query

Match Phrase Query

Multi Match Query

Compound queries

Query Context & Filter Context

Boolean Query

Boosting Query

Constant Score Query

Disjunction Max Query

Function Score Query

Search Template

Suggester API

Term Suggester && Prase Suggester

Complection Suggester

context Suggester

Cross Cluster Search

resources

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av