1.借鑒
極客時(shí)間 阮一鳴老師的Elasticsearch核心技術(shù)與實(shí)戰(zhàn)
Elasticsearch 參考指南(映射參數(shù)enabled)
[翻譯]Elasticsearch重要文章之五:預(yù)加載fielddata
Elasticsearch學(xué)習(xí)之圖解Elasticsearch中的_source、_all、store和index屬性
elasticsearch 中的store 以及倒排索引的問(wèn)題
Elasticsearch 關(guān)于store字段的處理
elasticsearch搜索過(guò)程分析
2. 開(kāi)始
Dynamic
- dynamic控制著索引的文檔是否可包含新增字段,默認(rèn)為true。
| true | false | strict | |
|---|---|---|---|
| 文檔可被索引 | 是 | 是 | 否 |
| 字段可被索引 | 是 | 否 | 否 |
| _mapping可被更新 | 是 | 否 | 否 |
False
- 我們?cè)囈幌?,設(shè)置dynamic為false
PUT /my_movies
{
"mappings": {
"dynamic": false,
"properties": {
"name": {
"type": "keyword"
},
"content": {
"type": "text"
}
}
}
}
- 添加一篇文檔,帶有mapping中沒(méi)有指定的字段age
PUT /my_movies/_doc/1
{
"name": "caiser",
"content": "Hello Hello",
"age": 99
}
- 添加成功后再看一下mapping
{
"my_movies" : {
"mappings" : {
"dynamic" : "false",
"properties" : {
"content" : {
"type" : "text"
},
"name" : {
"type" : "keyword"
}
}
}
}
}
- 結(jié)果表明,設(shè)置為false后,文檔被索引了,但是mapping并沒(méi)有更新
-我們?cè)偻ㄟ^(guò)age查詢一下,看看字段是否被索引
GET /my_movies/_search?q=age:99
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
- 返回結(jié)果是空的,說(shuō)明設(shè)置為false,字段不會(huì)被索引
Strict
- 接下來(lái)我們?cè)囈幌?,設(shè)置dynamic為strict
DELETE /my_movies
PUT /my_movies
{
"mappings": {
"dynamic": "strict",
"properties": {
"name": {
"type": "keyword"
},
"content": {
"type": "text"
}
}
}
}
- 我們嘗試添加一篇文檔,文檔中包含mapping定義中不存在的屬性
PUT /my_movies/_doc/1
{
"name": "caiser",
"content": "Hello Hello",
"age": 99
}
- 直接報(bào)錯(cuò)了
{
"error": {
"root_cause": [
{
"type": "strict_dynamic_mapping_exception",
"reason": "mapping set to strict, dynamic introduction of [age] within [_doc] is not allowed"
}
],
"type": "strict_dynamic_mapping_exception",
"reason": "mapping set to strict, dynamic introduction of [age] within [_doc] is not allowed"
},
"status": 400
}
- 由此可見(jiàn),設(shè)置dynamic為strict時(shí),如果索引mapping中不存在的字段,會(huì)直接報(bào)錯(cuò)
null_value
- 需要對(duì)null值進(jìn)行搜索
- 只有keyword類型支持設(shè)置為null_value
例子
- 我們來(lái)驗(yàn)證一下,為text類型設(shè)置null_value
DELETE /my_movies
PUT /my_movies
{
"mappings": {
"properties": {
"name": {
"type": "keyword"
},
"content": {
"type": "text",
"null_value": "null"
}
}
}
}
- 為text類型設(shè)置null_value則會(huì)報(bào)以下錯(cuò)誤
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [content] has unsupported parameters: [null_value : null]"
}
],
"type": "mapper_parsing_exception",
"reason": "Failed to parse mapping [_doc]: Mapping definition for [content] has unsupported parameters: [null_value : null]",
"caused_by": {
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [content] has unsupported parameters: [null_value : null]"
}
},
"status": 400
}
- 如果為keyword設(shè)置,則可以成功
DELETE /my_movies
PUT /my_movies
{
"mappings": {
"properties": {
"name": {
"type": "keyword",
"null_value": "null"
},
"content": {
"type": "text"
}
}
}
}
- 我們來(lái)添加數(shù)據(jù)并且查詢一下:
# 添加兩篇文檔
PUT /my_movies/_doc/1
{
"content": "123",
"name": null
}
PUT /my_movies/_doc/2
{
"content": "123456"
}
# 查詢一下
GET /my_movies/_search
{
"query": {
"term": {
"name": {
"value": "null"
}
}
}
}
- 查詢結(jié)果
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "my_movies",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"content" : "123",
"name" : null
}
}
]
}
}
Copy To
- copy_to將字段的數(shù)知拷貝到目標(biāo)字段
- copy_to的目標(biāo)字段不出現(xiàn)在_source中
DELETE /my_users
# 創(chuàng)建索引
PUT /my_users
{
"mappings": {
"properties": {
"fristName": {
"type": "text",
"copy_to": "fullName"
},
"lastName": {
"type": "text",
"copy_to": "fullName"
}
}
}
}
# 索引文檔
PUT /my_users/_doc/1
{
"fristName": "sun",
"lastName": "ruikai"
}
# 查詢
GET /my_users/_search
{
"query": {
"match": {
"fullName": {
"query": "sun ruikai",
"operator": "and"
}
}
}
}
- 查詢結(jié)果
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "my_users",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"fristName" : "sun",
"lastName" : "ruikai"
}
}
]
}
}
doc_values & fielddata
| doc_values | fielddata | |
|---|---|---|
| 何時(shí)創(chuàng)建 | 索引時(shí),和倒排索引一起創(chuàng)建 | 搜索是動(dòng)態(tài)創(chuàng)建 |
| 創(chuàng)建位置 | 磁盤文件 | JVM內(nèi)存 |
| 優(yōu)點(diǎn) | 避免大量?jī)?nèi)存占用 | 索引速度快,不占用額外的磁盤空間 |
| 缺點(diǎn) | 降低索引速度,占用額外的磁盤空間 | 文檔過(guò)多,動(dòng)態(tài)創(chuàng)建開(kāi)銷大,占用過(guò)多JVM內(nèi)存 |
| 缺省值 | true | false |
- 如果keyword字段無(wú)需排序和聚合,可以設(shè)置doc_values: false,可以增加索引的速度,減少磁盤使用量,如果重新打開(kāi),需要重建索引
- 如果text字段需要排序和聚合,需要設(shè)置fielddata: true
enable
如果一個(gè)字段不需要被檢索,排序以及集合分析,enable設(shè)置為false
需要注意的是:enabled只能設(shè)置在頂層mapping中,以及type為object的屬性中
以下兩種為合法的設(shè)置
DELETE my_movies
PUT /my_movies
{
"mappings": {
"properties": {
"name": {
"type": "keyword"
},
"content": {
"type": "text"
},
"url": {
"enabled": false,
"type": "object"
}
}
}
}
DELETE my_movies
PUT /my_movies
{
"mappings": {
"enabled": false,
"properties": {
"name": {
"type": "keyword"
},
"content": {
"type": "text"
},
"url": {
"type": "object"
}
}
}
}
eager_global_ordinals
預(yù)加載
如果更新頻繁,聚合查詢頻繁的keyword類型的字段推薦將該選項(xiàng)設(shè)置為true
DELETE my_movies
PUT /my_movies
{
"mappings": {
"properties": {
"name": {
"type": "keyword",
"eager_global_ordinals": true
},
"content": {
"type": "text"
},
"url": {
"type": "object"
}
}
}
}
_source && index && store 圖例


_source
翻譯官網(wǎng)如下:
_source字段包含索引時(shí)傳遞的原始JSON文檔主體。_source字段本身沒(méi)有索引(因此不能搜索),但是會(huì)被存儲(chǔ),以便在執(zhí)行fetch請(qǐng)求(如get或search)時(shí)返回。
設(shè)置_source為false可節(jié)約磁盤,適用于指標(biāo)型數(shù)據(jù),一般優(yōu)先考慮增加壓縮比(index.codec),但是關(guān)閉了_source就不支持以下操作
- update, update_by_query, reindex
- 高亮
- 無(wú)法在_source字段中獲得
我們可以指定_source全部禁用,或者指定包含哪些,不包含哪些
舉個(gè)栗子
# 全部禁用_source
PUT /song_of_ice_and_fire
{
"mappings": {
"_source": {
"enabled": false
},
"properties": {
"title": {
"type": "keyword"
},
"content": {
"type": "text"
}
}
}
}
# 包含title,不包含content
PUT /song_of_ice_and_fire
{
"mappings": {
"_source": {
"includes": ["title"],
"excludes": ["content"]
},
"properties": {
"title": {
"type": "keyword"
},
"content": {
"type": "text"
}
}
}
}
store
翻譯官網(wǎng)如下:
默認(rèn)情況下,字段值被索引以使其可搜索,但不存儲(chǔ)它們。這意味著可以查詢字段,但不能檢索原始字段值。
通常這并不重要。字段值已經(jīng)是_source字段的一部分,該字段默認(rèn)存儲(chǔ)。如果只想檢索單個(gè)字段或幾個(gè)字段的值,而不是整個(gè)_source,那么可以通過(guò)源過(guò)濾來(lái)實(shí)現(xiàn)。
在某些情況下,存儲(chǔ)字段是有意義的。例如,如果你有一個(gè)有標(biāo)題的文檔,一個(gè)日期,和一個(gè)非常大的內(nèi)容字段,你可能想檢索僅僅標(biāo)題和日期,而不必從一個(gè)大_source字段提取這些字段
store屬性用于指定原始字段是否存儲(chǔ),一般不與_source中的字段重疊
PUT /song_of_ice_and_fire
{
"mappings": {
"_source": {
"includes": ["title"],
"excludes": ["content"]
},
"properties": {
"title": {
"type": "keyword"
},
"content": {
"type": "text",
"store": true
}
}
}
}
Index
- index的設(shè)置控制著字段是否被索引,默認(rèn)為true
| true | false | |
|---|---|---|
| 是否會(huì)創(chuàng)建倒排索引 | 是 | 否 |
| 字段是否可被搜索 | 是 | 否 |
- 我們舉個(gè)栗子,設(shè)置name的index屬性為false
DELETE /my_movies
PUT /my_movies
{
"mappings": {
"properties": {
"name": {
"type": "keyword",
"index": false
},
"content": {
"type": "text"
}
}
}
}
- 索引一篇文檔
PUT /my_movies/_doc/1
{
"name": "caiser",
"content": "Hello Hello",
"age": 99
}
- 查詢一下
GET /my_movies/_search
{
"query": {
"term": {
"name": {
"value": "caiser"
}
}
}
}
- 結(jié)果直接報(bào)錯(cuò)了,es的返回也說(shuō)明了問(wèn)題:“Cannot search on field [name] since it is not indexed.”
{
"error": {
"root_cause": [
{
"type": "query_shard_exception",
"reason": "failed to create query: {\n \"term\" : {\n \"name\" : {\n \"value\" : \"caiser\",\n \"boost\" : 1.0\n }\n }\n}",
"index_uuid": "uLkZEGRuRCKVWyik8Z8VCQ",
"index": "my_movies"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "my_movies",
"node": "M4LyTpueT--40-oJaXKvfA",
"reason": {
"type": "query_shard_exception",
"reason": "failed to create query: {\n \"term\" : {\n \"name\" : {\n \"value\" : \"caiser\",\n \"boost\" : 1.0\n }\n }\n}",
"index_uuid": "uLkZEGRuRCKVWyik8Z8VCQ",
"index": "my_movies",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Cannot search on field [name] since it is not indexed."
}
}
}
]
},
"status": 400
}
Index Option
- index_option控制者倒排索引記錄的級(jí)別
| 序號(hào) | 級(jí)別 | 描述 |
|---|---|---|
| 1 | doc | 記錄doc id |
| 2 | freqs | 記錄doc id 和 term 頻率 |
| 3 | positions | 記錄doc id,term頻率,term位置 |
| 4 | offsets | 記錄doc id,term頻率,term位置,字符偏移量 |
- text 默認(rèn)級(jí)別為positions,其他默認(rèn)為doc