The bulk API makes it possible to perform many index/delete operations in a single API call. This can greatly increase the indexing speed.
批量 API 可以在單個(gè) API 調(diào)用中執(zhí)行許多索引/刪除操作。這可以大大提高索引速度。
bulk示例
POST _bulk
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_type" : "type1", "_id" : "2" } }
{ "create" : { "_index" : "test", "_type" : "type1", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_type" : "type1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }
The endpoints are /_bulk, /{index}/_bulk, and {index}/{type}/_bulk. When the index or the index/type are provided, they will be used by default on bulk items that don’t provide them explicitly.
端點(diǎn)是 /_bulk、/{index}/_bulk 和 {index}/{type}/_bulk。當(dāng)提供索引或索引/類型時(shí),默認(rèn)情況下它們將用于未明確提供它們的批量項(xiàng)目。
A note on the format. The idea here is to make processing of this as fast as possible. As some of the actions will be redirected to other shards on other nodes, only action_meta_data is parsed on the receiving node side.
Client libraries using this protocol should try and strive to do something similar on the client side, and reduce buffering as much as possible.
The response to a bulk action is a large JSON structure with the individual results of each action that was performed. The failure of a single action does not affect the remaining actions.
對批量操作的響應(yīng)是一個(gè)大型 JSON 結(jié)構(gòu),其中包含已執(zhí)行的每個(gè)操作的單獨(dú)結(jié)果。單個(gè)操作的失敗不會影響其余操作。
There is no "correct" number of actions to perform in a single bulk call. You should experiment with different settings to find the optimum size for your particular workload.
在單個(gè)批量調(diào)用中沒有要執(zhí)行的“正確”數(shù)量的操作。您應(yīng)該嘗試不同的設(shè)置以找到適合您特定工作負(fù)載的最佳大小。
If using the HTTP API, make sure that the client does not send HTTP chunks, as this will slow things down.
如果使用 HTTP API,請確??蛻舳瞬话l(fā)送 HTTP 塊,因?yàn)檫@會減慢速度。
Update
When using update action _retry_on_conflict can be used as field in the action itself (not in the extra payload line), to specify how many times an update should be retried in the case of a version conflict.
使用更新操作時(shí),_retry_on_conflict 可以用作操作本身的字段(而不是在額外的有效負(fù)載行中),以指定在版本沖突的情況下應(yīng)重試更新的次數(shù)。
The update action payload, supports the following options: doc (partial document), upsert, doc_as_upsert, script, params (for script), lang (for script) and _source. See update documentation for details on the options. Example with update actions:
更新操作負(fù)載,支持以下選項(xiàng):doc(部分文檔)、upsert、doc_as_upsert、script、params(用于腳本)、lang(用于腳本)和_source。有關(guān)選項(xiàng)的詳細(xì)信息,請參閱更新文檔。更新操作示例
POST _bulk
{ "update" : {"_id" : "1", "_type" : "type1", "_index" : "index1", "_retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"} }
{ "update" : { "_id" : "0", "_type" : "type1", "_index" : "index1", "_retry_on_conflict" : 3} }
{ "script" : { "source": "ctx._source.counter += params.param1", "lang" : "painless", "params" : {"param1" : 1}}, "upsert" : {"counter" : 1}}
{ "update" : {"_id" : "2", "_type" : "type1", "_index" : "index1", "_retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"}, "doc_as_upsert" : true }
{ "update" : {"_id" : "3", "_type" : "type1", "_index" : "index1", "_source" : true} }
{ "doc" : {"field" : "value"} }
{ "update" : {"_id" : "4", "_type" : "type1", "_index" : "index1"} }
{ "doc" : {"field" : "value"}, "_source": true}