久久精品字母,操逼视频一区二区,国产精品美女一区

如何監(jiān)控Elasticsearch(一)
如何監(jiān)控Elasticsearch(二)

Elasticsearch本身提供了詳盡API以供用戶實時了解Es運行狀態(tài)。通過這些Api你可以及時發(fā)現(xiàn)例如丟失節(jié)點，OOM，長時間GC等問題，然后可以及時修復它們。對Elasticsearch監(jiān)控主要分以下幾類：

搜索和索引性能
內(nèi)存和GC
機器指標和網(wǎng)絡指標
集群狀態(tài)和節(jié)點可用
資源負載和錯誤

（一）Search performance metrics

搜索性能指標。搜索是Es最主要的2個功能之一，另一個就是索引。搜索和索引類似于傳統(tǒng)DB的read和write。Es搜索功能的內(nèi)部實現(xiàn)包含了query和fetch兩個階段，API分別提供了這兩個階段的相關(guān)指標（性能數(shù)據(jù)主要分兩類：Throughput吞吐量和Performance性能）：

Metric description	Name	Metric type
Total number of queries	`indices.search.query_total`	Work: Throughput
Total time spent on queries	`indices.search.query_time_in_millis`	Work: Performance
Number of queries currently in progress	`indices.search.query_current`	Work: Throughput
Total number of fetches	`indices.search.fetch_total`	Work: Throughput
Total time spent on fetches	`indices.search.fetch_time_in_millis`	Work: Performance
Number of fetches currently in progress	`indices.search.fetch_current`	Work: Throughput

Search performance metrics to watch

Query load: 查詢負載。監(jiān)視當前正在進行的查詢數(shù)可以大致了解集群在任意時段內(nèi)處理的請求數(shù)。請求數(shù)突然激增或驟降都預示了一些問題，可以考慮給予告警。如果想監(jiān)控搜索線程池隊列大小，文章后面會有介紹。
Query latency: 查詢延遲。盡管Elasticsearch API沒有直接提供此指標，但是可以通過幾個指標來計算平均查詢延遲，方法是定期抽樣查詢總數(shù)和總耗用時間。如果延遲超過一定閾值時，就要找到資源瓶頸，或確認是否需要優(yōu)化查詢。
Fetch latency:提取延遲。提取階段是搜索過程的第二階段，它通常需要比查詢階段花費少得多的時間。如果發(fā)現(xiàn)此指標持續(xù)增加，這可能表示磁盤緩慢，對結(jié)果文檔處理（在搜索結(jié)果中高亮相關(guān)文字等），或請求過多結(jié)果文檔的問題。

（二）Indexing performance metrics

索引請求類似于傳統(tǒng)數(shù)據(jù)庫中的write請求。如果您的Elasticsearch主要工作是write，那么監(jiān)視和分析如何提高index性能就非常重要了。事先了解Elasticsearch更新索引的過程是有益處的。當將新文檔添加到索引，或更新刪除現(xiàn)有文檔時，索引中的每個分片都經(jīng)過兩個過程：refresh和flush。API提供了相關(guān)的指標：

Metric description	Name	Metric type
Total number of documents indexed	`indices.indexing.index_total`	Work: Throughput
Total time spent indexing documents	`indices.indexing.index_time_in_millis`	Work: Performance
Number of documents currently being indexed	`indices.indexing.index_current`	Work: Throughput
Total number of index refreshes	`indices.refresh.total`	Work: Throughput
Total time spent refreshing indices	`indices.refresh.total_time_in_millis`	Work: Performance
Total number of index flushes to disk	`indices.flush.total`	Work: Throughput
Total time spent on flushing indices to disk	`indices.flush.total_time_in_millis`	Work: Performance

Indexing performance metrics to watch

Indexing latency: 索引延遲。Elasticsearch API沒有直接提供此指標，但是可以通過index_total和index_time_in_millis指標來計算平均索引延時。如果延時在增加，可能是由于一次索引的數(shù)據(jù)量太大導致的（Elasticsearch的文檔建議在做bulk index時，單批次索引從5M-15M，慢慢增加，直到找到合理的值）。
Flush latency: 刷新延時。在flush成功完成之前，數(shù)據(jù)并不會持久化到磁盤，所以監(jiān)控該指標也是非常有必要的，如果想能下降的厲害，就要采取相應措施了。如果您看到此指標穩(wěn)步增長，則可能表明磁盤出現(xiàn)slow問題; 此問題可能會升級，并最終不能寫入數(shù)據(jù)。您可以嘗試在索引的flush設(shè)置中降低index.translog.flush_threshold_size。此設(shè)置時一個觸發(fā)Flush的閾值，即當translog超過多大時開始Flush。但是，如果您是一個寫得很重的Elasticsearch用戶，您應該使用iostat等工具隨時關(guān)注磁盤IO指標。如果有必要，請考慮升級磁盤。

（三）Memory usage and garbage collection

當Elasticsearch運行時，內(nèi)存是需要密切監(jiān)視的關(guān)鍵資源之一。 Elasticsearch和Lucene會通過兩種方式充分利用RAM：JVM堆和文件系統(tǒng)高速緩存。 Elasticsearch在Java虛擬機（JVM）中運行，這意味著JVM垃圾收集持續(xù)時間和頻率將是另外一個需要監(jiān)視的重要領(lǐng)域。
JVM heap
Elasticsearch非常強調(diào)JVM堆大小的“剛剛好”的重要性 - 既不能設(shè)置得太大或也不能太小，原因后面說。一般來說，Elasticsearch的經(jīng)驗是將接近50％的內(nèi)存分配給JVM堆，并且永遠不要超過32 GB。
分配給Elasticsearch的堆內(nèi)存越少，Lucene可以使用的RAM越多（Lucene非常依賴于file system cache來快速地處理請求）。但如果將Elasticsearch堆大小設(shè)置得太小，程序就會頻繁的GC，持續(xù)短暫停頓。甚至OOM。
Garbage collection
Elasticsearch依靠垃圾回收進程釋放堆內(nèi)存。 GC會導致進程無法響應外部請求，需要留意它的頻率和持續(xù)時間，看看是否需要調(diào)整堆大小。設(shè)置堆太大可能導致長時間的垃圾收集：長時間的暫停是危險的，因為這可能導致集群錯誤地認為節(jié)點已脫離集群。

Metric description	Name	Metric type
Total count of young-generation garbage collections	`jvm.gc.collectors.young.collection_count`	Other
Total time spent on young-generation garbage collections	`jvm.gc.collectors.young.collection_time_in_millis`	Other
Total count of old-generation garbage collections	`jvm.gc.collectors.old.collection_count`	Other
Total time spent on old-generation garbage collections	`jvm.gc.collectors.old.collection_time_in_millis`	Other
Percent of JVM heap currently in use	`jvm.mem.heap_used_percent`	Resource: Utilization
Amount of JVM heap committed	`jvm.mem.heap_committed_in_bytes`	Resource: Utilization

JVM metrics to watch

JVM heap in use: 已用的JVM堆大小。Elasticsearch默認配置在JVM堆使用率達到75％時進行垃圾回收GC。如果使用率一直非常高比如85%，說明GC長時間來不及回收內(nèi)存，這很危險。這是可能需要增加內(nèi)存或者增加節(jié)點。
JVM heap used vs. JVM heap committed: JVM堆的used于committed的比率。如果比率隨著時間的推移開始向上傾斜，這意味著垃圾收集速率不能跟上對象創(chuàng)建速率，這可能導致垃圾收集時間變慢，最終導致OutOfMemoryErrors。
Garbage collection duration and frequency:GC耗時和頻率。young gc和old gc都會有一個 “stop the world” 階段，因為GC時JVM會停止程序執(zhí)行并回收無用的對象實例。在此期間，節(jié)點無法完成任何任務。由于master節(jié)點每30秒檢查一個其他節(jié)點的狀態(tài)，如果任何節(jié)點的垃圾收集時間超過30秒，它將導致主節(jié)點認為該節(jié)點已經(jīng)丟失。

（四）Host-level network and system metrics

除了應用層面的性能指標，還需要監(jiān)控節(jié)點主機的性能指標。

Disk space: 數(shù)據(jù)節(jié)點的磁盤空間是非常重要的，如果空間不夠是無法寫入任何新數(shù)據(jù)的。當空間不夠時，需要刪除無用index，或者增加新的硬盤，或者增加新節(jié)點。
I/O utilization: I/O使用率。當創(chuàng)建，查詢，合并段文件時，Elasticsearch會大量的讀寫磁盤。Elasticsearch集群性能比較依賴磁盤I/O，如果條件允許，使用SSD，可以顯著提高集群性能。
CPU utilization: CPU使用率。如果CPU使用率增加，這通常是由于搜索或索引工作量大。如果CPU使用情況持續(xù)增加，那么可能需要添加更多節(jié)點以根據(jù)需要均衡負載。
Network bytes sent/received: 網(wǎng)絡流量。節(jié)點之間的通信是集群平衡非常關(guān)鍵。為了確保它的健康，監(jiān)控網(wǎng)絡是非常必要的。 Elasticsearch本身提供了集群通信的傳輸指標，但可以直接查看主機發(fā)送和接收的字節(jié)速率，以了解您的網(wǎng)絡流量。
Open file descriptors: 文件描述符。file descriptors用于文件操作，網(wǎng)絡連接。操作系統(tǒng)會有一個可用的上限，如果超過這個值，那么新鏈接和文件操作都不能進行。Elasticsearch會要求將該值設(shè)大，因為Lucene會同時打開大量文件。

HTTP connections

Metric description	Name	Metric type
Number of HTTP connections currently open	`http.current_open`	Resource: Utilization
Total number of HTTP connections opened over time	`http.total_opened`	Resource: Utilization

除了Java Client其他語言的Client都是使用的Http協(xié)議，如果Http鏈接數(shù)一直持續(xù)不斷增加，應該是有些client程序在連接Elasticsearch時設(shè)置有問題。不斷的重新建立連接會浪費server和client的資源。在寫client程序時要注意這點。

（五）Cluster health and node availability

Metric description	Name	Metric type
Cluster status (green, yellow, red)	`cluster.health.status`	Other
Number of nodes	`cluster.health.number_of_nodes`	Resource: Availability
Number of initializing shards	`cluster.health.initializing_shards`	Resource: Availability
Number of unassigned shards	`cluster.health.unassigned_shards`	Resource: Availability

**Cluster status: **集群狀態(tài)。如果集群狀態(tài)為yellow，說明至少丟失一個備份分片。這種狀態(tài)下，搜索結(jié)果仍是完整的。如果集群狀態(tài)為red，說明至少丟失一個主分片。這種狀態(tài)下，搜索結(jié)果會缺失部分數(shù)據(jù)。
**Initializing and unassigned shards: **初始化和未分配的數(shù)量。當創(chuàng)建一個新index或者節(jié)點重啟時，index的分片會首先處于“initializing”狀態(tài)，此時master節(jié)點會給集群中的節(jié)點分配分片。然后分片進入 “started” 或 “unassigned”狀態(tài)。

（六）Resource saturation and errors

Elasticsearch使用線程池來管理線程，用以調(diào)配內(nèi)存和CPU資源。線程池是基于CPU核數(shù)自動配置的，大部分情況下不需要調(diào)整。但是，最好實時監(jiān)控線程池的隊列長度和被拒絕的數(shù)量，以便可以及時發(fā)現(xiàn)集群配置已經(jīng)跟不上需求。這種情況應該增加節(jié)點以滿足高并發(fā)需求。fielddata 和 filter緩存使用是另外一個需要監(jiān)控的重要領(lǐng)域，因為它們可能反映出有人使用了低效的查詢語法請求，或者存在內(nèi)存壓力。

1.Thread pool queues and rejections
每個節(jié)點都管理多種線程池，其中最需要監(jiān)視的是search, index, merge, 和 bulk，分別對應了search, index, merge 和 bulk 請求操作。
線程池的大小表示該節(jié)點有多少請求正在等待服務。節(jié)點最終會服務隊列里的這些請求，并不會丟失他們。當線程池滿了之后，請求會被拒絕。

Metric description	Name	Metric type
Number of queued threads in a thread pool	`thread_pool.bulk.queue` `thread_pool.index.queue` `thread_pool.search.queue` `thread_pool.merge.queue`	Resource: Saturation
Number of rejected threads a thread pool	`thread_pool.bulk.rejected` `thread_pool.index.rejected` `thread_pool.search.rejected` `thread_pool.merge.rejected`	Resource: Error

**Thread pool queues: ** 線程池隊列。只是簡單的把隊列設(shè)大并不是一個好方案，因為這會耗盡系統(tǒng)資源，影響其他性能。而且隊列過大反而會增加數(shù)據(jù)丟失的風險。如果發(fā)現(xiàn)等待隊列及拒絕隊列在逐步增加，如果可能的話減少請求頻次，或者增加節(jié)點CPU，或直接增加節(jié)點。
**Bulk rejections and bulk queues: ** bulk的等待隊列及拒絕隊列。bulk操作是同時執(zhí)行多個操作，用以代替多次單個請求。如果發(fā)現(xiàn)bulk拒絕，一般是因為bulk操作在同一批次索引了過多文檔。此時應該線性或指數(shù)性減少請求量。

2.Cache usage metrics
每個查詢請求都會被分發(fā)到index的每個分片shard，然后命中每個分片的段文件segment。Elasticsearch基于每個段來緩存查詢，以加快響應時間。另一方面，如果緩存使用了太多內(nèi)存，他們可能會放慢速度，而不是加快速度！
Elasticsearch使用兩種主要類型的緩存更快地提供搜索請求：fielddata緩存和filter緩存。

Metric description	Name	Metric type
Size of the fielddata cache (bytes)	`indices.fielddata.memory_size_in_bytes`	Resource: Utilization
Number of evictions from the fielddata cache	`indices.fielddata.evictions`	Resource: Saturation
Size of the filter cache (bytes)	`indices.filter_cache.memory_size_in_bytes`	Resource: Utilization
Number of evictions from the filter cache	`indices.filter_cache.evictions`	Resource: Saturation

**Fielddata cache evictions: ** fielddata緩存剔除數(shù)。Elasticsearch會按照一定的規(guī)則剔除一些不常用的緩存，以更好的利用內(nèi)存資源。
**Filter cache evictions: ** filter緩存剔除數(shù)。類似fielddata緩存剔除數(shù)。

3.Pending tasks

Metric description	Name	Metric type
Number of pending tasks	`pending_task_total`	Resource: Saturation
Number of urgent pending tasks	`pending_tasks_priority_urgent`	Resource: Saturation
Number of high-priority pending tasks	`pending_tasks_priority_high`	Resource: Saturation

掛起的任務只能由master節(jié)點處理，這些任務包括創(chuàng)建索引和向節(jié)點分配分片。如果主節(jié)點非常忙，并且掛起任務的數(shù)量不減少，則可能導致不穩(wěn)定的集群。

4.Unsuccessful GET requests

Metric description	Name	Metric type
Total number of GET requests where the document was missing	`indices.get.missing_total`	Work: Error
Total time spent on GET requests where the document was missing	`indices.get.missing_time_in_millis`	Work: Error

Get請求比查詢請求直接多了：它直接通過ID獲取文檔。通常情況下Get請求不會有什么問題，但最好在發(fā)生Get失敗時保持警惕。

https://www.datadoghq.com/blog/monitor-elasticsearch-performance-metrics/
https://www.elastic.co/guide/en/elasticsearch/guide/current/_monitoring_individual_nodes.html

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

[Elasticsearch Monitor] 如何監(jiān)控Elasticsearch(一)

[Elasticsearch Monitor] 如何監(jiān)控Elasticsearch(一)

（一）Search performance metrics

（二）Indexing performance metrics

（三）Memory usage and garbage collection

（四）Host-level network and system metrics

（五）Cluster health and node availability

（六）Resource saturation and errors

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

[Elasticsearch Monitor] 如何監(jiān)控Elasticsearch(一)

（一）Search performance metrics

（二）Indexing performance metrics

（三）Memory usage and garbage collection

（四）Host-level network and system metrics

（五）Cluster health and node availability

（六）Resource saturation and errors

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av