題目
已知:es里存儲(chǔ)了一套在線數(shù)據(jù)索引:log_online
存儲(chǔ)了從多個(gè)服務(wù)器按每分鐘一次的頻率統(tǒng)計(jì)的,各服務(wù)器在線人數(shù)
求:系統(tǒng)內(nèi)在線人數(shù)的曲線圖數(shù)據(jù)
規(guī)則:
- 在線人數(shù)的最小統(tǒng)計(jì)粒度是1分鐘。在線人數(shù) = 某一時(shí)刻,所有服務(wù)器的在線人數(shù)總和
- 如果按每10分鐘統(tǒng)計(jì),按10分鐘內(nèi)的最高在線人數(shù)統(tǒng)計(jì)
思路
- 需要先統(tǒng)計(jì)出每分鐘,在線人數(shù)總和sum
第一個(gè)聚合:group_by_min,聚合方式:時(shí)間,每分鐘
聚合后的統(tǒng)計(jì)方式:sum
"aggs": {
"group_by_min": {
"date_histogram": {
"field": "log_time",
"fixed_interval": "1m",
"min_doc_count": 0
},
"aggs": {
"sum_online": {
"sum": {
"field": "online"
}
}
}
}
}
- 基于步驟1的結(jié)果,再根據(jù)要統(tǒng)計(jì)的粒度:比如每小時(shí),再聚合。用max_bucket找出每個(gè)分桶的最大值
"aggs": {
"group_by_hour": {
"date_histogram": {
"field": "log_time",
"fixed_interval": "1h",
"format": "yyyy-MM-dd HH",
"time_zone": "+08:00",
"min_doc_count": 0
},
"aggs": {
"group_by_min": {
"date_histogram": {
"field": "log_time",
"fixed_interval": "1m",
"format": "yyyy-MM-dd HH:mm:ss",
"time_zone": "+08:00",
"min_doc_count": 0
},
"aggs": {
"sum_online": {
"sum": {
"field": "online"
}
}
}
},
"max_aggs": {
"max_bucket": {
"buckets_path": "group_by_min>sum_online"
}
}
}
}
}
- 進(jìn)過(guò)兩次聚合,數(shù)據(jù)已經(jīng)非常多了,而實(shí)際要返回的數(shù)據(jù)其實(shí)只是每個(gè)時(shí)段的最高在線人數(shù),和時(shí)段的值,這個(gè)需要通過(guò)filter_path
GET /log_online/_search?filter_path=aggregations.group_by_hour.buckets.max_aggs.value,aggregations.group_by_hour.buckets.key,aggregations.group_by_hour.buckets.key_as_string