ElasticSearch實(shí)戰(zhàn)--指標(biāo)聚合(五)

一、聚合分析簡(jiǎn)介

??聚合分析是數(shù)據(jù)庫中重要的功能特性,完成對(duì)一個(gè)查詢的數(shù)據(jù)集中數(shù)據(jù)的聚合計(jì)算,如:找出某字段(或計(jì)算表達(dá)式的結(jié)果)的最大值、最小值,計(jì)算和、平均值等。ES作為搜索引擎兼數(shù)據(jù)庫,同樣提供了強(qiáng)大的聚合分析能力。
??對(duì)一個(gè)數(shù)據(jù)集求最大、最小、和、平均值等指標(biāo)的聚合,在ES中稱為指標(biāo)聚合 metric
??而關(guān)系型數(shù)據(jù)庫中除了有聚合函數(shù)外,還可以對(duì)查詢出的數(shù)據(jù)進(jìn)行分組group by,再在組上進(jìn)行指標(biāo)聚合。在 ES 中g(shù)roup by 稱為分桶,桶聚合 bucketing。
??ES中還提供了矩陣聚合(matrix)、管道聚合(pipleline),但還在完善中。
聚合分析的值來源:
聚合計(jì)算的值可以取字段的值,也可是腳本計(jì)算的結(jié)果。

二、指標(biāo)聚合

  1. 查找價(jià)格最高的商品
GET /goods_index/goods_type/_search
{
  "size": 0,
  "aggs": {
    "masssbalance": {
      "max": {
        "field": "sell_price"
      }
    }
  }
}
  1. 查找價(jià)格最低的商品
GET /goods_index/goods_type/_search
{
  "size": 0,
  "aggs": {
    "masssbalance": {
      "min": {
        "field": "sell_price"
      }
    }
  }
}
  1. 查找所有商品和
GET /goods_index/goods_type/_search
{
  "size": 0,
  "aggs": {
    "masssbalance": {
      "sum": {
        "field": "sell_price"
      }
    }
  }
}

  1. 查詢商品平均價(jià)
GET /goods_index/goods_type/_search
{
  "size": 0,
  "aggs": {
    "masssbalance": {
      "avg": {
        "field": "sell_price"
      }
    }
  }
}

  1. 文檔計(jì)數(shù) count
    統(tǒng)計(jì)商品價(jià)格大于500的文檔數(shù)量
GET /goods_index/goods_type/_count
{
  "query": {
    "bool": {
      "filter": {
        "range": {
          "sell_price": {
            "gte": 10
          }
        }
      }
    }
  }
}
  1. Value count 統(tǒng)計(jì)某字段有值的文檔數(shù)
GET /goods_index/goods_type/_search?size=0
{
  "aggs": {
    "sell_count": {
      "value_count": {
        "field": "sell_price"
      }
    }
  }
}

結(jié)果:

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 7,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "sell_count": {
      "value": 7
    }
  }
}
  1. cardinality 值去重計(jì)數(shù)
GET /goods_index/goods_type/_search?size=0
{
  "aggs": {
    "sell_count": {
      "cardinality": {
        "field": "sell_price"
      }
    }
  }
}

結(jié)果:

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 7,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "sell_count": {
      "value": 6
    }
  }
}

8.stats 統(tǒng)計(jì) count max min avg sum 5個(gè)值

GET /goods_index/goods_type/_search?size=0
{
  "aggs": {
    "sell_stats": {
      "stats": {
        "field": "sell_price"
      }
    }
  }
}

結(jié)果:

{
  "took": 17,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 7,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "sell_stats": {
      "count": 7,
      "min": 398,
      "max": 980,
      "avg": 692.1428571428571,
      "sum": 4845
    }
  }
}
  1. Extended stats
GET /goods_index/goods_type/_search?size=0
{
  "aggs": {
    "sell_stats": {
      "extended_stats": {
        "field": "sell_price"
      }
    }
  }
}

結(jié)果:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 7,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "sell_stats": {
      "count": 7,
      "min": 398,
      "max": 980,
      "avg": 692.1428571428571,
      "sum": 4845,
      "sum_of_squares": 3565461,
      "variance": 30289.836734693898,
      "std_deviation": 174.03975619005533,
      "std_deviation_bounds": {
        "upper": 1040.2223695229677,
        "lower": 344.06334476274645
      }
    }
  }
}
  1. Percentiles 占比百分位對(duì)應(yīng)的值統(tǒng)計(jì)
    對(duì)指定字段(腳本)的值按從小到大累計(jì)每個(gè)值對(duì)應(yīng)的文檔數(shù)的占比(占所有命中文檔數(shù)的百分比),返回指定占比比例對(duì)應(yīng)的值。默認(rèn)返回[ 1, 5, 25, 50, 75, 95, 99 ]分位上的值。如下中間的結(jié)果,可以理解為:占比為50%的文檔的sell_price值 <= 696,或反過來:sell_price<=696的文檔數(shù)占總命中文檔數(shù)的50%。
GET /goods_index/goods_type/_search?size=0
{
   "aggs": {
    "age_percents": {
      "percentiles": {
        "field": "sell_price"
      }
    }
  }
}

結(jié)果:

{
  "took": 34,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 7,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "age_percents": {
      "values": {
        "1.0": 398,
        "5.0": 398,
        "25.0": 603.75,
        "50.0": 696,
        "75.0": 819,
        "95.0": 980,
        "99.0": 980
      }
    }
  }
}

指定分位值

GET /goods_index/goods_type/_search?size=0
{
  "aggs": {
    "age_percents": {
      "percentiles": {
        "field": "sell_price",
        "percents" : [95, 99, 99.9] 
      }
    }
  }
}

結(jié)果:

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 7,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "age_percents": {
      "values": {
        "95.0": 980,
        "99.0": 980,
        "99.9": 980
      }
    }
  }
}
  1. Percentiles rank 統(tǒng)計(jì)值小于等于指定值的文檔占比
    統(tǒng)計(jì)年齡小于800和500的文檔的占比
GET /goods_index/goods_type/_search?size=0
{
  "aggs": {
    "gge_perc_rank": {
      "percentile_ranks": {
        "field": "sell_price",
        "values": [
          500,
          800
        ]
      }
    }
  }
}

結(jié)果:

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 7,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "gge_perc_rank": {
      "values": {
        "500.0": 14.417379855167873,
        "800.0": 73.64185110663985
      }
    }
  }
}

一、計(jì)算每個(gè)tag下的商品數(shù)量

image.png

將文本field的fielddata屬性設(shè)置為true


image.png
image.png

二、對(duì)名稱中包含yagao的商品,計(jì)算每個(gè)tag下的商品數(shù)量

image.png

三、先分組,再算每組的平均值,計(jì)算每個(gè)tag下的商品的平均價(jià)格

image.png

四、計(jì)算每個(gè)tag下的商品的平均價(jià)格,并且按照平均價(jià)格降序排序

image.png

五、按照指定的價(jià)格范圍區(qū)間進(jìn)行分組,然后在每組內(nèi)再按照tag進(jìn)行分組,最后再計(jì)算每組的平均價(jià)格

新增一個(gè)商品便于分析

PUT ecommerce/product/4
{
    "name": "shiwang yagao",
    "desc": "gaoxiao meibai fangzhu",
    "price": 30,
    "producer": "shiwang producer",
    "tags": [
     "meibai",
     "fangzhu"
    ]
}

image.png
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容