窗口函數(shù)也叫分析函數(shù)
窗口函數(shù)的基本用法如下:
其中,over是關(guān)鍵字,用來指定函數(shù)執(zhí)行的窗口范圍,包含三個(gè)分析子句:分組(partition by)子
句,排序(order by)子句,窗口(rows)子句,如果后面括號中什么都不寫,則意味著窗口包含滿
足where條件的所有行,窗口函數(shù)基于所有行進(jìn)行計(jì)算;如果不為空,則支持以下語法來設(shè)置窗口:
知識點(diǎn)總結(jié)
sum(...A...) over(partition by ...B... order by ...C... rows between ...D1... and ...D2...)
avg(...A...) over(partition by ...B... order by ...C... rows between ...D1... and ...D2...)
SELECT DATABASES() as 當(dāng)前數(shù)據(jù)庫名;
SELECT USER() as 用戶名;
函數(shù)名([expr]) over子句
函數(shù)() over()
函數(shù)名([expr]) over(partition by <要分列的組> order by <要排序的列> rows
between <數(shù)據(jù)范圍>)
SELECT VERSION() as 數(shù)據(jù)庫版本;A: 需要被加工的字段名稱
B: 分組的字段名稱
C: 排序的字段名稱
D: 計(jì)算的行數(shù)范圍
1.2 窗口函數(shù)應(yīng)用
一般,我們可以把窗口函數(shù)分為兩種:
專有窗口函數(shù):
rank()
dense_rank()
row_number()
聚合類窗口函數(shù):
普通場景下,聚合函數(shù)往往和group by一起使用,但是窗口環(huán)境下,聚合函數(shù)也可以應(yīng)
用進(jìn)來,那么此時(shí)它們就被稱之為聚合類窗口函數(shù),屬于窗口函數(shù)的一種
sum()
count()
avg()
max()
min()
窗口函數(shù)(專有窗口函數(shù)+聚合類窗口函數(shù))和普通場景下的聚合函數(shù)也很容易混淆,二者區(qū)別
如下:
普通場景下的聚合函數(shù)是將多條記錄聚合為一條(多到一);窗口函數(shù)是每條記錄都會(huì)執(zhí)
行,有幾條記錄執(zhí)行完還是幾條(多到多)。
分組(partition by):記錄按照字段進(jìn)行分組,窗口函數(shù)在不同的分組上分別執(zhí)行。
rows between 2 preceding and current row # 取當(dāng)前行和前面兩行
rows between unbounded preceding and current row # 包括本行和之前所有的行
rows between current row and unbounded following # 包括本行和之后所有的行
rows between 3 preceding and current row # 包括本行和前面三行
rows between 3 preceding and 1 following # 從前面三行和下面一行,總共五行
# 當(dāng)order by后面缺少窗口從句條件,窗口規(guī)范默認(rèn)是rows between unbounded
preceding and current row.
# 當(dāng)order by和窗口從句都缺失, 窗口規(guī)范默認(rèn)是 rows between unbounded preceding
and unbounded following
我們可以把窗口函數(shù)分為兩種:
專有窗口函數(shù):
rank()
dense_rank()
row_number()
聚合類窗口函數(shù):
普通場景下,聚合函數(shù)往往和group by一起使用,但是窗口環(huán)境下,聚合函數(shù)也可以應(yīng)
用進(jìn)來,那么此時(shí)它們就被稱之為聚合類窗口函數(shù),屬于窗口函數(shù)的一種
sum()
count()
avg()
max()
min()
窗口函數(shù)(專有窗口函數(shù)+聚合類窗口函數(shù))和普通場景下的聚合函數(shù)也很容易混淆,二者區(qū)別
如下:
普通場景下的聚合函數(shù)是將多條記錄聚合為一條(多到一);窗口函數(shù)是每條記錄都會(huì)執(zhí)
行,有幾條記錄執(zhí)行完還是幾條(多到多)。
分組(partition by):記錄按照字段進(jìn)行分組,窗口函數(shù)在不同的分組上分別執(zhí)行。
rows between 2 preceding and current row # 取當(dāng)前行和前面兩行
rows between unbounded preceding and current row # 包括本行和之前所有的行
rows between current row and unbounded following # 包括本行和之后所有的行
rows between 3 preceding and current row # 包括本行和前面三行
rows between 3 preceding and 1 following # 從前面三行和下面一行,總共五行
# 當(dāng)order by后面缺少窗口從句條件,窗口規(guī)范默認(rèn)是rows between unbounded
preceding and current row.
# 當(dāng)order by和窗口從句都缺失, 窗口規(guī)范默認(rèn)是 rows between unbounded preceding
and unbounded following列名 釋義
user_name 用戶名
piece 購買數(shù)量
price 價(jià)格
pay_amount 支付金額
goods_category 商品品類
pay_time 支付日期
排序(order by):按照哪些字段進(jìn)行排序,窗口函數(shù)將按照排序后的記錄順序進(jìn)行編號,可
以和partition子句配合使用,也可以單獨(dú)使用。如果沒有partition子句,數(shù)據(jù)范圍則是整
個(gè)表的數(shù)據(jù)行。
窗口(rows):就是進(jìn)行函數(shù)分析時(shí)要處理的數(shù)據(jù)范圍,屬于當(dāng)前分區(qū)的一個(gè)子集,通常用來
作為滑動(dòng)窗口使用。比如要根據(jù)每個(gè)訂單動(dòng)態(tài)計(jì)算包括本訂單和按時(shí)間順序前后兩個(gè)訂單
的移動(dòng)平均支付金額,則可以設(shè)置rows子句來創(chuàng)建滑動(dòng)窗口(rows)。
舉例:
先創(chuàng)建數(shù)據(jù)庫:
CREATE DATABASE ZXR_2 CHARACTER SET utf8;
先創(chuàng)建表格:
CREATE TABLE user_trade (
user_name VARCHAR ( 20 ),
piece INT,
price DOUBLE,
pay_amount DOUBLE,
goods_category VARCHAR ( 20 ),
pay_time date
);
-- 需求1: 查詢出2019年每月的支付總額和當(dāng)年累積支付總額
-- step1 過濾出2019年數(shù)據(jù)
SELECT
*
FROM
user_trade
WHERE
YEAR ( pay_time )= 2019;
-- step2 在1的基礎(chǔ)上,按照月份進(jìn)行g(shù)roup by 分組,統(tǒng)計(jì)每個(gè)月份的支付總額
SELECT
MONTH(pay_time ),SUM(pay_amount)
FROM
user_trade
WHERE
YEAR ( pay_time )= 2019
GROUP BY MONTH(pay_time );
-- step3 在2的基礎(chǔ)上應(yīng)用窗口函數(shù)實(shí)現(xiàn)需求
select
a.month '月份',a.pay_amount '金額',sum(a.pay_amount) over (ORDER BY a.MONTH) '累計(jì)金額'
FROM
(SELECT
MONTH(pay_time ) month ,SUM(pay_amount) pay_amount
FROM
user_trade
WHERE
YEAR ( pay_time )= 2019
GROUP BY MONTH(pay_time ))a;
-- 需求2:查詢出2018-2019年每月的支付總額和當(dāng)年累積支付總額
SELECT
b.YEAR,
b.MONTH,
b.pay_amount,
sum( b.pay_amount ) over (
PARTITION BY
b.YEAR
ORDER BY
b.MONTH )
FROM? ? ------子句
(
SELECT YEAR
( pay_time ) YEAR,
MONTH ( pay_time ) MONTH,
SUM( pay_amount ) pay_amount
FROM
user_trade
WHERE
YEAR ( pay_time ) IN ( 2018, 2019 )
GROUP BY
YEAR ( pay_time ),
MONTH ( pay_time ))b;
-- 需求3: 查詢出2019年每個(gè)月的近三月移動(dòng)平均支付金額
SELECT
b.MONTH,
b.pay_amount,
AVG( b.pay_amount ) over (
ORDER BY
b.MONTH rows BETWEEN 2 preceding and current row)
FROM
(
SELECT
MONTH ( pay_time ) MONTH,
SUM( pay_amount ) pay_amount
FROM
user_trade
WHERE
YEAR ( pay_time )= 2019
GROUP BY
MONTH ( pay_time ))b;
-- 需求4: 查詢出每四個(gè)月的最大月總支付金額
SELECT
b.MONTH,
b.pay_amount,
max( b.pay_amount ) over (
ORDER BY
b.MONTH rows BETWEEN 3 preceding and current row)
FROM
(
SELECT
substr(pay_time,1,7) month,
SUM( pay_amount ) pay_amount
FROM
user_trade
WHERE
YEAR ( pay_time ) in (2018,2019)
GROUP BY
substr(pay_time,1,7))b;