MYSQL經(jīng)典SQL之時(shí)長統(tǒng)計(jì)

概述

假設(shè)有一張表,記錄了用戶的登入、登出信息,我們需要根據(jù)這些信息統(tǒng)計(jì)用戶的在線時(shí)長,相應(yīng)的設(shè)置用戶等級(jí),就像掛QQ升級(jí)

數(shù)據(jù)如下:

mysql> select * from user_sessions;
+----+----------+-------+----------+----------+
| id | platform | usr   | start    | end      |
+----+----------+-------+----------+----------+
|  1 | web      | user1 | 08:30:00 | 10:30:00 |
|  2 | web      | user2 | 08:30:00 | 08:45:00 |
|  3 | web      | user1 | 09:00:00 | 09:30:00 |
|  4 | web      | user2 | 09:15:00 | 10:30:00 |
|  5 | web      | user1 | 09:15:00 | 09:30:00 |
|  6 | web      | user2 | 10:30:00 | 14:30:00 |
|  7 | web      | user1 | 10:45:00 | 11:30:00 |
|  8 | web      | user2 | 11:00:00 | 12:30:00 |
|  9 | app      | user1 | 08:30:00 | 08:45:00 |
| 10 | app      | user2 | 09:00:00 | 09:30:00 |
| 11 | app      | user1 | 11:45:00 | 12:00:00 |
| 12 | app      | user2 | 12:30:00 | 14:00:00 |
| 13 | app      | user1 | 12:45:00 | 13:30:00 |
| 14 | app      | user2 | 13:00:00 | 14:00:00 |
| 15 | app      | user1 | 14:00:00 | 16:30:00 |
| 16 | app      | user2 | 15:30:00 | 17:00:00 |
+----+----------+-------+----------+----------+

我們的站點(diǎn)有兩個(gè)入口:web端和app端

很明顯,這些數(shù)據(jù)時(shí)間上是有重疊的,例如user1在08:30~10:30這個(gè)時(shí)間段內(nèi)用三個(gè)web端和一個(gè)app端登錄,但他的時(shí)長應(yīng)該只能算作2個(gè)小時(shí)

要正確統(tǒng)計(jì),我們需要三步:

  1. 獲取用戶的重疊時(shí)間段中最早的起始時(shí)間
  2. 獲取用戶的重疊時(shí)間段中最晚的結(jié)束時(shí)間
  3. 前兩步的起始時(shí)間與結(jié)束時(shí)間合并,得到綜合的時(shí)間段,再進(jìn)行統(tǒng)計(jì)

第一步:

查詢用戶名、起始時(shí)間
其中起始時(shí)間需要滿足的條件是:不包含于其它時(shí)間段,即不存在一條記錄來滿足條件“s>start且s<=end”
得到的結(jié)果可能存在重復(fù),所以用distinct去重
這樣找到的s就是用戶重疊時(shí)間段的最早時(shí)間。如下:

mysql> select distinct usr,start from user_sessions as a where not exists(select * from user_sessions as b where a.usr=b.usr and a.start>b.start and a.start<=b.end);
+-------+----------+
| usr   | start    |
+-------+----------+
| user1 | 08:30:00 |
| user2 | 08:30:00 |
| user1 | 10:45:00 |
| user2 | 09:00:00 |
| user1 | 11:45:00 |
| user1 | 12:45:00 |
| user1 | 14:00:00 |
| user2 | 15:30:00 |
+-------+----------+

第二步:

查詢用戶名、結(jié)束時(shí)間
這步于第一步一樣道理,結(jié)束時(shí)間滿足的條件是:不包含于其它時(shí)間段,即不存在一條記錄滿足“e>=start且e<end”
結(jié)果如下:

mysql> select distinct usr,end from user_sessions as a where not exists(select * from user_sessions as b where a.usr=b.usr and a.end>=b.start and a.end<b.end);
+-------+----------+
| usr   | end      |
+-------+----------+
| user1 | 10:30:00 |
| user2 | 08:45:00 |
| user2 | 14:30:00 |
| user1 | 11:30:00 |
| user1 | 12:00:00 |
| user1 | 13:30:00 |
| user1 | 16:30:00 |
| user2 | 17:00:00 |
+-------+----------+

第三步:

合并前兩步的結(jié)果
為了方便查詢和敘述,我們將前兩步建立對(duì)應(yīng)的視圖:

mysql> create view v_s as select distinct usr,start from user_sessions as a where not exists(select * from user_sessions as b where a.usr=b.usr and a.start>b.start and a.start<=b.end);

mysql> create view v_e as select distinct usr,end from user_sessions as a where not exists(select * from user_sessions as b where a.usr=b.usr and a.end>=b.start and a.end<b.end);

查看v_s和v_e兩個(gè)視圖:

mysql> select * from v_s;
+-------+----------+
| usr   | start    |
+-------+----------+
| user1 | 08:30:00 |
| user2 | 08:30:00 |
| user1 | 10:45:00 |
| user2 | 09:00:00 |
| user1 | 11:45:00 |
| user1 | 12:45:00 |
| user1 | 14:00:00 |
| user2 | 15:30:00 |
+-------+----------+

mysql> select * from v_e;
+-------+----------+
| usr   | end      |
+-------+----------+
| user1 | 10:30:00 |
| user2 | 08:45:00 |
| user2 | 14:30:00 |
| user1 | 11:30:00 |
| user1 | 12:00:00 |
| user1 | 13:30:00 |
| user1 | 16:30:00 |
| user2 | 17:00:00 |
+-------+----------+

可以看到,兩個(gè)視圖總行數(shù)是相等的,但并不是按行對(duì)應(yīng)的,所以并不是單純的將兩個(gè)結(jié)果橫向合并!

方案一:

對(duì)于v_s中的起始時(shí)間,我們應(yīng)該在v_e中找“usr相同且end>start”的最小值
例如,對(duì)于v_s中user1起始時(shí)間為08:30:00,我們應(yīng)該找到v_e中end>08:30:00且usr=user1的記錄,然后取min得到結(jié)果為10:30:00作為與之對(duì)應(yīng)的結(jié)束時(shí)間
SQL及結(jié)果如下:

mysql> select distinct v_s.usr,v_s.start,(select min(end) from v_e where v_e.end>v_s.start and v_e.usr=v_s.usr) as end from v_s,v_e where v_s.usr=v_e.usr;
+-------+----------+----------+
| usr   | start    | end      |
+-------+----------+----------+
| user1 | 08:30:00 | 10:30:00 |
| user2 | 08:30:00 | 08:45:00 |
| user1 | 10:45:00 | 11:30:00 |
| user2 | 09:00:00 | 14:30:00 |
| user1 | 11:45:00 | 12:00:00 |
| user1 | 12:45:00 | 13:30:00 |
| user1 | 14:00:00 | 16:30:00 |
| user2 | 15:30:00 | 17:00:00 |
+-------+----------+----------+

在此基礎(chǔ)上,按usr進(jìn)行分組,統(tǒng)計(jì)用戶的總時(shí)長,得到最終結(jié)果:

mysql> select usr,sec_to_time(sum(timestampdiff(second,start,end))) as time from (select distinct v_s.usr,v_s.start,(select min(end) from v_e where v_e.end>v_s.start and v_e.usr=v_s.usr) as end from v_s,v_e where v_s.usr=v_e.usr) as c group by usr;
+-------+----------+
| usr   | time     |
+-------+----------+
| user1 | 06:15:00 |
| user2 | 07:15:00 |
+-------+----------+

其中,timestampdiff函數(shù)是計(jì)算兩個(gè)時(shí)間的差值,第一個(gè)參數(shù)用來指定其單位,這里指定為秒;
sec_to_time函數(shù)是將求和后的秒數(shù)再轉(zhuǎn)為時(shí)間格式來顯示

方案二

既然v_s與v_e的數(shù)據(jù)不是按行對(duì)應(yīng)的,那么我們可以對(duì)其排序,使其按行對(duì)應(yīng)后,就可以直接按行合并
為了按行合并,我們同時(shí)加上行號(hào),讓兩個(gè)子查詢行號(hào)匹配即可:

mysql> set @sno=0;
mysql> set @eno=0;
mysql> select s.usr,s.start,e.end from (select @sno:=@sno+1 as sno,usr,start from v_s order by usr,start) as s, (select @eno:=@eno+1 as eno,usr,end from v_e order by usr,end) as e where s.sno=e.eno;
+-------+----------+----------+
| usr   | start    | end      |
+-------+----------+----------+
| user1 | 08:30:00 | 10:30:00 |
| user1 | 10:45:00 | 11:30:00 |
| user1 | 11:45:00 | 12:00:00 |
| user1 | 12:45:00 | 13:30:00 |
| user1 | 14:00:00 | 16:30:00 |
| user2 | 08:30:00 | 08:45:00 |
| user2 | 09:00:00 | 14:30:00 |
| user2 | 15:30:00 | 17:00:00 |
+-------+----------+----------+

之后的分組統(tǒng)計(jì)與方案一相同

擴(kuò)展

實(shí)際場(chǎng)景中,我們經(jīng)常將不同的方式的在線時(shí)長加上獎(jiǎng)勵(lì)倍數(shù),來鼓勵(lì)用戶更多的使用某種入口。
例如,我們?yōu)榱俗層脩舾嗟氖褂胊pp登錄,規(guī)定app的在線時(shí)長獎(jiǎng)勵(lì)2倍,即app在線一小時(shí)相當(dāng)于web在線兩小時(shí)。
這時(shí),我們就需要將不同的登錄方式區(qū)分開來統(tǒng)計(jì)。要區(qū)分開也很簡(jiǎn)單,只需要添加SQL條件讓platform相同即可,例如第一步修改為:

mysql> select distinct usr,platform,start from user_sessions as a where not exists(select * from user_sessions as b where a.usr=b.usr and a.platform=b.platform and a.start>b.start and a.start<=b.end);
+-------+----------+----------+
| usr   | platform | start    |
+-------+----------+----------+
| user1 | web      | 08:30:00 |
| user2 | web      | 08:30:00 |
| user2 | web      | 09:15:00 |
| user1 | web      | 10:45:00 |
| user1 | app      | 08:30:00 |
| user2 | app      | 09:00:00 |
| user1 | app      | 11:45:00 |
| user2 | app      | 12:30:00 |
| user1 | app      | 12:45:00 |
| user1 | app      | 14:00:00 |
| user2 | app      | 15:30:00 |
+-------+----------+----------+

之后的步驟類似,在最終統(tǒng)計(jì)時(shí)分別統(tǒng)計(jì)web和app的時(shí)長,并將app的時(shí)長乘以2再與web時(shí)長相加即可,不再贅述。

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容