表結構:

方式一:
HIVESQL:
1.首先按照學生和上課時間排序(升序),并新增一列row值當作參考列
SELECT uid,`datetime`,teacher,id,ROW_NUMBER( ) OVER(order by `uid`,`datetime`) as rank from class_table結果如下:

2.利用lag(a,b,c)開窗函數(shù),共三個參數(shù),a為選中的滯后列,b為滯后的行數(shù),c為空時默認值(可以不指定)
暫定連續(xù)上4節(jié)課
SELECT uid, `datetime`, teacher, `rank`
, lag(`rank`, 3) OVER (PARTITION BY uid, teacher ORDER BY `datetime`) AS `rank2`
FROM (
SELECT uid, `datetime`, teacher, ROW_NUMBER() OVER (ORDER BY `uid`, `datetime`) AS rank
FROM class_table
) tmp
結果如下:

3.通過判斷列rank和rank2的關系即可找出相同老師學生連續(xù)上多節(jié)課的數(shù)據(jù)
SELECT *
FROM (
SELECT uid, `datetime`, teacher, `rank`
, lag(`rank`, 3) OVER (PARTITION BY uid, teacher ORDER BY `datetime`) AS `rank2`
FROM (
SELECT uid, `datetime`, teacher, ROW_NUMBER() OVER (ORDER BY `uid`, `datetime`) AS rank
FROM class_table
) tmp
) tmp2
WHERE `rank` = `rank2` + 3
調節(jié)滯后量即可查詢其他連續(xù)數(shù)量
方式二:

原理都是一樣的,pandas需要額外加一列自增列
data.sort_values(by=['uid','datetime'],inplace=True)
data['row'] = range(len(data))
data['rank'] = data.sort_values('datetime').groupby(['uid','teacher'])['row'].shift(3)
data[data['row']==data['rank']+3]
希望能幫助到大家
以上