使用Pandas進(jìn)行數(shù)據(jù)操作的時(shí)候,有時(shí)需要分組將數(shù)據(jù)錯(cuò)位進(jìn)行操作。
在數(shù)據(jù)分析中經(jīng)常遇到需要分組使用a列的第n行數(shù)據(jù)與去b列的第n+1行數(shù)據(jù)進(jìn)行對(duì)比或者計(jì)算的要求,下面是我使用pandas解決該問(wèn)題的方法。這個(gè)時(shí)候可以通過(guò)操作Index來(lái)實(shí)現(xiàn)。不過(guò)Pandas針對(duì)這種情況已經(jīng)提供了一種方法了,就是shift函數(shù)。定義如下:
pandas.DataFrame.shift
DataFrame.shift(self,periods=1,freq=None,axis=0,fill_value=None)[source]
Shift index by desired number of periods with an optional time?freq.
When?freq?is not passed, shift the index without realigning the data. If?freq?is passed (in this case, the index must be date or datetime, or it will raise a?NotImplementedError), the index will be increased using the periods and the?freq.
比如我們要分析一個(gè)汽車(chē)的形式記錄,需要對(duì)比每個(gè)位置的前一個(gè)點(diǎn)和后一個(gè)點(diǎn)的情況,如下代碼即可:
df1['x_pre']=df1.groupby('CARID')['x'].shift(1)
df1['x_next']=df1.groupby('CARID')['x'].shift(-1)
df1['y_pre']=df1.groupby('CARID')['y'].shift(1)
df1['y_next']=df1.groupby('CARID')['y'].shift(-1)