——python sql
pandas在數(shù)據(jù)處理上有著豐富且高效的函數(shù)，我們把數(shù)據(jù)清理、整理好后，只是一張?jiān)嫉腄ataFrame。python也能像SQL一樣或者excel里面的voolkup一樣將數(shù)據(jù)進(jìn)行合并，也能像excel里面的透視表或者sql group by一樣進(jìn)行數(shù)據(jù)透視組合，也能像excel的查找功能或者sql里面的where功能進(jìn)行數(shù)據(jù)篩選。

python類似sql where用法或excel查找用法

python 類似where用法中的 col=a、col<>a、col=a and col =b、col=a or col=b、col in(a,b,c)、col not in(a,b,c)

語(yǔ)法	描述
df[‘col’]==‘Female’	查找df表col列中內(nèi)容等于Female的內(nèi)容`=`用法
df[‘col’]!=11	查找df表col列中內(nèi)容不等于11的內(nèi)容`<>`用法
df[df[‘col’]==‘Female’]	查找df表中col列單個(gè)條件等于Female的內(nèi)容，并返回整張表所有列
df[(df[‘col’]==‘Female’)&(df[‘col2’]>0)]	查找df表中col列等于Female，并且col2列大于0的內(nèi)容，返回整張表所有列 `and`用法
df[‘col’].between(a,b)	col列中a=2,b=8，返回2-8之間的數(shù)
df[(df[‘col’]>=10)｜(df[‘col2’]<50)]	查找df表中col列小于等于10或者col2列小于50的內(nèi)容，返回整張表所有列 `or`用法
df[df[‘col’].isin([21.01, 23.68, 24.59])]	查找col列中包含指定值的內(nèi)容，返回整張表所有列`in`用法
df[-df[‘col’].isin([11,63])]	查找col列中不包含多個(gè)值的內(nèi)容，返回整張表`not in`用法

代碼案例

import pandas as pd
data={'a':[1,2,3,4,3,2,6],
      'b':[43,23,52,23,11,63,83],
      'c':['true','fales','true','true','fales','fales','true']}
data=pd.DataFrame(data)#創(chuàng)建一個(gè)df表
Out[33]: 
   a   b      c
0  1  43   true
1  2  23  fales
2  3  52   true
3  4  23   true
4  3  11  fales
5  2  63  fales
6  6  83   true

#查找b列內(nèi)容大于等于30的所有列
data[data['b']>=30] 
Out[34]: 
   a   b      c
0  1  43   true
2  3  52   true
5  2  63  fales
6  6  83   true
 #查找b類大于等于30并且a列小于5的所有列
data[(data['b']>=30)&(data['a']<5)]
Out[35]: 
   a   b      c
0  1  43   true
2  3  52   true
5  2  63  fales
#查找b列不包含11和63的所有列，不用`-`號(hào)代表包含
data[-data['b'].isin([11,63])]
Out[36]: 
   a   b      c
0  1  43   true
1  2  23  fales
2  3  52   true
3  4  23   true
6  6  83   true

python類似sql Group by分組用法

group一般會(huì)配合合計(jì)函數(shù)（Aggregate functions）使用，比如：count、avg等。Pandas對(duì)合計(jì)函數(shù)的支持有限，有count和size函數(shù)實(shí)現(xiàn)SQL的count

語(yǔ)法	描述
df.groupby(‘sex’).size()	對(duì)字段sex單列進(jìn)行分組，只展示sex字段計(jì)數(shù)
df.groupby(‘sex’)[‘tip’].count()	對(duì)字段sex單列進(jìn)行分組，計(jì)算tip字段計(jì)數(shù)
df.groupby(‘sex’).count()	根據(jù)字段sex單列進(jìn)行分組計(jì)算，展示所有字段計(jì)數(shù)
df.groupby(‘sex’).agg({‘tip’:np.max,‘total_bill’:np.sum})	根據(jù)字段sex進(jìn)行分組，分別求tip最大值，字段total_bill求和值
df.groupby(‘tip’).agg({‘sex’: pd.Series.nunique})	去重tip字段并依sex字段進(jìn)行計(jì)數(shù)
pd.pivot_table(df,index=col1,columns=col2,values=[col2,col3], aggfunc=max)	創(chuàng)建一個(gè)按列col1進(jìn)行分組，并計(jì)算col2和col3的最大值的數(shù)據(jù)透視表

python中的group也支持迭代常用于循環(huán)對(duì)整個(gè)df進(jìn)行分組然后再進(jìn)行加工

語(yǔ)法	描述
for x in df.groupby(‘col’):	循環(huán)語(yǔ)句對(duì)df表按col列進(jìn)行分組，返回多個(gè)tuple，x[1]選取返回的df數(shù)據(jù)
for x in df.groupby([‘col’,‘col2’]):	循環(huán)語(yǔ)句對(duì)df表按col、col2列進(jìn)行分組，返回多個(gè)tuple，x[1]選取返回的df數(shù)據(jù)

代碼案例

直接groupby計(jì)算

#按c列分組分別計(jì)算a,b列的和
data.groupby('c').sum() 

Out[37]: 
        a    b
c             
fales   7   97
true   14  201

#按c列分組只求a列的和
data.groupby('c')['a'].sum()
Out[38]: 
c
fales     7
true     14

for循環(huán)groupby迭代

#將data按c列分組，重新生成兩個(gè)單獨(dú)的df
for x in data.groupby('c'):
    print(x[1])

Out[40]:
   a   b      c
1  2  23  fales
4  3  11  fales
5  2  63  fales
   a   b     c
0  1  43  true
2  3  52  true
3  4  23  true
6  6  83  true

python類似sql join關(guān)聯(lián)用法

語(yǔ)法	描述
pd.merge(a,b,how=‘left’,left_on=‘sex’,right_on=‘sex’)	on指定的列做join Pandas滿足left、right、inner、outer四種join方
pd.merge(a,b,how=‘left’,on=[‘a(chǎn)1’,‘b1’,‘c1’])	on=指定需要相同的多列，至少三列列進(jìn)行join同時(shí)滿足匹配
pd.merge(a,b,left_index=True,right_index=True)	根據(jù)索引進(jìn)行合并left_index or right_index，解決一對(duì)多boolean類型

代碼案例

data1={'d':[7,44,1,44,31,42,3],
      'b':[43,23,52,23,11,63,83],
      'c':['true','fales','true','true','fales','fales','true']}
data1=pd.DataFrame(data1)#再創(chuàng)一個(gè)表命名為data1，data表在最前面

Out[52]: 
    d   b      c
0   7  43   true
1  44  23  fales
2   1  52   true
3  44  23   true
4  31  11  fales
5  42  63  fales
6   3  83   true

pd.merge(data,data1,how='inner',left_on='a',right_on='d')
#取data表a列與data1表d列相同的交集部分
Out[55]: 
   a  b_x    c_x  d  b_y   c_y
0  1   43   true  1   52  true
1  3   52   true  3   83  true
2  3   11  fales  3   83  true

python類似sql order排序用法

語(yǔ)法	描述
df.sort_values([‘col’], ascending=False)	按col列排序，ascending=False為`降序`
df.sort_values([‘col’], ascending=True)	按col列排序，ascending=True為`升序`
df.sort_index(ascending=False)	根據(jù)索引進(jìn)行排序，ascending=False為`降序`

代碼案例

data.sort_values(['a'],ascending=[True])對(duì)a列進(jìn)行排序

Out[56]: 
   a   b      c
0  1  43   true
1  2  23  fales
5  2  63  fales
2  3  52   true
4  3  11  fales
3  4  23   true
6  6  83   true

python類似sql Distinct去重用法

語(yǔ)法	描述
df.drop_duplicates(subset=[‘col’], keep=‘first’, inplace=True)	根據(jù)某列對(duì)dataframe進(jìn)行去重

包含參數(shù)

參數(shù)	描述
subset	為選定的列做distinct，默認(rèn)為所有列
keep	值選項(xiàng){‘first’, ‘last’, False}，保留重復(fù)元素中的第一個(gè)、最后一個(gè)，或全部刪除
inplace	默認(rèn)為False，返回一個(gè)新的dataframe；若為True，則返回去重后的原dataframe

代碼案例

data.drop_duplicates(subset=['a'],keep='first',inplace=True)
#將data表a列中重復(fù)的去掉，并替換原表
Out[59]: 
   a   b      c
0  1  43   true
1  2  23  fales
2  3  52   true
3  4  23   true
6  6  83   true

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

python數(shù)據(jù)分析——python類似sql用法

python數(shù)據(jù)分析——python類似sql用法

python類似sql where用法或excel查找用法

python類似sql Group by分組用法

python類似sql join關(guān)聯(lián)用法

python類似sql order排序用法

python類似sql Distinct去重用法

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

python數(shù)據(jù)分析——python類似sql用法

python類似sql where用法或excel查找用法

python類似sql Group by分組用法

python類似sql join關(guān)聯(lián)用法

python類似sql order排序用法

python類似sql Distinct去重用法

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av