Pandas 使用總結(jié)

1.Pandas 介紹

pandas 是基于NumPy 的一種工具，該工具是為了解決數(shù)據(jù)分析任務(wù)而創(chuàng)建的。Pandas 納入了大量庫(kù)和一些標(biāo)準(zhǔn)的數(shù)據(jù)模型，提供了高效地操作大型數(shù)據(jù)集所需的工具。pandas提供了大量能使我們快速便捷地處理數(shù)據(jù)的函數(shù)和方法。你很快就會(huì)發(fā)現(xiàn)，它是使Python成為強(qiáng)大而高效的數(shù)據(jù)分析環(huán)境的重要因素之一。

? ——來(lái)自百度百科

簡(jiǎn)單來(lái)說(shuō)，pandas是一個(gè)能夠方便快捷地處理表格等結(jié)構(gòu)數(shù)據(jù)的工具包，能夠很方便地讀取，處理excel等數(shù)據(jù)。

能夠大大簡(jiǎn)化數(shù)據(jù)分析的工作。

2.安裝命令

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple pandas

3.導(dǎo)入包

import pandas as pd

4.創(chuàng)建，讀取DataFrame

通過(guò)數(shù)組創(chuàng)建表格

array = [1, 2, 3, 4, 5]
pd.DataFrame(array,columns=['數(shù)據(jù)'])

通過(guò)字典創(chuàng)建表格

pd.DataFrame({'數(shù)據(jù)':array})

讀入excel文件創(chuàng)建表格
```
pd.read_excel('study.xlsx')
```
read_excel 參數(shù)：
- io: 讀取的數(shù)據(jù)流，如果是字符串，則是文件的路徑
- sheetname：工作表名稱(chēng)
- header：指定作為標(biāo)簽的列，默認(rèn)不指定
- 其他參數(shù)參見(jiàn)：https://pandas.pydata.org/pandas-docs/version/0.14.0/generated/pandas.read_excel.html
讀入csv文件創(chuàng)建表格
```
pd.read_csv('study.csv')
```
read_csv 參數(shù)：
- io: 同上
- sep：用作分割的字符串
- quoting :
- 其他參數(shù)參考：https://pandas.pydata.org/pandas-docs/version/0.14.0/generated/pandas.read_csv.html#pandas.read_csv

5.保存數(shù)據(jù)

df.to_excel('study.xlsx')
df.to_csv('study.csv')

6.數(shù)據(jù)處理之增刪改查

增加數(shù)據(jù)

增加行數(shù)據(jù)

df.loc[len(df)]=2 # 使用這種方式需要注意索引列的設(shè)置方式，不設(shè)置索引列，自動(dòng)生成一行索引列
df_new = pd.concat([df3,df4],ignore_index=True)

增加列數(shù)據(jù)
```
df['第二列'] = [1, 2, 3, 4, 5]
```

刪除數(shù)據(jù)

df.drop(axis=0,index=0) # 刪除某一行
df.drop(axis=1,index=0) # 刪除某一列

drop 中如果指定 inplace = True 則會(huì)再原來(lái)的數(shù)據(jù)上做修改

另外：

對(duì)空值的處理：

df.dropNa()可以刪除包含nan的數(shù)據(jù)行或者列

查詢(xún)和修改數(shù)據(jù)

df['某一列'][1]
df.loc[:,'某一列']
df.iloc[1,1]
df.loc[df['某一列']==1]

df.loc[df['某一列']==1&]：通過(guò)這種方式可以快速篩選需要的值，其中，表達(dá)式支持 &和|作為與或操作

高級(jí)功能

df.info() # 可以查看數(shù)據(jù)表的結(jié)構(gòu)
df.describe() # 可以查看每一列的描述
df.shift(-1) # 可以使數(shù)據(jù)上下移動(dòng)

分組

df.groupby(by='列名').apply(lambda x:print(np.sum(x))

后續(xù)等待更新

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

pandas 使用總結(jié)