Pandas 學(xué)習(xí)(1)

Pandas 學(xué)習(xí)

  • 基本命令

  1. 導(dǎo)包
import pandas
  1. 數(shù)據(jù)的導(dǎo)入文件
pd.read_csv(filename) # 導(dǎo)入csv格式文件中的數(shù)據(jù)
pd.read_table(filename) # 導(dǎo)入有分隔符的文本 (如TSV) 中的數(shù)據(jù)
pd.read_excel(filename) # 導(dǎo)入Excel格式文件中的數(shù)據(jù)
pd.read_sql(query, connection_object) # 導(dǎo)入SQL數(shù)據(jù)表/數(shù)據(jù)庫中的數(shù)據(jù)
pd.read_json(json_string) # 導(dǎo)入JSON格式的字符,URL地址或者文件中的數(shù)據(jù)
pd.read_html(url) # 導(dǎo)入經(jīng)過解析的URL地址中包含的數(shù)據(jù)框 (DataFrame) 數(shù)據(jù)
pd.read_clipboard() # 導(dǎo)入系統(tǒng)粘貼板里面的數(shù)據(jù)
pd.DataFrame(dict)  # 導(dǎo)入Python字典 (dict) 里面的數(shù)據(jù),其中key是數(shù)據(jù)框的表頭,value是數(shù)據(jù)框的內(nèi)容。
  1. 數(shù)據(jù)的檢查和查看

    查看數(shù)據(jù)框的前幾行

    chipo.head(n) # n為查看的大小,默認(rèn)為5
    

    查看數(shù)據(jù)框的后幾行

    chipo.tail(n)
    

    查看數(shù)據(jù)的索引,數(shù)據(jù)類型及內(nèi)存信息

    chipo.info()
    

    查看數(shù)據(jù)的行列大小

    chipo.shape[1] # shape返回的是一個(gè)元祖,第0位為行大小,第1位為列大小
    

    查看所有列名 返回值為Index(['order_id', 'quantity', 'item_name', 'choice_description',
    'item_price'],
    dtype='object')

    chipo.columns
    

    查看索引 返回值為RangeIndex(start=0, stop=4622, step=1)

    chipo.index
    

    查詢每個(gè)獨(dú)特?cái)?shù)據(jù)值出現(xiàn)次數(shù)統(tǒng)計(jì)

    chipo.item_name.value_counts() # 每個(gè)商品出現(xiàn)的次數(shù)
    
  2. 數(shù)據(jù)的分類

    • groupby
    drinks.groupby('continent').beer_servings.mean() # 將酒銷售量按照大洲分類,求每個(gè)大洲銷售的平均值
    
  3. 合并

    • concat()
    all_data_col = pd.concat([data1, data2], axis = 1) # axis = 1 則是按照列的維度合并,默認(rèn)按照行的維度合并
    
    • merge()
    pd.merge(all_data, data3, on='subject_id') 
    

    常用參數(shù)參考自官網(wǎng)

    how : {‘left’, ‘right’, ‘outer’, ‘inner’}, default ‘inner’
    left: use only keys from left frame, similar to a SQL left outer join; preserve key order
    right: use only keys from right frame, similar to a SQL right outer join; preserve key order
    outer: use union of keys from both frames, similar to a SQL full outer join; sort keys lexicographically
    inner: use intersection of keys from both frames, similar to a SQL inner join; preserve the order of the left keys
    on : (label or list)
    Field names to join on. Must be found in both DataFrames. If on is None and not merging on indexes, then it merges on the intersection of the columns by default.
    left_on : (label or list, or array-like)
    Field names to join on in left DataFrame. Can be a vector or list of vectors of the length of the DataFrame to use a particular vector as the join key instead of columns
    right_on : (label or list, or array-like)
    Field names to join on in right DataFrame or vector/list of vectors per left_on docs
    left_index : (boolean, default False)
    Use the index from the left DataFrame as the join key(s). If it is a MultiIndex, the number of keys in the other DataFrame (either the index or a number of columns) must match the number of levels
    right_index : (boolean, default False)
    Use the index from the right DataFrame as the join key. Same caveats as left_index
    sort : (boolean, default False)
    Sort the join keys lexicographically in the result DataFrame. If False, the order of the join keys depends on the join type (how keyword)

  4. Apply 聚合函數(shù)

    apply 是 pandas 庫的一個(gè)很重要的函數(shù),多和 groupby 函數(shù)一起用,也可以直接用于 DataFrame 和 Series 對(duì)象。主要用于數(shù)據(jù)聚合運(yùn)算,可以很方便的對(duì)分組進(jìn)行現(xiàn)有的運(yùn)算和自定義的運(yùn)算。

    def fix_century(x):
        year = x.year - 100 if x.year > 1989 else x.year
        return datetime.date(year, x.month, x.day)
    
    data['Yr_Mo_Dy'] = data['Yr_Mo_Dy'].apply(fix_century)  # 數(shù)據(jù)中Yr_Mo_Dy 按照fix_century進(jìn)行替換
    
  5. resample 重新采樣函數(shù)
    Pandas中的resample,重新采樣,是對(duì)原樣本重新處理的一個(gè)方法,是一個(gè)對(duì)常規(guī)時(shí)間序列數(shù)據(jù)重新采樣和頻率轉(zhuǎn)換的便捷的方法。
    常用參數(shù)

    rule : 偏移量(隔多長時(shí)間取一次值)
    例如:'30s','3T','BM'

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容