Pandas 重置索引深度總結(jié)

今天我們來討論 Pandas 中的 `reset_index()` 方法,包括為什么我們需要在 Pandas 中重置 DataFrame 的索引,以及我們應(yīng)該如何應(yīng)用該方法 在本文我們將使用 Kaggle 上的數(shù)據(jù)集樣本 Animal Shelter Analytics 來作為我們的測試數(shù)據(jù) ## Pandas 中的 Reset_Index() 是什么? 如果我們使用 Pandas 的 `read_csv()` 方法讀取 csv 文件而不指定任何索引,則生成的 DataFrame 將具有默認(rèn)的基于整數(shù)的索引,第一行從 0 開始,隨后每行增加 1: ```Python import pandas as pd import numpy as np df = pd.read_csv('Austin_Animal_Center_Intakes.csv').head() df ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color 0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor 1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver 2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White 3 A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico 4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 在某些情況下,我們可能希望擁有更有意義的行標(biāo)簽,因此我們將選擇 DataFrame 的其中一列作為 DataFrame 索引。我們可以使用 `read_csv()` 方法的 index_col 參數(shù)直接執(zhí)行此操作: ```Python df = pd.read_csv('Austin_Animal_Center_Intakes.csv', index_col='Animal ID').head() df ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 或者我們還可以使用 set_index() 方法將 DataFrame 的任何列設(shè)置為 DataFrame 索引: ```Python df = pd.read_csv('Austin_Animal_Center_Intakes.csv').head() df.set_index('Animal ID', inplace=True) df ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 如果在某個時候我們需要恢復(fù)默認(rèn)的數(shù)字索引呢,這時就可以使用 reset_index()函數(shù)了 ```Python df.reset_index() ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color 0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor 1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver 2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White 3 A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico 4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 此方法的默認(rèn)行為包括用默認(rèn)的基于整數(shù)的索引替換現(xiàn)有的 DataFrame 索引,并將舊索引轉(zhuǎn)換為與舊索引同名的新列(或名稱索引)。此外,默認(rèn)情況下,reset_index() 方法會從 MultiIndex 中刪除所有級別并且不會影響原始 DataFrame 數(shù)據(jù),而是創(chuàng)建一個新的 ## 何時使用 Reset_Index() 方法 reset_index() 方法將 DataFrame 索引重置為默認(rèn)數(shù)字索引,在以下情況下特別有用: - 執(zhí)行數(shù)據(jù)整理時——尤其是過濾數(shù)據(jù)或刪除缺失值等預(yù)處理操作,會導(dǎo)致較小的 DataFrame 具有不再連續(xù)的數(shù)字索引 - 當(dāng)索引應(yīng)該被視為一個常見的 DataFrame 列時 - 當(dāng)索引標(biāo)簽沒有提供有關(guān)數(shù)據(jù)的任何有價值的信息時 ## 如何調(diào)整 Reset_Index() 方法 前面的討論中,我們看到了當(dāng)我們不向它傳遞任何參數(shù)時,reset_index() 方法是如何工作的,當(dāng)然如果有需要,我們可以通過調(diào)整方法的各種參數(shù)來更改此默認(rèn)行為。 讓我們看看最有用的三種參數(shù):level、drop 和 inplace ### level 此參數(shù)采用整數(shù)、字符串、元組或列表作為可能的數(shù)據(jù)類型,并且僅適用于具有 MultiIndex 的 DataFrame,如下所示: ```Python df_multiindex = pd.read_csv('Austin_Animal_Center_Intakes.csv', index_col=['Animal ID', 'Name']).head() df_multiindex ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 事實(shí)上,如果現(xiàn)在我們檢查上面 DataFrame 的索引,我們會發(fā)現(xiàn)它不是一個常見的 DataFrame 索引,而是一個 MultiIndex 對象: ```Python df_multiindex.index ``` Output: ```Text MultiIndex([('A786884', '*Brock'), ('A706918', 'Belle'), ('A724273', 'Runster'), ('A665644', nan), ('A682524', 'Rio')], names=['Animal ID', 'Name']) ``` 默認(rèn)情況下,reset_index() 方法參數(shù) level (level=None) 會移除 MultiIndex 的所有級別: ```Python df_multiindex.reset_index() ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color 0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor 1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver 2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White 3 A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico 4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 我們看到 DataFrame 的兩個索引都被轉(zhuǎn)換為通用 DataFrame 列,而索引被重置為默認(rèn)的基于整數(shù)的索引 相反,如果我們顯式傳遞 level 的值,則此參數(shù)會從 DataFrame 索引中刪除選定的級別,并將它們作為常見的 DataFrame 列返回(除非我們選擇使用 drop 參數(shù)從 DataFrame 中完全刪除此信息)。比較以下操作: ```Python df_multiindex.reset_index(level='Animal ID') ``` Output: ```Text Name Animal ID DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color *Brock A786884 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor Belle A706918 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver Runster A724273 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White NaN A665644 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico Rio A682524 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 最開始 Animal ID 是 DataFrame 的索引之一,當(dāng)我們設(shè)置 level 參數(shù)后,將其從索引中刪除并作為稱為 Animal ID 的公共列插入到 DataFrame 中 ```Python df_multiindex.reset_index(level='Name') ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 在這里,Name 最初是 DataFrame 的索引之一,設(shè)置完level參數(shù)后,就變成了一個常用的列,叫做Name ### drop 此參數(shù)決定在索引重置后是否將舊索引保留為通用 DataFrame 列,或者將其從 DataFrame 中完全刪除。默認(rèn)情況下 (drop=False) 是進(jìn)行保留的,正如我們在前面的所有示例中看到的那樣。否則,如果我們不想將舊索引保留為列,我們可以在索引重置后將其從 DataFrame 中完全刪除(drop=True): ```Python df ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 增加 drop 參數(shù) ```Python df.reset_index(drop=True) ``` Output: ```Text Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color 0 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor 1 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver 2 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White 3 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico 4 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 在上面的 DataFrame 中,舊索引中包含的信息已完全從 DataFrame 中刪除了 drop 參數(shù)也適用于具有 MultiIndex 的 DataFrame,就像我們之前創(chuàng)建的那樣: ```Python df_multiindex ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 增加 drop 參數(shù) ```Python df_multiindex.reset_index(drop=True) ``` Output: ```Text DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color 0 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor 1 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver 2 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White 3 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico 4 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 兩個舊索引都已從 Dataframe 中完全刪除,并且索引已重置為默認(rèn)值 當(dāng)然,我們可以結(jié)合 drop 和 level 參數(shù),指定要從 DataFrame 中完全刪除哪些舊索引: ```Python df_multiindex.reset_index(level='Animal ID', drop=True) ``` Output: ```Text DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color Name *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 舊索引 Animal ID 已從索引和 DataFrame 本身中刪除,另一個索引 Name 被保留為 DataFrame 的當(dāng)前索引 ### inplace 該參數(shù)決定是直接修改原來的 DataFrame 還是新建一個 DataFrame 對象。默認(rèn)情況下,它會使用新索引 (inplace=False) 創(chuàng)建一個新的 DataFrame,并保持原始 DataFrame 不變。讓我們使用默認(rèn)參數(shù)再次運(yùn)行 reset_index() 方法,然后將結(jié)果與原始 DataFrame 進(jìn)行比較: ```Python df.reset_index() ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color 0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor 1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver 2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White 3 A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico 4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` ```Python df ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 即使我們將索引重置為運(yùn)行第一段代碼的默認(rèn)數(shù)字,原始 DataFrame 仍然保持不變。 如果我們需要將原始 DataFrame 重新分配給對其應(yīng)用 reset_index() 方法的結(jié)果,我們可以直接重新分配它(df = df.reset_index())或?qū)?shù) inplace=True 傳遞給該方法: ```Python df.reset_index(inplace=True) df ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color 0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor 1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver 2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White 3 A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico 4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 我們看到現(xiàn)在更改已直接應(yīng)用于原始 DataFrame 之上了 ## 應(yīng)用實(shí)例:刪除缺失值后重置索引 讓我們將到目前為止討論的所有內(nèi)容付諸實(shí)踐,看看當(dāng)我們從 DataFrame 中刪除缺失值時,重置 DataFrame 索引是如何有用的 首先,讓我們恢復(fù)我們最開始時創(chuàng)建的第一個 DataFrame,它具有默認(rèn)數(shù)字索引: ```Python df = pd.read_csv('Austin_Animal_Center_Intakes.csv').head() df ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color 0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor 1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver 2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White 3 A665644 NaN 10/21/2013 07:59:00 AM 10/21/2013 07:59:00 AM Austin (TX) Stray Sick Cat Intact Female 4 weeks Domestic Shorthair Mix Calico 4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 我們看到 DataFrame 中有一個缺失值,讓我們使用 dropna() 方法刪除具有缺失值的整行 ```Python df.dropna(inplace=True) df ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color 0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor 1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver 2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White 4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 該行已從 DataFrame 中刪除,但是索引不再是連續(xù)的:0、1、2、4。讓我們重新設(shè)置它: ```Python df.reset_index() ``` Output: ```Text index Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color 0 0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor 1 1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver 2 2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White 3 4 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 現(xiàn)在索引是連續(xù)的了,但是由于我們沒有顯式傳遞 drop 參數(shù),舊索引被轉(zhuǎn)換為列,具有默認(rèn)名稱 index,下面讓我們從 DataFrame 中完全刪除舊索引: ```Python df.reset_index(drop=True) ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color 0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor 1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver 2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White 3 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` 現(xiàn)在我們徹底擺脫了無意義的舊索引,當(dāng)前索引是連續(xù)的。最后一步是使用 inplace 參數(shù)將這些修改保存到我們的原始 DataFrame 中: ```Python df.reset_index(drop=True, inplace=True) df ``` Output: ```Text Animal ID Name DateTime MonthYear Found Location Intake Type Intake Condition Animal Type Sex upon Intake Age upon Intake Breed Color 0 A786884 *Brock 01/03/2019 04:19:00 PM 01/03/2019 04:19:00 PM 2501 Magin Meadow Dr in Austin (TX) Stray Normal Dog Neutered Male 2 years Beagle Mix Tricolor 1 A706918 Belle 07/05/2015 12:59:00 PM 07/05/2015 12:59:00 PM 9409 Bluegrass Dr in Austin (TX) Stray Normal Dog Spayed Female 8 years English Springer Spaniel White/Liver 2 A724273 Runster 04/14/2016 06:43:00 PM 04/14/2016 06:43:00 PM 2818 Palomino Trail in Austin (TX) Stray Normal Dog Intact Male 11 months Basenji Mix Sable/White 3 A682524 Rio 06/29/2014 10:38:00 AM 06/29/2014 10:38:00 AM 800 Grove Blvd in Austin (TX) Stray Normal Dog Neutered Male 4 years Doberman Pinsch/Australian Cattle Dog Tan/Gray ``` ## 總結(jié) 今天我們從多個方面討論了 reset_index() 方法 - reset_index() 方法的默認(rèn)行為 - 如何恢復(fù) DataFrame 的默認(rèn)數(shù)字索引 - 何時使用 reset_index() 方法 - 該方法最重要的幾個參數(shù) - 如何使用 MultiIndex - 如何從 DataFrame 中完全刪除舊索引 - 如何將修改直接保存到原始 DataFrame 中 最好我們又完整的完成了一個在刪除缺失值后重置 DataFrame 索引的實(shí)戰(zhàn)案例 好了,這就是今天分享的全部內(nèi)容,喜歡就點(diǎn)個贊吧 本文由[mdnice](https://mdnice.com/?platform=6)多平臺發(fā)布
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容