亚洲国产青青草原,亚洲综合字幕在线,人人操久久精品人妻

Numpy是python數(shù)據(jù)分析和科學(xué)計(jì)算的核心軟件包。這是numpy教程的第2部分。在這一部分中，我將詳細(xì)介紹numpy的高級(jí)功能，這些功能對(duì)于數(shù)據(jù)分析和操作非常重要。

photo by kevin-horstmann

1. 使用np.where獲取滿足給定條件的索引位置

上篇文章，你看到了如何從數(shù)組中提取滿足給定條件的項(xiàng)目。布爾索引，是否還記得？
但有時(shí)我們想知道項(xiàng)目的索引位置（滿足一個(gè)條件），并做任何你想做的事情。np.where定位數(shù)組中給定條件成立的位置。

# Create an array
import numpy as np
arr_rand = np.array([8, 8, 3, 7, 7, 0, 4, 2, 5, 2])
print("Array: ", arr_rand)

# Positions where value > 5
index_gt5 = np.where(arr_rand > 5)
print("Positions where value > 5: ", index_gt5)


#> Array:  [8 8 3 7 7 0 4 2 5 2]
#> Positions where value > 5:  (array([0, 1, 3, 4]),)

一旦你有了具體位置，你可以使用數(shù)組的take方法來(lái)提取他們。

# Take items at given index
arr_rand.take(index_gt5)

#> array([[8, 8, 7, 7]])

值得慶幸的是，np.where還接受2個(gè)更多的可選參數(shù)x和y。只要條件成立，'x'就會(huì)產(chǎn)生'y'。
下面，我嘗試創(chuàng)建一個(gè)數(shù)組，每當(dāng)條件為真時(shí)都會(huì)有字符串'gt5'，否則，它會(huì)有'lt5'。

# If value > 5, then yield 'gt5' else 'le5'
np.where(arr_rand > 5, 'gt5', 'le5')
#> array(['gt5', 'gt5', 'le5', 'gt5', 'gt5', 'le5', 'le5', 'le5', 'le5', 'le5'],dtype='<U3')

我們還可以找到最大值和最小值的位置

# Location of the max
print('Position of max value: ', np.argmax(arr_rand))  

# Location of the min
print('Position of min value: ', np.argmin(arr_rand))  

#> Position of max value:  0
#> Position of min value:  5

好極了！

2. 將數(shù)據(jù)導(dǎo)入和導(dǎo)出為csv文件

導(dǎo)入數(shù)據(jù)集的標(biāo)準(zhǔn)方法是使用np.genfromtxt函數(shù)。它可以從網(wǎng)址導(dǎo)入數(shù)據(jù)集，處理缺失值，多個(gè)分隔符，處理不規(guī)則數(shù)量的列等。
一個(gè)不太常用的版本是假設(shè)數(shù)據(jù)集沒(méi)有缺失值的np.loadtxt。
舉個(gè)例子，讓我們嘗試從下面的URL讀取一個(gè).csv文件。由于numpy數(shù)組中的所有元素應(yīng)該具有相同的數(shù)據(jù)類型，因此默認(rèn)情況下，最后一個(gè)文本類型數(shù)據(jù)列將作為“nan”導(dǎo)入。
通過(guò)設(shè)置filling_values參數(shù)，你可以用其他值替換缺失的值。

# Turn off scientific notation
np.set_printoptions(suppress=True)  

# Import data from csv file url
path = 'https://raw.githubusercontent.com/selva86/datasets/master/Auto.csv'
data = np.genfromtxt(path, delimiter=',', skip_header=1, filling_values=-999, dtype='float')
data[:3]  # see first 3 rows


#> array([[   18. ,     8. ,   307. ,   130. ,  3504. ,    12. ,    70. ,
#>             1. ,  -999. ],
#>        [   15. ,     8. ,   350. ,   165. ,  3693. ,    11.5,    70. ,
#>             1. ,  -999. ],
#>        [   18. ,     8. ,   318. ,   150. ,  3436. ,    11. ,    70. ,
#>             1. ,  -999. ]])

看上去很整齊。但是你是否注意到最后一列中的所有值都具有相同的值'-999'？
發(fā)生這種事上面提到過(guò)。D型=” float'。文件中的最后一列包含文本值，因?yàn)閚umpy數(shù)組中的所有值都必須是相同的dtype，np.genfromtxt不知道如何將其轉(zhuǎn)換為浮點(diǎn)數(shù)。

2.1 處理包含數(shù)字和文本列的數(shù)據(jù)集

在這種情況下，你必須按照原樣使用文本列而不用占位符替換它，你可以將dtype設(shè)置為object或None。

# data2 = np.genfromtxt(path, delimiter=',', skip_header=1, dtype='object')
data2 = np.genfromtxt(path, delimiter=',', skip_header=1, dtype=None)
data2[:3]  # see first 3 rows


#> array([( 18., 8,  307., 130, 3504,  12. , 70, 1, b'"chevrolet chevelle malibu"'),
#>        ( 15., 8,  350., 165, 3693,  11.5, 70, 1, b'"buick skylark 320"'),
#>        ( 18., 8,  318., 150, 3436,  11. , 70, 1, b'"plymouth satellite"')],
#> dtype=[('f0', '<f8'), ('f1', '<i8'), ('f2', '<f8'), ('f3', '<i8'), ('f4', '<i8'), ('f5', '<f8'), ('f6', '<i8'), ('f7', '<i8'), ('f8', 'S38')])

好極了！
最后，np.savetxt可讓你將數(shù)組導(dǎo)出為csv文件。

# Save the array as a csv file
np.savetxt("out.csv", data, delimiter=",")

3. 保存和加載numpy對(duì)象

在某些情況下，我們希望將大型轉(zhuǎn)換后的numpy數(shù)組保存到磁盤(pán)并直接將其加載回控制臺(tái)，而無(wú)需重新運(yùn)行數(shù)據(jù)轉(zhuǎn)換代碼。
Numpy為此提供了.npy和.npz文件類型。
如果要存儲(chǔ)單個(gè)ndarray對(duì)象，請(qǐng)使用np.save將其存儲(chǔ)為.npy文件。這可以使用np.load加載。
如果你想要將一個(gè)以上的ndarray對(duì)象存儲(chǔ)在單個(gè)文件中，請(qǐng)使用np.savez將其保存為.npz文件。

# Save single numpy array object as .npy file
np.save('myarray.npy', arr2d)  

# Save multile numy arrays as a .npz file
np.savez('array.npz', arr2d_f, arr2d_b)

加載.npy文件

# Load a .npy file
a = np.load('myarray.npy')
print(a)

#> [[0 1 2]
#>  [3 4 5]
#>  [6 7 8]]

加載.npz文件

# Load a .npz file
b = np.load('array.npz')
print(b.files)
b['arr_0']

#> ['arr_0', 'arr_1']

#> array([[ 0.,  1.,  2.],
#>        [ 3.,  4.,  5.],
#>        [ 6.,  7.,  8.]])

4. 連接兩個(gè)numpy數(shù)組的列式和行式###

有三種不同的連接兩個(gè)或更多numpy數(shù)組的方法。

方法1：利用np.concatenate將軸參數(shù)更改為0和1
方法2：np.vstack和np.hstack
方法3：np.r_和np.c_

所有三種方法提供相同的輸出，要注意的一個(gè)關(guān)鍵區(qū)別是與其他兩種方法不同，np.r_和np.c_使用方括號(hào)來(lái)堆棧數(shù)組。但首先，讓我創(chuàng)建要并置的數(shù)組。

a = np.zeros([4, 4])
b = np.ones([4, 4])
print(a)
print(b)

#> [[ 0.  0.  0.  0.]
#>  [ 0.  0.  0.  0.]
#>  [ 0.  0.  0.  0.]
#>  [ 0.  0.  0.  0.]]

#> [[ 1.  1.  1.  1.]
#>  [ 1.  1.  1.  1.]
#>  [ 1.  1.  1.  1.]
#>  [ 1.  1.  1.  1.]]

讓我們垂直堆疊數(shù)組。

# Vertical Stack Equivalents (Row wise)
np.concatenate([a, b], axis=0)
np.vstack([a,b])
np.r_[a,b]

#> array([[ 0.,  0.,  0.,  0.],
#>        [ 0.,  0.,  0.,  0.],
#>        [ 0.,  0.,  0.,  0.],
#>        [ 0.,  0.,  0.,  0.],
#>        [ 1.,  1.,  1.,  1.],
#>        [ 1.,  1.,  1.,  1.],
#>        [ 1.,  1.,  1.,  1.],
#>        [ 1.,  1.,  1.,  1.]])

那是我們想要的。再來(lái)個(gè)水平（列）堆疊

# Horizontal Stack Equivalents (Coliumn wise)
np.concatenate([a, b], axis=1) 
np.hstack([a,b])  
np.c_[a,b]

#> array([[ 0.,  0.,  0.,  0.,  1.,  1.,  1.,  1.],
#>        [ 0.,  0.,  0.,  0.,  1.,  1.,  1.,  1.],
#>        [ 0.,  0.,  0.,  0.,  1.,  1.,  1.,  1.],
#>        [ 0.,  0.,  0.,  0.,  1.,  1.,  1.,  1.]])

此外，你可以使用np.r_在一維數(shù)組中創(chuàng)建更復(fù)雜的數(shù)字序列

np.r_[[1,2,3], 0, 0, [4,5,6]]

#> array([1, 2, 3, 0, 0, 4, 5, 6])

5. 根據(jù)一列或多列對(duì)numpy數(shù)組進(jìn)行排序

讓我們嘗試根據(jù)第一列對(duì)二維數(shù)組進(jìn)行排序。

arr = np.random.randint(1,6, size=[8, 4])

arr
#> array([[3, 3, 2, 1],
#>        [1, 5, 4, 5],
#>        [3, 1, 4, 2],
#>        [3, 4, 5, 5],
#>        [2, 4, 5, 5],
#>        [4, 4, 4, 2],
#>        [2, 4, 1, 3],
#>        [2, 2, 4, 3]])

我們有一個(gè)8行和4列的隨機(jī)數(shù)組。
如果使用axis = 0的np.sort函數(shù)，那么所有列將按照彼此獨(dú)立的升序進(jìn)行排序，從而有效地降低行項(xiàng)目的完整性。簡(jiǎn)而言之，每行中的值會(huì)被來(lái)自其他行的值破壞。

# Sort each columns of arr
np.sort(arr, axis=0)

#> array([[1, 1, 1, 1],
#>        [2, 2, 2, 2],
#>        [2, 3, 4, 2],
#>        [2, 4, 4, 3],
#>        [3, 4, 4, 3],
#>        [3, 4, 4, 5],
#>        [3, 4, 5, 5],
#>        [4, 5, 5, 5]])

因?yàn)槲也幌胱屝械膬?nèi)容受到干擾，所以這里間接地采取np.argsort來(lái)排序。

5.1 使用argsort對(duì)基于1列的numpy數(shù)組進(jìn)行排序

我們先來(lái)了解一下np.argsort的功能。 np.argsort返回將使給定的一維數(shù)組排序的索引位置。

# Get the index positions that would sort the array
x = np.array([1, 10, 5, 2, 8, 9])
sort_index = np.argsort(x)
print(sort_index)

#> [0 3 2 4 5 1]

如何解釋這一點(diǎn)？在數(shù)組x中，第0項(xiàng)是最小的，第3項(xiàng)是第2小的，以此類推。

x[sort_index]
#> array([ 1,  2,  5,  8,  9, 10])

現(xiàn)在，為了排序上面的arr數(shù)組，我將在第1列上執(zhí)行一個(gè)參數(shù)，并使用結(jié)果索引位置對(duì)arr進(jìn)行排序?？创a。

# Argsort the first column
sorted_index_1stcol = arr[:, 0].argsort()

# Sort 'arr' by first column without disturbing the integrity of rows
arr[sorted_index_1stcol]
#> array([[1, 5, 4, 5],
#>        [2, 4, 5, 5],
#>        [2, 4, 1, 3],
#>        [2, 2, 4, 3],
#>        [3, 3, 2, 1],
#>        [3, 1, 4, 2],
#>        [3, 4, 5, 5],
#>        [4, 4, 4, 2]])

要按降序?qū)ζ溥M(jìn)行排序，只需顛倒argsorted索引即可

# Descending sort
arr[sorted_index_1stcol[::-1]]
#> array([[4, 4, 4, 2],
#>        [3, 4, 5, 5],
#>        [3, 1, 4, 2],
#>        [3, 3, 2, 1],
#>        [2, 2, 4, 3],
#>        [2, 4, 1, 3],
#>        [2, 4, 5, 5],
#>        [1, 5, 4, 5]])

5.2 如何基于2列或更多列對(duì)numpy數(shù)組進(jìn)行排序？

你可以使用np.lexsort通過(guò)傳遞一個(gè)基于數(shù)組應(yīng)該排序的列的元組來(lái)執(zhí)行此操作。只要記住要將列首先排列在元組內(nèi)的最右側(cè)。

# Sort by column 0, then by column 1
lexsorted_index = np.lexsort((arr[:, 1], arr[:, 0])) 
arr[lexsorted_index]

#> array([[1, 5, 4, 5],
#>        [2, 2, 4, 3],
#>        [2, 4, 5, 5],
#>        [2, 4, 1, 3],
#>        [3, 1, 4, 2],
#>        [3, 3, 2, 1],
#>        [3, 4, 5, 5],
#>        [4, 4, 4, 2]])

6. 使用日期###

Numpy通過(guò)np.datetime64實(shí)現(xiàn)日期對(duì)象，該對(duì)象支持精度直到納秒。你可以使用標(biāo)準(zhǔn)的YYYY-MM-DD格式的日期字符串創(chuàng)建這樣一個(gè)對(duì)象

# Create a datetime64 object
date64 = np.datetime64('2018-02-04 23:10:10')
date64
#> numpy.datetime64('2018-02-04T23:10:10')

當(dāng)然，你可以通過(guò)幾小時(shí)，幾分鐘，幾秒直到納秒。讓我們從date64中刪除時(shí)間組件

# Drop the time part from the datetime64 object
dt64 = np.datetime64(date64, 'D')
dt64
#> numpy.datetime64('2018-02-04')

默認(rèn)情況下，如果添加數(shù)字會(huì)增加天數(shù)。但是，如果你需要增加其他時(shí)間單位（如月，小時(shí)，秒等），那么timedelta對(duì)象非常方便。

# Create the timedeltas (individual units of time)
tenminutes = np.timedelta64(10, 'm')  # 10 minutes
tenseconds = np.timedelta64(10, 's')  # 10 seconds
tennanoseconds = np.timedelta64(10, 'ns')  # 10 nanoseconds

print('Add 10 days: ', dt64 + 10)
print('Add 10 minutes: ', dt64 + tenminutes)
print('Add 10 seconds: ', dt64 + tenseconds)
print('Add 10 nanoseconds: ', dt64 + tennanoseconds)
#> Add 10 days:  2018-02-14
#> Add 10 minutes:  2018-02-04T00:10
#> Add 10 seconds:  2018-02-04T00:00:10
#> Add 10 nanoseconds:  2018-02-04T00:00:00.000000010

將dt64轉(zhuǎn)換回字符串。

# Convert np.datetime64 back to a string
np.datetime_as_string(dt64)
#> '2018-02-04'

處理日期時(shí)，你通常需要從數(shù)據(jù)中過(guò)濾掉工作日。你可以使用np.is_busday（）判斷給定日期是否為營(yíng)業(yè)日。

print('Date: ', dt64)
print("Is it a business day?: ", np.is_busday(dt64))  
print("Add 2 business days, rolling forward to nearest biz day: ", np.busday_offset(dt64, 2, roll='forward'))  
print("Add 2 business days, rolling backward to nearest biz day: ", np.busday_offset(dt64, 2, roll='backward'))  
#> Date:  2018-02-04
#> Is it a business day?:  False
#> Add 2 business days, rolling forward to nearest biz day:  2018-02-07
#> Add 2 business days, rolling backward to nearest biz day:  2018-02-06

6.1 創(chuàng)建一個(gè)日期序列

它可以簡(jiǎn)單地使用np.arange本身完成

#Create date sequence
dates = np.arange(np.datetime64('2018-02-01'), np.datetime64('2018-02-10'))
print(dates)

# Check if its a business day
np.is_busday(dates)
#> ['2018-02-01' '2018-02-02' '2018-02-03' '2018-02-04' '2018-02-05'
#>  '2018-02-06' '2018-02-07' '2018-02-08' '2018-02-09']

array([ True,  True, False, False,  True,  True,  True,  True,  True], dtype=bool)

6.2 將`numpy.datetime64`轉(zhuǎn)換為`datetime.datetime`對(duì)象

# Convert np.datetime64 to datetime.datetime
import datetime
dt = dt64.tolist()
dt
#> datetime.date(2018, 2, 4)

一旦你將它轉(zhuǎn)換為datetime.date對(duì)象，你就有更多的工具來(lái)提取月份的日期，月份等

print('Year: ', dt.year)  
print('Day of month: ', dt.day)
print('Month of year: ', dt.month)  
print('Day of Week: ', dt.weekday())  # Sunday
#> Year:  2018
#> Day of month:  4
#> Month of year:  2
#> Day of Week:  6

7.0 總結(jié)

到目前為止，我們已經(jīng)涵蓋了許多使用numpy進(jìn)行數(shù)據(jù)操作的技術(shù)。但是有很多事情你不能直接用numpy做...
所以接下來(lái)我們可以選擇放棄一些難的numpy函數(shù)，尋求一些常用基本操作... 并且熟悉它們，所以那就必須有numpy 100練...大家可以去google然后鍛煉一下

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

Numpy教程（2）數(shù)據(jù)分析的重要功能

Numpy教程（2）數(shù)據(jù)分析的重要功能

1. 使用np.where獲取滿足給定條件的索引位置

2. 將數(shù)據(jù)導(dǎo)入和導(dǎo)出為csv文件

2.1 處理包含數(shù)字和文本列的數(shù)據(jù)集

3. 保存和加載numpy對(duì)象

4. 連接兩個(gè)numpy數(shù)組的列式和行式###

5. 根據(jù)一列或多列對(duì)numpy數(shù)組進(jìn)行排序

5.1 使用argsort對(duì)基于1列的numpy數(shù)組進(jìn)行排序

5.2 如何基于2列或更多列對(duì)numpy數(shù)組進(jìn)行排序？

6. 使用日期###

6.1 創(chuàng)建一個(gè)日期序列

6.2 將`numpy.datetime64`轉(zhuǎn)換為`datetime.datetime`對(duì)象

7.0 總結(jié)

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

Numpy教程（2） 數(shù)據(jù)分析的重要功能

1. 使用np.where獲取滿足給定條件的索引位置

2. 將數(shù)據(jù)導(dǎo)入和導(dǎo)出為csv文件

2.1 處理包含數(shù)字和文本列的數(shù)據(jù)集

3. 保存和加載numpy對(duì)象

4. 連接兩個(gè)numpy數(shù)組的列式和行式###

5. 根據(jù)一列或多列對(duì)numpy數(shù)組進(jìn)行排序

5.1 使用argsort對(duì)基于1列的numpy數(shù)組進(jìn)行排序

5.2 如何基于2列或更多列對(duì)numpy數(shù)組進(jìn)行排序？

6. 使用日期###

6.1 創(chuàng)建一個(gè)日期序列

6.2 將numpy.datetime64轉(zhuǎn)換為datetime.datetime對(duì)象

7.0 總結(jié)

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

Numpy教程（2）數(shù)據(jù)分析的重要功能

6.2 將`numpy.datetime64`轉(zhuǎn)換為`datetime.datetime`對(duì)象