Numpy-其他

axis理解

NumPy數(shù)組的維數(shù)稱為軸(axes),軸的個數(shù)叫秩(rank),一維數(shù)組的秩為1,二維數(shù)組的秩為2。

Stackoverflow系列(1) -Python Pandas與Numpy中axis參數(shù)的二義性

取數(shù)組元素

x = np.array([2,  4,  0,  3,  5])
# 不包括倒數(shù)第一個
x[:-1]

[2,4,0,3]

x=np.array([[1,2,3],[4,5,6],[7,8,9]])
# 二維數(shù)組,逗號前后表示要取的行和列,:就是全部取,0:2就是取第0列和第1列,不包括第2列
print(x[:,0:2])
[[1 2]
 [4 5]
 [7 8]]

如果只取一列,下面這種形式就會變成一個一位數(shù)組,要加上一個[],才可以維持原有的二維數(shù)組的形式。

print(x[:,-1])
[3 6 9]
print(x[:,[-1]])
[[3]
 [6]
 [9]]

排序

numpy教程:排序、搜索和計數(shù)

默認是升序排序。

list1 = [[1,3,2], [3,5,4]]
array = numpy.array(list1)
array = sort(array, axis=1)   #對第1維升序排序
#array = sort(array, axis=0)   #對第0維
print(array)
[[1 2 3]
 [3 4 5]]

降序排序的實現(xiàn):

array = -sort(-array, axis=1)   #降序
[[3 2 1]
 [5 4 3]]

參考

【1】numpy中的ndarray方法和屬性

運算、索引、切片

http://blog.csdn.net/liangzuojiayi/article/details/51534164

矩陣的各類乘法

dot product點積

a \cdot b = a_1b_1+a_2b_2...a_nb_n

x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0]
x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0]
### VECTORIZED DOT PRODUCT OF VECTORS ###
tic = time.process_time()
dot = np.dot(x1,x2)
toc = time.process_time()
print ("dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

只有兩個值是普通數(shù)組的時候才可以是點積,如果是np.array,則dot會變成矩陣乘法。也就是

x1 = np.array([[1,2,3]])
x2 = np.array([[1,2,3]])
np.dot(x1,x2)

會報錯

ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)

outer product外積

x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0]
x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0]
### VECTORIZED OUTER PRODUCT ###
outer = np.outer(x1,x2)

element-wise multipulation按位乘

mul = np.multiply(x1,x2)

general dot product矩陣乘法

W = np.random.rand(3,len(x1))
dot = np.dot(W,x1)

可以看出dot既可以用作點積,也可以執(zhí)行矩陣乘法

Broadcasting

廣播用以描述numpy中對兩個形狀不同的陣列進行數(shù)學計算的處理機制。較小的陣列“廣播”到較大陣列相同的形狀尺度上,使它們對等以可以進行數(shù)學計算。廣播提供了一種向量化陣列的操作方式,因此Python不需要像C一樣循環(huán)。廣播操作不需要數(shù)據(jù)復制,通常執(zhí)行效率非常高。然而,有時廣播是個壞主意,可能會導致內(nèi)存浪費以致計算減慢。

Numpy操作通常由成對的陣列完成,陣列間逐個元素對元素地執(zhí)行。最簡單的情形是兩個陣列有一樣的形狀,例如:

>>> a = np.array([1.0, 2.0, 3.0])
>>> b = np.array([2.0, 2.0, 2.0])
>>> a * b
array([ 2.,  4.,  6.])

Numpy的廣播機制放寬了對陣列形狀的限制。最簡單的情形是一個陣列和一個尺度值相乘:

>>> a = np.array([1.0, 2.0, 3.0])
>>> b = 2.0
>>> a * b
array([ 2.,  4.,  6.])

上面兩種結(jié)果是一樣的,我們可以認為尺度值b在計算時被延展得和a一樣的形狀。延展后的b的每一個元素都是原來尺度值的復制。延展的類比只是一種概念性的。實際上,Numpy并不需要真的復制這些尺度值,所以廣播運算在內(nèi)存和計算效率上盡量高效。

上面的第二個例子比第一個更高效,因為廣播在乘法計算時動用更少的內(nèi)存。

exp

broadcast運算

x = np.array([1,2,3])
np.exp(x)

sum

broadcast運算。

def softmax(x):
    x_exp = np.exp(x)
    x_sum = np.sum(x_exp, axis=1, keepdims=True)
    s = x_exp/x_sum

matrix

array轉(zhuǎn)matrix

s = np.array([5,5,0,0,0,5])
np.matrix(s)

加載數(shù)據(jù)

loadtxt

numpy.loadtxt(fname, dtype=<type 'float'>, comments='#', delimiter=None, converters=None, skiprows=0, usecols=None, unpack=False, ndmin=0)[source]

參數(shù)

fname : file, str, or pathlib.Path

File, filename, or generator to read. If the filename extension is .gz or .bz2, the file is first decompressed. Note that generators should return byte strings for Python 3k.

dtype : data-type, optional

Data-type of the resulting array; default: float. If this is a structured data-type, the resulting array will be 1-dimensional, and each row will be interpreted as an element of the array. In this case, the number of columns used must match the number of fields in the data-type.

comments : str or sequence, optional

The characters or list of characters used to indicate the start of a comment; default: ‘#’.

delimiter : str, optional

The string used to separate values. By default, this is any whitespace.

converters : dict, optional

A dictionary mapping column number to a function that will convert that column to a float. E.g., if column 0 is a date string: converters = {0: datestr2num}. Converters can also be used to provide a default value for missing data (but see also genfromtxt):converters = {3: lambda s: float(s.strip() or 0)}. Default: None.

skiprows : int, optional

Skip the first skiprows lines; default: 0.

usecols : int or sequence, optional

Which columns to read, with 0 being the first. For example, usecols = (1,4,5) will extract the 2nd, 5th and 6th columns. The default, None, results in all columns being read.

New in version 1.11.0.

Also when a single column has to be read it is possible to use an integer instead of a tuple. E.g usecols = 3 reads the fourth column the same way as usecols = (3,)` would.

unpack : bool, optional

If True, the returned array is transposed, so that arguments may be unpacked using x, y, z = loadtxt(...). When used with a structured data-type, arrays are returned for each field. Default is False.

ndmin : int, optional

The returned array will have at least ndmin dimensions. Otherwise mono-dimensional axes will be squeezed. Legal values: 0 (default), 1 or 2.

New in version 1.6.0.

返回

out : ndarray

Data read from the text file.

genfromtxt

import numpy
nfl = numpy.genfromtxt("data.csv", delimiter=",")
# U75就是將每個值作為一個75 byte的unicode來讀取
world_alcohol = np.genfromtxt('world_alcohol.csv', dtype='U75', skip_header=1, delimiter=',')
data = np.genfromtxt('/Users/david/david/code/00project/carthage/scripts/adult.data', delimiter=', ', dtype=str)
# 取第14列
labels = data[:,14]
# 取除了倒數(shù)第二列之外的所有列
data = data[:,:-1]

matrix轉(zhuǎn)數(shù)組

np.argsort(y_score, kind="mergesort")[::-1]

隨機數(shù)字的矩陣

import numpy as np
numpy_matrix = np.random.randint(10, size=[5,2])

‘’‘
array([[1, 0],
       [8, 4],
       [0, 5],
       [2, 9],
       [9, 9]])
’‘’

獲取排序后數(shù)據(jù)位置的下標

import numpy as np
dd=np.mat([4,5,1]) 
dd1 = dd.argsort()
print dd
print dd1       #matrix([[2, 0, 1]], dtype=int64)

squeeze

從數(shù)組的形狀中刪除單維條目,即把shape中為1的維度去掉

x = np.array([[[0], [1], [2]]])  
np.squeeze(x)
array([0, 1, 2])

如果本來就是(1,1)的矩陣,則變成常數(shù)

cost = np.array([[1]])
cost = np.squeeze(cost)

得到1,cost的shape變成()

獲取符合條件的行列集合

數(shù)據(jù)如

1,1,1,0,0,0
0,1,1,1,1,0
1,0,0,1,1,0
0,0,0,1,1,0

第一列作為y_train,后面矩陣作為x_train,需要獲取y_train中為1的x_train的行

pos_rows = (y_train == 1)
x_train[pos_rows,:]

還有個例子

vector = numpy.array([5, 10, 15, 20])
vector == 10

[False, True, False, False]

matrix = numpy.array([
                    [5, 10, 15], 
                    [20, 25, 30],
                    [35, 40, 45]
                 ])
    matrix == 25

[
    [False, False, False], 
    [False, True,  False],
    [False, False, False]
]

比如要找第二列中是25的那一行

matrix = np.array([
                [5, 10, 15], 
                [20, 25, 30],
                [35, 40, 45]
             ])
    second_column_25 = (matrix[:,1] == 25)
    # 等同于print(matrix[second_column_25])
    print(matrix[second_column_25, :])

[
    [20, 25, 30]
]

多個條件的比較

vector = numpy.array([5, 10, 15, 20])
equal_to_ten_and_five = (vector == 10) & (vector == 5)

[False, False, False, False]

vector = numpy.array([5, 10, 15, 20])
equal_to_ten_or_five = (vector == 10) | (vector == 5)

[True, True, False, False]

也可以根據(jù)比較的結(jié)果改變值

vector = numpy.array([5, 10, 15, 20])
equal_to_ten_or_five = (vector == 10) | (vector == 5)
vector[equal_to_ten_or_five] = 50
print(vector)

true的都變成了50

[50, 50, 15, 20]

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

  • 基礎(chǔ)篇NumPy的主要對象是同種元素的多維數(shù)組。這是一個所有的元素都是一種類型、通過一個正整數(shù)元組索引的元素表格(...
    oyan99閱讀 5,289評論 0 18
  • mean to add the formatted="false" attribute?.[ 46% 47325/...
    ProZoom閱讀 3,202評論 0 3
  • 先決條件 在閱讀這個教程之前,你多少需要知道點python。如果你想從新回憶下,請看看Python Tutoria...
    舒map閱讀 2,725評論 1 13
  • NumPy是Python中關(guān)于科學計算的一個類庫,在這里簡單介紹一下。 來源:https://docs.scipy...
    灰太狼_black閱讀 1,331評論 0 5
  • 畢加索說:“我花了一輩子學習怎樣像孩子那樣畫畫?!钡沁€沒聽過一個演員說“花了一輩子學習怎樣像孩子那樣表演?!?關(guān)...
    不流不流閱讀 1,062評論 1 8

友情鏈接更多精彩內(nèi)容