1、官方文檔
ndarray.size
Number of elements in the array.矩陣中元素的個(gè)數(shù)。
s = pd.Series({'a': 1, 'b': 2, 'c': 3})
>>> s.size
3
df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
>>> df.size
4
2、size包括NaN值,count不包括:
In [46]:
df = pd.DataFrame({'a':[0,0,1,2,2,2], 'b':[1,2,3,4,np.NaN,4], 'c':np.random.randn(6)})
df
Out[46]:
a b c
0 0 1 1.067627
1 0 2 0.554691
2 1 3 0.458084
3 2 4 0.426635
4 2 NaN -2.238091
5 2 4 1.256943
In [48]:
print(df.groupby(['a'])['b'].count())
print(df.groupby(['a'])['b'].size())
a
0 2
1 1
2 2
Name: b, dtype: int64
a
0 2
1 1
2 3
dtype: int64
3、即使數(shù)據(jù)沒(méi)有NA值,count()的結(jié)果也更加冗長(zhǎng)
In [114]:
grouped = fec_mrbo.groupby(['cand_nm',labels])
grouped.size().unstack(0)
Out[114]:
cand_nm Obama, Barack Romney, Mitt
contb_receipt_amt
(0, 1] 493.0 77.0
(1, 10] 40070.0 3681.0
(10, 100] 372280.0 31853.0
(100, 1000] 153991.0 43357.0
(1000, 10000] 22284.0 26186.0
(10000, 100000] 2.0 1.0
(100000, 1000000] 3.0 NaN
(1000000, 10000000] 4.0 NaN
In [115]:
grouped = fec_mrbo.groupby(['cand_nm',labels])
grouped.count().unstack(0)
Out[115]:
cmte_id cand_id contbr_nm contbr_city contbr_st ... memo_cd memo_text form_tp file_num parties
cand_nm Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt ... Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt Obama, Barack Romney, Mitt
contb_receipt_amt
(0, 1] 493.0 77.0 493.0 77.0 493.0 77.0 493.0 77.0 493.0 77.0 ... 31.0 1.0 138.0 10.0 493.0 77.0 493.0 77.0 493.0 77.0
(1, 10] 40070.0 3681.0 40070.0 3681.0 40070.0 3681.0 40070.0 3681.0 40070.0 3681.0 ... 4645.0 14.0 4781.0 53.0 40070.0 3681.0 40070.0 3681.0 40070.0 3681.0
(10, 100] 372280.0 31853.0 372280.0 31853.0 372280.0 31853.0 372276.0 31853.0 372280.0 31853.0 ... 33331.0 74.0 33789.0 236.0 372280.0 31853.0 372280.0 31853.0 372280.0 31853.0
(100, 1000] 153991.0 43357.0 153991.0 43357.0 153991.0 43357.0 153991.0 43355.0 153987.0 43357.0 ... 31674.0 347.0 31897.0 849.0 153991.0 43357.0 153991.0 43357.0 153991.0 43357.0
(1000, 10000] 22284.0 26186.0 22284.0 26186.0 22284.0 26186.0 22284.0 26185.0 22284.0 26186.0 ... 16622.0 640.0 16693.0 2217.0 22284.0 26186.0 22284.0 26186.0 22284.0 26186.0
(10000, 100000] 2.0 1.0 2.0 1.0 2.0 1.0 2.0 1.0 2.0 1.0 ... 0.0 1.0 1.0 1.0 2.0 1.0 2.0 1.0 2.0 1.0
(100000, 1000000] 3.0 NaN 3.0 NaN 3.0 NaN 3.0 NaN 3.0 NaN ... 3.0 NaN 3.0 NaN 3.0 NaN 3.0 NaN 3.0 NaN
(1000000, 10000000] 4.0 NaN 4.0 NaN 4.0 NaN 4.0 NaN 4.0 NaN ... 4.0 NaN 4.0 NaN 4.0 NaN 4.0 NaN 4.0 NaN