雷軍看了都說好的25個實用Python代碼分享

在小米之前的一次發(fā)布會上,雷軍與代言人王源展開了一場對話,在對話過程中,雷軍回答了不少的提問,在這些提問中,其中一個問題引發(fā)了廣泛的關注。

雷軍在被問道會不會寫詩的時候說“我沒有寫過詩,但有人說我寫的代碼像詩一樣優(yōu)雅”。眾人都知道雷軍是一個優(yōu)秀的企業(yè)家,但卻很少有人知道雷軍還很會寫代碼。

下面小編帶來的Python代碼,連“寫代碼像寫詩一樣優(yōu)雅”的雷軍都說好,干貨滿滿,還不趕快學起來!

1. 散點圖

Scatteplot是用于研究兩個變量之間關系的經(jīng)典和基本圖。如果數(shù)據(jù)中有多個組,則可能需要以不同顏色可視化每個組。在Matplotlib,你可以方便地使用。

#?Import?dataset?midwest=?pd.read_csv("https://raw.githubusercontent.com/selva86/datasets/master/midwest_filter.csv")#?Prepare?Data?#?Create?as?many?colors?as?there?are?unique?midwest['category']categories=?np.unique(midwest['category'])colors=?[plt.cm.tab10(i/float(len(categories)-1))?for?iinrange(len(categories))]#?Draw?Plot?for?Each?Categoryplt.figure(figsize=(16,10),dpi=80,facecolor='w',edgecolor='k')for?i,?categoryinenumerate(categories):????plt.scatter('area',?'poptotal',data=midwest.loc[midwest.category==category,:],s=20,c=colors[i],label=str(category))#?Decorationsplt.gca().set(xlim=(0.0,0.1),ylim=(0,90000),xlabel='Area',ylabel='Population')plt.xticks(fontsize=12);plt.yticks(fontsize=12)plt.title("Scatterplot?of?Midwest?Area?vs?Population",fontsize=22)plt.legend(fontsize=12)plt.show()

2. 帶邊界的氣泡圖

有時,您希望在邊界內(nèi)顯示一組點以強調(diào)其重要性。在此示例中,您將從應該被環(huán)繞的數(shù)據(jù)幀中獲取記錄,并將其傳遞給下面的代碼中描述的記錄。encircle()

from?matplotlibimportpatchesfrom?scipy.spatialimportConvexHullimportwarnings;?warnings.simplefilter('ignore')sns.set_style("white")#?Step?1:?Prepare?Datamidwest=?pd.read_csv("https://raw.githubusercontent.com/selva86/datasets/master/midwest_filter.csv")#?As?many?colors?as?there?are?unique?midwest['category']categories=?np.unique(midwest['category'])colors=?[plt.cm.tab10(i/float(len(categories)-1))?for?iinrange(len(categories))]#?Step?2:?Draw?Scatterplot?with?unique?color?for?each?categoryfig=?plt.figure(figsize=(16,10),dpi=80,facecolor='w',edgecolor='k')for?i,?categoryinenumerate(categories):????plt.scatter('area',?'poptotal',data=midwest.loc[midwest.category==category,:],s='dot_size',c=colors[i],label=str(category),edgecolors='black',linewidths=.5)#?Step?3:?Encircling#?https://stackoverflow.com/questions/44575681/how-do-i-encircle-different-data-sets-in-scatter-plotdef?encircle(x,y,ax=None,**kw):ifnot?ax:ax=plt.gca()p=?np.c_[x,y]hull=?ConvexHull(p)poly=?plt.Polygon(p[hull.vertices,:],?**kw)????ax.add_patch(poly)#?Select?data?to?be?encircledmidwest_encircle_data=?midwest.loc[midwest.state=='IN',:]#?Draw?polygon?surrounding?vertices????encircle(midwest_encircle_data.area,?midwest_encircle_data.poptotal,ec="k",fc="gold",alpha=0.1)encircle(midwest_encircle_data.area,?midwest_encircle_data.poptotal,ec="firebrick",fc="none",linewidth=1.5)#?Step?4:?Decorationsplt.gca().set(xlim=(0.0,0.1),ylim=(0,90000),xlabel='Area',ylabel='Population')plt.xticks(fontsize=12);plt.yticks(fontsize=12)plt.title("Bubble?Plot?with?Encircling",fontsize=22)plt.legend(fontsize=12)plt.show()

3. 帶線性回歸最佳擬合線的散點圖

如果你想了解兩個變量如何相互改變,那么最合適的線就是要走的路。下圖顯示了數(shù)據(jù)中各組之間最佳擬合線的差異。要禁用分組并僅為整個數(shù)據(jù)集繪制一條最佳擬合線,請從下面的調(diào)用中刪除該參數(shù)。

#?Import?Datadf?=?pd.read_csv("https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv")df_select?=?df.loc[df.cyl.isin([4,8]),?:]#?Plotsns.set_style("white")gridobj?=?sns.lmplot(x="displ",y="hwy",hue="cyl",data=df_select,height=7,aspect=1.6,robust=True,palette='tab10',scatter_kws=dict(s=60,linewidths=.7,edgecolors='black'))#?Decorationsgridobj.set(xlim=(0.5,?7.5),?ylim=(0,?50))plt.title("Scatterplot?with?line?of?best?fit?grouped?by?number?of?cylinders",fontsize=20)

每個回歸線都在自己的列中

或者,您可以在其自己的列中顯示每個組的最佳擬合線。你可以通過在里面設置參數(shù)來實現(xiàn)這一點。

#?Import?Datadf?=?pd.read_csv("https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv")df_select?=?df.loc[df.cyl.isin([4,8]),?:]#?Each?line?in?its?own?columnsns.set_style("white")gridobj?=?sns.lmplot(x="displ",y="hwy",data=df_select,height=7,robust=True,palette='Set1',col="cyl",scatter_kws=dict(s=60,linewidths=.7,edgecolors='black'))#?Decorationsgridobj.set(xlim=(0.5,?7.5),?ylim=(0,?50))plt.show()

4. 抖動圖

通常,多個數(shù)據(jù)點具有完全相同的X和Y值。結(jié)果,多個點相互繪制并隱藏。為避免這種情況,請稍微抖動點,以便您可以直觀地看到它們。這很方便使用

#?Import?Datadf=?pd.read_csv("https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv")#?Draw?Stripplotfig,ax=?plt.subplots(figsize=(16,10),dpi=80)????sns.stripplot(df.cty,?df.hwy,jitter=0.25,size=8,ax=ax,linewidth=.5)#?Decorationsplt.title('Use?jittered?plots?to?avoid?overlapping?of?points',fontsize=22)plt.show()

5. 計數(shù)圖

避免點重疊問題的另一個選擇是增加點的大小,這取決于該點中有多少點。因此,點的大小越大,周圍的點的集中度就越大。

#?Import?Datadf=?pd.read_csv("https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv")df_counts=?df.groupby(['hwy',?'cty']).size().reset_index(name='counts')#?Draw?Stripplotfig,ax=?plt.subplots(figsize=(16,10),dpi=80)????sns.stripplot(df_counts.cty,?df_counts.hwy,size=df_counts.counts*2,ax=ax)#?Decorationsplt.title('Counts?Plot?-?Size?of?circle?is?bigger?as?more?points?overlap',fontsize=22)plt.show()

6. 邊緣直方圖

邊緣直方圖具有沿X和Y軸變量的直方圖。這用于可視化X和Y之間的關系以及單獨的X和Y的單變量分布。該圖如果經(jīng)常用于探索性數(shù)據(jù)分析(EDA)。

#?Import?Datadf=?pd.read_csv("https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv")#?Create?Fig?and?gridspecfig=?plt.figure(figsize=(16,10),dpi=80)grid=?plt.GridSpec(4,4,hspace=0.5,wspace=0.2)#?Define?the?axesax_main=?fig.add_subplot(grid[:-1,?:-1])ax_right=?fig.add_subplot(grid[:-1,?-1],xticklabels=[],yticklabels=[])ax_bottom=?fig.add_subplot(grid[-1,0:-1],xticklabels=[],yticklabels=[])#?Scatterplot?on?main?axax_main.scatter('displ',?'hwy',s=df.cty*4,c=df.manufacturer.astype('category').cat.codes,alpha=.9,data=df,cmap="tab10",edgecolors='gray',linewidths=.5)#?histogram?on?the?rightax_bottom.hist(df.displ,40,histtype='stepfilled',orientation='vertical',color='deeppink')ax_bottom.invert_yaxis()#?histogram?in?the?bottomax_right.hist(df.hwy,40,histtype='stepfilled',orientation='horizontal',color='deeppink')#?Decorationsax_main.set(title='ScatterplotwithHistograms??displ?vs?hwy',xlabel='displ',ylabel='hwy')ax_main.title.set_fontsize(20)for?itemin([ax_main.xaxis.label,?ax_main.yaxis.label]?+?ax_main.get_xticklabels()?+?ax_main.get_yticklabels()):????item.set_fontsize(14)xlabels=?ax_main.get_xticks().tolist()ax_main.set_xticklabels(xlabels)plt.show()

7.邊緣箱形圖

邊緣箱圖與邊緣直方圖具有相似的用途。然而,箱線圖有助于精確定位X和Y的中位數(shù),第25和第75百分位數(shù)。

#?Import?Datadf=?pd.read_csv("https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv")#?Create?Fig?and?gridspecfig=?plt.figure(figsize=(16,10),dpi=80)grid=?plt.GridSpec(4,4,hspace=0.5,wspace=0.2)#?Define?the?axesax_main=?fig.add_subplot(grid[:-1,?:-1])ax_right=?fig.add_subplot(grid[:-1,?-1],xticklabels=[],yticklabels=[])ax_bottom=?fig.add_subplot(grid[-1,0:-1],xticklabels=[],yticklabels=[])#?Scatterplot?on?main?axax_main.scatter('displ',?'hwy',s=df.cty*5,c=df.manufacturer.astype('category').cat.codes,alpha=.9,data=df,cmap="Set1",edgecolors='black',linewidths=.5)#?Add?a?graph?in?each?partsns.boxplot(df.hwy,ax=ax_right,orient="v")sns.boxplot(df.displ,ax=ax_bottom,orient="h")#?Decorations?------------------#?Remove?x?axis?name?for?the?boxplotax_bottom.set(xlabel='')ax_right.set(ylabel='')#?Main?Title,?Xlabel?and?YLabelax_main.set(title='ScatterplotwithHistograms??displ?vs?hwy',xlabel='displ',ylabel='hwy')#?Set?font?size?of?different?componentsax_main.title.set_fontsize(20)for?itemin([ax_main.xaxis.label,?ax_main.yaxis.label]?+?ax_main.get_xticklabels()?+?ax_main.get_yticklabels()):????item.set_fontsize(14)plt.show()

8. 相關圖

Correlogram用于直觀地查看給定數(shù)據(jù)幀(或2D數(shù)組)中所有可能的數(shù)值變量對之間的相關度量。

#?Import?Datasetdf=?pd.read_csv("https://github.com/selva86/datasets/raw/master/mtcars.csv")#?Plotplt.figure(figsize=(12,10),dpi=80)sns.heatmap(df.corr(),xticklabels=df.corr().columns,yticklabels=df.corr().columns,cmap='RdYlGn',center=0,annot=True)#?Decorationsplt.title('Correlogram?of?mtcars',fontsize=22)plt.xticks(fontsize=12)plt.yticks(fontsize=12)plt.show()

9. 矩陣圖

成對圖是探索性分析中的最愛,以理解所有可能的數(shù)字變量對之間的關系。它是雙變量分析的必備工具。

#?Load?Datasetdf=?sns.load_dataset('iris')#?Plotplt.figure(figsize=(10,8),dpi=80)sns.pairplot(df,kind="scatter",hue="species",plot_kws=dict(s=80,edgecolor="white",linewidth=2.5))plt.show()

#?Load?Datasetdf=?sns.load_dataset('iris')#?Plotplt.figure(figsize=(10,8),dpi=80)sns.pairplot(df,kind="reg",hue="species")plt.show()

偏差

10. 發(fā)散型條形圖

如果您想根據(jù)單個指標查看項目的變化情況,并可視化此差異的順序和數(shù)量,那么發(fā)散條是一個很好的工具。它有助于快速區(qū)分數(shù)據(jù)中組的性能,并且非常直觀,并且可以立即傳達這一點。

#?Prepare?Datadf=?pd.read_csv("https://github.com/selva86/datasets/raw/master/mtcars.csv")x=?df.loc[:,?['mpg']]df['mpg_z']?=?(x?-?x.mean())/x.std()df['colors']?=?['red'ifx?<0else'green'?for?xindf['mpg_z']]df.sort_values('mpg_z',inplace=True)df.reset_index(inplace=True)#?Draw?plotplt.figure(figsize=(14,10),dpi=80)plt.hlines(y=df.index,xmin=0,xmax=df.mpg_z,color=df.colors,alpha=0.4,linewidth=5)#?Decorationsplt.gca().set(ylabel='$Model$',xlabel='$Mileage$')plt.yticks(df.index,?df.cars,fontsize=12)plt.title('Diverging?Bars?of?Car?Mileage',fontdict={'size':20})plt.grid(linestyle='--',alpha=0.5)plt.show()

11. 發(fā)散型文本

分散的文本類似于發(fā)散條,如果你想以一種漂亮和可呈現(xiàn)的方式顯示圖表中每個項目的價值,它更喜歡。

#?Prepare?Datadf?=?pd.read_csv("https://github.com/selva86/datasets/raw/master/mtcars.csv")x?=?df.loc[:,?['mpg']]df['mpg_z']?=?(x?-?x.mean())/x.std()df['colors']?=?['red'ifx?<0else'green'forxindf['mpg_z']]df.sort_values('mpg_z',?inplace=True)df.reset_index(inplace=True)#?Draw?plotplt.figure(figsize=(14,14),?dpi=80)plt.hlines(y=df.index,?xmin=0,?xmax=df.mpg_z)forx,?y,?texinzip(df.mpg_z,?df.index,?df.mpg_z):????t?=?plt.text(x,?y,round(tex,2),?horizontalalignment='right'ifx?<0else'left',??????????????????verticalalignment='center',?fontdict={'color':'red'ifx?<0else'green','size':14})#?Decorations????plt.yticks(df.index,?df.cars,?fontsize=12)plt.title('Diverging?Text?Bars?of?Car?Mileage',?fontdict={'size':20})plt.grid(linestyle='--',?alpha=0.5)plt.xlim(-2.5,2.5)plt.show()

12. 發(fā)散型包點圖

發(fā)散點圖也類似于發(fā)散條。然而,與發(fā)散條相比,條的不存在減少了組之間的對比度和差異。

#?Prepare?Datadf?=?pd.read_csv("https://github.com/selva86/datasets/raw/master/mtcars.csv")x?=?df.loc[:,?['mpg']]df['mpg_z']?=?(x?-?x.mean())/x.std()df['colors']?=?['red'ifx?<0else'darkgreen'forxindf['mpg_z']]df.sort_values('mpg_z',?inplace=True)df.reset_index(inplace=True)#?Draw?plotplt.figure(figsize=(14,16),?dpi=80)plt.scatter(df.mpg_z,?df.index,?s=450,?alpha=.6,?color=df.colors)forx,?y,?texinzip(df.mpg_z,?df.index,?df.mpg_z):????t?=?plt.text(x,?y,round(tex,1),?horizontalalignment='center',??????????????????verticalalignment='center',?fontdict={'color':'white'})#?Decorations#?Lighten?bordersplt.gca().spines["top"].set_alpha(.3)plt.gca().spines["bottom"].set_alpha(.3)plt.gca().spines["right"].set_alpha(.3)plt.gca().spines["left"].set_alpha(.3)plt.yticks(df.index,?df.cars)plt.title('Diverging?Dotplot?of?Car?Mileage',?fontdict={'size':20})plt.xlabel('$Mileage$')plt.grid(linestyle='--',?alpha=0.5)plt.xlim(-2.5,2.5)plt.show()

13. 帶標記的發(fā)散型棒棒糖圖

帶標記的棒棒糖通過強調(diào)您想要引起注意的任何重要數(shù)據(jù)點并在圖表中適當?shù)亟o出推理,提供了一種可視化分歧的靈活方式。

#?Prepare?Datadf=?pd.read_csv("https://github.com/selva86/datasets/raw/master/mtcars.csv")x=?df.loc[:,?['mpg']]df['mpg_z']?=?(x?-?x.mean())/x.std()df['colors']?=?'black'#?color?fiat?differentlydf.loc[df.cars==?'Fiat?X1-9',?'colors']?=?'darkorange'df.sort_values('mpg_z',inplace=True)df.reset_index(inplace=True)#?Draw?plotimportmatplotlib.patches?as?patchesplt.figure(figsize=(14,16),dpi=80)plt.hlines(y=df.index,xmin=0,xmax=df.mpg_z,color=df.colors,alpha=0.4,linewidth=1)plt.scatter(df.mpg_z,?df.index,color=df.colors,s=[600ifx==?'Fiat?X1-9'else300for?xindf.cars],alpha=0.6)plt.yticks(df.index,?df.cars)plt.xticks(fontsize=12)#?Annotateplt.annotate('Mercedes?Models',xy=(0.0,11.0),xytext=(1.0,11),xycoords='data',fontsize=15,ha='center',va='center',bbox=dict(boxstyle='square',fc='firebrick'),arrowprops=dict(arrowstyle='-[,widthB=2.0,lengthB=1.5',lw=2.0,color='steelblue'),color='white')#?Add?Patchesp1=?patches.Rectangle((-2.0,?-1),width=.3,height=3,alpha=.2,facecolor='red')p2=?patches.Rectangle((1.5,27),width=.8,height=5,alpha=.2,facecolor='green')plt.gca().add_patch(p1)plt.gca().add_patch(p2)#?Decorateplt.title('Diverging?Bars?of?Car?Mileage',fontdict={'size':20})plt.grid(linestyle='--',alpha=0.5)plt.show()

14.面積圖

通過對軸和線之間的區(qū)域進行著色,區(qū)域圖不僅強調(diào)峰值和低谷,而且還強調(diào)高點和低點的持續(xù)時間。高點持續(xù)時間越長,線下面積越大。

importnumpy?as?npimportpandas?as?pd#?Prepare?Datadf=?pd.read_csv("https://github.com/selva86/datasets/raw/master/economics.csv",parse_dates=['date']).head(100)x=?np.arange(df.shape[0])y_returns=?(df.psavert.diff().fillna(0)/df.psavert.shift(1)).fillna(0)?*100#?Plotplt.figure(figsize=(16,10),dpi=80)plt.fill_between(x[1:],?y_returns[1:],0,where=y_returns[1:]>=0,facecolor='green',interpolate=True,alpha=0.7)plt.fill_between(x[1:],?y_returns[1:],0,where=y_returns[1:]<=0,facecolor='red',interpolate=True,alpha=0.7)#?Annotateplt.annotate('Peak1975',xy=(94.0,21.0),xytext=(88.0,28),bbox=dict(boxstyle='square',fc='firebrick'),arrowprops=dict(facecolor='steelblue',shrink=0.05),fontsize=15,color='white')#?Decorationsxtickvals=?[str(m)[:3].upper()+"-"+str(y)?for?y,minzip(df.date.dt.year,?df.date.dt.month_name())]plt.gca().set_xticks(x[::6])plt.gca().set_xticklabels(xtickvals[::6],rotation=90,fontdict={'horizontalalignment':'center',?'verticalalignment':?'center_baseline'})plt.ylim(-35,35)plt.xlim(1,100)plt.title("Month?Economics?Return?%",fontsize=22)plt.ylabel('Monthly?returns?%')plt.grid(alpha=0.5)plt.show()

15. 有序條形圖

有序條形圖有效地傳達了項目的排名順序。但是,在圖表上方添加度量標準的值,用戶可以從圖表本身獲取精確信息。

#?Prepare?Datadf_raw=?pd.read_csv("https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv")df=?df_raw[['cty',?'manufacturer']].groupby('manufacturer').apply(lambda?x:?x.mean())df.sort_values('cty',inplace=True)df.reset_index(inplace=True)#?Draw?plotimportmatplotlib.patches?as?patchesfig,ax=?plt.subplots(figsize=(16,10),facecolor='white',dpi=80)ax.vlines(x=df.index,ymin=0,ymax=df.cty,color='firebrick',alpha=0.7,linewidth=20)#?Annotate?Textfor?i,?ctyinenumerate(df.cty):????ax.text(i,?cty+0.5,?round(cty,1),horizontalalignment='center')#?Title,?Label,?Ticks?and?Ylimax.set_title('Bar?Chart?for?Highway?Mileage',fontdict={'size':22})ax.set(ylabel='MilesPer?Gallon',ylim=(0,30))plt.xticks(df.index,?df.manufacturer.str.upper(),rotation=60,horizontalalignment='right',fontsize=12)#?Add?patches?to?color?the?X?axis?labelsp1=?patches.Rectangle((.57,?-0.005),width=.33,height=.13,alpha=.1,facecolor='green',transform=fig.transFigure)p2=?patches.Rectangle((.124,?-0.005),width=.446,height=.13,alpha=.1,facecolor='red',transform=fig.transFigure)fig.add_artist(p1)fig.add_artist(p2)plt.show()

16. 棒棒糖圖

棒棒糖圖表以一種視覺上令人愉悅的方式提供與有序條形圖類似的目的。

#?Prepare?Datadf_raw=?pd.read_csv("https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv")df=?df_raw[['cty',?'manufacturer']].groupby('manufacturer').apply(lambda?x:?x.mean())df.sort_values('cty',inplace=True)df.reset_index(inplace=True)#?Draw?plotfig,ax=?plt.subplots(figsize=(16,10),dpi=80)ax.vlines(x=df.index,ymin=0,ymax=df.cty,color='firebrick',alpha=0.7,linewidth=2)ax.scatter(x=df.index,y=df.cty,s=75,color='firebrick',alpha=0.7)#?Title,?Label,?Ticks?and?Ylimax.set_title('Lollipop?Chart?for?Highway?Mileage',fontdict={'size':22})ax.set_ylabel('Miles?Per?Gallon')ax.set_xticks(df.index)ax.set_xticklabels(df.manufacturer.str.upper(),rotation=60,fontdict={'horizontalalignment':'right',?'size':12})ax.set_ylim(0,30)#?Annotatefor?rowindf.itertuples():????ax.text(row.Index,?row.cty+.5,s=round(row.cty,2),horizontalalignment='center',verticalalignment='bottom',fontsize=14)plt.show()

17. 包點圖

點圖表傳達了項目的排名順序。由于它沿水平軸對齊,因此您可以更容易地看到點彼此之間的距離。

#?Prepare?Datadf_raw?=?pd.read_csv("https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv")df?=?df_raw[['cty',?'manufacturer']].groupby('manufacturer').apply(lambda?x:?x.mean())df.sort_values('cty',inplace=True)df.reset_index(inplace=True)#?Draw?plotfig,?ax?=?plt.subplots(figsize=(16,10),?dpi=?80)ax.hlines(y=df.index,xmin=11,xmax=26,color='gray',alpha=0.7,linewidth=1,linestyles='dashdot')ax.scatter(y=df.index,x=df.cty,s=75,color='firebrick',alpha=0.7)#?Title,?Label,?Ticks?and?Ylimax.set_title('Dot?Plot?for?Highway?Mileage',fontdict={'size':22})ax.set_xlabel('Miles?Per?Gallon')ax.set_yticks(df.index)ax.set_yticklabels(df.manufacturer.str.title(),?fontdict={'horizontalalignment':?'right'})ax.set_xlim(10,?27)plt.show()

18. 坡度圖

斜率圖最適合比較給定人/項目的“之前”和“之后”位置。

importmatplotlib.lines?as?mlines#?Import?Datadf?=?pd.read_csv("https://raw.githubusercontent.com/selva86/datasets/master/gdppercap.csv")left_label?=?[str(c)?+',?'+str(round(y))forc,?y?in?zip(df.continent,?df['1952'])]right_label?=?[str(c)?+',?'+str(round(y))forc,?y?in?zip(df.continent,?df['1957'])]klass?=?['red'if(y1-y2)?<0else'green'fory1,?y2?in?zip(df['1952'],?df['1957'])]#drawline#?https://stackoverflow.com/questions/36470343/how-to-draw-a-line-with-matplotlib/36479941def?newline(p1,?p2,color='black'):????ax?=?plt.gca()????l?=?mlines.Line2D([p1[0],p2[0]],?[p1[1],p2[1]],color='red'ifp1[1]-p2[1]?>0else'green',?marker='o',?markersize=6)????ax.add_line(l)returnlfig,?ax?=?plt.subplots(1,1,figsize=(14,14),?dpi=80)#?Vertical?Linesax.vlines(x=1,?ymin=500,?ymax=13000,color='black',alpha=0.7,?linewidth=1,?linestyles='dotted')ax.vlines(x=3,?ymin=500,?ymax=13000,color='black',alpha=0.7,?linewidth=1,?linestyles='dotted')#?Pointsax.scatter(y=df['1952'],?x=np.repeat(1,?df.shape[0]),?s=10,color='black',alpha=0.7)ax.scatter(y=df['1957'],?x=np.repeat(3,?df.shape[0]),?s=10,color='black',alpha=0.7)#?Line?Segmentsand?Annotationforp1,?p2,?c?in?zip(df['1952'],?df['1957'],?df['continent']):????newline([1,p1],?[3,p2])????ax.text(1-0.05,?p1,?c?+',?'+str(round(p1)),?horizontalalignment='right',?verticalalignment='center',?fontdict={'size':14})????ax.text(3+0.05,?p2,?c?+',?'+str(round(p2)),?horizontalalignment='left',?verticalalignment='center',?fontdict={'size':14})#'Before'and'After'Annotationsax.text(1-0.05,13000,'BEFORE',?horizontalalignment='right',?verticalalignment='center',?fontdict={'size':18,'weight':700})ax.text(3+0.05,13000,'AFTER',?horizontalalignment='left',?verticalalignment='center',?fontdict={'size':18,'weight':700})#?Decorationax.set_title("Slopechart:?Comparing?GDP?Per?Capita?between?1952?vs?1957",?fontdict={'size':22})ax.set(xlim=(0,4),?ylim=(0,14000),?ylabel='Mean?GDP?Per?Capita')ax.set_xticks([1,3])ax.set_xticklabels(["1952","1957"])plt.yticks(np.arange(500,13000,2000),?fontsize=12)#?Lighten?bordersplt.gca().spines["top"].set_alpha(.0)plt.gca().spines["bottom"].set_alpha(.0)plt.gca().spines["right"].set_alpha(.0)plt.gca().spines["left"].set_alpha(.0)plt.show()

19. 啞鈴圖

啞鈴圖傳達各種項目的“前”和“后”位置以及項目的排序。如果您想要將特定項目/計劃對不同對象的影響可視化,那么它非常有用。

importmatplotlib.lines?as?mlines#?Import?Datadf=?pd.read_csv("https://raw.githubusercontent.com/selva86/datasets/master/health.csv")df.sort_values('pct_2014',inplace=True)df.reset_index(inplace=True)#?Func?to?draw?line?segmentdef?newline(p1,?p2,color='black'):ax=?plt.gca()l=?mlines.Line2D([p1[0],p2[0]],?[p1[1],p2[1]],color='skyblue')ax.add_line(l)????return?l#?Figure?and?Axesfig,ax=?plt.subplots(1,1,figsize=(14,14),facecolor='#f7f7f7',dpi=80)#?Vertical?Linesax.vlines(x=.05,ymin=0,ymax=26,color='black',alpha=1,linewidth=1,linestyles='dotted')ax.vlines(x=.10,ymin=0,ymax=26,color='black',alpha=1,linewidth=1,linestyles='dotted')ax.vlines(x=.15,ymin=0,ymax=26,color='black',alpha=1,linewidth=1,linestyles='dotted')ax.vlines(x=.20,ymin=0,ymax=26,color='black',alpha=1,linewidth=1,linestyles='dotted')#?Pointsax.scatter(y=df['index'],x=df['pct_2013'],s=50,color='#0e668b',alpha=0.7)ax.scatter(y=df['index'],x=df['pct_2014'],s=50,color='#a3c4dc',alpha=0.7)#?Line?Segmentsfor?i,?p1,?p2inzip(df['index'],?df['pct_2013'],?df['pct_2014']):????newline([p1,?i],?[p2,?i])#?Decorationax.set_facecolor('#f7f7f7')ax.set_title("Dumbell?Chart:?Pct?Change?-?2013?vs?2014",fontdict={'size':22})ax.set(xlim=(0,.25),ylim=(-1,27),ylabel='MeanGDP?Per?Capita')ax.set_xticks([.05,?.1,?.15,?.20])ax.set_xticklabels(['5%',?'15%',?'20%',?'25%'])ax.set_xticklabels(['5%',?'15%',?'20%',?'25%'])????plt.show()

20. 連續(xù)變量的直方圖

直方圖顯示給定變量的頻率分布。下面的表示基于分類變量對頻率條進行分組,從而更好地了解連續(xù)變量和串聯(lián)變量。

#?Import?Datadf=?pd.read_csv("https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv")#?Prepare?datax_var=?'displ'groupby_var=?'class'df_agg=?df.loc[:,?[x_var,?groupby_var]].groupby(groupby_var)vals=?[df[x_var].values.tolist()?for?i,?dfindf_agg]#?Drawplt.figure(figsize=(16,9),dpi=80)colors=?[plt.cm.Spectral(i/float(len(vals)-1))?for?iinrange(len(vals))]n,?bins,patches=?plt.hist(vals,30,stacked=True,density=False,color=colors[:len(vals)])#?Decorationplt.legend({group:col?for?group,?colinzip(np.unique(df[groupby_var]).tolist(),?colors[:len(vals)])})plt.title(f"Stacked?Histogram?of${x_var}$?colored?by${groupby_var}$",fontsize=22)plt.xlabel(x_var)plt.ylabel("Frequency")plt.ylim(0,25)plt.xticks(ticks=bins[::3],labels=[round(b,1)for?binbins[::3]])plt.show()

21. 類型變量的直方圖

分類變量的直方圖顯示該變量的頻率分布。通過對條形圖進行著色,您可以將分布與表示顏色的另一個分類變量相關聯(lián)。

#?Import?Datadf=?pd.read_csv("https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv")#?Prepare?datax_var=?'manufacturer'groupby_var=?'class'df_agg=?df.loc[:,?[x_var,?groupby_var]].groupby(groupby_var)vals=?[df[x_var].values.tolist()?for?i,?dfindf_agg]#?Drawplt.figure(figsize=(16,9),dpi=80)colors=?[plt.cm.Spectral(i/float(len(vals)-1))?for?iinrange(len(vals))]n,?bins,patches=?plt.hist(vals,?df[x_var].unique().__len__(),stacked=True,density=False,color=colors[:len(vals)])#?Decorationplt.legend({group:col?for?group,?colinzip(np.unique(df[groupby_var]).tolist(),?colors[:len(vals)])})plt.title(f"Stacked?Histogram?of${x_var}$?colored?by${groupby_var}$",fontsize=22)plt.xlabel(x_var)plt.ylabel("Frequency")plt.ylim(0,40)plt.xticks(ticks=bins,labels=np.unique(df[x_var]).tolist(),rotation=90,horizontalalignment='left')plt.show()

22. 密度圖

密度圖是一種常用工具,可視化連續(xù)變量的分布。通過“響應”變量對它們進行分組,您可以檢查X和Y之間的關系。以下情況,如果出于代表性目的來描述城市里程的分布如何隨著汽缸數(shù)的變化而變化。

# Import Data

df = pd.read_csv("https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv")

# Draw Plot

plt.figure(figsize=(16,10), dpi= 80)

sns.kdeplot(df.loc[df['cyl'] == 4, "cty"], shade=True, color="g", label="Cyl=4", alpha=.7)

sns.kdeplot(df.loc[df['cyl'] == 5, "cty"], shade=True, color="deeppink", label="Cyl=5", alpha=.7)

sns.kdeplot(df.loc[df['cyl'] == 6, "cty"], shade=True, color="dodgerblue", label="Cyl=6", alpha=.7)

sns.kdeplot(df.loc[df['cyl'] == 8, "cty"], shade=True, color="orange", label="Cyl=8", alpha=.7)

# Decoration

plt.title('Density Plot of City Mileage by n_Cylinders', fontsize=22)

plt.legend()

23. 直方密度線圖

帶有直方圖的密度曲線將兩個圖表傳達的集體信息匯集在一起,這樣您就可以將它們放在一個圖形而不是兩個圖形中。

#?Import?Datadf?=?pd.read_csv("https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv")#?Draw?Plotplt.figure(figsize=(13,10),?dpi=80)sns.distplot(df.loc[df['class']?=='compact',"cty"],?color="dodgerblue",label="Compact",?hist_kws={'alpha':.7},?kde_kws={'linewidth':3})sns.distplot(df.loc[df['class']?=='suv',"cty"],?color="orange",label="SUV",?hist_kws={'alpha':.7},?kde_kws={'linewidth':3})sns.distplot(df.loc[df['class']?=='minivan',"cty"],?color="g",label="minivan",?hist_kws={'alpha':.7},?kde_kws={'linewidth':3})plt.ylim(0,0.35)#?Decorationplt.title('Density?Plot?of?City?Mileage?by?Vehicle?Type',?fontsize=22)plt.legend()plt.show()

24. Joy Plot

Joy Plot允許不同組的密度曲線重疊,這是一種可視化相對于彼此的大量組的分布的好方法。它看起來很悅目,并清楚地傳達了正確的信息。它可以使用joypy基于的包來輕松構(gòu)建matplotlib。

#?!pip?install?joypy#?Import?Datampg=?pd.read_csv("https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv")#?Draw?Plotplt.figure(figsize=(16,10),dpi=80)fig,axes=?joypy.joyplot(mpg,column=['hwy','cty'],by="class",ylim='own',figsize=(14,10))#?Decorationplt.title('Joy?Plot?of?CityandHighway?Mileage?by?Class',fontsize=22)plt.show()

25. 分布式點圖

分布點圖顯示按組分割的點的單變量分布。點數(shù)越暗,該區(qū)域的數(shù)據(jù)點集中度越高。通過對中位數(shù)進行不同著色,組的真實定位立即變得明顯。

importmatplotlib.patches?as?mpatches#?Prepare?Datadf_raw=?pd.read_csv("https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv")cyl_colors=?{4:'tab:red',5:'tab:green',6:'tab:blue',8:'tab:orange'}df_raw['cyl_color']?=?df_raw.cyl.map(cyl_colors)#?Mean?and?Median?city?mileage?by?makedf=?df_raw[['cty',?'manufacturer']].groupby('manufacturer').apply(lambda?x:?x.mean())df.sort_values('cty',ascending=False,inplace=True)df.reset_index(inplace=True)df_median=?df_raw[['cty',?'manufacturer']].groupby('manufacturer').apply(lambda?x:?x.median())#?Draw?horizontal?linesfig,ax=?plt.subplots(figsize=(16,10),dpi=80)ax.hlines(y=df.index,xmin=0,xmax=40,color='gray',alpha=0.5,linewidth=.5,linestyles='dashdot')#?Draw?the?Dotsfor?i,?makeinenumerate(df.manufacturer):df_make=?df_raw.loc[df_raw.manufacturer==make,:]????ax.scatter(y=np.repeat(i,df_make.shape[0]),x='cty',data=df_make,s=75,edgecolors='gray',c='w',alpha=0.5)ax.scatter(y=i,x='cty',data=df_median.loc[df_median.index==make,:],s=75,c='firebrick')#?Annotate????ax.text(33,13,"$red?;?dots?;?are?;?the?:?median$",fontdict={'size':12},color='firebrick')#?Decorationsred_patch=?plt.plot([],[],marker="o",ms=10,ls="",mec=None,color='firebrick',label="Median")plt.legend(handles=red_patch)ax.set_title('Distribution?of?City?Mileage?by?Make',fontdict={'size':22})ax.set_xlabel('Miles?Per?Gallon?(City)',alpha=0.7)ax.set_yticks(df.index)ax.set_yticklabels(df.manufacturer.str.title(),fontdict={'horizontalalignment':'right'},alpha=0.7)ax.set_xlim(1,40)plt.xticks(alpha=0.7)plt.gca().spines["top"].set_visible(False)????plt.gca().spines["bottom"].set_visible(False)????plt.gca().spines["right"].set_visible(False)????plt.gca().spines["left"].set_visible(False)???plt.grid(axis='both',alpha=.4,linewidth=.1)plt.show()

想要學會Python代碼也變得像雷軍一樣優(yōu)雅嗎?下面小編就為大家送上干貨,460集的Python課程免費送上~

領取方式:后臺私信“資料”二字,即可免費領取~

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

友情鏈接更多精彩內(nèi)容