亚洲本道AV九区,午夜福利啪啪

數(shù)據(jù)分析的時(shí)候, 我們有時(shí)候會(huì)遇到這樣的需求.

就比如當(dāng)一個(gè)GO號(hào)對(duì)應(yīng)多個(gè)Gene ID的時(shí)候，如下:

    GO_ids      Gene_ids
0   GO:666666   AT1G12310,AT1G12320,AT1G23330
1   GO:888888   Gene1,Gene2,Gene3

我們想把它變成GO ID和Gene ID一一對(duì)應(yīng)的關(guān)系，這樣做的目的是為了為基因添加表達(dá)量信息或者其它注釋信息. 目標(biāo)表格如下:

    GO_ids      Gene_ids
0   GO:666666   AT1G12310
0   GO:666666   AT1G12320
0   GO:666666   AT1G23330
1   GO:888888   Gene1
1   GO:888888   Gene2
1   GO:888888   Gene3

以前的我也干過(guò)這樣的事情, 居然是硬寫(xiě)代碼, 后來(lái)有一次聽(tīng)數(shù)據(jù)分析的會(huì)議, 聽(tīng)到有個(gè)人提到Hive里面的爆炸函數(shù), 覺(jué)得挺有趣, 想著python數(shù)據(jù)分析生態(tài)系統(tǒng)里不可能沒(méi)有這樣的輪子. 于是搜索了一下, 還真的有.

學(xué)習(xí)一個(gè)工具最好的工具是查文檔, 查文檔, 查文檔.

pandas explode的文檔鏈接如下: pandas explode函數(shù).

This routine will explode list-likes including lists, tuples, sets, Series, and np.ndarray. The result dtype of the subset rows will be object. Scalars will be returned unchanged, and empty list-likes will result in a np.nan for that row. In addition, the ordering of rows in the output will be non-deterministic when exploding sets.

文檔里面最重要的一句話是,能夠?qū)?strong>lists-like的元素"爆炸成"新的行. 下面我們通過(guò)一個(gè)實(shí)例來(lái)演示一下, explode函數(shù)如何工作.

Jupyter notebook中測(cè)試時(shí)所用代碼塊:

#構(gòu)建測(cè)試數(shù)據(jù)集
import pandas as pd
import numpy as np
df = pd.DataFrame.from_dict({"GO_ids":["GO:666666", "GO:888888"], "Gene_ids":["AT1G12310,AT1G12320,AT1G23330", "Gene1,Gene2,Gene3"]})
df

#將想被explode的列里的元素, 變?yōu)閘ist like
df["Gene_ids"] = df["Gene_ids"].apply(lambda x: x.split(","))
#df["Gene_ids"] = df["Gene_ids"].str.split(",")
df

#explode 
df.explode("Gene_ids")

Jupyter notebook中實(shí)踐.

Step1: 構(gòu)建測(cè)試數(shù)據(jù)集

Screenshot_2021-06-23_10-30-10.png

Step2: 將想被explode的列里的元素, 變?yōu)閘ist like

Screenshot_2021-06-23_10-30-26.png

Step3: explode.

Screenshot_2021-06-23_10-30-38.png

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

pandas包所謂的爆炸函數(shù)explode

pandas包所謂的爆炸函數(shù)explode

Jupyter notebook中測(cè)試時(shí)所用代碼塊:

Jupyter notebook中實(shí)踐.

Step1: 構(gòu)建測(cè)試數(shù)據(jù)集

Step2: 將想被explode的列里的元素, 變?yōu)閘ist like

Step3: explode.

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

pandas包所謂的爆炸函數(shù)explode

Jupyter notebook中測(cè)試時(shí)所用代碼塊:

Jupyter notebook中實(shí)踐.

Step1: 構(gòu)建測(cè)試數(shù)據(jù)集

Step2: 將想被explode的列里的元素, 變?yōu)閘ist like

Step3: explode.

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av