bs大香蕉一区,日韩午夜精品骚货毛P,久,久com日

很多做單細(xì)胞的研究者都提出過(guò)這個(gè)問(wèn)題，是否有直接的功能能對(duì)單細(xì)胞直接進(jìn)行注釋，而不是繁瑣的參看文獻(xiàn)，搜索marker，人為對(duì)單細(xì)胞進(jìn)行注釋。
單細(xì)胞真的可以實(shí)現(xiàn)自動(dòng)化注釋嗎？我想答案應(yīng)該是肯定可以的。但是很多方法注釋結(jié)果的準(zhǔn)確性有待探討，不過(guò)作為單細(xì)胞注釋的輔助工具是一個(gè)不錯(cuò)的選擇。
這兒我們將詳細(xì)講解SingleR單細(xì)胞注釋工具的使用以及弊端
我們可以通過(guò)得到singleR的細(xì)胞注釋結(jié)果之后，同時(shí)結(jié)合Seurat的分群結(jié)果，具體組織類型來(lái)綜合完成細(xì)胞注釋。
官方教程Using SingleR to annotate single-cell RNA-seq data: https://www.bioconductor.org/packages/release/bioc/vignettes/SingleR/inst/doc/SingleR.html

使用內(nèi)置參考進(jìn)行注釋（最簡(jiǎn)便的）
使用SingleR的最簡(jiǎn)單方法是使用內(nèi)置參考對(duì)細(xì)胞進(jìn)行注釋。celldex包通過(guò)專用的檢索功能提供了7個(gè)參考數(shù)據(jù)集（主要來(lái)自大量RNA-seq或微陣列數(shù)據(jù)）。
singleR自帶的7個(gè)參考數(shù)據(jù)集，需要聯(lián)網(wǎng)才能下載，其中5個(gè)是人類數(shù)據(jù)，2個(gè)是小鼠的數(shù)據(jù)：
BlueprintEncodeData Blueprint (Martens and Stunnenberg 2013) and Encode (The ENCODE Project Consortium 2012) （人）
DatabaseImmuneCellExpressionData The Database for Immune Cell Expression(/eQTLs/Epigenomics)(Schmiedel et al. 2018)（人）
HumanPrimaryCellAtlasData the Human Primary Cell Atlas (Mabbott et al. 2013)（人）
MonacoImmuneData, Monaco Immune Cell Data - GSE107011 (Monaco et al. 2019)（人）
NovershternHematopoieticData Novershtern Hematopoietic Cell Data - GSE24759（人）
ImmGenData the murine ImmGen (Heng et al. 2008) （鼠）
MouseRNAseqData a collection of mouse data sets downloaded from GEO (Benayoun et al. 2019).鼠）

相關(guān)包安裝

conda install -c bioconda bioconductor-Seurat
conda install -c bioconda bioconductor-singler ##devtools::install_github('dviraran/SingleR')安裝報(bào)錯(cuò)，直接用conda安裝了
conda install -c bioconda bioconductor-celldex ##安裝這個(gè)包用來(lái)調(diào)用參考數(shù)據(jù)集

導(dǎo)入相關(guān)包，并下載參考數(shù)據(jù)集

library(Seurat) ##
library(SingleR)
library(ggplot2)
library(reshape2)
hpca.se=HumanPrimaryCellAtlasData() ##第一次載入會(huì)下載數(shù)據(jù)集，可能會(huì)慢一些，后面在用時(shí)就不用下載了
Blue.se=BlueprintEncodeData() 
Immune.se=DatabaseImmuneCellExpressionData()
Nover.se=NovershternHematopoieticData()
MonacoIm.se=MonacoImmuneData()
ImmGen.se=ImmGenData() #(鼠)
Mouse.se=MouseRNAseqData() #(鼠)

在這里，我們還是使用我們前面經(jīng)常使用的pbmc3k數(shù)據(jù)集，這樣也是為了方便SingleR與Seurat分析結(jié)合起來(lái)
pbmc數(shù)據(jù)集相關(guān)下載，seurat聚類都可參照前面的簡(jiǎn)書(shū)：http://www.itdecent.cn/p/adda4536b2cb

setwd("/home/wucheng/jianshu/function/data")
pbmc <-readRDS("pbmc.rds") ##這兒我們直接導(dǎo)入Seurat標(biāo)準(zhǔn)化，聚類的pbmc數(shù)據(jù)
> pbmc
An object of class Seurat 
13714 features across 2638 samples within 1 assay 
Active assay: RNA (13714 features, 2000 variable features)
 2 dimensional reductions calculated: pca, umap

meta=pbmc@meta.data #pbmc的meta文件，包含了seurat的聚類結(jié)果
pbmc_for_SingleR <- GetAssayData(pbmc, slot="data") ##獲取標(biāo)準(zhǔn)化矩陣
pbmc.hesc <- SingleR(test = pbmc_for_SingleR, ref = hpca.se, labels = hpca.se$label.main) # 使用HumanPrimaryCellAtlasData參考數(shù)據(jù)集，main大類注釋，也可使用fine小類注釋，不過(guò)小類注釋準(zhǔn)確性不好確定

table(pbmc.hesc[[i]]$labels,meta$seurat_clusters) ##查看新注釋的標(biāo)簽與seurat分類的結(jié)果的交疊情況
> table(pbmc.hesc[[i]]$labels,meta$seurat_clusters) 
                 
                    0   1   2   3   4   5   6   7   8
  B cells           0   0   3 342   0   0   0   0   1
  CD4+ T cells    488 404   0   1   5   0   0   0   0
  CD8+ T cells    159  51   0   1  96   0   3   0   0
  Dendritic cells   0   0  14   0   0   0   0  31   0
  Monocytes         0   0 463   0   0 162   0   1   2
  NK cells          0   0   0   0  15   0 148   0   0
  Progenitors       4   0   0   0   0   0   0   0  11
  T cells          46  28   0   0 155   0   4   0   0

我們可以看到有些細(xì)胞簇分類還是很明確的，接著我們借助一些可視化函數(shù)看看注釋效果

pdf("plotScoreHeatmap.pdf")
print(plotScoreHeatmap(pbmc.hesc))
dev.off()
pbmc@meta.data$labels <-pbmc.hesc$labels
pdf(paste(i,"Umap.pdf",sep ="_"),height=5,width=10)
print(DimPlot(pbmc, group.by = c("seurat_clusters", "labels"),reduction = "umap"))
dev.off()

plotScoreHeatmap

可以看到參考數(shù)據(jù)集中的大部分細(xì)胞類別這兒都沒(méi)有

Umap

umap直觀的可以看到通過(guò)singleR注釋的細(xì)胞標(biāo)簽準(zhǔn)確性應(yīng)該可以（不過(guò)注意這兒時(shí)pbmc數(shù)據(jù)集，有些組織單細(xì)胞數(shù)據(jù)可能就不是這樣了哦，可能會(huì)很亂，做好心理準(zhǔn)備哦）

aa=table(pbmc.hesc[[i]]$labels,meta$seurat_clusters)
aa= apply(aa,2,function(x) x/sum(x))
df=as.data.frame(melt(aa))
df$Var2=as.factor(df$Var2)
g <- ggplot(df, aes(Var2, Var1)) + geom_point(aes(size = value), colour = "green") + theme_bw() 
pdf("singleR_match_seurat.pdf",height=5,width=10)
print(g)
dev.off()
library(pheatmap)
pdf(paste(i,"heatmap.pdf",sep ="_"),height=5,width=10)
pheatmap(log2(aa+10), color=colorRampPalette(c("white", "blue"))(101))
dev.off()

singleR_match_seurat

heatmap

兩個(gè)圖意思差不多，可以作為判斷簇具體細(xì)胞類型的一個(gè)借鑒。

另一種是不用這兒的參考數(shù)據(jù)集，利用已有參考數(shù)據(jù)集，給其它數(shù)據(jù)集注釋（Seurat也能實(shí)現(xiàn)）
這兒從pbmc數(shù)據(jù)集中抽取500個(gè)細(xì)胞作為新的數(shù)據(jù)集pbmc1,使用前面給pbmc打上的標(biāo)簽，為pbmc1重新打標(biāo)簽

pbmc1 <-pbmc[,1:500]
test <- GetAssayData(pbmc1, slot="data")
library(scran)
pbmc1.hesc <- SingleR(test=test, ref=pbmc_for_SingleR, labels=pbmc$labels, de.method="wilcox")
pbmc1@meta.data$labels1 <-pbmc1.hesc$labels
pdf("Umap1.pdf",height=5,width=10)
print(DimPlot(pbmc1, group.by = c("seurat_clusters", "labels"),reduction = "umap"))
dev.off()

Umap1

因?yàn)閜bmc1是從pbmc抽取的，所以新的標(biāo)簽和之前的一致。

利用多個(gè)數(shù)據(jù)參考集為單細(xì)胞數(shù)據(jù)打標(biāo)簽
一些時(shí)候，如果希望使用多個(gè)參考數(shù)據(jù)集對(duì)單細(xì)胞數(shù)據(jù)進(jìn)行注釋?？梢员苊鈫蝹€(gè)參考數(shù)據(jù)集中不能覆蓋到的標(biāo)記，從而得到一組更加全面的細(xì)胞類型標(biāo)記，尤其是在考慮分辨率差異的情況下。我們可以通過(guò)將多個(gè)對(duì)象簡(jiǎn)單地傳遞到SingleR()函數(shù)中的ref=和label=參數(shù)，即可支持使用多個(gè)參考數(shù)據(jù)集。

pbmc.hesc <- SingleR(test = pbmc_for_SingleR, ref = list(BP=Blue.se, HPCA=hpca.se), labels = list(Blue.se$label.main, hpca.se$label.main)) 
table(pbmc.hesc$labels,meta$seurat_clusters)
table(pbmc.hesc$labels,meta$seurat_clusters)
                  
                     0   1   2   3   4   5   6   7   8
  B_cell             0   0   0   4   0   0   0   0   0
  B-cells            0   0   0 334   0   0   0   1   1
  CD4+ T-cells     310 140   0   1   1   0   0   0   0
  CD8+ T-cells     366 318   0   4 240   0   5   1   0
  HSC                4   0   0   0   0   0   0   1   0
  Monocyte           0   0 292   0   0 130   0  21   0
  Monocytes          0   0 175   0   0  30   0   8   1
  NK cells           0   0   0   0  30   0 150   0   0
  Platelets          0   0   0   0   0   0   0   0  12
  Pre-B_cell_CD34-   0   0  13   0   0   1   0   0   0
  T_cells           17  25   0   1   0   1   0   0   0

不同參考數(shù)據(jù)集命名不同，有些其實(shí)是一樣的細(xì)胞類型。

總而言之，參考庫(kù)是作者基于已發(fā)表的單一種類的純細(xì)胞轉(zhuǎn)錄組數(shù)據(jù)構(gòu)建的，所以如果純轉(zhuǎn)錄組數(shù)據(jù)不全，細(xì)胞注釋是存在影響的。
所以說(shuō)，SingleR作為單細(xì)胞注釋的輔助工具是一個(gè)不錯(cuò)的選擇。
后面我們還會(huì)講到其它的單細(xì)胞注釋輔助工具，謝謝！

希望大家關(guān)注點(diǎn)贊，謝謝！?。。。。。。。。。?！

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

單細(xì)胞輔助注釋工具-SingleR

單細(xì)胞輔助注釋工具-SingleR

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

單細(xì)胞輔助注釋工具-SingleR

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av