ArchR官網(wǎng)教程學(xué)習(xí)筆記11:鑒定Marker峰

系列回顧:
ArchR官網(wǎng)教程學(xué)習(xí)筆記1:Getting Started with ArchR
ArchR官網(wǎng)教程學(xué)習(xí)筆記2:基于ArchR推測(cè)Doublet
ArchR官網(wǎng)教程學(xué)習(xí)筆記3:創(chuàng)建ArchRProject
ArchR官網(wǎng)教程學(xué)習(xí)筆記4:ArchR的降維
ArchR官網(wǎng)教程學(xué)習(xí)筆記5:ArchR的聚類
ArchR官網(wǎng)教程學(xué)習(xí)筆記6:單細(xì)胞嵌入(Single-cell Embeddings)
ArchR官網(wǎng)教程學(xué)習(xí)筆記7:ArchR的基因評(píng)分和Marker基因
ArchR官網(wǎng)教程學(xué)習(xí)筆記8:定義與scRNA-seq一致的聚類
ArchR官網(wǎng)教程學(xué)習(xí)筆記9:ArchR的偽批量重復(fù)
ArchR官網(wǎng)教程學(xué)習(xí)筆記10:ArchR的call peak

在前面討論基因評(píng)分的章節(jié)中,我們已經(jīng)介紹了標(biāo)記特征的識(shí)別??梢允褂孟嗤暮瘮?shù)(getMarkerFeatures())來標(biāo)識(shí)ArchRProject中存儲(chǔ)的任何矩陣中的標(biāo)記特性。標(biāo)記特征是特定細(xì)胞群所特有的特征。這些對(duì)于理解特定cluster或細(xì)胞類型非常有用。在本章中,我們將討論如何使用這個(gè)功能來識(shí)別Marker峰。

(一)鑒定Marker峰

通常,我們感興趣的是想知道哪些峰是單個(gè)cluster或一小群cluster所特有的。在ArchR中,我們可以使用addMarkerFeatures()函數(shù)和useMatrix = "PeakMatrix"以無監(jiān)督的方式完成此任務(wù)。

首先,讓我們回顧一下我們?cè)趐rojHeme5中研究的細(xì)胞類型和它們的相對(duì)比例。

#Our scRNA labels
> table(projHeme5$Clusters2)

         B      CD4.M      CD4.N        CLP  Erythroid        GMP       Mono 
       432        639       1279        384        732       1201       2579 
        NK        pDC       PreB Progenitor 
       874        300        358       1473

現(xiàn)在,我們準(zhǔn)備使用useMatrix = "PeakMatrix"調(diào)用addMarkerFeatures()函數(shù)來識(shí)別Marker峰。此外,我們通過設(shè)置bias參數(shù)來考慮TSS富集和每個(gè)細(xì)胞唯一片段的數(shù)量,讓ArchR考慮細(xì)胞群之間數(shù)據(jù)質(zhì)量的差異。

> markersPeaks <- getMarkerFeatures(
  ArchRProj = projHeme5, 
  useMatrix = "PeakMatrix", 
  groupBy = "Clusters2",
  bias = c("TSSEnrichment", "log10(nFrags)"),
  testMethod = "wilcoxon"
)

這個(gè)對(duì)象會(huì)通過getMarkerFeatures()返回一個(gè)SummarizedExperiment,其中包含一些不同的assays:

> markersPeaks
class: SummarizedExperiment 
dim: 140865 11 
metadata(2): MatchInfo Params
assays(7): Log2FC Mean ... AUC MeanBGD
rownames(140865): 1 2 ... 140864 140865
rowData names(4): seqnames idx start end
colnames(11): B CD4.M ... PreB Progenitor
colData names(0):

我們可以使用getMarkers()函數(shù)檢索這個(gè)SummarizedExperiment中我們感興趣的特定片段。此函數(shù)的默認(rèn)行為是返回一個(gè)DataFrame對(duì)象列表,每個(gè)對(duì)象是一個(gè)細(xì)胞群。

> markerList <- getMarkers(markersPeaks, cutOff = "FDR <= 0.01 & Log2FC >= 1")
> markerList
List of length 11
names(11): B CD4.M CD4.N CLP Erythroid GMP Mono NK pDC PreB Progenitor

如果你對(duì)某一個(gè)細(xì)胞群的Marker峰感興趣,可以單獨(dú)查看:

> markerList$Erythroid
DataFrame with 2656 rows and 7 columns
       seqnames     idx     start       end           Log2FC                  FDR          MeanDiff
          <Rle> <array>   <array>   <array>        <numeric>            <numeric>         <numeric>
87111     chr22    1219  30129813  30130313 4.28186227428957 1.83893337171705e-15  1.07953356543409
6670       chr1    6670 110407019 110407519 8.67005424991154 2.96546706376336e-13 0.820868679383739
58340     chr17    6183  73631581  73632081 8.61339322841304 3.53910479048555e-13 0.789176607419672
15006     chr10    1557  30440804  30441304 5.96401972746272 1.43825817219516e-12 0.758436365804101
2590       chr1    2590  27869127  27869627 6.24899843746968 3.09707434920801e-12 0.892089356271252
...         ...     ...       ...       ...              ...                  ...               ...
104472     chr5     770  32714018  32714518 1.82287547506236   0.0094547934235557 0.126001924074063
125339     chr7    6487 150724392 150724892 2.78589698640991   0.0094547934235557 0.111227644844517
51924     chr16    5054  89045463  89045963 1.75678742468666  0.00948266858646735  0.20433103914127
120570     chr7    1718  30185784  30186284 1.58859209325531  0.00955616601235496 0.326414734260411
83256     chr20    3287  50157615  50158115 1.35032538225079  0.00960141941952872 0.462686019918627

我們可以使用getMarkers()來代替DataFrame對(duì)象列表,通過設(shè)置returnGR = TRUE來返回GRangesList對(duì)象:

> markerList <- getMarkers(markersPeaks, cutOff = "FDR <= 0.01 & Log2FC >= 1", returnGR = TRUE)
> markerList
List of length 11
names(11): B CD4.M CD4.N CLP Erythroid GMP Mono NK pDC PreB Progenitor

這個(gè)GRangesList對(duì)象也可以查看某一個(gè)細(xì)胞群的marker峰:

> markerList$Erythroid
GRanges object with 2656 ranges and 3 metadata columns:
         seqnames              ranges strand |           Log2FC                  FDR          MeanDiff
            <Rle>           <IRanges>  <Rle> |        <numeric>            <numeric>         <numeric>
     [1]    chr22   30129813-30130313      * | 4.28186227428957 1.83893337171705e-15  1.07953356543409
     [2]     chr1 110407019-110407519      * | 8.67005424991154 2.96546706376336e-13 0.820868679383739
     [3]    chr17   73631581-73632081      * | 8.61339322841304 3.53910479048555e-13 0.789176607419672
     [4]    chr10   30440804-30441304      * | 5.96401972746272 1.43825817219516e-12 0.758436365804101
     [5]     chr1   27869127-27869627      * | 6.24899843746968 3.09707434920801e-12 0.892089356271252
     ...      ...                 ...    ... .              ...                  ...               ...
  [2652]     chr5   32714018-32714518      * | 1.82287547506236   0.0094547934235557 0.126001924074063
  [2653]     chr7 150724392-150724892      * | 2.78589698640991   0.0094547934235557 0.111227644844517
  [2654]    chr16   89045463-89045963      * | 1.75678742468666  0.00948266858646735  0.20433103914127
  [2655]     chr7   30185784-30186284      * | 1.58859209325531  0.00955616601235496 0.326414734260411
  [2656]    chr20   50157615-50158115      * | 1.35032538225079  0.00960141941952872 0.462686019918627
  -------
  seqinfo: 23 sequences from an unspecified genome; no seqlengths

(二)繪制Marker峰

(1)熱圖

我們可以使用markerHeatmap()函數(shù)將這些Marker峰(或getMarkerFeatures()輸出的任何特征)可視化為一個(gè)熱圖:

> heatmapPeaks <- markerHeatmap(
  seMarker = markersPeaks, 
  cutOff = "FDR <= 0.1 & Log2FC >= 0.5",
  transpose = TRUE
)
> reorder_name = c("Mono","Erythroid","GMP","Progenitor","NK","CD4.M","CD4.N","B","PreB","CLP","pDC")
> draw(heatmapPeaks, row_order= reorder_name,heatmap_legend_side = "bot", annotation_legend_side = "bot")
(2)MA Plot和火山圖

除了繪制熱圖,我們還可以為任何單個(gè)細(xì)胞群繪制一個(gè)MA plot或火山圖。我們使用markerPlot()函數(shù)。對(duì)于MA圖,我們指定plotAs =“MA”。這里,我們通過name參數(shù)指定“Erythroid”細(xì)胞群。

> pma <- markerPlot(seMarker = markersPeaks, name = "Erythroid", cutOff = "FDR <= 0.1 & Log2FC >= 1", plotAs = "MA")
> pma
> pv <- markerPlot(seMarker = markersPeaks, name = "Erythroid", cutOff = "FDR <= 0.1 & Log2FC >= 1", plotAs = "Volcano")
> pv
(3)在Browser Tracks展示Marker峰

此外,通過將相關(guān)的峰區(qū)域傳遞給plotBrowserTrack()函數(shù)中的特性參數(shù),我們可以看到這些峰區(qū)域在browser tracks上的覆蓋。這將增加一個(gè)額外的BED形式的marker峰在ArchR軌跡圖的底部。這里我們通過geneSymbol參數(shù)繪制GATA1基因:

> p <- plotBrowserTrack(
    ArchRProj = projHeme5, 
    groupBy = "Clusters2", 
    geneSymbol = c("GATA1"),
    features =  getMarkers(markersPeaks, cutOff = "FDR <= 0.1 & Log2FC >= 1", returnGR = TRUE)["Erythroid"],
    upstream = 50000,
    downstream = 50000
)
> grid::grid.newpage()
> grid::grid.draw(p$GATA1)
> plotPDF(p, name = "Plot-Tracks-With-Features", width = 5, height = 5, ArchRProj = projHeme5, addDOC = FALSE)

(三)比較兩個(gè)細(xì)胞群

Marker特征識(shí)別是一種非常特殊的差異測(cè)試。不過,ArchR還使用相同的getMarkerFeatures()函數(shù)支持標(biāo)準(zhǔn)差異測(cè)試。這里的小竅門是將useGroups設(shè)置為兩個(gè)細(xì)胞組中的一個(gè),而將bgdGroups設(shè)置為另一個(gè)細(xì)胞組。這將在提供的兩個(gè)組之間執(zhí)行差異測(cè)試。在所有這些差異測(cè)試中,傳遞給useGroups的組中較高的峰將具有正的fold change值,而傳遞給bgdGroups的組中較高的峰值將具有負(fù)的fold change值。

在這里,我們?cè)凇癊rythroid”細(xì)胞組和“Progenitor”細(xì)胞組之間進(jìn)行了配對(duì)試驗(yàn):

> markerTest <- getMarkerFeatures(
  ArchRProj = projHeme5, 
  useMatrix = "PeakMatrix",
  groupBy = "Clusters2",
  testMethod = "wilcoxon",
  bias = c("TSSEnrichment", "log10(nFrags)"),
  useGroups = "Erythroid",
  bgdGroups = "Progenitor"
)

畫MA plot:

> pma <- markerPlot(seMarker = markerTest, name = "Erythroid", cutOff = "FDR <= 0.1 & abs(Log2FC) >= 1", plotAs = "MA")
> pma

畫火山圖:

> pv <- markerPlot(seMarker = markerTest, name = "Erythroid", cutOff = "FDR <= 0.1 & abs(Log2FC) >= 1", plotAs = "Volcano")
> pv

下一章我們將繼續(xù)這個(gè)差異分析,來尋找差異可接近peaks的motif富集。

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。
禁止轉(zhuǎn)載,如需轉(zhuǎn)載請(qǐng)通過簡信或評(píng)論聯(lián)系作者。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容