RNAseq分析(5):差異分析工具 GDCRNATools 安裝及測(cè)試

前言

GDCRNATools 是一個(gè)用于下載、整理和綜合分析GDC中IncRNA、mRNA和miRNA數(shù)據(jù)的R/Bioconductor包。主要功能包括:差異基因分析、生存分析、功能富集分析、內(nèi)源競(jìng)爭(zhēng)性RNA分析、lncRNA分析以及pseudogene分析等。另外,還可以進(jìn)行結(jié)果可視化,比如常規(guī)的火山圖,柱狀圖,散點(diǎn)圖,富集分析氣泡圖,生存曲線等。具體使用說明詳見: 說明文檔。

Fig1.png

安裝及使用

環(huán)境要求:R (>= 3.5.0)

1. GDCRNATools 安裝方法一(詳見

最簡(jiǎn)單的安裝方式(需要聯(lián)網(wǎng)):

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("GDCRNATools", version = "3.8")

安裝成功后,測(cè)試一下:

> library(GDCRNATools)

##############################################################################
Pathview is an open source software package distributed under GNU General
Public License version 3 (GPLv3). Details of GPLv3 is available at
http://www.gnu.org/licenses/gpl-3.0.html. Particullary, users are required to
formally cite the original Pathview paper (not just mention it) in publications
or products. For details, do citation("pathview") within R.

The pathview downloads and uses KEGG data. Non-academic uses may require a KEGG
license agreement (details at http://www.kegg.jp/kegg/legal.html).
##############################################################################

2. GDCRNATools 安裝方法二(詳見

在無法正常聯(lián)網(wǎng)的時(shí)候,那只好選擇離線安裝了:

install.packages("GDCRNATools",contriburl=paste("file:","/work/software/R/contrib",sep=''), type="source")

如果沒有出現(xiàn)報(bào)錯(cuò),那么安裝就應(yīng)該沒什么問題了。

3. 出現(xiàn)報(bào)錯(cuò)了怎么辦?

偶爾可能會(huì)遇到類似 “l(fā)ibudunits2.so not found!” 的報(bào)錯(cuò),這說明udunits 庫未正確安裝,需要進(jìn)行安裝:

$ wget -c ftp://ftp.unidata.ucar.edu/pub/udunits/udunits-2.2.26.tar.gz
$ tar zxf udunits-2.2.26.tar.gz
$ cd udunits-2.2.26
$ ./configure
$ make
$ make install
$ make install-info install-html install-pdf
$ make clean

安裝好udunits 庫了之后,再進(jìn)行GDCRNATools的安裝即可。

使用示例

最近安裝完GDCRNATools之后,按照官網(wǎng)上的教程,進(jìn)行了簡(jiǎn)單的測(cè)試,代碼和結(jié)果如下:

1)數(shù)據(jù)下載、整理:

library(GDCRNATools)
library(DT)

project <- 'TCGA-CHOL'
rnadir <- paste(project, 'RNAseq', sep='/')

#1) load RNA counts data

data(rnaCounts)  
rnaExpr <- gdcVoomNormalization(counts = rnaCounts, filter = FALSE)   ### Normalization of RNAseq data

#2) Parse metadata
metaMatrix.RNA <- gdcParseMetadata(project.id = 'TCGA-CHOL',
                                   data.type  = 'RNAseq', 
                                   write.meta = T)

metaMatrix.RNA <- gdcFilterDuplicate(metaMatrix.RNA)
metaMatrix.RNA <- gdcFilterSampleType(metaMatrix.RNA)
datatable(as.data.frame(metaMatrix.RNA[1:5,]), extensions = 'Scroller',
          options = list(scrollX = TRUE, deferRender = TRUE, scroller = TRUE))


#3) Merge RNAseq data 
rnaCounts <- gdcRNAMerge(metadata  = metaMatrix.RNA, 
                         path      = rnadir,   # the folder in which the data stored
                         organized = T,        # if the data are in separate folders
                         data.type = 'RNAseq')

Fig3.png

2)RNAseq 差異分析:

#4) Differential gene expression analysis

data(DEGAll)

DEGAll <- gdcDEAnalysis(counts     = rnaCounts, 
                        group      = metaMatrix.RNA$sample_type, 
                        comparison = 'PrimaryTumor-SolidTissueNormal', 
                        method     = 'limma')


### All DEGs
deALL <- gdcDEReport(deg = DEGAll, gene.type = 'all')

### DE long-noncoding
deLNC <- gdcDEReport(deg = DEGAll, gene.type = 'long_non_coding')

### DE protein coding genes
dePC <- gdcDEReport(deg = DEGAll, gene.type = 'protein_coding')

3)結(jié)果可視化:


#5) DEG visualization

## Volcano plot
gdcVolcanoPlot(DEGAll)

### Barplot
gdcBarPlot(deg = DEGAll, angle = 45, data.type = 'RNAseq')

degName = rownames(deALL)
gdcHeatmap(deg.id = degName, metadata = metaMatrix.RNA, rna.expr = rnaExpr)


data(enrichOutput)
gdcEnrichPlot(enrichOutput, type = 'bar', category = 'GO', num.terms = 10)


### Bubble plot
gdcEnrichPlot(enrichOutput, type='bubble', category='GO', num.terms = 10)

Fig4.png
Fig5.png
Fig6.png
Fig7.png
Fig8.png

4)代謝通路展示:

### View pathway maps on a local webpage

library(pathview)

deg <- deALL$logFC
names(deg) <- rownames(deALL)

pathways <- as.character(enrichOutput$Terms[enrichOutput$Category=='KEGG'])

shinyPathview(deg, pathways = pathways, directory = 'pathview')

Fig9.png

結(jié)語

經(jīng)過簡(jiǎn)單測(cè)試之后,發(fā)現(xiàn)GDCRNATools的功能確實(shí)很強(qiáng)大,不過要想將其完全掌握,還得仔細(xì)鉆研一番,后續(xù)再進(jìn)行補(bǔ)充。如有疑問,可以留言給出郵箱地址,方便進(jìn)行交流。

參考

Bioconductor : GDCRNATools

GDCRNATools: an R/Bioconductor package for integrative analysis of lncRNA, miRNA and mRNA data in GDC

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容