R語言:clusterProfiler進行GO富集分析和Gene_ID轉換

一、讀取文件,ID轉換

1.讀取文件

library(clusterProfiler)
library(org.Hs.eg.db)
#讀取文件,原始文件中使用空格分割的
go_ythdf2 <- read.table("./goList/RIP YTHDF2 RNAseq.txt",sep=" ")
go_ythdf2 <- t(go_ythdf2)

2.ID轉換,ENTREZID是進行GO分析最好的ID類型(針對clusterProfiler)

ID轉換用到的是bitr()函數,bitr()的使用方法:

bitr(geneID, fromType, toType, OrgDb, drop = TRUE)

geneID:一個含有gene_name的矢量
orgDb:人類的注釋包是 org.Hs.eg.db
fromType:輸入的gene_name的類型
toType:需要轉換成的gene_name的類型,可以是多種類型,用大寫character類型向量表示

3.gene_name的類型查看

org.Hs.eg.db包含有多種gene_name的類型

#查看org.Hs.eg.db中可以被選擇/使用的類型
keytypes(org.Hs.eg.db) 

[1] "ACCNUM"       "ALIAS"        "ENSEMBL"      "ENSEMBLPROT"  "ENSEMBLTRANS" "ENTREZID"    
[7] "ENZYME"       "EVIDENCE"     "EVIDENCEALL"  "GENENAME"     "GO"           "GOALL"       
[13] "IPI"          "MAP"          "OMIM"         "ONTOLOGY"     "ONTOLOGYALL"  "PATH"        
[19] "PFAM"         "PMID"         "PROSITE"      "REFSEQ"       "SYMBOL"       "UCSCKG"      
[25] "UNIGENE"      "UNIPROT"     

#ID轉換,把gene_symbol轉換成ENTREZID。
go_ythdf2_id_trance <- bitr(go_ythdf2,fromType = "SYMBOL",toType = "ENTREZID",OrgDb = "org.Hs.eg.db",drop = T)

4.org.Hs.eg.db的相關函數

keytypes():keytypes(x),查看注釋包中可以使用的類型
columns():類似于keytypes(),針對org.Hs.eg.db兩個函數返回值一致
select():select(x, keys, columns, keytype, ...) eg.

#類似于bitr()的注釋方式
select(org.Hs.eg.db, keys=geneids, columns=c("SYMBOL", "GENENAME","GO"), keytype="ENSEMBL")

二、GO注釋

函數enrichGO()進行GO富集分析,enrichGO()的使用方法:

enrichGO(gene, OrgDb, keyType = "ENTREZID", ont = "MF", pvalueCutoff = 0.05, 
           pAdjustMethod = "BH", universe, qvalueCutoff = 0.2, minGSSize = 10, 
           maxGSSize = 500, readable = FALSE, pool = FALSE)
           
Arguments:
  gene                 a vector of entrez gene id.
  OrgDb                OrgDb
  keyType              keytype of input gene
  ont                  One of "MF", "BP", and "CC" subontologies or 'ALL'.
  pvalueCutoff         Cutoff value of pvalue.
  pAdjustMethod        one of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"
  universe             background genes
  qvalueCutoff         qvalue cutoff
  minGSSize            minimal size of genes annotated by Ontology term for testing.
  maxGSSize            maximal size of genes annotated for testing
  readable             whether mapping gene ID to gene Name
  pool                 If ont=’ALL’, whether pool 3 GO sub-ontologies

舉例:

result_go_ythdf2 <- enrichGO(go_ythdf2_id_trance$ENTREZID,OrgDb = "org.Hs.eg.db",
                             keyType = "ENTREZID",ont = "ALL",readable = T)

結果處理

畫圖

dotplot(result_go_ythdf2,showCategory=50)
barplot(result_go_ythdf2,showCategory=20)

保存

df_go_ythdf2 <- as.data.frame(result_go_ythdf2)
write.table(df_go_ythdf2,"df_go_ythdf2.xls",quote = F,sep = "\t")

三、整體代碼

library("clusterProfiler")
library("org.Hs.eg.db")

#讀取數據
go_ythdf2 <- read.table("./goList/RIP YTHDF2 RNAseq.txt",sep=" ")
go_ythdf2 <- t(go_ythdf2)

#ID轉換
go_ythdf2_id_trance <- bitr(go_ythdf2,fromType = "SYMBOL",toType = "ENTREZID",
                            OrgDb = "org.Hs.eg.db",drop = T)

#GO分析
result_go_ythdf2 <- enrichGO(go_ythdf2_id_trance$ENTREZID,OrgDb = "org.Hs.eg.db",
                             keyType = "ENTREZID",ont = "ALL",readable = T)

#繪圖
dotplot(result_go_ythdf2,showCategory=50)
barplot(result_go_ythdf2,showCategory=20)

#保存
df_go_ythdf2 <- as.data.frame(result_go_ythdf2)
write.table(df_go_ythdf2,"df_go_ythdf2.xls",quote = F,sep = "\t")

我想建立并管理一個高質量的生信&統(tǒng)計相關的微信討論群,如果你想參與討論,可以添加微信:veryqun 。我會拉你進群,當然有問題也可以微信咨詢我。

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
【社區(qū)內容提示】社區(qū)部分內容疑似由AI輔助生成,瀏覽時請結合常識與多方信息審慎甄別。
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發(fā)布,文章內容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

友情鏈接更多精彩內容