殺殺
生信分析中,我們?cè)谌蚪M測(cè)序后會(huì)獲得分子突變譜信息,如何直觀地將各種疾病分子亞型下的基因突變特征,以及分子共突變/互斥突變的特征展示出來(lái)呢?

好了讓我們來(lái)學(xué)習(xí)它
首先安裝和加載包
library(ComplexHeatmap)
library(circlize)
library(RColorBrewer) #這是一個(gè)顏色搭配包
我們需要導(dǎo)入兩部分?jǐn)?shù)據(jù),一部分是分子突變信息數(shù)據(jù),另一部分是臨床特征數(shù)據(jù)

這個(gè)表格的制作比較基礎(chǔ),就不多介紹,從TCGA數(shù)據(jù)庫(kù)(膠質(zhì)瘤的CGGA數(shù)據(jù)庫(kù)直接是整理好的表格)中下下來(lái)的原始數(shù)據(jù)中都有突變類型注釋,直接轉(zhuǎn)換成表格即可。
制作需要展現(xiàn)的基因列表,有些基因基本上就沒突變,或不關(guān)心,就不需要展示,也可以直接從excel表格導(dǎo)入
genelist <- c("IDH1","IDH2","TP53","ATRX","PTEN","EGFR","NF1","PIK3CA")
interestgene_matrix <- TCGA_mut_matrix_2016[match(genelist,rownames(TCGA_mut_matrix_2016)),]
將我們感興趣的基因的突變譜取出準(zhǔn)備畫圖
接下來(lái)要為熱圖準(zhǔn)備一些注釋信息,首先是每個(gè)突變的顏色,所有的突變類型都可以設(shè)置,我比較習(xí)慣取自己喜歡的顏色,也可以用顏色包
col <- c( "missense_variant" = "#0072b4","splice_acceptor_variant" = "#e29e3e",
"stop_gained" = "#008f8c", "frameshift_variant" = "#e29e3e",
"inframe_deletion" = "#e8c31c","synonymous_variant" = "#9ec66d")
接下來(lái)這一步很重要,給背景以及每個(gè)突變類型設(shè)置形狀和顏色,如果不運(yùn)行這一步,接下來(lái)的主函數(shù)將不會(huì)自動(dòng)給你分配顏色和大小,會(huì)報(bào)錯(cuò)
另外,這邊的h*0.8會(huì)在圖中每行的基因突變條間留一些空隙,大家可以乘不同的數(shù)值試試看
alter_fun <- list(
background = function(x, y, w, h) {
grid.rect(x, y, w-unit(0.5, "mm"), h*0.8,
gp = gpar(fill = "#e8e7e3", col = NA))
},
missense_variant = function(x, y, w, h) {
grid.rect(x, y, w-unit(0.5, "mm"), h*0.8,
gp = gpar(fill = col["missense_variant"], col = NA))
},
splice_acceptor_variant = function(x, y, w, h) {
grid.rect(x, y, w-unit(0.5, "mm"), h*0.8,
gp = gpar(fill = col["splice_acceptor_variant"], col = NA))
},
stop_gained = function(x, y, w, h) {
grid.rect(x, y, w-unit(0.5, "mm"), h*0.8,
gp = gpar(fill = col["stop_gained"], col = NA))
},
frameshift_variant = function(x, y, w, h) {
grid.rect(x, y, w-unit(0.5, "mm"), h*0.8,
gp = gpar(fill = col["frameshift_variant"], col = NA))
},
inframe_deletion = function(x, y, w, h) {
grid.rect(x, y, w-unit(0.5, "mm"), h*0.8,
gp = gpar(fill = col["inframe_deletion"], col = NA))
},
synonymous_variant = function(x, y, w, h) {
grid.rect(x, y, w-unit(0.5, "mm"), h*0.8,
gp = gpar(fill = col["synonymous_variant"], col = NA))
}
)
接下來(lái)設(shè)置圖例
heatmap_legend <- list(title = "Alternations",
at = c("missense_variant", "splice_acceptor_variant",
"stop_gained", "frameshift_variant", "inframe_deletion", "synonymous_variant",
"start_lost", "splice_region_variant", "splice_donor_variant",
"intron_variant", "stop_lost", "coding_sequence_variant","Mutant","Methylated","Codel"),
labels = c("missense_variant", "splice_acceptor_variant",
"stop_gained", "frameshift_variant", "inframe_deletion", "synonymous_variant",
"start_lost", "splice_region_variant", "splice_donor_variant",
"intron_variant", "stop_lost", "coding_sequence_variant","Mutant","Methylated","Codel"))
接下來(lái)就可以畫突變圖啦:使用的就是下面這個(gè)oncoPrint函數(shù)
oncoPrint(interestgene_matrix , #數(shù)據(jù)
alter_fun = alter_fun, col = col,
column_title = column_title,
#這兩行是設(shè)置要不要去掉空行和空列 remove_empty_columns = FALSE,
#去掉空行 remove_empty_rows = FALSE,
row_names_side = "left", #我們可以把基因信息放左邊
pct_side = "right",
#這個(gè)設(shè)不設(shè)置好像影響不大 alter_fun_is_vectorized = FALSE,
heatmap_legend_param = heatmap_legend, #設(shè)置圖例
)
接下來(lái)我們可以導(dǎo)入對(duì)應(yīng)的臨床信息,比如腫瘤等級(jí),分型,生存,性別信息等等。

接下來(lái)簡(jiǎn)單設(shè)置一下臨床信息函數(shù):$后面加上想要展示的列名
clini <- HeatmapAnnotation(Age=TCGA_clin_2016$AGE,
Gender=TCGA_clin_2016$SEX,
GRADE = TCGA_clin_2016$GRADE ,
HISTOLOGICAL_DIAGNOSIS = TCGA_clin_2016$HISTOLOGICAL_DIAGNOSIS ,
OS_MONTHS = TCGA_clin_2016$OS_MONTHS,
show_annotation_name = TRUE,
annotation_name_gp = gpar(fontsize = 7))
然后在剛剛的主函數(shù)中加上你的臨床信息即可
oncoplot_tcga2016 <- oncoPrint(interestgene_matrix ,
alter_fun = alter_fun, col = col,
column_title = column_title,
row_names_side = "left",
pct_side = "right",
alter_fun_is_vectorized = FALSE,
heatmap_legend_param = heatmap_legend,
bottom_annotation = clini
)
oncoplot_tcga2016
需要注意的是,如果你不為臨床信息設(shè)置顏色的話,每一次運(yùn)行出來(lái)的顏色可能是不一樣的,所以你可以設(shè)置一下顏色,以防隨機(jī)的顏色影響美觀:
col_os = colorRamp2(c(0, 4000), c("white", "blue"))
clini <- HeatmapAnnotation(TCGA_clin_2016$AGE,
Gender=TCGA_clin_2016$SEX,
GRADE = TCGA_clin_2016$GRADE ,
HISTOLOGICAL_DIAGNOSIS = TCGA_clin_2016$HISTOLOGICAL_DIAGNOSIS ,
OS_MONTHS = TCGA_clin_2016$OS_MONTHS,
#設(shè)置顏色
col = list(GRADE = c("II" = "orange","III" = "green","IV" = "skyblue" ),os = col_os),
show_annotation_name = TRUE,
annotation_name_gp = gpar(fontsize = 7))
另外,有時(shí)我們需要根據(jù)需求優(yōu)先對(duì)一些信息進(jìn)行排序,這可以直接通過(guò)對(duì)數(shù)據(jù)排序?qū)崿F(xiàn)。
當(dāng)畫圖完成之后,可以將oncoPrint的結(jié)果賦值,并且使用draw函數(shù),其中annotation_legend_side 參數(shù)可以將圖例放在下面,讓圖片整體美觀一些。
pdf("tcga2016result1.pdf") ##導(dǎo)出pdf文件
draw(oncoplot_tcga2016,annotation_legend_side = "bottom")
dev.off()
CGGA的這個(gè)網(wǎng)站里提供了一些他們的數(shù)據(jù)和代碼,以及生成的圖片,大家可以取數(shù)據(jù)下來(lái)嘗試一下
http://www.cgga.org.cn/analyse/WEseq-data-oncoprint-result.jsp
參考網(wǎng)址:
https://jokergoo.github.io/ComplexHeatmap-reference/book/oncoprint.html
https://mp.weixin.qq.com/s/8kz2oKvUQrCR2_HWYXQT4g