R 數(shù)據(jù)可視化 —— 聚類熱圖 ComplexHeatmap(五)oncoprint

前言

oncoprint 是一種通過(guò)熱圖的方式來(lái)可視化多個(gè)基因組變異事件。ComplexHeatmap 包提供了 oncoPrint() 函數(shù)來(lái)繪制這種類型的圖。

默認(rèn)的樣式是 cBioPortal 格式,我們也可以根據(jù)需要不同類型的圖形

常規(guī)設(shè)置

1. 輸入數(shù)據(jù)格式

輸入數(shù)據(jù)可以有兩種格式:矩陣和矩陣列表

1.1 矩陣

對(duì)于矩陣類型的數(shù)據(jù),行代表的是基因,列表示的是樣本,矩陣的值表示基因在樣本中方式的變異類型。例如

mat <- read.table(
  textConnection(
    "s1,s2,s3
    g1,snv;indel,snv,indel
    g2,,snv;indel,snv
    g3,snv,,indel;snv"
  ),
  row.names = 1,
  header = TRUE,
  sep = ",",
  stringsAsFactors = FALSE
)
mat <- as.matrix(mat)

對(duì)于這種字符型矩陣,還需要定義相應(yīng)的變異類型提取函數(shù),例如

> get_type_fun <- function(x) unlist(strsplit(x, ";"))
> get_type_fun(mat[1,1])
[1] "snv"   "indel"

如果變異類型的編碼方式為 snv|indel,只需把函數(shù)定義為按 | 分割就行

get_type_fun <- function(x) unlist(strsplit(x, "|"))

然后將該函數(shù)傳遞給 oncoPrint() 函數(shù)的 get_type 參數(shù)。

對(duì)于常見(jiàn)的分隔符:;:,|oncoPrint 會(huì)自動(dòng)解析,不需要指定解析函數(shù)

alter_fun 參數(shù)可以自定義每種變異類型在熱圖單元格中的繪制函數(shù),函數(shù)接受 4 個(gè)參數(shù),其中 x、y 用于標(biāo)識(shí)格子位置,w、h 用于標(biāo)識(shí)格子的大小,并使用 col 來(lái)標(biāo)注顏色

col = c(snv = "#fb8072", indel = "#80b1d3")
oncoPrint(
  mat, alter_fun = list(
    snv = function(x, y, w, h) 
      grid.rect(
        x, y, w*0.9, h*0.9,
        gp = gpar(fill = col["snv"],col = NA)),
    indel = function(x, y, w, h) 
      grid.rect(
        x, y, w*0.9, h*0.4,
        gp = gpar(fill = col["indel"], col = NA))
    ), 
  col = col
)

注意:如果 alter_fun 設(shè)置為列表,元素的順序會(huì)影響圖形繪制順序,先定義的先繪制

1.2 矩陣列表

第二種格式是矩陣列表,列表中的每種突變類型對(duì)應(yīng)一個(gè)矩陣,矩陣僅包含 0-1 值,用于標(biāo)識(shí)基因在樣本中是否發(fā)生了這種類型的突變,并且列表名稱與變異類型對(duì)應(yīng)

> mat_list <- list(
+   snv = matrix(c(1, 0, 1, 1, 1, 0, 0, 1, 1), nrow = 3),
+   indel = matrix(c(1, 0, 0, 0, 1, 0, 1, 0, 0), nrow = 3))
> 
> rownames(mat_list$snv) <- rownames(mat_list$indel) <- c("g1", "g2", "g3")
> colnames(mat_list$snv) <- colnames(mat_list$indel) <- c("s1", "s2", "s3")
> mat_list
$snv
   s1 s2 s3
g1  1  1  0
g2  0  1  1
g3  1  0  1

$indel
   s1 s2 s3
g1  1  0  1
g2  0  1  0
g3  0  0  0

需要保證所有矩陣具有相同的行名和列名

col = c(snv = "#fb8072", indel = "#80b1d3")
oncoPrint(
  mat_list, alter_fun = list(
    snv = function(x, y, w, h) 
      grid.rect(
        x, y, w*0.9, h*0.9,
        gp = gpar(fill = col["snv"],col = NA)),
    indel = function(x, y, w, h) 
      grid.rect(
        x, y, w*0.9, h*0.4,
        gp = gpar(fill = col["indel"], col = NA))
    ), 
  col = col
)

2. 定義 alter_fun

alter_fun 不僅可以傳遞一個(gè)函數(shù)列表,還可以傳遞一個(gè)函數(shù),該函數(shù)多了一個(gè)參數(shù),用于傳遞一個(gè)邏輯值向量,用于標(biāo)識(shí)當(dāng)前基因在當(dāng)前樣本中是否發(fā)生了對(duì)應(yīng)的變異

oncoPrint(
  mat, alter_fun = function(x, y, w, h, v) {
    if (v["snv"])
      grid.rect(x, y, w * 0.9, h * 0.9, 
                gp = gpar(fill = col["snv"], col = NA))
    if (v["indel"])
      grid.rect(x, y, w * 0.9, h * 0.4,
                gp = gpar(fill = col["indel"], col = NA))
  }, 
  col = col
)

設(shè)置為單個(gè)函數(shù),可以更靈活進(jìn)行自定義

oncoPrint(
  mat, alter_fun = function(x, y, w, h, v) {
    n = sum(v) # 發(fā)生變異的數(shù)量
    h = h * 0.9
    if (n)
      grid.rect(x,
                y - h * 0.5 + 1:n / n * h,
                w * 0.9,
                1 / n * h,
                gp = gpar(fill = col[names(which(v))], col = NA),
                just = "top")
  }, col = col)

設(shè)置為三角形填充

oncoPrint(
  mat,
  alter_fun = list(
    # 控制背景的繪制,通常在放在第一個(gè)
    background = function(x, y, w, h) {
      grid.polygon(
        unit.c(x - 0.5 * w, x - 0.5 * w, x + 0.5 * w),
        unit.c(y - 0.5 * h, y + 0.5 * h, y - 0.5 * h),
        gp = gpar(fill = "grey", col = "white")
      )
      grid.polygon(
        unit.c(x + 0.5 * w, x + 0.5 * w, x - 0.5 * w),
        unit.c(y + 0.5 * h, y - 0.5 * h, y + 0.5 * h),
        gp = gpar(fill = "grey", col = "white")
      )
    },
    snv = function(x, y, w, h) {
      grid.polygon(
        unit.c(x - 0.5 * w, x - 0.5 * w, x + 0.5 * w),
        unit.c(y - 0.5 * h, y + 0.5 * h, y - 0.5 * h),
        gp = gpar(fill = col["snv"], col = "white")
      )
    },
    indel = function(x, y, w, h) {
      grid.polygon(
        unit.c(x + 0.5 * w, x + 0.5 * w, x - 0.5 * w),
        unit.c(y + 0.5 * h, y - 0.5 * h, y + 0.5 * h),
        gp = gpar(fill = col["indel"], col = "white")
      )
    }
  ),
  col = col
)

在上面的例子中,我們添加了一個(gè)背景設(shè)置 background。背景需要放置第一個(gè),如果想要?jiǎng)h除背景,可以設(shè)置

background = function(...) NULL

在某些情況下,我們可能需要設(shè)置的變異類型較多,為了確保我們 alter_fun 設(shè)置正確,可以使用 test_alter_fun() 函數(shù)來(lái)進(jìn)行測(cè)試。例如

alter_fun <- list(
  mut1 = function(x, y, w, h) 
    grid.rect(x, y, w, h, gp = gpar(fill = "red", col = NA)),
  mut2 = function(x, y, w, h) 
    grid.rect(x, y, w, h, gp = gpar(fill = "blue", col = NA)),
  mut3 = function(x, y, w, h) 
    grid.rect(x, y, w, h, gp = gpar(fill = "yellow", col = NA)),
  mut4 = function(x, y, w, h) 
    grid.rect(x, y, w, h, gp = gpar(fill = "purple", col = NA)),
  mut5 = function(x, y, w, h) 
    grid.rect(x, y, w, h, gp = gpar(fill = NA, lwd = 2)),
  mut6 = function(x, y, w, h) 
    grid.points(x, y, pch = 16),
  mut7 = function(x, y, w, h) 
    grid.segments(x - w*0.5, y - h*0.5, x + w*0.5, y + h*0.5, gp = gpar(lwd = 2))
)
test_alter_fun(alter_fun)

3. 簡(jiǎn)化 alter_fun

如果只要繪制簡(jiǎn)單圖形,如 矩形和散點(diǎn)圖,可以使用 alter_graphic() 函數(shù)

oncoPrint(
  mat,
  alter_fun = list(
    snv = alter_graphic(
      "rect",
      width = 0.9,
      height = 0.9,
      fill = col["snv"]
    ),
    indel = alter_graphic(
      "rect",
      width = 0.9,
      height = 0.4,
      fill = col["indel"]
    )
  ),
  col = col
)

4. 復(fù)雜變異類型

大多數(shù)時(shí)候,我們需要展示的變異類型并不是單單一兩種,可能會(huì)有很多種,如果單單用顏色來(lái)區(qū)分的話比較困難。

而且,有些變異類型是我們比較關(guān)注的,而其他的一些次要的變異類型沒(méi)那么重要,就有一個(gè)主次關(guān)系。

例如,snvindel 變異類型中又包含 intronic snv、exonic snvintronic indel、exonic indel。主分類應(yīng)該是 snvindel,次分類是 intronicexonic

所以,我們可以為主分類設(shè)置同樣類型的圖形,比如說(shuō),設(shè)置不同的顏色來(lái)區(qū)分;而次分類設(shè)置為不同的符號(hào)類型。

對(duì)于下面的數(shù)據(jù)

type <- c("snv;intronic", "snv;exonic", "indel;intronic", "indel;exonic", "")
m <- matrix(
  sample(type, size = 100, replace = TRUE),
  nrow = 10, ncol = 10,
  dimnames = list(paste0("g", 1:10), paste0("s", 1:10))
  )

定義 alter_fun

alter_fun <- list(
  # 設(shè)置背景
  background = function(x, y, w, h) 
    grid.rect(x, y, w*0.9, h*0.9, gp = gpar(fill = "#CCCCCC", col = NA)),
  # SNV 顏色
  snv = function(x, y, w, h) 
    grid.rect(x, y, w*0.9, h*0.9, gp = gpar(fill = "#fb8072", col = NA)),
  # indel 顏色
  indel = function(x, y, w, h) 
    grid.rect(x, y, w*0.9, h*0.9, gp = gpar(fill = "#80b1d3", col = NA)),
  # 內(nèi)含子設(shè)置為點(diǎn)
  intronic = function(x, y, w, h) 
    grid.points(x, y, pch = 16),
  # 外顯子設(shè)置為 X
  exonic = function(x, y, w, h) {
    grid.segments(x - w*0.4, y - h*0.4, x + w*0.4, y + h*0.4, gp = gpar(lwd = 2))
    grid.segments(x + w*0.4, y - h*0.4, x - w*0.4, y + h*0.4, gp = gpar(lwd = 2))
  }
)

繪制

oncoPrint(m, alter_fun = alter_fun, col = c(snv = "#fb8072", indel = "#80b1d3"))

5. 其他參數(shù)設(shè)置

oncoPrint 本質(zhì)上也是熱圖,所以很多熱圖的參數(shù)都可以使用,例如,顯示列名

alter_fun <- list(
  snv = function(x, y, w, h)
    grid.rect(x, y, w * 0.9, h * 0.9,
              gp = gpar(fill = col["snv"], col = NA)),
  indel = function(x, y, w, h)
    grid.rect(x, y, w * 0.9, h * 0.4,
              gp = gpar(fill = col["indel"], col = NA))
)

oncoPrint(
  mat, alter_fun = alter_fun, 
  col = col, show_column_names = TRUE
)

行名和百分比文本的顯示可以使用 show_pctshow_row_names,位置可以使用 pct_siderow_names_side 設(shè)置,百分比精確度可以使用 pct_digits

oncoPrint(
  mat,
  alter_fun = alter_fun,
  col = col,
  row_names_side = "left",
  pct_side = "right",
  pct_digits = 2
)

使用 anno_oncoprint_barplot() 注釋函數(shù)來(lái)控制條形圖

oncoPrint(
  mat,
  alter_fun = alter_fun,
  col = col,
  top_annotation = HeatmapAnnotation(cbar = anno_oncoprint_barplot(height = unit(1, "cm"))),
  right_annotation = rowAnnotation(rbar = anno_oncoprint_barplot(
    width = unit(4, "cm"),
    axis_param = list(
      at = c(0, 2, 4),
      labels = c("zero", "two", "four"),
      side = "top",
      labels_rot = 0
    )
  )),
)

或者,把右邊的條形圖往左邊放放

oncoPrint(
  mat,
  alter_fun = alter_fun,
  col = col,
  left_annotation =  rowAnnotation(rbar = anno_oncoprint_barplot(axis_param = list(direction = "reverse"))),
  right_annotation = NULL
)

應(yīng)用實(shí)例

我們使用 ComplexHeatmap 包中提供的數(shù)據(jù),該數(shù)據(jù)來(lái)自于 cBioPortal 數(shù)據(jù)庫(kù)

mat <- read.table(
   system.file(
     "extdata",
     package = "ComplexHeatmap",
     "tcga_lung_adenocarcinoma_provisional_ras_raf_mek_jnk_signalling.txt"
   ),
   header = TRUE,
   stringsAsFactors = FALSE,
   sep = "\t"
 )
mat[is.na(mat)] <- ""
rownames(mat) <- mat[, 1]
mat <- mat[,-1]
mat <-  mat[,-ncol(mat)]
mat <- t(as.matrix(mat))

該數(shù)據(jù)包含 Ras-Raf-MEK-Erk/JNK signaling 通路中的 26 個(gè)基因在 172 個(gè)肺腺癌樣本中的突變即 CNV 變異信息

> mat[1:3,1:3]
     TCGA-05-4384-01 TCGA-05-4390-01 TCGA-05-4425-01
KRAS "  "            "MUT;"          "  "           
HRAS "  "            "  "            "  "           
BRAF "  "            "  "            "  " 

數(shù)據(jù)中包含 3 種變異:MUT、AMP、HOMDEL,現(xiàn)在,我們?yōu)槊糠N變異類型定義圖形

col <- c("HOMDEL" = "#ff7f00", "AMP" = "#984ea3", "MUT" = "#4daf4a")
alter_fun = list(
  background = alter_graphic("rect", fill = "#CCCCCC"),
  HOMDEL = alter_graphic("rect", fill = col["HOMDEL"]),
  AMP = alter_graphic("rect", fill = col["AMP"]),
  MUT = alter_graphic("rect", height = 0.33, fill = col["MUT"])
)

我們只是設(shè)置格子的顏色,所以可以使用 alter_graphic 來(lái)設(shè)置

設(shè)置列標(biāo)題和圖例

column_title <- "OncoPrint for TCGA Lung Adenocarcinoma, genes in Ras Raf MEK JNK signalling"
heatmap_legend_param <-
  list(
    title = "Alternations",
    at = c("HOMDEL", "AMP", "MUT"),
    labels = c("Deep deletion", "Amplification", "Mutation")
  )

繪制圖片

oncoPrint(
  mat, alter_fun = alter_fun, col = col,
  column_title = column_title,
  heatmap_legend_param = heatmap_legend_param
)

我們可以看到,有很多空白的行和列,刪掉它們

oncoPrint(
  mat, alter_fun = alter_fun, col = col,
  remove_empty_columns = TRUE, 
  remove_empty_rows = TRUE,
  column_title = column_title,
  heatmap_legend_param = heatmap_legend_param
)

row_ordercolumn_order 可以設(shè)置行、列的順序

oncoPrint(
  mat, alter_fun = alter_fun, col = col,
  column_title = column_title,
  row_order = 1:nrow(mat),
  remove_empty_columns = TRUE, 
  remove_empty_rows = TRUE,
  heatmap_legend_param = heatmap_legend_param
)

我們可以使用 anno_oncoprint_barplot() 來(lái)修改條形圖注釋,且條形圖默認(rèn)都是顯示變異的數(shù)量,可以在設(shè)置 show_fraction = TRUE 來(lái)顯示頻率

oncoPrint(
  mat,
  alter_fun = alter_fun,
  col = col,
  # 上方條形圖只顯示 MUT 的頻率
  top_annotation = HeatmapAnnotation(
    column_barplot = anno_oncoprint_barplot(
      "MUT", border = TRUE,
      show_fraction = TRUE,
      height = unit(4, "cm")
  )),
  # 右側(cè)條形圖顯示 AMP 和 HOMDEL
  right_annotation = rowAnnotation(
    row_barplot = anno_oncoprint_barplot(
      c("AMP", "HOMDEL"),
      border = TRUE,
      height = unit(4, "cm"),
      axis_param = list(side = "bottom", labels_rot = 90)
  )),
  remove_empty_columns = TRUE,
  remove_empty_rows = TRUE,
  column_title = column_title,
  heatmap_legend_param = heatmap_legend_param
)

類似于熱圖,我們可以使用 HeatmapAnnotation()rowAnnotation() 來(lái)添加行列注釋

oncoPrint(
  mat,
  alter_fun = alter_fun,
  col = col,
  remove_empty_columns = TRUE,
  remove_empty_rows = TRUE,
  top_annotation = HeatmapAnnotation(
    cbar = anno_oncoprint_barplot(),
    foo1 = 1:172,
    bar1 = anno_points(1:172)
  ),
  left_annotation = rowAnnotation(foo2 = 1:26),
  right_annotation = rowAnnotation(bar2 = anno_barplot(1:26)),
  column_title = column_title,
  heatmap_legend_param = heatmap_legend_param
)

起始 oncoPrint() 返回的是 Heatmap 對(duì)象,所以,我們可以在水平或豎直方向上添加熱圖或注釋

ht_list <- oncoPrint(
  mat,
  alter_fun = alter_fun,
  col = col,
  column_title = column_title,
  heatmap_legend_param = heatmap_legend_param
) +
  Heatmap(
    matrix(rnorm(nrow(mat) * 10), ncol = 10),
    name = "expr",
    col = colorRamp2(c(-3, 0, 3), c("#8c510a", "white", "#01665e")),
    width = unit(4, "cm")
  )
draw(ht_list)
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容