国产精品啪视频,91高跟黑色丝袜呻吟,97人妻九色

變異數(shù)據(jù)處理的總體思路如下：

思維導(dǎo)圖

1.數(shù)據(jù)下載

TCGA的突變數(shù)據(jù)有4個(gè)軟件得到的不同版本：

這個(gè)可以在gdc的官網(wǎng)上找到，case選擇KIRC，文件類型選擇maf即可獲得。

選擇mutect，就一個(gè)文件，直接點(diǎn)進(jìn)去，download就行，下載下來(lái)只有一個(gè)tar.gz文件，解壓放在工作目錄下。

tar -xzvf file.tar.gz 解壓tar.gz，即可得到一個(gè)maf.gz文件。

同樣的篩選條件，參考http://www.itdecent.cn/p/559d9604fcdf下載臨床信息數(shù)據(jù)并整理。

2.數(shù)據(jù)讀取

使用R包maftools讀取。

rm(list=ls())
options(stringsAsFactors = F) 
require(maftools) 
require(dplyr)
project='TCGA_KIRC'
laml = read.maf(maf = 'TCGA.KIRC.mutect.2a8f2c83-8b5e-4987-8dbf-01f7ee24dc26.DR-10.0.somatic.maf.gz')
laml 
maf_df = laml@data
save(laml,maf_df,file = "maf.Rdata")
length(unique(maf_df$Tumor_Sample_Barcode))
length(unique(maf_df$Hugo_Symbol))

因此，有336個(gè)病人,9444個(gè)突變基因信息。了解maf還可以用下面的幾個(gè)函數(shù)：

getSampleSummary(laml) 
getGeneSummary(laml) 
getFields(laml)

3.突變數(shù)據(jù)的可視化

3.1 plotmafSummary

maftools 自帶可視化函數(shù)plotmafSummary，可以比較直觀的統(tǒng)計(jì)maf文件的數(shù)據(jù)。

#if (as.numeric(dev.cur()) != 1) graphics.off()
plotmafSummary(maf = laml, rmOutlier = TRUE,
               #showBarcodes = FALSE,
               addStat = 'median', dashboard = TRUE, titvRaw = FALSE)

就是將maf_df 數(shù)據(jù)框做了統(tǒng)計(jì)，用barplot和boxplot做了可視化。

image.png

3.2 突變頻譜圖

代碼其實(shí)就一句！

oncoplot(maf = laml, top = 30, fontSize = 1)

TOP30的基因

#上面這幅圖是TOP30的基因，如果想要挑出幾個(gè)特定的基因也是可以的
oncoplot(maf = laml, genes = unique(maf_df$Hugo_Symbol[50:55]), fontSize = 1)

挑出了5個(gè)基因

我們可以看一下oncoplot這個(gè)繪圖包的幫助文檔里面的示例代碼

image.png

運(yùn)行示例代碼出圖

可以看到最后加了臨床信息，配色也有所改變

可以加上一些臨床信息

# pd 是臨床信息數(shù)據(jù)框，第一列是Tumor_Sample_Barcode，后面幾列是各種分組，如gender，age等
load("clinical.Rdata")
pd = cl_df[,c(12,5,6)]  #選列
#colnames(pd)[1] = "Tumor_Sample_Barcode"
#調(diào)整pd的病人id順序和laml里的樣本id順序一致
sam_id = unique(laml@variants.per.sample$Tumor_Sample_Barcode)
library(stringr)
pd = pd[match(str_sub(sam_id,1,12),pd$bcr_patient_barcode),]
pd$Tumor_Sample_Barcode = sam_id
pd = pd[,c(4,2,3)]
pd = na.omit(pd)
laml = read.maf(maf = 'TCGA.KIRC.mutect.2a8f2c83-8b5e-4987-8dbf-01f7ee24dc26.DR-10.0.somatic.maf.gz',clinicalData = pd)

oncoplot(maf = laml,
         sortByAnnotation = TRUE,
         clinicalFeatures = c('vital_status',
                              'gender')
          )

image.png

下面展開(kāi)一下這個(gè)圖的解讀

主體熱圖

一行是一個(gè)基因，總共是9444個(gè)基因，從中截取了top30；一列是一個(gè)樣本，總共是336個(gè)樣本。不同顏色代表不同類型的突變。

右側(cè)條形圖

右側(cè)的條形圖是每個(gè)基因的突變樣本數(shù)、突變類型和比例
驗(yàn)證一下突變樣本數(shù)

count(maf_df,Hugo_Symbol,sort = T)  #count出自dplyr包，表示統(tǒng)計(jì)數(shù)據(jù)框中某一列的重復(fù)值

結(jié)果顯示VHL在169樣本中突變，樣本總數(shù)336，所以是49%，以此類推
條形圖的顏色是突變類型，以VHL基因?yàn)槔?，他的突變類型分別是：

maf_df %>% filter(Hugo_Symbol=="VHL") %>%
  count(Variant_Classification,sort = T)

頂部條形圖

顯示每個(gè)樣本里突變的基因個(gè)數(shù)，可以看到最高的是那個(gè)一枝獨(dú)秀的1600多。

laml@variants.per.sample %>% head()

image.png

可以看到是Dead

#####################################分割線###############################

這個(gè)圖怎么做出來(lái)？附上小潔老師的代碼

image.png

0.R包

rm(list=ls())
options(stringsAsFactors = F) 
project='TCGA_KIRC'
library(maftools) 
library(dplyr)
library(pheatmap)
library(ComplexHeatmap)
library(stringr)
library(deconstructSigs)
library(BSgenome)
library(BSgenome.Hsapiens.UCSC.hg38)

1.讀取突變數(shù)據(jù)maf

是和上一篇一樣的數(shù)據(jù)，從gdc下載的maf文件和臨床信息，見(jiàn)：
http://www.itdecent.cn/p/2d6cb5bd8771

laml = read.maf(maf = "TCGA.KIRC.mutect.2a8f2c83-8b5e-4987-8dbf-01f7ee24dc26.DR-10.0.somatic.maf.gz")
laml 

mut=laml@data
mut[1:4,1:2]
mut=mut[mut$Variant_Type=='SNP',]  #把SNP的數(shù)據(jù)挑出來(lái)
#每行記錄了一個(gè)突變，所以統(tǒng)計(jì)樣本列的行數(shù)即為樣本的突變數(shù)量
plot(table(mut[,16]),las = 2)

2.制作signatures的輸入數(shù)據(jù)（96頻譜）

關(guān)于什么是mutation

signature：http://www.bio-info-trainee.com/1619.html

關(guān)于96頻譜：http://www.biotrainee.com/thread-832-1-1.html

a = mut[,c(16,5,6,12,13)]
colnames(a)=c( "Sample","chr", "pos","ref",  "alt")
a$Sample=as.character(a$Sample)

sigs.input <- mut.to.sigs.input(mut.ref = a, 
                                sample.id = "Sample", 
                                chr = "chr", 
                                pos = "pos", 
                                ref = "ref", 
                                alt = "alt",
                                bsg = BSgenome.Hsapiens.UCSC.hg38)
class(sigs.input)
#第一個(gè)樣本的突變繪圖
barplot(as.numeric(sigs.input[1,])~colnames(sigs.input))
sigs.input[1:4,1:4]

一頓操作猛如虎，生成signature與樣本對(duì)應(yīng)關(guān)系的矩陣

if(!file.exists(paste0(project,"w.Rdata"))){
  w=lapply(unique(a$Sample), function(i){
    ## signatures.cosmic signatures.nature2013
    sample_1 = whichSignatures(tumor.ref = sigs.input[,], 
                               signatures.ref = signatures.cosmic, 
                               sample.id =  i, 
                               contexts.needed = TRUE,
                               tri.counts.method = 'default')
    print(c(i,which(unique(a$Sample)==i)))
    return(sample_1$weights)
  })
  w=do.call(rbind,w)
  save(w,file = paste0(project,"w.Rdata"))
}
load(paste0(project,"w.Rdata"))
mat = t(w)

3.可視化

矩陣的可視化--熱圖

3.1 簡(jiǎn)單版本的熱圖

pheatmap(mat,
         cluster_rows = F,
         cluster_cols = T,
         show_colnames = F)

3.2 好看一點(diǎn)的熱圖

看到了這個(gè)圖，氮素不知道用什么包畫(huà)的，反正閑的沒(méi)事干，不如自己模仿一波。如果你知道用的什么包，可以告訴我~感謝

熱圖上面放直方圖，直方圖內(nèi)容是每個(gè)樣本的突變數(shù)量。想到了ComplexHeatmap，可以將直方圖作為注釋添加上去。

難點(diǎn)是調(diào)整順序，R語(yǔ)言神技能加持。

捋一捋：

直方圖橫縱坐標(biāo)是樣本和突變數(shù)量
熱圖輸入數(shù)據(jù)mat橫縱坐標(biāo)是樣本和signature
注釋欄來(lái)源于clinical數(shù)據(jù)，與樣本一一對(duì)應(yīng)。
所以需要調(diào)整樣本順序，三者統(tǒng)一。
注釋欄簡(jiǎn)化一下，用生死和性別；先排一下序，再按照排好的順序給mat和直方圖輸入數(shù)據(jù)排序。

load("clinical.Rdata")
table( str_sub(colnames(mat),1,12)%in% cl_df$bcr_patient_barcode)
#一一對(duì)應(yīng)，只要mat列名對(duì)應(yīng)的臨床數(shù)據(jù)。
cl_df = cl_df[cl_df$bcr_patient_barcode %in% str_sub(colnames(mat),1,12),]
#按照性別和生死排序
cl_df = arrange(cl_df,gender,vital_status)
#mat排序
x = match(cl_df$bcr_patient_barcode,str_sub(colnames(mat),1,12))
mat = mat[,x]
#直方圖輸入數(shù)據(jù)排序
for_bar = laml@variants.per.sample$Variants[match(cl_df$bcr_patient_barcode,str_sub(laml@variants.per.sample$Tumor_Sample_Barcode,1,12))]
#確認(rèn)順序是否正確
cl_df$bcr_patient_barcode[1] == str_sub(colnames(mat),1,12)[1]
cl_df$bcr_patient_barcode[1] == for_bar[1]

annotation = HeatmapAnnotation(mut = anno_barplot(for_bar),
  df = data.frame(gender = cl_df$gender,
                  vital = cl_df$vital_status)
                               )

# top_annotation參數(shù)在頂部添加注釋信息
ht_list = Heatmap(mat, name = "mat", 
        #top_annotation = column_ha, 
        #right_annotation = row_ha,
        cluster_rows = F,
        cluster_columns = F,
        #show_row_names = F,
        show_column_names = F,
        top_annotation = annotation)

draw(ht_list, heatmap_legend_side = "left", annotation_legend_side = "bottom")

箱線圖

image.png

0.輸入數(shù)據(jù)

rm(list=ls())
load("for_boxplot.Rdata")

這里面的三個(gè)數(shù)據(jù)：

exp和meta是miRNA的表達(dá)矩陣和臨床信息，由GDC下載整理得到。

mut是突變信息，讀取maf得到的數(shù)據(jù)框。

1.比較任意miRNA在tumor和normal樣本中的表達(dá)量

這個(gè)只需要表達(dá)矩陣，以hsa-mir-143為例畫(huà)圖，可替換為其他任意miRNA。

exp[1:4,1:4]
group_list=ifelse(as.numeric(substr(colnames(exp),14,15)) < 10,'tumor','normal')
table(group_list)

library(ggstatsplot)
dat = data.frame(gene = exp["hsa-mir-143",],
                 group = group_list)
ggbetweenstats(data = dat, x = group,  y = gene,title = "hsa-mir-143")

這樣就畫(huà)出來(lái)了上面那個(gè)圖
還有相關(guān)性的圖：

image.png

按照stage分組

按照是否突變來(lái)分組

分面

還是分面

拼圖

呼。~~~
代碼就不貼了，反正還沒(méi)有搞懂。。。慢慢來(lái)吧！??！
文末要特別感謝數(shù)據(jù)挖掘班的小潔老師，以上所有代碼都是來(lái)自小潔老師?

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

變異數(shù)據(jù)

變異數(shù)據(jù)

1.數(shù)據(jù)下載

2.數(shù)據(jù)讀取

3.突變數(shù)據(jù)的可視化

3.1 plotmafSummary

3.2 突變頻譜圖

可以看到最后加了臨床信息，配色也有所改變

主體熱圖

右側(cè)條形圖

頂部條形圖

這個(gè)圖怎么做出來(lái)？附上小潔老師的代碼

0.R包

1.讀取突變數(shù)據(jù)maf

2.制作signatures的輸入數(shù)據(jù)（96頻譜）

3.可視化

3.1 簡(jiǎn)單版本的熱圖

3.2 好看一點(diǎn)的熱圖

捋一捋：

箱線圖

0.輸入數(shù)據(jù)

1.比較任意miRNA在tumor和normal樣本中的表達(dá)量

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

變異數(shù)據(jù)

1.數(shù)據(jù)下載

2.數(shù)據(jù)讀取

3.突變數(shù)據(jù)的可視化

3.1 plotmafSummary

3.2 突變頻譜圖

可以看到最后加了臨床信息，配色也有所改變

主體熱圖

右側(cè)條形圖

頂部條形圖

這個(gè)圖怎么做出來(lái)？附上小潔老師的代碼

0.R包

1.讀取突變數(shù)據(jù)maf

2.制作signatures的輸入數(shù)據(jù)（96頻譜）

3.可視化

3.1 簡(jiǎn)單版本的熱圖

3.2 好看一點(diǎn)的熱圖

捋一捋：

箱線圖

0.輸入數(shù)據(jù)

1.比較任意miRNA在tumor和normal樣本中的表達(dá)量

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

這個(gè)圖怎么做出來(lái)？附上小潔老師的代碼