【搬磚】計算HRD(first try)

HRD score = LOH + TAI + LST

參考:Sztupinszki et al, Migrating the SNP array-based homologous recombination deficiency measures to next generation sequencing data of breast cancer, npj Breast Cancer, https://www.nature.com/articles/s41523-018-0066-6.

R package: scarHRD
https://github.com/sztup/scarHRD#introduction

workflow

第1步最關(guān)鍵,即得到 input file。

一、嘗試Sequenza

根據(jù)sequenza說明書,需要bam file。。比較難獲得。而且,需要使用python,俺不會。


image.png
TCGA data level

附可參考的網(wǎng)頁:

  1. Sequenza User Guide
    https://rdrr.io/cran/sequenza/f/vignettes/sequenza.Rmd
  2. TCGA RNAseq BAM File
    http://seqanswers.com/forums/showthread.php?t=65176
  3. TCGA_bam_splicer
    https://freesoft.dev/program/131953985
  4. bam 格式文件
    https://blog.csdn.net/qq_36608036/article/details/104630366

二、嘗試ASCAT

參考: ASCAT (Van Loo et al. 2010)
https://github.com/VanLoo-lab/ascat
先跑一下包里的ExampleData

library(ASCAT)
ascat.bc = ascat.loadData("Tumor_LogR.txt","Tumor_BAF.txt","Germline_LogR.txt","Germline_BAF.txt")
ascat.plotRawData(ascat.bc) 
ascat.bc = ascat.aspcf(ascat.bc)
ascat.plotSegmentedData(ascat.bc)
ascat.output = ascat.runAscat(ascat.bc)

ascat.output$nA
ascat.output$nB
ascat.output$ploidy
ascat.output$aberrantcellfraction

目標:跑出下圖的數(shù)據(jù)


ASCAT output

很可惜GitHub里的readme寫的不是很仔細,manual.pdf不見了,所以只能閱讀原文 ASCAT (Van Loo et al. 2010),來破解參數(shù)的含義。

ASCAT profiles

ASCAT profiles: genome-wide allele-specific copy number profiles
左圖:ASCAT首先確定腫瘤細胞的倍性ploidy 和異常細胞分數(shù)fraction of aberrant cells。然后評估 goodness of fit for a grid of possible values for both parameters (blue, good solution),選擇最佳的solution,即綠色交叉點,例如A圖的左邊 綠色交叉點對應(yīng)ploidy=1.77和fraction of aberrant cells=80%
右上圖:x軸表示genomic location,y軸 CN(其中綠色是allele with lowest copy number,紅色是allele with highest copy number)
右下圖: an aberration reliability score異常細胞可靠性分數(shù)

  • 何為fit?
Frequency of LOH and copy number-neutral events

(A) Frequency of LOH across the genome. Probes are shown in
genomic order along the x axis, from chromosome 1 to chromosome X, where different chromosomes are delimited by gray lines.
(B) Frequency of copy number neutral events across the genome. For diploid tumors, copy number-neutral events correspond to a subset of LOH (copy number-neutral LOH), but for, for example, tetraploid tumors, a copy number neutral event can also be three copies of A and one copy of B.

  • 何為LOH?
  • 何為copy number neutral event ?

LOH:Loss of heterozygosity (LOH) was defined as the number of counts of chromosomal LOH regions shorter than whole chromosome and longer than 15 Mb 。
Copy number neutral event :Copy number正常,但存在allelic bias。

Illumina SNP arrays deliver two output tracks:** Log R, a measure of total signal intensity,** and B allele frequency (BAF), a measure of allelic contrast.
The Log R track is similar to the output given by common array-CGH platforms and quantifies the (total) copy number of each genomic locus.
The BAF track shows the relative presence of each of the two alternative nucleotides (called “A” and “B”) at each SNP locus profiled.

PennCNV
  • 為了得到LRR和BAF,還是逃不掉處理CEL文件嗎?

-end-

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容