最近一直在折騰結(jié)構(gòu)變異SV的鑒定，SV的鑒定軟件雖然很多，但是在最近的兩篇基因組文章里面，主要用的是smartie-sv和bayesTyper，其中smartie-sv主要用于基因組間的比較，也可以用于三代數(shù)據(jù)的比較；bayesTyper主要用于二代數(shù)據(jù)的SV鑒定，適用于群體，速度較快。

參考文獻(xiàn):
[1] Genome assembly of a tropical maize inbred line provides insights into structural variation and crop improvement
[2] Eight high-quality genomes reveal pan-genome architecture and ecotype differentiation of Brassica napus

smartie-sv的安裝

smartie-sv的詳細(xì)信息在這里.

1. smartie-sv的安裝，需要依賴htslib和blasr

$ git clone --recursive https://github.com/zeeev/smartie-sv.git #獲取samrtie-sv
$ cd smartie-sv && make

2. hstlib的安裝

$ git clone https://github.com/samtools/htslib.git
$ autoconf 
$ ./configure
$ make
$ make install

3. blasr的安裝

$ wget https://github.com/PacificBiosciences/blasr/archive/master.zip -O blasr.zip
$ unzip blasr.zip
$ mv blasr-master/ blasr
$ cd blasr
$ make -j 8

4. smartie-sv的配置

smartie-sv需要用到htslib的bgzip、htsfile、tabix，以及blasr的blasr、sawriter，所以我們需要把5個(gè)可執(zhí)行文件鏈接到smartie-sv的bin文件夾下，以便smartie-sv對(duì)其的調(diào)用。

$ cd smartie-sv/bin
$ ln -s ../../htslib/bin/bgzip ./
$ ln -s ../../htslib/bin/htsfile ./
$ ln -s ../../htslib/bin/tabix ./
$ ln -s ../../blasr/alignment/bin/blasr ./
$ln -s ../../blasr/alignment/bin/sawriter ./

bin目錄.png

5. smartie-sv的使用

smartie-sv的使用在官方的README.md有示例，它支持snakename的命令。

$ bin/sawriter target.fasta #利用sawriter對(duì)基因組進(jìn)行index
# 本地運(yùn)行時(shí)
$ snakemake -s Snakefile -w 50 -p -k -j 20

6. 運(yùn)行結(jié)果

# Snakefile內(nèi)容，在安裝目錄"smartie-sv/pipeline/Snakefile"位置，定義call SV等方法，便于流程使用
shell.prefix("source config.sh; set -eo pipefail ; ")

configfile: "config.json"

def _get_target_files(wildcards):
    return config["targets"][wildcards.target]

def _get_query_files(wildcards):
        return config["queries"][wildcards.query]

rule dummy:
     input: expand("variants/{target}-{query}.svs.bed", target=config["targets"], query=config["queries"])

rule callSVs:
     message: "Calling SVs"
     input  : SAM="mappings/{target}-{query}-aligned.sam", TARGET=_get_target_files, PG=config["install"] + "/bin/printgaps"
     output : "variants/{target}-{query}.svs.bed"
     shell  : """
            cat {input.SAM} | {input.PG} {input.TARGET} variants/{wildcards.target}-{wildcards.query}
     """

rule runBlasr:
     message: "Aligning query to target"
     input:   BL=config["install"] + "/bin/blasr", TARGET=_get_target_files, QUERY=_get_query_files
     output:  "mappings/{target}-{query}-aligned.sam", "unmappings/{target}-{query}-unaligned.fasta"
     shell:   """
              {input.BL} -clipping hard -alignContigs -sam -minMapQV 30 -nproc 6 -minPctIdentity 50 -unaligned {output[1]} {input.QUERY} {input.TARGET} -out {output[0]}
     """

我們需要修改的只有文件"config.json"，主要包含smartie-sv的目錄，參考基因組文件和比對(duì)基因組文件，以json格式存在。

{
"install":"smartie-sv",             #最好寫絕對(duì)路徑
"targets":{"zs11":"Darmor.fa"},
"queries":{"Darmor":"Darmor.fa"},
}

最后生成三個(gè)文件夾，mappings、unmappings和variants，sv信息主要在variants文件

結(jié)果文件.png

其中zs11-Darmor.svs.bed包含SV的信息，以bed格式存在。

zs11-Darmor.svs.bed.png

bayesTyper的安裝

bayesTyper的詳細(xì)信息在這里.
Platypus的詳細(xì)信息在這里.
Manta的詳細(xì)信息在這里.

bayesTyper的官網(wǎng)文檔推薦使用三種方法鑒定變異（GATK、Platypus和manta），然后利用bayesTyperTools對(duì)變異文件進(jìn)行合并，然后利用bayesTyper的cluster進(jìn)行cluster，最后利用bayesTyper的genotype進(jìn)行基因分型。

1. bayesTyper安裝

bayesTyper的安裝非常簡(jiǎn)單，安裝完成后會(huì)在bin目錄下生成bayesTyper 和bayesTyperTools兩個(gè)可執(zhí)行文件

$ git clone https://github.com/bioinformatics-centre/BayesTyper.git
$ cd BayesTyper && make -j 4

2. Platypus的安裝

$ git clone https://github.com/andyrimmer/Platypus.git
$ cd Platypus && make -j 4
# 使用platypus的只需運(yùn)行bin目錄下的Platypus.py
$ python bin/Platypus.py --bamFiles=BAM.bam --refFile=REF.fa --output=variants.vcf

3. manta的安裝：

manta采用的是cmake的方式，所以要另外新建一個(gè)安裝目錄，另外manta需采用python 2.7 版本，以及需要Cython模塊。安裝成功后會(huì)在bin目錄下生成三個(gè)文件，configManta.py、configManta.py.ini和runMantaWorkflowDemo.py，我們主要用的就是configManta.py。

$ git clone https://github.com/Illumina/manta.git
$ make manta_build && cd manta_build
$ ../manta-1.6.0/configure --prefix=`pwd`
$ make -C /public/home/guocc/software/manta_build
$ make -j4 install

4. bayesTyper的使用

在官方文檔中，鑒定變異的主要流程分為2大部分：

4.1：Generation of variant candidates（候選變異的生成）

以比對(duì)完的bam文件（推薦以bwa的mem）為起始，分為一下幾個(gè)步驟：

4.1.1 用GATK的HaplotypeCaller模塊鑒定候選位點(diǎn)。
4.1.2 用Platypus鑒定小的以及中等的變異
4.1.3 用manta鑒定大的結(jié)構(gòu)變異
4.1.4 利用bayesTyperTools的combine功能對(duì)以上三種方法的結(jié)果進(jìn)行合并，合并命令為:

$ bayesTyperTools combine -v GATK:<gatk_sample1>.vcf,GATK:<gatk_sample2>.vcf,PLATYPUS:<platypus_sample1>.vcf,PLATYPUS:<platypus_sample2>.vcf,MANTA:<manta_sample1>.vcf,...,prior:<prior>.vcf -o <candiate_variants_prefix> -z

注意這里-v參數(shù)后面接的是字符串，為gatk:sample.vcf格式，各個(gè)樣品間用“，”分隔，參數(shù)-z 表示以壓縮格式gz輸出。
bayesTyper的combine格式需要一個(gè)參數(shù)文件–contigs.txt，里面包含基因組的所有contig信息。格式為##contig=<ID=8,length=146364022>.

4.2 Genotyping based on variant candidates（基于候選變異的基因分型）

4.2.1 計(jì)算測(cè)序數(shù)據(jù)的 k-mers

這里主要用的KMC3來對(duì)比對(duì)后的bam文件進(jìn)行kmer的統(tǒng)計(jì)，參數(shù)為（-k55 -ci1 -fbam)
計(jì)算完kmer后，用bayesTyperTools makeBloom -k <kmc_output_prefix> -p <num_threads>生成bayesTyper需要的前提文件。
kmc生成的為<sample_id>.kmc_pre和<sample_id>.kmc_suf兩個(gè)文件，bayesTyperTools makeBloom生成的為<sample_id>.bloomMeta和<sample_id>.bloomData兩個(gè)文件，這里一定要在同一文件下運(yùn)行，且前綴名一致。

4.2.2 鑒定變異的cluster

運(yùn)行命令：

$ bayesTyper cluster -v <candiate_variants_prefix>.vcf.gz -s <samples>.tsv -g <ref_build>_canon.fa -d <ref_build>_decoy.fa -p <num_threads>

所有的結(jié)果都會(huì)按cluster分成很多個(gè)unit，存在獨(dú)立的文件
文件<sample>.tsv包含的信息為<sample_id>，<sex>和<kmc_output_prefix>.
cluster的結(jié)果輸出在bayestyper_cluster_data目錄

4.2.3 對(duì)cluster進(jìn)行g(shù)enotype

bayesTyper genotype -v bayestyper_unit_<unit_id>/variant_clusters.bin -c bayestyper_cluster_data -s <samples>.tsv -g <ref_build>_canon.fa -d <ref_build>_decoy.fa -o bayestyper_unit_<unit_id>/bayestyper -z -p <num_threads>

4.2.4 利用bcftools對(duì)結(jié)果進(jìn)行合并

bcftools concat -O z -o <output_prefix>.vcf.gz bayestyper_unit_1/bayestyper.vcf.gz bayestyper_unit_2/bayestyper.vcf.gz ...

此貼為記錄我的爬坑之路，因?yàn)橹鞍俣榷紱]有找到任何與這兩個(gè)軟件相關(guān)的信息，所以很是頭疼。希望能幫到大家，供大家參考。
以上就是全部步驟，最后兩步還沒跑通，等跑通后再做分享。

![image.png](https://upload-images.jianshu.io/upload_images/18925383-84b8f660f7f679fa.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240 =100x30)

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

結(jié)構(gòu)變異SV的鑒定--smartie-sv與bayestyper

結(jié)構(gòu)變異SV的鑒定--smartie-sv與bayestyper

smartie-sv的安裝

1. smartie-sv的安裝，需要依賴htslib和blasr

2. hstlib的安裝

3. blasr的安裝

4. smartie-sv的配置

5. smartie-sv的使用

6. 運(yùn)行結(jié)果

bayesTyper的安裝

1. bayesTyper安裝

2. Platypus的安裝

3. manta的安裝：

4. bayesTyper的使用

4.1：Generation of variant candidates（候選變異的生成）

4.2 Genotyping based on variant candidates（基于候選變異的基因分型）

4.2.1 計(jì)算測(cè)序數(shù)據(jù)的 k-mers

4.2.2 鑒定變異的cluster

4.2.3 對(duì)cluster進(jìn)行g(shù)enotype

4.2.4 利用bcftools對(duì)結(jié)果進(jìn)行合并

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

結(jié)構(gòu)變異SV的鑒定--smartie-sv與bayestyper

smartie-sv的安裝

1. smartie-sv的安裝，需要依賴htslib和blasr

2. hstlib的安裝

3. blasr的安裝

4. smartie-sv的配置

5. smartie-sv的使用

6. 運(yùn)行結(jié)果

bayesTyper的安裝

1. bayesTyper安裝

2. Platypus的安裝

3. manta的安裝：

4. bayesTyper的使用

4.1：Generation of variant candidates（候選變異的生成）

4.2 Genotyping based on variant candidates（基于候選變異的基因分型）

4.2.1 計(jì)算測(cè)序數(shù)據(jù)的 k-mers

4.2.2 鑒定變異的cluster

4.2.3 對(duì)cluster進(jìn)行g(shù)enotype

4.2.4 利用bcftools對(duì)結(jié)果進(jìn)行合并

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

1. smartie-sv的安裝，需要依賴htslib和blasr