亚洲欧美另类在线清纯,国产成人女主播视频,青青草精品www

在之前學習snakemake的基礎上，這里重新又對之前編寫的pipline進行了優(yōu)化，只需要在config.yaml中輸入相應的samples，以及參考基因組路徑，即可以得到測序結果比對的最終count數據。
參考的官方文檔
 snakemake+ R/Python script

主要分為以下幾個階段：

1.比對生成sam文件：保存在bwa/文件夾下

2.sam轉為bam：保存在bam/文件夾下

3. 對bam文件進行sort：保存在sort/文件夾下

4. 對bam文件進行count：保存在count/文件夾下

其中還有兩個必需的文件夾：

samples文件夾：保存待分析的測序原始數據

reference文件夾：保存參考基因組

這個是分析的流程圖

流程圖

配置文件config.yaml

samples:
 control_R1: samples/control_R1.fq.gz
 control_R2: samples/control_R2.fq.gz
 treated_R1: samples/treated_R1.fq.gz
 treated_R2: samples/treated_R2.fq.gz
reference:
 bwaRef: reference/chr19
annotation:
 gtf: annotation/mouse.chr.gtf

待分析數據、參考基因組

tree samples/ reference/
samples/
├── control_R1.fq.gz
├── control_R2.fq.gz
├── nohup.out
├── treated_R1.fq.gz
├── treated_R2.fq.gz
├── wget-log
└── wget-log.1
reference/
├── chr19.1.bt2
├── chr19.2.bt2
├── chr19.3.bt2
├── chr19.4.bt2
├── chr19.fa
├── chr19.fa_bowtie2-build.log
├── chr19.fa.fai
├── chr19.rev.1.bt2
├── chr19.rev.2.bt2
├── GRCm38.83.chr19.gtf
└── nohup.out

1.fastqc 質控

rule quality_check:
 input:
  lambda wildcards: config["samples"][wildcards.sample]
 output:
  "qc/"
 shell:
  "fastqc -o {output} {input}"

在這次分析中沒成功，單獨分析也沒成功，可能是我服務器的原因，所以這一步雖然在流程中，但是分析時被跳過了

2. bowtie2對比

rule bwa:
 input:
  lambda wildcards: config["samples"][wildcards.sample]
 output:
  "bwa/{sample}.sam"
 params:
  INDEX= config["reference"]["bwaRef"] #參考基因組路徑
 shell:
  "bowtie2 -x {params.INDEX} -q {input} -S {output}"

3. samtools轉化sam文件

 input:
  "bwa/{sample}.sam"
 output:
  "bam/{sample}.bam"
 shell:
  "samtools view -Sb {input} > {output}"

4. sort bam文件

rule sort:
 input:
  "bam/{sample}.bam"
 output:
  "sort/{sample}_sort.bam"
 shell:
  "samtools sort {input} -o {output}"

5. 建立bam文件的index

rule index:
 input:
  "sort/{sample}_sort.bam"
 output:
  "sort/{sample}.bam.bi"
 shell:
  "samtools index {input} {output}"

6. htseq-count對bam文件進行count

rule count:
 input:
  sample="sort/{sample}_sort.bam",
  annotation=config["annotation"]["gtf"]
 output:
  "count/{sample}.count"
 shell:
  "htseq-count -f bam -r name -s no {input.sample}  {input.annotation} > {output}"

7.采用python對各樣本的count結果進行合并，然而這個卻是在整個snakemake 文件的最開始

rule all:
 input:
  expand("count/{sample}.count",sample=config["samples"]) #進行遍歷，告訴它所有sample的count文件
 output:
  "count/merge.count" #輸出一個merge之后的文件
 run:
  import os
  import pandas as pd
  dm=pd.DataFrame() #建立一個空dataframe
  for filename in input: #讀取各個文件的全路徑
        df=pd.read_csv(filename,sep="\t",names=["gene",os.path.basename(filename)])
        if(any(dm)):#進行合并
                dm=pd.merge(dm,df,on="gene")
        else:
                dm=df
  dm.to_csv(output[0],sep='\t')
#將合并之后文件輸出到output路徑，要使用output[0]，或者output.index

8.最終整個snakefile文件如下

cat snakefile1
configfile: "config.yaml"

rule all:
 input:
  expand("count/{sample}.count",sample=config["samples"])
 output:
  "count/merge.count"
 run:
  import os
  import pandas as pd
  dm=pd.DataFrame()
  for filename in input:
        df=pd.read_csv(filename,sep="\t",names=["gene",os.path.basename(filename)])
        if(any(dm)):
                dm=pd.merge(dm,df,on="gene")
        else:
                dm=df
  dm.to_csv(output[0],sep='\t')
#rule all:
# input:
#  "count/merge.count"
#  expand("count/{sample}.count",sample=config["samples"])

rule quality_check:
 input:
  lambda wildcards: config["samples"][wildcards.sample]
 output:
  "qc/"
 shell:
  "fastqc -o {output} {input}"

rule bwa:
 input:
  lambda wildcards: config["samples"][wildcards.sample]
 output:
  "bwa/{sample}.sam"
 params:
  INDEX= config["reference"]["bwaRef"]
 shell:
  "bowtie2 -x {params.INDEX} -q {input} -S {output}"

rule bam:
 input:
  "bwa/{sample}.sam"
 output:
  "bam/{sample}.bam"
 shell:
  "samtools view -Sb {input} > {output}"

rule sort:
 input:
  "bam/{sample}.bam"
 output:
  "sort/{sample}_sort.bam"
 shell:
  "samtools sort {input} -o {output}"

rule index:
 input:
  "sort/{sample}_sort.bam"
 output:
  "sort/{sample}.bam.bi"
 shell:
  "samtools index {input} {output}"

rule count:
 input:
  sample="sort/{sample}_sort.bam",
  annotation=config["annotation"]["gtf"]
 output:
  "count/{sample}.count"
 shell:
  "htseq-count -f bam -r name -s no {input.sample}  {input.annotation} > {output}"

生成的結果是這個樣子：

bam/:
total 181M
-rw-r--r-- 1 root root 49M Jul  2 11:20 control_R1.bam
-rw-r--r-- 1 root root 49M Jul  2 11:21 control_R2.bam
-rw-r--r-- 1 root root 42M Jul  2 11:21 treated_R1.bam
-rw-r--r-- 1 root root 43M Jul  2 11:21 treated_R2.bam

bwa/:
total 640M
-rw-r--r-- 1 root root 173M Jul  2 11:16 control_R1.sam
-rw-r--r-- 1 root root 172M Jul  2 11:15 control_R2.sam
-rw-r--r-- 1 root root 148M Jul  2 11:18 treated_R1.sam
-rw-r--r-- 1 root root 148M Jul  2 11:17 treated_R2.sam

count/:
total 5.3M
-rw-r--r-- 1 root root 965K Jul  2 11:30 control_R1.count
-rw-r--r-- 1 root root 965K Jul  2 11:28 control_R2.count
-rw-r--r-- 1 root root 1.5M Jul  3 20:27 merge.count
-rw-r--r-- 1 root root 965K Jul  2 11:36 treated_R1.count
-rw-r--r-- 1 root root 965K Jul  2 11:33 treated_R2.count

sort/:
total 131M
-rw-r--r-- 1 root root 35M Jul  2 11:25 control_R1_sort.bam
-rw-r--r-- 1 root root 35M Jul  2 11:25 control_R2_sort.bam
-rw-r--r-- 1 root root 31M Jul  2 11:25 treated_R1_sort.bam
-rw-r--r-- 1 root root 31M Jul  2 11:25 treated_R2_sort.bam

嗯，還算比較滿意，接下來會繼續(xù)完善這個流程。

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

RNAseq分析流程：從測序Rawdata到Count輸出

RNAseq分析流程：從測序Rawdata到Count輸出

1.比對生成sam文件：保存在bwa/文件夾下

2.sam轉為bam：保存在bam/文件夾下

3. 對bam文件進行sort：保存在sort/文件夾下

4. 對bam文件進行count：保存在count/文件夾下

samples文件夾：保存待分析的測序原始數據

reference文件夾：保存參考基因組

配置文件config.yaml

待分析數據、參考基因組

1.fastqc 質控

2. bowtie2對比

3. samtools轉化sam文件

4. sort bam文件

5. 建立bam文件的index

6. htseq-count對bam文件進行count

7.采用python對各樣本的count結果進行合并，然而這個卻是在整個snakemake 文件的最開始

8.最終整個snakefile文件如下

生成的結果是這個樣子：

相關閱讀更多精彩內容

友情鏈接更多精彩內容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

RNAseq分析流程：從測序Rawdata到Count輸出

1.比對生成sam文件：保存在bwa/文件夾下

2.sam轉為bam： 保存在bam/文件夾下

3. 對bam文件進行sort： 保存在sort/文件夾下

4. 對bam文件進行count： 保存在count/文件夾下

samples文件夾：保存待分析的測序原始數據

reference文件夾：保存參考基因組

配置文件config.yaml

待分析數據、參考基因組

1.fastqc 質控

2. bowtie2對比

3. samtools轉化sam文件

4. sort bam文件

5. 建立bam文件的index

6. htseq-count對bam文件進行count

7.采用python對各樣本的count結果進行合并，然而這個卻是在整個snakemake 文件的最開始

8.最終整個snakefile文件如下

生成的結果是這個樣子：

相關閱讀更多精彩內容

友情鏈接更多精彩內容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

2.sam轉為bam：保存在bam/文件夾下

3. 對bam文件進行sort：保存在sort/文件夾下

4. 對bam文件進行count：保存在count/文件夾下

待分析數據、參考基因組

7.采用python對各樣本的count結果進行合并，然而這個卻是在整個snakemake 文件的最開始