作者:202031107010173何思成
以markdown形式或格式文本形式,記錄MultiQC軟件安裝運行的過程。找到2個以上fastq文件,用multiQC同時對它們的數(shù)據(jù)質(zhì)量做出評價
要求:
1)按操作過程分步驟記錄;
2)每塊有源代碼和得到的運行結(jié)果截圖;
3)將筆記上傳在簡書或者是騰訊文檔;然后將鏈接填在答案處?;蛘咧苯佑脀ord編輯【注意排版格式】,然后將word文件以附件形式上傳到答案處。
安裝conda, 安裝python2環(huán)境
(base) 202031107010173@xiaoming-HP:~$ cd ~/Biosofts
(base) 202031107010173@xiaoming-HP:~/Biosofts$ conda create --name python2 python=2.7 -c https://mirrors.ustc.edu.cn/anaconda/cloud/bioconda/ -y
啟入python環(huán)境
(base) 202031107010173@xiaoming-HP:~/Biosofts$ conda activate python2
(python2) 202031107010173@xiaoming-HP:~/Biosofts$
下載multiqc軟件
(python2) 202031107010173@xiaoming-HP:~/Biosofts$ conda install multiqc -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
運行multiqc軟件
(python2) 202031107010173@xiaoming-HP:~/Biosofts$ multiqc .
/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/site-packages/multiqc-1.0.dev0-py2.7.egg/multiqc/utils/config.py:44: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
configs = yaml.load(f)
/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/site-packages/multiqc-1.0.dev0-py2.7.egg/multiqc/utils/config.py:50: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
sp = yaml.load(f)
[INFO ] multiqc : This is MultiQC v1.0.dev0
[INFO ] multiqc : Template : default
[INFO ] multiqc : Searching '.'
[WARNING] multiqc : No analysis results found. Cleaning up..
[INFO ] multiqc : MultiQC complete
沒有運行成功,查看目錄文件
(python2) 202031107010173@xiaoming-HP:~/Biosofts$ ll
total 632892
drwxrwxr-x 6 202031107010173 202031107010173 4096 10月 8 13:44 ./
drwxr-xr-x 18 202031107010173 202031107010173 4096 10月 8 13:18 ../
-rw-rw-r-- 1 202031107010173 202031107010173 570853747 5月 14 2021 Anaconda3-2021.05-Linux-x86_64.sh
drwxrwxr-x 2 202031107010173 202031107010173 4096 9月 24 15:27 fastqc/
drwxrwxr-x 8 202031107010173 202031107010173 4096 1月 10 2018 FastQC/
-rw-rw-r-- 1 202031107010173 202031107010173 10254666 1月 16 2020 fastqc_v0.11.7.zip
-rw-rw-r-- 1 202031107010173 202031107010173 2530112 9月 24 11:10 GCF_946151055.1_Q3570_genomic.gff
-rw-rw-r-- 1 202031107010173 202031107010173 1157510 9月 24 04:27 GCF_946183555.1_B129_S48_genomic.gff
-rwxrwxr-x 1 202031107010173 202031107010173 63248534 6月 12 2021 ibm-aspera-connect_4.0.2.38_linux.sh*
drwxrwxr-x 4 202031107010173 202031107010173 4096 10月 4 10:41 SPAdes-3.12.0-Linux/
drwxrwxr-x 5 202031107010173 202031107010173 4096 8月 17 2021 sratoolkit.2.11.1-ubuntu64/
但是由于由于當(dāng)前目錄下沒有fastqc文件,所以multiqc找不到分析結(jié)果
所以我們需要先去NCBI下載一些fastqc文件
(python2) 202031107010173@xiaoming-HP:~/Biosofts$ wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/946/151/055/GCF_946151055.1_Q3570/GCF_946151055.1_Q3570_genomic.fna.gz
--2022-10-08 13:59:07-- https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/946/151/055/GCF_946151055.1_Q3570/GCF_946151055.1_Q3570_genomic.fna.gz
Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)... 165.112.9.230, 165.112.9.229, 2607:f220:41f:250::229, ...
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|165.112.9.230|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1369696 (1.3M) [application/x-gzip]
Saving to: ‘GCF_946151055.1_Q3570_genomic.fna.gz’
GCF_946151055.1_Q3570_genomic.fna.gz 100%[==============================================================================>] 1.31M 270KB/s in 5.0s
2022-10-08 13:59:14 (270 KB/s) - ‘GCF_946151055.1_Q3570_genomic.fna.gz’ saved [1369696/1369696]
(python2) 202031107010173@xiaoming-HP:~/Biosofts$ wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/946/151/065/GCF_946151065.1_Q6965/GCF_946151065.1_Q6965_genomic.fna.gz
--2022-10-08 14:02:23-- https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/946/151/065/GCF_946151065.1_Q6965/GCF_946151065.1_Q6965_genomic.fna.gz
Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)... 130.14.250.10, 165.112.9.230, 2607:f220:41f:250::230, ...
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|130.14.250.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 794477 (776K) [application/x-gzip]
Saving to: ‘GCF_946151065.1_Q6965_genomic.fna.gz’
GCF_946151065.1_Q6965_genomic.fna.gz 100%[==============================================================================>] 775.86K 314KB/s in 2.5s
2022-10-08 14:02:26 (314 KB/s) - ‘GCF_946151065.1_Q6965_genomic.fna.gz’ saved [794477/794477]
下載之后解壓這兩個壓縮文件
(python2) 202031107010173@xiaoming-HP:~/Biosofts$ gunzip GCF_946151055.1_Q3570_genomic.fna.gz
(python2) 202031107010173@xiaoming-HP:~/Biosofts$ gunzip GCF_946151065.1_Q6965_genomic.fna.gz
(python2) 202031107010173@xiaoming-HP:~/Biosofts$ ll
total 636652
drwxrwxr-x 6 202031107010173 202031107010173 4096 10月 8 14:07 ./
drwxr-xr-x 18 202031107010173 202031107010173 4096 10月 8 13:18 ../
-rw-rw-r-- 1 202031107010173 202031107010173 570853747 5月 14 2021 Anaconda3-2021.05-Linux-x86_64.sh
drwxrwxr-x 2 202031107010173 202031107010173 4096 9月 24 15:27 fastqc/
drwxrwxr-x 8 202031107010173 202031107010173 4096 1月 10 2018 FastQC/
-rw-rw-r-- 1 202031107010173 202031107010173 10254666 1月 16 2020 fastqc_v0.11.7.zip
-rw-rw-r-- 1 202031107010173 202031107010173 4650900 9月 24 11:10 GCF_946151055.1_Q3570_genomic.fna
-rw-rw-r-- 1 202031107010173 202031107010173 2885939 9月 24 11:10 GCF_946151065.1_Q6965_genomic.fna
-rwxrwxr-x 1 202031107010173 202031107010173 63248534 6月 12 2021 ibm-aspera-connect_4.0.2.38_linux.sh*
drwxrwxr-x 4 202031107010173 202031107010173 4096 10月 4 10:41 SPAdes-3.12.0-Linux/
drwxrwxr-x 5 202031107010173 202031107010173 4096 8月 17 2021 sratoolkit.2.11.1-ubuntu64/
multiqc . 命令輸入后發(fā)現(xiàn)找不到分析數(shù)據(jù)
(python2) 202031107010173@xiaoming-HP:~/Biosofts$ multiqc .
/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/site-packages/multiqc-1.0.dev0-py2.7.egg/multiqc/utils/config.py:44: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
configs = yaml.load(f)
/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/site-packages/multiqc-1.0.dev0-py2.7.egg/multiqc/utils/config.py:50: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
sp = yaml.load(f)
[INFO ] multiqc : This is MultiQC v1.0.dev0
[INFO ] multiqc : Template : default
[INFO ] multiqc : Searching '.'
[WARNING] multiqc : No analysis results found. Cleaning up..
[INFO ] multiqc : MultiQC complete
查找原因,首先查看multiqc安裝情況,出現(xiàn)問題了
(base) 202031107010173@xiaoming-HP:~/Biosofts$ multiqc --version
multiqc: command not found
(base) 202031107010173@xiaoming-HP:~/Biosofts$ multiqc -h
multiqc: command not found
那么重新安裝試試
conda install multiqc -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
過了幾天再做這個任務(wù)發(fā)現(xiàn)在/disk1/shares/Seqs/目錄下有現(xiàn)成的fastqc文件,而且之前我下載的存在問題,那么這次將fastqc文件cp到~/Biosofts/下再次multiqc測試
(base) 202031107010173@xiaoming-HP:~/Biosofts$ conda activate python2
(python2) 202031107010173@xiaoming-HP:~/Biosofts$ multiqc .
/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/site-packages/multiqc-1.0.dev0-py2.7.egg/multiqc/utils/config.py:44: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
configs = yaml.load(f)
/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/site-packages/multiqc-1.0.dev0-py2.7.egg/multiqc/utils/config.py:50: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
sp = yaml.load(f)
[INFO ] multiqc : This is MultiQC v1.0.dev0
[INFO ] multiqc : Template : default
[INFO ] multiqc : Searching '.'
[ERROR ] multiqc : Oops! The 'prokka' MultiQC module broke...
Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues
(if possible, include a log file that triggers the error)
============================================================
Module prokka raised an exception: Traceback (most recent call last):
File "/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/site-packages/multiqc-1.0.dev0-py2.7.egg/EGG-INFO/scripts/multiqc", line 346, in multiqc
output = mod()
File "/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/site-packages/multiqc-1.0.dev0-py2.7.egg/multiqc/modules/prokka/prokka.py", line 31, in __init__
self.parse_prokka(f)
File "/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/site-packages/multiqc-1.0.dev0-py2.7.egg/multiqc/modules/prokka/prokka.py", line 87, in parse_prokka
first_line = f['f'].readline()
File "/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/codecs.py", line 314, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xd4 in position 3116: invalid continuation byte
============================================================
[WARNING] multiqc : No analysis results found. Cleaning up..
[INFO ] multiqc : MultiQC complete
又失敗了o(╥﹏╥)o
再看看是什么原因
我們先用fastqc對文件進行質(zhì)量測控
(base) 202031107010173@xiaoming-HP:~/Biosofts$ fastqc Akle_TTAGGC_L004_R1_001.fastq
Started analysis of Akle_TTAGGC_L004_R1_001.fastq
Approx 5% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 10% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 15% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 20% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 25% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 30% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 35% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 40% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 45% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 50% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 55% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 60% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 65% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 70% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 75% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 80% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 85% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 90% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 95% complete for Akle_TTAGGC_L004_R1_001.fastq
Approx 100% complete for Akle_TTAGGC_L004_R1_001.fastq
Analysis complete for Akle_TTAGGC_L004_R1_001.fastq
(base) 202031107010173@xiaoming-HP:~/Biosofts$ fastqc Akle_TTAGGC_L004_R2_001.fastq
Started analysis of Akle_TTAGGC_L004_R2_001.fastq
Approx 5% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 10% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 15% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 20% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 25% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 30% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 35% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 40% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 45% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 50% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 55% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 60% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 65% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 70% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 75% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 80% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 85% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 90% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 95% complete for Akle_TTAGGC_L004_R2_001.fastq
Approx 100% complete for Akle_TTAGGC_L004_R2_001.fastq
Analysis complete for Akle_TTAGGC_L004_R2_001.fastq
(base) 202031107010173@xiaoming-HP:~/Biosofts$ conda activate python2
(python2) 202031107010173@xiaoming-HP:~/Biosofts$ multiqc .
/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/site-packages/multiqc-1.0.dev0-py2.7.egg/multiqc/utils/config.py:44: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
configs = yaml.load(f)
/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/site-packages/multiqc-1.0.dev0-py2.7.egg/multiqc/utils/config.py:50: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
sp = yaml.load(f)
[INFO ] multiqc : This is MultiQC v1.0.dev0
[INFO ] multiqc : Template : default
[INFO ] multiqc : Searching '.'
[ERROR ] multiqc : Oops! The 'prokka' MultiQC module broke...
Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues
(if possible, include a log file that triggers the error)
============================================================
Module prokka raised an exception: Traceback (most recent call last):
File "/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/site-packages/multiqc-1.0.dev0-py2.7.egg/EGG-INFO/scripts/multiqc", line 346, in multiqc
output = mod()
File "/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/site-packages/multiqc-1.0.dev0-py2.7.egg/multiqc/modules/prokka/prokka.py", line 31, in __init__
self.parse_prokka(f)
File "/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/site-packages/multiqc-1.0.dev0-py2.7.egg/multiqc/modules/prokka/prokka.py", line 87, in parse_prokka
first_line = f['f'].readline()
File "/disk1/202031107010173/anaconda3/envs/python2/lib/python2.7/codecs.py", line 314, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xd4 in position 3116: invalid continuation byte
============================================================
[INFO ] fastqc : Found 2 reports
[INFO ] multiqc : Report : multiqc_report.html
[INFO ] multiqc : Data : multiqc_data
[INFO ] multiqc : MultiQC complete
再次查看Biosofts文件夾出現(xiàn)了 multiqc_data/ multiqc_report.html兩個文件
然后將multiqc_report.html文件下載到桌面用瀏覽器打開就可以查看整合情況了
這是具體情況鏈接
[MultiQC Report](file:///C:/Users/hesicheng/Desktop/multiqc_report.html)
截圖如下

image.png
完成