Sequence Read Archive (SRA)

Sequence Read Archive (SRA)

2021.8.13 HuPY

1. OverView

Prepare the following information for your submission and be ready to:

  1. Provide a project name and description

  2. Choose the type or ‘package’ of your samples

  3. Provide sample metadata that is unique by sample

  4. Provide sequence metadata

  5. Upload your files

2. Project

For a new project, prepare the information that creates a BioProject.

Required information:

  1. Title

  2. Description

Optional information:

  1. Participants

  2. Grants: Required if your project was funded by a National Institutes of Health (NIH) grant

Important: The required BioProject and BioSample can be created during submission. You can also link an existing BioProject and BioSample with accession numbers.

3. Sample

For new samples, prepare the details that will serve as BioSamples' metadata for individual biological specimens (collection date, location, etc.).

  1. Select the ‘package’ that best fits your biological samples. Each package has a distinct set of required attributes which you can preview here.

    Take ’Metagenome‘ for example:

  2. Each sample must have a unique set of attributes. Provide all required fields and any optional fields that apply to your samples.

  3. Add custom attributes to fully describe your samples and facilitate searching. You should submit at least one unique data file for each sample you create.

4. Library

Prepare the following 'Library' information:

  1. Which BioSample should be linked to which file(s)

  2. Your library construction protocol

  3. Other metadata like unique library names, sequencing platform, and filetype

5.Summary

  1. BioSample is an description file.

    BioProject related to SRA & BioSample

    --- N BioSample

    --- Biosamples: SRSxxx

    --- N SRA

    --- Experiments: SRXxxx

    --- Runs: SRRxxx

  2. Batch-download from SRA

    1. 在文章Data Availability部分查找文章對(duì)應(yīng)的BioProject ID,然后進(jìn)入SRA Run Selector網(wǎng)址,在”Accession“搜索欄中輸入BioProject ID即可獲得研究所有相關(guān)的SRR...。

    2. Found xx items一欄中勾選所有Run,然后點(diǎn)擊上方的Select欄目的Selected中的 Metadata (named SraRunTable)或Accession list(named SRR_Acc_List)獲取相對(duì)應(yīng)的文件。使用FileZilla軟件傳輸上述兩個(gè)文件(path:~/paper_code/pnasHeterosis)。

    3. 接著在集群中批量下載數(shù)據(jù)

    <pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="bash" cid="n306" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; color: rgb(51, 51, 51); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;"># 加載環(huán)境變量
    modules

    加載sratoolkit

    module load sratoolkit/2.9.6

    檢查sratoolkit 可否正常使用

    prefetch --help

    進(jìn)入工作路徑“~/paper_script/pnasHeterosis”,使用sratoolkit批量下載

    cd ~/paper_script/pnasHeterosis

    Alias:alias BSUB="echo 'bsub -J blast -n 2 -R span[hosts=1] -o %J.out -e %J.err -q normal 'command''"

    BSUB

    下載的數(shù)據(jù)默認(rèn)路徑為"~/ncbi/public/sra"

    </pre>

    Command line

    <pre mdtype="fences" cid="n319" lang="bash" spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-position: inherit; background-size: inherit; background-repeat: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; color: rgb(51, 51, 51); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">#!/bin/sh
    bsub -J prefetch -n 2 -R span[hosts=1] -o %J.prefetch.out -e %J.prefetch.err -q normal 'prefetch --option-file ~/paper_script/pnasHeterosis/SRR_Acc_List.txt'
    bsub -K -J fastqdump -n 2 -R span[hosts=1] -o %J.fastqdump.out -e %J.fastqdump.err -q normal 'fastq-dump -I --split-files ~/ncbi/public/sra/*sra'</pre>

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容