Metabolic interaction models recapitulate leaf microbiota ecology

微生物形成和植物健康和0生態(tài)系統(tǒng)功能密切相關。在環(huán)境背景下決定物種豐度機制不清楚,由于在植物葉際資源可用有限性,了解微生物分布和代謝能力可以為猜測物種間相互作用的奉獻程度、植物微生物群體組裝提供途徑、微生物植物資源競爭提供了信息、微生物的系統(tǒng)發(fā)育特征、菌落聚焦的重要性、特定相互作用代謝機制、碳利用作用、生態(tài)位分配、交叉喂養(yǎng)的信息、生物和生物之間相互作用、資源分配對物種相互作用、群落體系影響生態(tài)系統(tǒng)和目標微生物設計讓植物產(chǎn)量提升的共生機制提供了重要的依據(jù)。

植物是地球最大生物量,并被各種微生物寄生,影響植物的健康和生長。植物微生物形成是確定的,說明了群落組裝驅動因子存在。但是,這些相互作用代謝機制未知。

已經(jīng)有發(fā)文Metabolic interaction models recapitulate leaf microbiota ecology 研究論文。

這篇文章將擬南芥葉際微生物分離并獲取了224種菌株,將這些細菌放在45中不同碳源上面生長,想發(fā)現(xiàn)這些菌株能量來源信息,根據(jù)這些細菌的能量來源判斷菌株存在的生態(tài)位重疊,可以估計這些菌群的優(yōu)勢菌群和各個菌株占比例,發(fā)現(xiàn)葉際微生物占比情況,并且用兩種相同能量來源的菌株進行競爭來驗證擬南芥生長的情況,看對應微生物的占比情況。

作者將每個菌株一共有224個菌株進行基因組代謝模型擬合,確定菌株生長進行評分有個計算工具,計算菌株之間的代謝生態(tài)位重疊程度,計算每個菌株代謝模型(從這個模型中可以發(fā)現(xiàn)菌碳利用譜圖,并需發(fā)現(xiàn)較強的系統(tǒng)發(fā)育特征),計算擬南芥宿主與兩組競爭宿主的生態(tài)位進行預測并驗證模型的正確性。

強調了碳代謝對群落聚焦重要性,對超17000對菌株相互作用建立模型,將菌株放到一塊在平板培養(yǎng)基上面培養(yǎng),發(fā)現(xiàn)菌株豐度在沒有上升,背后發(fā)現(xiàn)對碳的高度競爭,別的菌株可以存活因為能量來源來自于氨基酸和有機酸攝取。

亮點: 不同生物的genome scale model + 生態(tài)位猜測 == 獲取物種互作信息

書寫基因組代謝 模型算法部分:

科研人員說明了:碳源可用性和代謝相互作用在植物的群落中得到了重要的作用,觀察的特征保守性,可能有助于植物微生物整體確定性的形成。首先,對于植物來說,葉際微生物可以用宏基因組測序將對應的細菌類型計算出來,對應這些微生物占比例為什么會不一樣?所以作者研究了植物和微生物代謝物在什么代謝物有交互進行研究,基因尺度模型可以可以看到菌株生長狀況,

eg: 作者發(fā)現(xiàn)微生物在碳代謝方面存在有競爭,因此當不給菌株添加碳時,菌株也可以生長成這樣的狀態(tài),可被氨基酸和有機酸攝取抵消,用的人工檢查和圖像處理技術,看碳利用能力和系統(tǒng)發(fā)育偶然模型,根據(jù)碳利用效率看菌株生態(tài)位重疊程度,生態(tài)型重疊指數(shù)用NOI來測算。 基本上面可以看到所有菌株根瘤菌株NOI高

另外為什么叫基因組代謝模型?菌代謝能力和生理特征用大約5000反應和相應菌株genome 大小適度相關。

模型應用:
低多功能性菌株的生態(tài)位重疊程度均較高,而Leaf202和Leaf145與其他菌株的NOI重疊程度均較低。

基因組尺度模型幾乎只預測了所有低多功能性菌株的負相互作用結果,Leaf202與除Leaf145外的所有菌株的聯(lián)合為弱陽性結果,Leaf145與大多數(shù)其他菌株的較強陽性結果

此外,Leaf202和Leaf145在與其他低多功能性菌株配對時,每個樣本都經(jīng)歷了兩個弱陽性結果的實例,這是由我們的基因組尺度建模預測捕獲的。我們的實驗證實了計算預測的有效性(表S4和S5),并強調了資源競爭對原位應變特異性相互作用結果的強大貢獻

PART1 :模型產(chǎn)生部分

*At*-LSPHERE genome-scale metabolic model generation pipeline
========================

This collection of scripts will output a set of curated metabolic models based on organism genomes and experimental information. It is divided into four subsections: 
(1) generation of draft models using CarveMe (Machado *et al.*, 2018), 
(2) initial gapfilling of the draft models using NICEgame (Vayena *et al.*, 2022), 
(3) Additional gapfilling of the models to resolve false positive and negative reactions, and 
(4) final model formatting and annotation, followed by verification using MEMOTE (Lieven *et al.*, 2020). This guide is based on a recommended folder structure for storing models and reports.

# Local quickstart

Software requirements:
  * [MATLAB](https://www.mathworks.com/products/matlab.html) R2021a or higher
  * [CarveMe](https://carveme.readthedocs.io/en/latest/installation.html)
  * [Python](https://www.python.org) 3.6 or 3.7
  * [COBRA Toolbox](https://opencobra.github.io/cobratoolbox/stable/) v2.24.3 or higher
  * [IBM CPLEX Solver](https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer) v12.10
  * NICEgame (from this repository)
  * [MEMOTE](https://memote.readthedocs.io/en/latest/)

## Generate draft metabolic reconstructions using CarveMe:

**Procedure:**
1. Download all desired genomes (in this repo, these are in 'Models/Genomes/'):

2. Using a command line interface, navigate to the CarveMe installation directory and initialize the software:
  bash
  $ python3 /Applications/carveme-master/carveme/__init__.py


3. To generate models for all genomes in a directory, navigate to the directory in which the genomes are stored (i.e., 'Models/Genomes/') and run:
bash
{
for infile in *.faa.zip; do
   outfile=$(echo $infile | awk -F'[.]' '{print $1}')
   carve $infile -o "../CarveMe/sbml_noGF/$outfile.xml
done
}

This will create one SBML draft model corresponding to each genome, and will store them in the 'sbml_noGF' directory.

 Alternatively, to generate models for individual genomes, navigate to desired directory and run:
bash
{
carve --refseq GCF_XXXXXXXXX.1 -o ../CarveMe/sbml_noGF/GCF_XXXXXXXXX.xml
done
}


**Key outputs:**
  * One draft genome-scale model (in SBML format) for each input genome

## Generate gapfilled models using NICEgame:

**Main script:**
* Gapfilling/NICEgame/gapFillModelTFA.m

**Key inputs:**
  * Draft models (in 'FBA/Models/CarveMe/sbml_noGF/')
  * Carbon source screen data ('Medium/CSourceScreen_Jul2022.xlsx')

**Procedure:**
1. Unpack the matTFA toolbox located in NICEgame/matTFA-master/matTFA.zip

2. Open MATLAB and the 'gapFillModelTFA.m' script. This script generates genome-scale metabolic models from previously-generated CarveMe reconstructions and experimental data using the matTFA (Thermodynamic Flux Analysis, Salvy *et al.*, 2019) and NICEgame (Vayena *et al.*, 2022) pipelines.

     This script takes a CarveMe draft metabolic model of an organism and its corresponding experimental data (in .xlsx format representing growth/no growth on carbon sources) as its main inputs. It performs gapfilling using NICEgame and matTFA, which merge the corresponding draft model with a universal metabolite/reaction database and constrains reactions using thermodynamic information. NICEgame then finds candidate reactions that need to be added to the reconstructions to enable growth on each carbon source.

     The script then selects the best combination of gapfilled reactions to use by predicting the growth/no growth phenotype of each model on combinations of solutions. It then saves COBRA model files for downstream curation.

**Key outputs:**
  * List of candidate reactions for gapfilling (in 'FBA/Models/NICEgame/GapfillingResults/')
  * Gapfilled models (one per organism, in 'FBA/Models/NICEgame/Gapfilled/')

## Perform additional model curation to resolve false negative and positive growth:

**Main scripts:**
  * Gapfilling/getModelAccuracy.m
  * Gapfilling/troubleshootFalsePosNeg.m

**Key inputs:**
  * Gapfilled models (one per organism, in 'FBA/Models/NICEgame/Gapfilled/')
  * Carbon source screen data ('Medium/CSourceScreen_Jul2022.xlsx')

**Procedure:**
1. Run the 'getModelAccuracy.m' script, which will output a .mat file containing accuracy statistics of all models in the relevant directory.

2. Run the 'troubleshootFalsePosNeg.m' script, which will reference other models within the collection to correct for false negative and positive growth predictions. Here, the threshold for false positives and the method of correction can be adjusted.

**Key outputs:**
  * FP/FN-corrected gapfilled models (one per organism, in 'FBA/Models/NICEgame/Gapfilled/FPFNCorrected/')

## Perform final model formatting:

**Main scripts:**
  * Final/finalModelFormatting.m

**Key inputs:**
  * FP/FN-corrected gapfilled models (one per organism, in 'FBA/Models/NICEgame/Gapfilled/FPFNCorrected/')
  * Annotation databases (in 'FBA/Scripts/ModelGeneration/Final/databases/')

**Procedure:**
1. Run the 'finalModelFormatting.m' script, which will attempt to annotate all model metabolites, genes, reactions, and subsystems. It will output a .mat file containing the formatted model in COBRA format, as well as an SBML model in .xml.

**Key outputs:**
  * Annotated models in .mat format (one per organism, in 'FBA/Models/Final/')
  * Annotated models in SBML format (one per organism, in 'FBA/Models/Final/sbml')

## Verify models using MEMOTE:

**Key inputs:**
  * Annotated models in SBML format (in 'FBA/Models/Final/sbml')

**Procedure:**

1. Navigate to the directory containing the gapfilled models in SBML format and run MEMOTE via a command line interface to verify the models:

bash
for i in *.xml; do
  memote report snapshot --filename "../../Reports/${i%.*}.html" "$i" || break
done


**Key outputs:**
  * MEMOTE quality scores for each model (in 'FBA/Models/Reports/')

PART2:模型模擬部分

*At*-LSPHERE genome-scale metabolic model simulation scripts 
========================

These scripts will simulate competitive outcomes between previously-generated genome-scale models, and will compare these outcomes to experimental data. This guide is based on a recommended folder structure for storing models, but can be modified in each script.

# Local quickstart

Software requirements:
  * [MATLAB](https://www.mathworks.com/products/matlab.html) R2021a or higher
  * [COBRA Toolbox](https://opencobra.github.io/cobratoolbox/stable/) v2.24.3 or higher
  * [IBM CPLEX Solver](https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer) v12.10

## Compute competitive outcomes and compare to experimental data:

**Main script:**
* competitiveOutcomesPairs.m

**Key inputs:**
  * Curated models (in 'Models/Final/')
  * Medium composition ('Medium/minMedCSourceScreen.mat')

**Procedure:**
1. Open MATLAB and the 'competitiveOutcomesPairs.m' script. This script computes competitive outcomes between strain pairs and community compositions, and compares them to experimental outcomes if desired.

**Key outputs:**
  * Pairwise and community competitive outcomes and associated metabolic flux information

模型測試部分

測試側重于測試是否遵守基于約束的建模的基本原則:質量、電荷和化學計量平衡以及注釋的存在。

“重建”與“模型”
一些作者可能會發(fā)表參數(shù)化的代謝網(wǎng)絡,準備運行通量平衡分析(FBA),這些被簡單地稱為“模型”。或者,其他人可能會發(fā)布無約束的代謝知識庫(稱為“重建”),從中可以通過應用不同的約束推導出幾個模型。兩者都可以用SBML編碼。由于有一個獨立的測試部分,我們試圖使“模型”和“重建”具有可比性,盡管用戶應該意識到這種差異存在并且受到一些影響

“集中”和“分裂”生物質反應
確定生物量組成的基本方法有兩種。最常見的是包含所有生物質前體的單一集總反應。或者,生物量方程可以分成幾個反應,每個反應都關注不同的大分子成分,例如a (1 gDW灰)+ b (1 gDW磷脂)+ c(游離脂肪酸)+ d (1 gDW碳水化合物)+ e (1 gDW蛋白質)+ f (1 gDW RNA) + g (1 gDW DNA) + h(維生素/輔因子)+ xATP + xH2O-> 1 gDCW生物量+ xADP + xH + xPi。這兩種方法的好處在很大程度上取決于所使用的用例

“平均”和“獨特”代謝物由固定核心和可變分支(如膜脂)組成的代謝物有時通過對單個脂類的分布進行平均來實現(xiàn)。生成的偽代謝物被賦予一個平均化學式,這需要對相關反應的化學計量進行縮放,以避免化學式中的浮點數(shù)。另一種方法是在模型中實現(xiàn)每個物種作為不同的代謝物,這增加了反應的總數(shù)。Memote還不能區(qū)分這些范式,這意味著依賴于反應總數(shù)或計量參數(shù)標度的特定部分的結果可能有偏差。

?著作權歸作者所有,轉載或內容合作請聯(lián)系作者
【社區(qū)內容提示】社區(qū)部分內容疑似由AI輔助生成,瀏覽時請結合常識與多方信息審慎甄別。
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發(fā)布,文章內容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

相關閱讀更多精彩內容

友情鏈接更多精彩內容