10X單細(xì)胞空間聯(lián)合分析之五----spatialDWLS

今天我們來到了單細(xì)胞空間聯(lián)合分析的第五個部分,可能有部分同學(xué)有這樣的疑問,為什么要分享和研究這么多的方法, 有一個不就好了么?這個問題,說明你站的高度還需要提升。好了,開始我們今天的分享,單細(xì)胞和空間聯(lián)合分析的方法----spatialDWLS

文章在SpatialDWLS: accurate deconvolution of spatial transcriptomic data,目前處于前發(fā)的狀態(tài),中國人寫的,里面用到的方法也是解卷積,大家要對比之前分享的方法SPOTlight一起對比學(xué)習(xí),我們這里關(guān)注重點(diǎn)。

基礎(chǔ)知識部分

一、為什么不能利用bulk-seq數(shù)據(jù)解卷積方法,直接對空間轉(zhuǎn)錄組數(shù)據(jù)進(jìn)行解卷積??
(1)the number of cells within each spot is typically small. For example, each spot in the 10X Genomics Visium platform has the diameter of 55 μm, corresponding to a spatial resolution of 5-10 cells. The application of a bulk RNAseq deconvolution method to such a small sample size would result in noise from unrelated cell types。(noise
(2)as spatial expression datasets usually contain thousands of spots, it would be time and memory consuming if deconvolution methods designed for bulk RNA-seq were applied on spatial expression datasets.(第二個原因還是次要的,主要是第一個)
二、spatialDWLS的分析原理
(1)it identifies cell types that likely to be present at each location by using a recently developed cell-type enrichment analysis method(注意這里用到了一種富集方法,算法中我們探討一下)。
(2)the cell type composition at each location is inferred by extending the dampened weighted least squares (DWLS) method,which was originally developed for deconvolving bulk RNAseq data(我們先來記住這個簡單的過程)。

圖片.png

三、spatialDWLS方法的評估
the Root Mean Square Error (RMSE) associated with oligodendrocytes is only 0.03 with the predicted values approximately center around ground-truth。
這里有一個Root Mean Square Error (RMSE),大家可以參考均方根誤差
圖片.png

可見方法中對之前介紹的SPOTlight進(jìn)行了比較。

圖片.png

這里提一句,文章寫肯定自己的方法最好,但是,我們要甄別。

四、運(yùn)用驗(yàn)證,這里就列舉其中的一個例子
During embryonic development, the spatial-temporal distribution of cell types changes
dramatically. Therefore, it is of interest to test whether spatialDWLS could aid the discovery of such dynamic changes. Recently, Asp and colleagues studied the development of human heart in early embryos (4.5–5, 6.5, and 9 post-conception weeks) by using the Spatial Transcriptomics (ST) technology。 Since the data does not have single-cell resolution, they were not able to identify cell-type distribution directly from the ST data. In order to apply spatialDWLS, we utilized the single-cell RNAseq derived gene signatures from this study as reference. All the cell types were mapped to expected locations .


圖片.png

In order to quantitatively compare the change of spatial-temporal organization of cell type composition during embryonic heart development, we first examined the overall abundance of different cell types


圖片.png

有些細(xì)胞增多了,有些細(xì)胞減少了(聯(lián)合分析的結(jié)果看),總之,結(jié)果很好,大家嘗試(作者的觀點(diǎn))。

這里我們要重點(diǎn)關(guān)注一點(diǎn)文章的方法了。

Cell type selection of spatial expression data by enrichment analysis We use an enrichment based weighted least squares approach for deconvolution of spatial

expression datasets
(1)enrichment analysis using Parametric Analysis of Gene Set Enrichment (PAGE) method22 is applied on spatial expression dataset as previously reported。這里的富集方法就是GSEA。The marker genes can be identified via differential expression gene analysis of Giotto based on the single cell RNA-seq data provided by users(單細(xì)胞數(shù)據(jù)提供的marker,感覺有點(diǎn)Low,)。Alternatively, users can also provide marker gene expression for each cell type for deconvolution.(或者自己提供marker,更扯了)。
細(xì)胞marker gene的數(shù)量為m,對于每個基因,我們將倍數(shù)變化計算為每個點(diǎn)的表達(dá)值與所有點(diǎn)的平均表達(dá)之比,The mean and standard deviation of the fold change values are defined as μ and δ, respectively.In addition, we calculate the mean fold change of the m marker genes, which is defined as Sm. The enrichment score (ES) is defined as follows:


圖片.png

Then, we binarize the enrichment matrix with the cutoff value of ES = 2 to select cell types that are likely to be present at each point.
恕我直言,這個富集方法,很飄啊。

Estimating cell type composition by using a weighted least squares approach

In previous work, we developed dampened weighted least squares (DWLS) for deconvolution of single-cell RNAseq data.(這個方法大家可以查一下),This method is extended here to deconvolve spatial transcriptomic data using the signature gene identification step described above. In short, DWLS uses a weighted least squares approach to infer cell-type composition, where the weight is selected to minimize the overall relative error rate. In addition, a damping constant d is used to enhance numerical stability, whose value is determined by using a cross-validation procedure. Here, we use the same sets of weights and damping constant across spots within same clusters to reduce technical variation. Finally, since the number of cells present at each spot is generally small, we perform another round deconvolution by remove those cell types that are predicted to present at a low frequency by imposing an additional thresholding (min frequency = 0.02 by default).(這個地方還是需要涉及到算法,大家可以深入)。

最后來一張效果圖


圖片.png

這個方法在spatialDWLS,代碼都很簡單,只需要關(guān)注一個函數(shù)runDWLSDeconv,算法才是精髓。

生活很好,有你更好1

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。
禁止轉(zhuǎn)載,如需轉(zhuǎn)載請通過簡信或評論聯(lián)系作者。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容