作者,Evil Genius
前天清明節(jié),我陪父母都回老家了,祭奠先人的同時(shí),也是讓自己能夠完全冷靜下來,不得不說老家是一個(gè)讓人感到安寧的地方,連父母極度悲傷的情緒也得到了很大的緩解,姥姥還健在,看到母親和姥姥見面的那一刻,其實(shí)自己也能感覺到,母親也是個(gè)女兒。
游子們,尤其出身農(nóng)村的游子們,如果有機(jī)會(huì),多回家看看。
好了,今天我們分享的文獻(xiàn)在Spatial transcriptomics reveals niche-specific enrichment and vulnerabilities of radial glial stem-like cells in malignant gliomas,2023年2月發(fā)表于NC,我精讀了一下,有很多好的方法值得推薦。
文章大家可以好好看看,我在這里重點(diǎn)講述分析方法,
首先,空間轉(zhuǎn)錄組污染的問題,這個(gè)污染來源于周圍spot轉(zhuǎn)錄本的彌散,去污的方法是SpotClean,這個(gè)方法我之前分享過,文章在10X空間轉(zhuǎn)錄組去污染分析之SpotClean
第二點(diǎn),空間整合聚類的方法,不同于單細(xì)胞,空間具有形態(tài)學(xué)信息,因而聚類可以進(jìn)行一定程度上的監(jiān)督,我們來看作者的做法。
Spots from different samples are horizontally integrated in the transcriptional space by Harmony. To integrate both transcriptional space and Cartesian space for spatially informed spot clustering, we tested several recently developed spatially aware tools such as Seurat, BayesSpace,SpatialPCA, Spruce, SpatialDE, and BANKSY. Since the DMG1 sample contains a significant portion of normal cerebellum tissue with clearly demarcated anatomic domains, we used DMG1 as a benchmark to compare the clustering results, and found that the clusters generated by Banksy best correlate with anatomical domains in DMG1
整合的方法依據(jù)形態(tài)學(xué)的認(rèn)知進(jìn)行識(shí)別,正常的區(qū)域應(yīng)當(dāng)單獨(dú)聚成一類,從這個(gè)方面也說明不見得引用率最多的方法就是最好的方法,適合數(shù)據(jù)特點(diǎn)的方法才是最好的。
第三點(diǎn),腫瘤spot的惡性程度評(píng)判,這個(gè)地方注意參考的選擇
To identify malignant spots with relatively high tumor cell content, we performed inferCNV analysis using histologically normal peritumor tissue as a reference
上一篇?jiǎng)傊v過,如果懷疑腫瘤樣本對(duì)自己取的正常組織會(huì)產(chǎn)生影響,那就使用inferCNV或者copyCAT來分析一下,如果沒有明顯的CNV事件,那么就多了一個(gè)證據(jù)斷定樣本來源于正常組織。當(dāng)然這個(gè)會(huì)檢出很多的CNV事件,那么這個(gè)事件,就可以進(jìn)入下游的深入分析。



這個(gè)地方也可以看出,不同樣本的聚類結(jié)果往往也有很大的差異,但是如果可以判斷大致的形態(tài)學(xué)位置,那么認(rèn)為這個(gè)聚類結(jié)果是正常的。
第四點(diǎn),識(shí)別腫瘤轉(zhuǎn)錄的program,這也是文章中常見的分析內(nèi)容,一般我們采用WGCNA或者NMF尋找,但是這是偷懶的方法,作者就做的非常精細(xì)。
we first analyzed patient samples individually to identify spatially informed marker gene sets. For each sample, we filtered out malignant spots, performed BANKSY to group them into spatially informed clusters, and identified marker genes for each cluster using the Seurat package27 (v4.0.4) (FindAllMarkers function, only. pos = T, p_val_adj < 0.05), while excluding marker genes that are shared by different clusters. For each cluster, we retained the top 50 marker genes based on log2FC. Clusters with fewer than 50 significant genes (log2FC > 0.25 and P.adj < 0.05) were removed. As a result, 48 spatially informed marker gene sets were identified across 10 tumor samples.
To horizontally integrate these gene sets into transcriptional modules, we tested three methods as follows and got consistent results.
(1) In the transcriptional space, we calculated the relative gene set expression score in each spot using the Seurat’s (v4.0.4) AddModule-Score function with default parameters. The gene set expression matrix was then used as input for Pearson correlation analysis. The resultant correlation coefficient matrix was subjected to hierarchical clustering using corrplot package-based hclust method, integrating the 48 spatially informed marker gene sets into four cluster modules.
(2) In the Cartesian space, while each spot is not spatially independent, spatially informed clusters obtained by Banksy can be considered independent to each other. Thus,we integrated spots fromthe same cluster in each sample into pseudobulks using Seurat’s (v4.0.4) AverageExpression function. For each pseudobulk, we calculated the relative expression of the aforementioned 48 marker gene sets using Seurat’s (v4.0.4) AddModuleScore function with the default parameters. The gene set expression matrix was then used as input for Pearson correlation analysis. The correlation coefficient matrix was
subjected to hierarchical clustering using corrplot (v0.92) packagebased hclust method, resulting in four modules highly similar to method 1 (Jaccard-Index 0.746).
(3) In the Cartesian space, since adjacent spots are not independent, we used Geographically Weighted Regression (GWR) for correlation analysis. We first calculated all 48 marker gene set scores for individual spots in each sample. Then we calculated the spatially weighted correlation coefficient between any two gene sets using the GWmodel(v2.2) and gwrr (v0.2-2) packages, individually for each sample. The resulting correlation array was reduced by mean to generate a single cross-sample correlation coefficient for any two gene sets. Finally, the correlation coefficient matrix was hierarchical clustered using the corrplot package-based hclust method, resulting in four modules similar to method 1 (Jaccard-Index 0.53). The mean values of the correlation coefficients were visualized by ComplexHeatmap64 R package (v2.0.0).

當(dāng)然,最終的模塊結(jié)果跟形態(tài)學(xué)是匹配的,這也是空間轉(zhuǎn)錄組需要告訴我們的信息。當(dāng)然, 模塊的分布與CNV事件的關(guān)聯(lián),也自然而然成為分析的重點(diǎn)。這里的模塊分布,就意味著niche的分布。

第五點(diǎn),解卷積分析,這里也提醒我們,如果沒有匹配的單細(xì)胞數(shù)據(jù)來運(yùn)用的話,可以借助數(shù)據(jù)庫的單細(xì)胞數(shù)據(jù)。分析niche的時(shí)候判斷細(xì)胞類型的分布差異。這個(gè)地方就為生態(tài)位通訊提供了依據(jù)。


第六點(diǎn),空間軌跡基因,identified genes specifically upregulated in each region based on their dynamic expression patterns.對(duì)于空間軌跡基因的變化,高度特異的區(qū)域基因在調(diào)節(jié)生態(tài)位program起到至關(guān)重要的作用。

這種區(qū)域特異基因是我們關(guān)注的重點(diǎn),當(dāng)然,文章還有一些三代全長的分析內(nèi)容,這部分就需要繼續(xù)學(xué)習(xí)了。我們來看看重點(diǎn)的分析方法。

生活很好,有你更好