最近在做一個(gè)分析的時(shí)候,需要使用到python,而我的單細(xì)胞數(shù)據(jù)是seurat對(duì)象,所以需要轉(zhuǎn)化一下,變成h5ad。關(guān)于seurat轉(zhuǎn)h5ad的方法,我們?cè)赑AGA那一部分進(jìn)行過(guò)總結(jié),還沒(méi)有看的可以在打包代碼PAGA里面找找。那么這次,我依然時(shí)使用之前的代碼,選擇用sceasy進(jìn)行轉(zhuǎn)化,這個(gè)我認(rèn)為是最簡(jiǎn)潔的一個(gè)方法。可是,這次我用的是V5,報(bào)錯(cuò)了!首先我們安裝下sceasy,它的使用需要借助python環(huán)境,首先我們?cè)诮K端創(chuàng)建環(huán)境,R里面調(diào)用,安裝sceasy,需要用到的有l(wèi)oompy。
#創(chuàng)建環(huán)境(終端)
conda create -y -n sceasy python=3.7
conda activate sceasy
conda install anndata -c bioconda
#R中操作
#安裝R包,sceasy可以實(shí)現(xiàn)多種數(shù)據(jù)形式的轉(zhuǎn)化,https://github.com/cellgeni/sceasy
devtools::install_github("cellgeni/sceasy")
library(sceasy)
library(reticulate)
use_condaenv('sceasy')
loompy <- reticulate::import('loompy')
sceasy轉(zhuǎn)化只需一句代碼,可是出現(xiàn)了錯(cuò)誤,很明顯是V5的事!
objV5 <- sce_cca
sceasy::convertFormat(objV5, from="seurat", to="anndata", outFile='objV5.h5ad')
#運(yùn)行日志報(bào)錯(cuò)
Error in ncol(df) :
no slot of name "meta.features" for this object of class "Assay5"
那么一種方法就是等待相關(guān)的包更新以適用于V5,這其實(shí)很坐以待斃,那么可行的方式就是轉(zhuǎn)化降低,還比如別人的V5對(duì)象,你想操作,但是不熟悉,那么就需要轉(zhuǎn)化為V4可能更好用一點(diǎn)。V4、V5區(qū)別沒(méi)有達(dá)到“天差地別”,不像monocle2和monocle3那樣的轉(zhuǎn)折。操作很簡(jiǎn)單,as一句代碼。轉(zhuǎn)化后,sceay的轉(zhuǎn)阿虎操作也沒(méi)有問(wèn)題了!
objV5[["RNA"]] <- as(objV5[["RNA"]], "Assay")
sceasy::convertFormat(objV5, from="seurat", to="anndata", outFile='objV5.h5ad')
#運(yùn)行日志
# ... storing 'orig.ident' as categorical
# ... storing 'celltype' as categorical
# ... storing 'metacell_group' as categorical
# ... storing 'var.features' as categorical
# AnnData object with n_obs × n_vars = 7490 × 23700
# obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'percent.mt', 'percent.hb', 'percent.rb', 'RNA_snn_res.0.1', 'RNA_snn_res.0.2', 'RNA_snn_res.0.3', 'RNA_snn_res.0.4', 'RNA_snn_res.0.5', 'RNA_snn_res.0.6', 'RNA_snn_res.0.7', 'RNA_snn_res.0.8', 'RNA_snn_res.0.9', 'RNA_snn_res.1', 'seurat_clusters', 'celltype', 'metacell_group'
# var: 'vf_vst_counts.WT_mean', 'vf_vst_counts.WT_variance', 'vf_vst_counts.WT_variance.expected', 'vf_vst_counts.WT_variance.standardized', 'vf_vst_counts.WT_variable', 'vf_vst_counts.WT_rank', 'vf_vst_counts.GO_mean', 'vf_vst_counts.GO_variance', 'vf_vst_counts.GO_variance.expected', 'vf_vst_counts.GO_variance.standardized', 'vf_vst_counts.GO_variable', 'vf_vst_counts.GO_rank', 'var.features', 'var.features.rank'
# obsm: 'X_pca', 'X_integrated.cca', 'X_umap'
我們將轉(zhuǎn)化后的數(shù)據(jù)讀入python,看看有沒(méi)有問(wèn)題!
sce = anndata.read_h5ad('objV5.h5ad')
#plot
sc.pl.umap(sce, color="celltype")
sc.pl.DotPlot(sce, ["Pparg", "Myh11", "Mrc1", "Flt1", "Col11a1", "Mymk", "Pax7", "Pdgfra","Ttn","Sox2"],
log = True, groupby='celltype').style(cmap='PRGn',dot_edge_color='black', dot_edge_lw=1).swap_axes(False).show(True)

做一下差異基因分析,也是沒(méi)有毛?。?/p>
sc.pp.log1p(sce)
sc.pp.scale(sce)
sc.tl.rank_genes_groups(sce, 'orig.ident', method='wilcoxon')
那么最后,檢驗(yàn)一下seurat V5轉(zhuǎn)4之后做分析有沒(méi)有問(wèn)題,我選擇的是用scMetabolism進(jìn)行測(cè)試,我們知道,scMetabolism是不適用于V5的。但是轉(zhuǎn)化之后,分析這個(gè)scMetabolism流程就沒(méi)有問(wèn)題了!
library(scMetabolism)
V5_metabolism<-sc.metabolism.Seurat(obj = sce_cca,
method = "AUCell",
imputation = F,
ncores = 2,
metabolism.type = "KEGG")
#運(yùn)行日志
Error in sc.metabolism.Seurat(obj = sce_cca, method = "AUCell", imputation = F, :
no slot of name "counts" for this object of class "Assay5"
V4_metabolism<-sc.metabolism.Seurat(obj = objV5_trans,
method = "AUCell",
imputation = F,
ncores = 2,
metabolism.type = "KEGG")
#運(yùn)行日志
# Your choice is: KEGG
# Start quantify the metabolism activity...
# Genes in the gene sets NOT available in the dataset:
# Glycolysis / Gluconeogenesis: 8 (12% of 68)
# Citrate cycle (TCA cycle): 1 (3% of 30)
# Pentose phosphate pathway: 4 (13% of 30)
# Pentose and glucuronate interconversions: 13 (38% of 34)
# Fructose and mannose metabolism: 2 (6% of 33)
# Galactose metabolism: 3 (10% of 31)
# Ascorbate and aldarate metabolism: 13 (48% of 27)
# Starch and sucrose metabolism: 8 (22% of 36)
# Amino sugar and nucleotide sugar metabolism: 2 (4% of 48)
# Pyruvate metabolism: 2 (5% of 39)
# Glyoxylate and dicarboxylate metabolism: 3 (11% of 28)
# Propanoate metabolism: 2 (6% of 32)
# Butanoate metabolism: 5 (18% of 28)
# Inositol phosphate metabolism: 2 (3% of 73)
# Oxidative phosphorylation: 35 (26% of 133)
# Nitrogen metabolism: 1 (6% of 17)
# Fatty acid elongation: 1 (3% of 30)
# Fatty acid degradation: 4 (9% of 44)
# Synthesis and degradation of ketone bodies: 1 (10% of 10)
# Steroid biosynthesis: 2 (11% of 19)
# Primary bile acid biosynthesis: 3 (18% of 17)
# Steroid hormone biosynthesis: 23 (39% of 59)
# Glycerolipid metabolism: 6 (10% of 61)
# Glycerophospholipid metabolism: 8 (8% of 97)
# Ether lipid metabolism: 7 (15% of 47)
# Sphingolipid metabolism: 2 (4% of 47)
# Arachidonic acid metabolism: 16 (26% of 62)
# Linoleic acid metabolism: 11 (38% of 29)
# alpha-Linolenic acid metabolism: 7 (28% of 25)
# Biosynthesis of unsaturated fatty acids: 1 (4% of 23)
# Purine metabolism: 12 (7% of 174)
# Pyrimidine metabolism: 7 (7% of 101)
# Alanine, aspartate and glutamate metabolism: 2 (6% of 35)
# Glycine, serine and threonine metabolism: 3 (8% of 40)
# Cysteine and methionine metabolism: 4 (9% of 45)
# Valine, leucine and isoleucine degradation: 2 (4% of 48)
# Lysine degradation: 1 (2% of 59)
# Arginine biosynthesis: 3 (14% of 21)
# Arginine and proline metabolism: 3 (6% of 50)
# Histidine metabolism: 1 (4% of 23)
# Tyrosine metabolism: 6 (17% of 36)
# Phenylalanine metabolism: 2 (12% of 17)
# Tryptophan metabolism: 4 (10% of 40)
# Phenylalanine, tyrosine and tryptophan biosynthesis: 1 (20% of 5)
# Taurine and hypotaurine metabolism: 1 (9% of 11)
# Selenocompound metabolism: 1 (6% of 17)
# D-Glutamine and D-glutamate metabolism: 1 (20% of 5)
# Glutathione metabolism: 10 (18% of 56)
# N-Glycan biosynthesis: 3 (6% of 49)
# Mucin type O-glycan biosynthesis: 4 (13% of 31)
# Mannose type O-glycan biosynthesis: 2 (9% of 23)
# Glycosaminoglycan biosynthesis - keratan sulfate: 1 (7% of 14)
# Glycosphingolipid biosynthesis - lacto and neolacto series: 5 (19% of 27)
# Other glycan degradation: 1 (6% of 18)
# Thiamine metabolism: 4 (25% of 16)
# Nicotinate and nicotinamide metabolism: 1 (3% of 30)
# Pantothenate and CoA biosynthesis: 1 (5% of 19)
# Folate biosynthesis: 5 (19% of 26)
# One carbon pool by folate: 2 (10% of 20)
# Retinol metabolism: 25 (38% of 66)
# Porphyrin and chlorophyll metabolism: 11 (26% of 42)
# Ubiquinone and other terpenoid-quinone biosynthesis: 1 (9% of 11)
# Caffeine metabolism: 2 (40% of 5)
# Metabolism of xenobiotics by cytochrome P450: 30 (40% of 75)
# Drug metabolism - cytochrome P450: 27 (38% of 71)
# Drug metabolism - other enzymes: 21 (27% of 79)
# Warning messages:
# 1: In .AUCell_buildRankings(exprMat = exprMat, featureType = featureType, :
# nCores is no longer used. It will be deprecated in the next AUCell version.
# 2: useNames = NA is deprecated. Instead, specify either useNames = TRUE or useNames = TRUE.
input.pathway<-c("Glycolysis / Gluconeogenesis", "Oxidative phosphorylation", "Citrate cycle (TCA cycle)")
DotPlot.metabolism(obj = V4_metabolism, pathway = input.pathway, phenotype = "celltype", norm = "y")

最后感謝github上各路大神的分享,以及討論探究,覺(jué)得分享有用的點(diǎn)個(gè)贊再走唄!