更換數(shù)據(jù)集跑Seurat流程(smartseq2)
幾個函數(shù)
1. DimPlot():
Graphs the output of a dimensional reduction technique on a 2D scatter plot where each point is a cell and it's positioned based on the cell embeddings determined by the reduction technique. By default, cells are colored by their identity class (can be changed with the group.by parameter). 畫一個2D降維圖,點代表細胞,可以根據(jù)其分組用不同的顏色表示。
Arguments:
DimPlot(object, dims = c(1, 2), cells = NULL, cols = NULL, pt.size = NULL, reduction = NULL, group.by = NULL, split.by = NULL, shape.by = NULL, order = NULL, label = FALSE, label.size = 4, repel = FALSE, cells.highlight = NULL, cols.highlight = "#DE2D26", sizes.highlight = 1, na.value = "grey50", ncol = NULL, combine = TRUE)
dims: Dimensions to plot, must be a two-length numeric vector specifying x- and y-dimensions. PCA圖的維度,必須是兩維向量
reduction:Which dimensionality reduction to use. If not specified, first searches for umap, then tsne, then pca.采用的降維方法,首選umap->tsne->pca
DimPlot(brc, dims = c(1, 3), reduction = "pca")PC1及PC3

group.by:Name of one or more metadata columns to group (color) cells by (for example, orig.ident); pass 'ident' to group by identity class.應用metadata中的某列數(shù)據(jù)作為分組標準,按不同顏色顯示
DimPlot(brc, dims = c(1, 3), reduction = "pca", group.by = "plate")按照批次分組繪圖
split.by: Name of a metadata column to split plot by,按照metadata中的某個屬性(可以是批次信息,生物學狀態(tài),不同組織病理類型等等)將plot圖分為幾列
DimPlot(brc, dims = c(1, 3), reduction = "pca", split.by = "plate")按照批次繪圖,PC1及PC3

DimPlot(brc, dims = c(1, 3), reduction = "pca", split.by = "g")按照分組繪圖,PC1及PC3

2. DimHeatmap()
DimHeatmap(brc, dims = 1, cells = 500, balanced = TRUE)object: brc
dims: 顯示哪個維度的基因及細胞的熱圖(一般可根據(jù)下述步驟確定多少個維度后,將重要的維度都繪制出來)
cells: 繪圖所用的細胞,500表示前500個細胞,其他參數(shù)一般不重要
代碼運行
rm(list = ls())
packageVersion("Seurat")
library(Seurat)
library(dplyr)
library(patchwork)
library(ggsci)###改變一下配色
load(file='../input.Rdata')
#?CreateSeuratObject()
brc.data = a
brc <- CreateSeuratObject(counts = brc.data, project = "brc7", min.cells = 3, min.features = 200, meta.data = df)
brc
brc.data[c("Jak1", "Jak2", "Jak3"), 1:5]
brc[["percent.mt"]] <- PercentageFeatureSet(brc, pattern = "^MT-")
brc[["percent.ercc"]] <- PercentageFeatureSet(brc, pattern = "^ERCC-")
head(brc@meta.data)
#展示數(shù)據(jù)中每個細胞UMI,基因數(shù),線粒體及ERCC比例
VlnPlot(brc, features = c("nFeature_RNA", "nCount_RNA", "percent.ercc","percent.mt"), ncol = 4)
#展示相關(guān)特征之間的關(guān)系
plot1 <- FeatureScatter(brc, feature1 = "nCount_RNA", feature2 = "percent.ercc")
plot2 <- FeatureScatter(brc, feature1 = "nCount_RNA", feature2 = "nFeature_RNA")
plot3 <- FeatureScatter(brc, feature1 = "percent.ercc", feature2 = "nFeature_RNA")
plot1 + plot2 + plot3
#將unique基因count數(shù)超過7000,或者小于1000的細胞過濾掉,把ERCCcount數(shù)占50%以上的細胞過濾掉
brc <- subset(brc, subset = nFeature_RNA > 1000 & nFeature_RNA < 7000 & percent.ercc < 50)
#標準化:表達量*10000后取log,結(jié)果存儲在pbmc[["RNA"]]@data
brc <- NormalizeData(brc, normalization.method = "LogNormalize", scale.factor = 10000)
#選取高變異基因,只選取前2000個
brc <- FindVariableFeatures(brc, selection.method = "vst", nfeatures = 2000)
#?FindVariableFeatures
top10 <- head(VariableFeatures(brc), 10)
plot1 <- VariableFeaturePlot(brc)
plot2 <- LabelPoints(plot = plot1, points = top10, repel = TRUE)
plot1 + plot2
#scale線性轉(zhuǎn)換
all.genes <- rownames(brc)
brc <- ScaleData(brc, features = all.genes)
brc <- ScaleData(brc, vars.to.regress = "percent.ercc")
#降維 scale后PCA
brc <- RunPCA(brc, features = VariableFeatures(object = brc))
VizDimLoadings(brc, dims = 1:2, reduction = "pca")
DimPlot(brc, reduction = "pca")
#?DimPlot metadata中分組列名一定要加引號,參數(shù)可調(diào)整
DimPlot(brc, dims = c(1, 2), reduction = "pca", split.by = "g")
DimHeatmap(brc, dims = 1, cells = 500, balanced = TRUE)
?DimHeatmap
#合適維度的判定
brc <- JackStraw(brc, num.replicate = 100)
?JackStraw()
brc <- ScoreJackStraw(brc, dims = 1:20)
JackStrawPlot(brc, dims = 1:15)
brc <- FindNeighbors(brc, dims = 1:10)
brc <- FindClusters(brc, resolution = 0.5)
head(Idents(brc))
#降維可視化
brc <- RunUMAP(brc, dims = 1:10)
DimPlot(brc, reduction = "umap")
#尋找差異基因
cluster1.markers <- FindMarkers(brc, ident.1 = 1, min.pct = 0.25)
head(cluster1.markers, n = 5)
brc.markers <- FindAllMarkers(brc, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25)
table(brc.markers$cluster == 2)
cluster1.markers <- FindMarkers(brc, ident.1 = 1, logfc.threshold = 0.25, test.use = "roc", only.pos = TRUE)
VlnPlot(brc, features = c("Fbln1", "Mfap5"))
top10 = brc.markers %>% group_by(cluster) %>% top_n(n = 10, wt = avg_logFC)
DoHeatmap(brc, features = top10$gene) + NoLegend()





