
Overview of Harmony algorithm
Fast, sensitive and accurate integration of single-cell data with Harmony
#系統(tǒng)需求
- Linux, OS X, 和Windows系統(tǒng)均可以;
- R 版本需要3.4以上
- Python 用戶參考harmonypy
#安裝
library(devtools)
install_github("immunogenomics/harmony")
#例子
##PCA matrix
Harmony 可以迭代矯正PCA 降維數(shù)據(jù);使用PCA數(shù)據(jù),需要設置:do_pca=FALSE
data(cell_lines_small)
pca_matrix <- cell_lines_small$scaled_pcs
meta_data <- cell_lines_small$meta_data
harmony_embeddings <- HarmonyMatrix(pca_matrix, meta_data, 'dataset',
do_pca=FALSE)
##\## Output is a matrix of corrected PC embeddings
dim(harmony_embeddings)
harmony_embeddings[seq_len(5), seq_len(5)]
##\## Finally, we can return an object with all the underlying data structures
harmony_object <- HarmonyMatrix(pca_matrix, meta_data, 'dataset',
do_pca=FALSE, return_object=TRUE)
dim(harmony_object$Y) ## cluster centroids
dim(harmony_object$R) ## soft cluster assignment
dim(harmony_object$Z_corr) ## corrected PCA embeddings
head(harmony_object$O) ## batch by cluster co-occurence matrix
##Normalized gene matrix
Harmony期望導入的數(shù)據(jù)是標準化之后的數(shù)據(jù)。Harmony 會縮放數(shù)據(jù),降維(PCA),最后數(shù)據(jù)整合。
library(harmony)
my_harmony_embeddings <- HarmonyMatrix(normalized_counts, meta_data, "dataset")
##Seurat
在Seurat分析流程中使用Harmony:Seurat V2 Seurat V3;使用RunHarmony()代替PCA,之后runUMAP().
seuratObj <- RunHarmony(seuratObj, "dataset")
seuratObj <- RunUMAP(seuratObj, reduction = "harmony")
##Harmony with two or more covariates
Harmony 可以基于多個協(xié)變量整合數(shù)據(jù);整合時,通過向量指定協(xié)變量。
my_harmony_embeddings <- HarmonyMatrix(
my_pca_embeddings, meta_data, c("dataset", "donor", "batch_id"),
do_pca = FALSE
)
Seurat 流程中:
seuratObject <- RunHarmony(seuratObject, c("dataset", "donor", "batch_id"))
詳細使用方法參考: advanced tutorial
Fast, sensitive and accurate integration of single-cell data with Harmony 文章代碼復現(xiàn)見harmony2019
#參考:
Harmony
Fast, sensitive and accurate integration of single-cell data with Harmony