對(duì)應(yīng)原版教程第10章http://bioconductor.org/books/release/OSCA/overview.html
“物以類(lèi)聚，人以群分” 分群步驟即將基因表達(dá)（降維結(jié)果）相似的細(xì)胞歸為同一個(gè)群體，往往對(duì)應(yīng)一種特定的細(xì)胞類(lèi)型或者細(xì)胞軌跡狀態(tài)。從一步開(kāi)始，就可以開(kāi)始敘述我們的生物學(xué)故事了~

源網(wǎng)，侵刪~

筆記要點(diǎn)

1、clustering是一個(gè)顯微鏡
2、基于圖聚類(lèi)的分群
3、其它分群算法（k均值與層次聚類(lèi)）
4、分群結(jié)果評(píng)價(jià)

一、clustering是一個(gè)顯微鏡

細(xì)胞分群結(jié)果是通過(guò)基因表達(dá)相似度的計(jì)算過(guò)程，人為貼上的標(biāo)簽；即本質(zhì)是一個(gè)數(shù)學(xué)計(jì)算問(wèn)題。
如果把分群比作一個(gè)顯微鏡，那么我們可以根據(jù)不同的放大倍數(shù)(resolution分辨率)，得到不同的結(jié)果。脫離于生物學(xué)背景知識(shí)，來(lái)談?wù)撃膫€(gè)分群結(jié)果是“最佳”的問(wèn)題，是沒(méi)有意義。（例如是想觀察細(xì)胞的primary cell type還是會(huì)specific cell sub-type。）
分群是我們分析scRNA-seq的一個(gè)工具，是真正開(kāi)始結(jié)合生物學(xué)背景知識(shí)的開(kāi)始。我們可以采用靈活采用不同的算法、分辨率獲得我們“滿意”的分群結(jié)果。

二、基于圖聚類(lèi)的分群

2.1 算法簡(jiǎn)介

可簡(jiǎn)單分為3步

（1）計(jì)算所有細(xì)胞兩兩間的距離(歐幾里得距離)，確定每個(gè)細(xì)胞的Top K nearest neighbors(KNN)；
（2）根據(jù)上述關(guān)系，計(jì)算細(xì)胞(節(jié)點(diǎn))兩兩間的相關(guān)性(邊)(節(jié)點(diǎn)之間的邊的權(quán)重)；
（3）根據(jù)保留的KNN網(wǎng)絡(luò)，劃分出內(nèi)部互相連接關(guān)系遠(yuǎn)高于內(nèi)外部互相連接關(guān)系的cluster。
如上分別對(duì)應(yīng)3個(gè)問(wèn)題：（1）選擇多少個(gè)最近鄰居；（2）如何度量細(xì)胞相關(guān)性；（3）采用什么劃分cluster的算法。

2.2 scran包分群實(shí)操

示例數(shù)據(jù)

sce.pbmc   #來(lái)源參考原教程

參數(shù)設(shè)置
為每個(gè)細(xì)胞確定10個(gè)最鄰近細(xì)胞；基于highest average rank of the shared neighbors，計(jì)算兩兩細(xì)胞間的關(guān)聯(lián)性；使用igraph包提供的Walktrap算法進(jìn)行cluster的劃分

library(scran)
g <- buildSNNGraph(sce.pbmc, k=10, use.dimred = 'PCA')
clust <- igraph::cluster_walktrap(g)$membership
table(clust)
#clust
# 1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16 
#205 508 541  56 374 125  46 432 302 867  47 155 166  61  84  16

結(jié)合t-SNE可視化cluster分布

library(scater)
colLabels(sce.pbmc) <- factor(clust)
plotReducedDim(sce.pbmc, "TSNE", colour_by="label")

值得注意的是：t-SNE二維轉(zhuǎn)換與cluster計(jì)算是基于分別PCA降維結(jié)果的獨(dú)立計(jì)算的過(guò)程。經(jīng)過(guò)上圖的可視化呈現(xiàn)，可以起到相互驗(yàn)證的作用。
但利用igrap包的系列可視化方式就是基于KNN產(chǎn)生的細(xì)胞間關(guān)系，進(jìn)行排布的，如下代碼。

set.seed(11000)
reducedDim(sce.pbmc, "force") <- igraph::layout_with_fr(g)
plotReducedDim(sce.pbmc, colour_by="label", dimred="force")

2.3 參數(shù)調(diào)整對(duì)cluster結(jié)果的影響

（1） Top n nearest neighbour的選擇，即`buildSNNGraph()`的`k=`參數(shù)設(shè)置（*）

一般來(lái)說(shuō)，k越大，簇內(nèi)互相關(guān)聯(lián)程度越緊密，即cluster越大，分得的cluster數(shù)越少；k越小，分得的cluster數(shù)就越多。

g.5 <- buildSNNGraph(sce.pbmc, k=5, use.dimred = 'PCA')
clust.5 <- igraph::cluster_walktrap(g.5)$membership
table(clust.5)
#  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20  21  22 
#523 302 125  45 172 573 249 439 293  95 772 142  38  18  62  38  30  16  15   9  16  13

g.50 <- buildSNNGraph(sce.pbmc, k=50, use.dimred = 'PCA')
clust.50 <- igraph::cluster_walktrap(g.50)$membership
table(clust.50)
#  1   2   3   4   5   6   7   8   9  10 
#869 514 194 478 539 944 138 175  89  45

（2）其它評(píng)價(jià)細(xì)胞間關(guān)聯(lián)性（權(quán)重weight）的方法

Based on the number of nearest neighbors that are shared between two cells

g.num <- buildSNNGraph(sce.pbmc, use.dimred="PCA", type="number")

Based on the Jaccard index of the two sets of neighbors.

g.jaccard <- buildSNNGraph(sce.pbmc, use.dimred="PCA", type="jaccard")

（3）其它劃分cluster的算法

igraph包提供的各種方法

clust.louvain <- igraph::cluster_louvain(g)$membership
clust.infomap <- igraph::cluster_infomap(g)$membership
clust.fast <- igraph::cluster_fast_greedy(g)$membership
clust.labprop <- igraph::cluster_label_prop(g)$membership
clust.eigen <- igraph::cluster_leading_eigen(g)$membership

Pipelines involving scran default to rank-based weights followed by Walktrap clustering. In contrast, Seurat uses Jaccard-based weights followed by Louvain clustering.

2.4 評(píng)價(jià)cluter結(jié)果

主要目的即是為了不同cluster間的分離度是否足夠明顯；
在筆記最后，會(huì)介紹一些通用的評(píng)價(jià)方法，這里介紹一種專(zhuān)門(mén)適用KNN圖聚類(lèi)的評(píng)價(jià)方法；
在之前已經(jīng)明白了計(jì)算細(xì)胞間關(guān)聯(lián)性（weight）是圖聚類(lèi)算法的重要步驟（第二步）。設(shè)A=某一cluster間任意兩兩細(xì)胞的實(shí)際關(guān)系（weight）總和；B=某一cluster間兩兩細(xì)胞的預(yù)期關(guān)系（來(lái)自于所有cluster兩兩細(xì)胞間的關(guān)系隨機(jī)分布）。若A-B的差值越大，則說(shuō)明這個(gè)cluster的內(nèi)聯(lián)度越顯著。
為了全面評(píng)價(jià)同一cluster間的內(nèi)聯(lián)度與不同cluster兩兩間的分離度，使用bluster包的pairwiseModularity()函數(shù)進(jìn)行如上的計(jì)算。唯一的區(qū)別是用比值ratio代替了差值。

ratio <- pairwiseModularity(g, clust, as.ratio=TRUE)
dim(ratio)

library(pheatmap)
pheatmap(log2(ratio+1), cluster_rows=FALSE, cluster_cols=FALSE,
    color=colorRampPalette(c("white", "blue"))(100))

如下圖，理想情況是對(duì)角線的結(jié)果最明顯，上三角的結(jié)果越小越好。

可以看到部分cluster間的關(guān)聯(lián)程度還是存在的，我們可以通過(guò)igraph進(jìn)行可視化，進(jìn)一步查看。

cluster.gr <- igraph::graph_from_adjacency_matrix(log2(ratio+1), 
                                                  mode="upper", weighted=TRUE, diag=FALSE)
# Increasing the weight to increase the visibility of the lines.
set.seed(11001010)
plot(cluster.gr, edge.width=igraph::E(cluster.gr)$weight*5,
     layout=igraph::layout_with_lgl)

三、其它分群算法

3.1 k均值法

（1）算法簡(jiǎn)介

首先確定想要分成k個(gè)cluster；然后在所有細(xì)胞中隨機(jī)抽取k個(gè)細(xì)胞作為cluster中心點(diǎn)；將中心點(diǎn)以外所有細(xì)胞分配到相距最近的中心點(diǎn)，從而形成第一次分群；計(jì)算上一次分群的新的中心點(diǎn)，將新的中心點(diǎn)以外所有細(xì)胞重新分配到相距最近的中心點(diǎn)；如此往復(fù)，直至分群結(jié)果穩(wěn)定下來(lái)。
根據(jù)上述定義，雖然k-means計(jì)算速度很快，但存在一些不適合scRNA-seq的地方。例如需要提前指定k的數(shù)目，需要多次嘗試出比較合適的值；傾向于形成的cluster呈圍繞中心點(diǎn)的圓形分布，可能不符合細(xì)胞類(lèi)型的真實(shí)分布

（2）`kmeans()`函數(shù)

set.seed(100)
clust.kmeans <- kmeans(reducedDim(sce.pbmc, "PCA"), centers=10)
table(clust.kmeans$cluster)
#  1   2   3   4   5   6   7   8   9  10 
#548  46 408 270 539 199 148 783 163 881

#結(jié)合t-SNE進(jìn)行可視化
colLabels(sce.pbmc) <- factor(clust.kmeans$cluster)
plotReducedDim(sce.pbmc, "TSNE", colour_by="label")

如上圖 k均值的分群結(jié)果在t-SNE分布還是較為理想的。

（3）評(píng)價(jià)k-means分群結(jié)果

首先計(jì)算每個(gè)cluster內(nèi)所有點(diǎn)到中心點(diǎn)的距離平方和(WCSS, within-cluster sum of squars)
然后計(jì)算每個(gè)cluster內(nèi)細(xì)胞據(jù)中心點(diǎn)的平均距離，最后開(kāi)根號(hào)的結(jié)果（RMSD，root-mean-squared-deciation）作為該cluster的評(píng)價(jià)指標(biāo)。
若cluster的RMSD越小，表明該群的分布越緊湊，即效果越好

ncells <- tabulate(clust.kmeans$cluster) #統(tǒng)計(jì)1-10群細(xì)胞數(shù)量
tab <- data.frame(wcss=clust.kmeans$withinss, ncells=ncells)
tab$rms <- sqrt(tab$wcss/tab$ncells)
tab
#       wcss      ncells   rms
# 1  20626.970    548  6.135182
# 2   6530.806     46 11.915286
# 3   9899.650    408  4.925835
# 4   6429.640    270  4.879906
# 5  12413.267    539  4.798977
# 6  13467.078    199  8.226406
# 7   4718.144    148  5.646180
# 8  27010.471    783  5.873341
# 9   7703.558    163  6.874670
# 10  9469.524    881  3.278507

可以順便可視化下基于cluster中心點(diǎn)的層次聚類(lèi)關(guān)系（關(guān)于層析聚類(lèi)算法，之后會(huì)說(shuō)一下）

cent.tree <- hclust(dist(clust.kmeans$centers), "ward.D2")
plot(cent.tree)

image.png

（4）關(guān)于中心點(diǎn)(centroid)數(shù)量k的選擇

A：關(guān)于k的最佳取值，我覺(jué)得針對(duì)scRNA-seq還是要多嘗試不同的分群結(jié)果。教程也介紹了一種可以用來(lái)選擇最佳k值的思路：計(jì)算每次分群結(jié)果的gap統(tǒng)計(jì)量，一般越高越好?？墒褂?code>cluster包的相關(guān)函數(shù)，相關(guān)代碼如下，具體原理可參考原教程。

library(cluster)
set.seed(110010101)
gaps <- clusGap(reducedDim(sce.pbmc, "PCA"), kmeans, K.max=20) #耗時(shí)
best.k <- maxSE(gaps$Tab[,"gap"], gaps$Tab[,"SE.sim"])
best.k
# 8

B：需要提前k值的k均值法可以設(shè)定一個(gè)較大的值，達(dá)到圖聚類(lèi)法難以達(dá)到的分辨率。雖然進(jìn)行生物水平的可解釋性不高，但可實(shí)現(xiàn)從所有細(xì)胞中，抽取k個(gè)有代表性表達(dá)情況的細(xì)胞的目的，用于某些特定的分析場(chǎng)景。
例如 clusterRows {bluster}提供一種聯(lián)合圖聚類(lèi)與k-均值聚類(lèi)的方法，可明顯的優(yōu)勢(shì)是相對(duì)于單純圖聚類(lèi)大大提高了分析速度。
簡(jiǎn)單舉例來(lái)說(shuō)：首先使用k均值法，獲得所有細(xì)胞的k個(gè)代表性細(xì)胞（一般取較大的值，如1000等）。然后使用圖聚類(lèi)法對(duì)這1000個(gè)中心點(diǎn)細(xì)胞矩形聚類(lèi)。最后把歸于相應(yīng)中心點(diǎn)的其余細(xì)胞分到中心點(diǎn)所在的圖聚類(lèi)結(jié)果里。
相關(guān)代碼如下

set.seed(0101010)
kgraph.clusters <- clusterRows(reducedDim(sce.pbmc, "PCA"),
    TwoStepParam(
        first=KmeansParam(centers=1000),  #1000個(gè)中心點(diǎn)
        second=NNGraphParam(k=5)  #5個(gè)最近鄰居
    )
)
table(kgraph.clusters)
#  1   2   3   4   5   6   7   8   9  10  11  12 
#191 854 506 541 541 892  46 120  29 132  47  86

3.2 層次聚類(lèi)法

（1）算法簡(jiǎn)介

第一步將每個(gè)細(xì)胞(n)看做一個(gè)單獨(dú)的cluster，然后計(jì)算所有cluter兩兩間的“距離”，將相距最近的兩個(gè)cluster歸為一個(gè)cluster；第二步再次計(jì)算（n-1）個(gè)cluster兩兩間的“距離”，將相距最近的兩個(gè)cluster歸為一個(gè)cluster；如此往復(fù)循壞，直至歸為最后一個(gè)最終的cluster。不同分群的結(jié)果即取決于距離閾值的切分（如上圖所示，閾值=4）
該過(guò)程中最關(guān)鍵步驟是如何度量cluster間的距離，尤其是對(duì)于從第二步開(kāi)始，不同cluster的細(xì)胞數(shù)是不一致的。相關(guān)算法也有很多，例如

complete linkage aims to merge clusters with the smallest maximum distance between their elements, while Ward’s method aims to minimize the increase in within-cluster variance.

層次聚類(lèi)最直接的結(jié)果就是得到如上圖所示的系統(tǒng)發(fā)生樹(shù)(dendrogram)，從某種角度代表了細(xì)胞從上至上或者從上至下的關(guān)系。但另一方面，層次聚類(lèi)法往往不適于動(dòng)輒成千上萬(wàn)的細(xì)胞的計(jì)算，相對(duì)來(lái)說(shuō)適合一些小的數(shù)據(jù)集；或者某一特定細(xì)胞群；或者結(jié)合k-均值的結(jié)果。

（2）`hclust()`函數(shù)實(shí)操

sce.416b
#含有185個(gè)細(xì)胞的sce子集。獲取方式見(jiàn)原教程

dist.416b <- dist(reducedDim(sce.416b, "PCA"))
tree.416b <- hclust(dist.416b, "ward.D2") #Ward’s method

# Making a prettier dendrogram.
library(dendextend)
tree.416b$labels <- seq_along(tree.416b$labels)
dend <- as.dendrogram(tree.416b, hang=0.1)

combined.fac <- paste0(sce.416b$block, ".", 
                       sub(" .*", "", sce.416b$phenotype))
labels_colors(dend) <- c(
  `20160113.wild`="blue",
  `20160113.induced`="red",
  `20160325.wild`="dodgerblue",
  `20160325.induced`="salmon"
)[combined.fac][order.dendrogram(dend)]

plot(dend)

“砍樹(shù)”分群：cutree()函數(shù)，有兩個(gè)參數(shù)，可分別設(shè)置。k=表示想分成多少個(gè)cluster；h=表示基于什么距離閾值(上圖的縱坐標(biāo))分隔；

cutree(dend,k=4)
cutree(dend,h=200)

“智能砍樹(shù)”：dynamicTreeCut包的cutreeDynamic()函數(shù)，可自動(dòng)尋找一個(gè)最佳的閾值，進(jìn)行分群

library(dynamicTreeCut)

# minClusterSize needs to be turned down for small datasets.
# deepSplit controls the resolution of the partitioning.
clust.416b <- cutreeDynamic(tree.416b, distM=as.matrix(dist.416b),
                            minClusterSize=10, deepSplit=1)
table(clust.416b)
# 1  2  3  4 
#78 69 24 14
labels_colors(dend) <- clust.416b[order.dendrogram(dend)]
plot(dend)

4 分群結(jié)果評(píng)價(jià)

4.1 cluter間的分離度

(1) 輪廓系數(shù) silhouette width

方法簡(jiǎn)介
對(duì)于某一個(gè)cluster的某一個(gè)細(xì)胞來(lái)說(shuō)，設(shè)A=該細(xì)胞到所在cluster所有細(xì)胞的平均距離；B=該細(xì)胞到其它所有cluster里的所有細(xì)胞的平均值的最小值(即與該細(xì)胞相距最近的其它c(diǎn)luster)。所以對(duì)于該細(xì)胞的輪廓系數(shù)=(B-A)/max{A,B}。該值越接近1，表明分群越合理；負(fù)值則表示該細(xì)胞可能是個(gè)“叛徒”
silhouette()函數(shù)計(jì)算

sil <- silhouette(clust.416b, dist = dist.416b)
plot(sil)

針對(duì)大的scRNA-seq數(shù)據(jù)集，推薦使用approxSilhouette()函數(shù)采用近似的方法計(jì)算所有細(xì)胞的輪廓系數(shù)。

# Performing the calculations on the PC coordinates, like before.
sil.approx <- approxSilhouette(reducedDim(sce.pbmc, "PCA"), clusters=clust)

sil.data <- as.data.frame(sil.approx)
sil.data$closest <- factor(ifelse(sil.data$width > 0, clust, sil.data$other))
sil.data$cluster <- factor(clust)

ggplot(sil.data, aes(x=cluster, y=width, colour=closest)) + 
  ggbeeswarm::geom_quasirandom(method="smiley")

如下圖，橫坐標(biāo)為既定的分群結(jié)果，每個(gè)點(diǎn)代表一個(gè)細(xì)胞，點(diǎn)的顏色即根據(jù)所有cluster所屬細(xì)胞的輪廓系數(shù)標(biāo)注的分群評(píng)價(jià)。如果輪廓系數(shù)為負(fù)值，即歸為相距最近的其它類(lèi)。

(2) clutering purity

簡(jiǎn)單來(lái)說(shuō)就是：在既定的分群結(jié)果里，對(duì)于一個(gè)特定cluster的每個(gè)細(xì)胞來(lái)說(shuō)，其近鄰細(xì)胞都為該cluster的比例；比例越接近1越好。
若一個(gè)cluster里的所有細(xì)胞的purity的中位數(shù)大于0.9，那么表明分群結(jié)果理想；
neighborPurity函數(shù)

pure.pbmc <- neighborPurity(reducedDim(sce.pbmc, "PCA"), clusters=clust)

pure.data <- as.data.frame(pure.pbmc)
pure.data$maximum <- factor(pure.data$maximum)
pure.data$cluster <- factor(clust)

ggplot(pure.data, aes(x=cluster, y=purity, colour=maximum)) +
    ggbeeswarm::geom_quasirandom(method="smiley")

如下圖，橫坐標(biāo)為既定的分群結(jié)果，每個(gè)點(diǎn)代表一個(gè)細(xì)胞，點(diǎn)的顏色即根據(jù)每個(gè)細(xì)胞的近鄰細(xì)胞中所屬cluster占比最多的所決定。

4.2 不同clustering 結(jié)果的比較

pheatmap熱圖形式

tab <- table(Walktrap=clust, Louvain=clust.louvain)   #2.3
#行為Walktrap， 列為L(zhǎng)ouvain
tab <- tab/rowSums(tab) #Louvain結(jié)果相對(duì)于Walktrap的分布比例(按行看)
pheatmap(tab, color=viridis::viridis(100), cluster_cols=FALSE, cluster_rows=FALSE)

clustree樹(shù)分布形式

library(clustree)
combined <- cbind(k.50=clust.50, k.10=clust, k.5=clust.5)
clustree(combined, prefix="k.", edge_arrow=FALSE)

Rand index指標(biāo)
這個(gè)指標(biāo)常用于檢測(cè)分類(lèi)模型的預(yù)測(cè)結(jié)果。例如我有2個(gè)蘋(píng)果，2個(gè)香蕉，2個(gè)芒果；根據(jù)模型對(duì)這6個(gè)水果的分類(lèi)，使用Rand index指標(biāo)表示預(yù)測(cè)結(jié)果與真實(shí)結(jié)果的相似性；
簡(jiǎn)單來(lái)說(shuō)，首先A=6個(gè)水果所有兩兩組合的可能性，即(6*5)/(2*1)=15：而B(niǎo)=所有組合的真實(shí)分布與預(yù)測(cè)分布要么均歸為1類(lèi)（蘋(píng)果1與蘋(píng)果2預(yù)測(cè)歸為1類(lèi)），要么均歸為不同類(lèi)(蘋(píng)果1與香蕉1預(yù)測(cè)歸為不同類(lèi))的次數(shù)。則Rand index=B/A，越接近1，越好。

pairwiseRand(clust, clust.5, mode="index")
# 0.7796

最后關(guān)于subclustering，即在clustering結(jié)果基礎(chǔ)上，對(duì)某一個(gè)cluster進(jìn)一步分群，是提高細(xì)胞亞群分辨率的一種方法。步驟算法基本同上，但可能也會(huì)遇到一些坑，詳見(jiàn)原教程最后，這里就不展開(kāi)介紹了。

以上學(xué)習(xí)了scRNA-seq分群涉及到的一些知識(shí)點(diǎn)。下一步就是找出每個(gè)cluster的marker gene了~

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

OSCA單細(xì)胞數(shù)據(jù)分析筆記-9、Clustering

OSCA單細(xì)胞數(shù)據(jù)分析筆記-9、Clustering

筆記要點(diǎn)

一、clustering是一個(gè)顯微鏡

二、基于圖聚類(lèi)的分群

2.1 算法簡(jiǎn)介

2.2 scran包分群實(shí)操

2.3 參數(shù)調(diào)整對(duì)cluster結(jié)果的影響

（1） Top n nearest neighbour的選擇，即`buildSNNGraph()`的`k=`參數(shù)設(shè)置（*）

（2）其它評(píng)價(jià)細(xì)胞間關(guān)聯(lián)性（權(quán)重weight）的方法

（3）其它劃分cluster的算法

2.4 評(píng)價(jià)cluter結(jié)果

三、其它分群算法

3.1 k均值法

（1）算法簡(jiǎn)介

（2）`kmeans()`函數(shù)

（3）評(píng)價(jià)k-means分群結(jié)果

（4）關(guān)于中心點(diǎn)(centroid)數(shù)量k的選擇

3.2 層次聚類(lèi)法

（1）算法簡(jiǎn)介

（2）`hclust()`函數(shù)實(shí)操

4 分群結(jié)果評(píng)價(jià)

4.1 cluter間的分離度

(1) 輪廓系數(shù) silhouette width

(2) clutering purity

4.2 不同clustering 結(jié)果的比較

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

OSCA單細(xì)胞數(shù)據(jù)分析筆記-9、Clustering

筆記要點(diǎn)

一、clustering是一個(gè)顯微鏡

二、基于圖聚類(lèi)的分群

2.1 算法簡(jiǎn)介

2.2 scran包分群實(shí)操

2.3 參數(shù)調(diào)整對(duì)cluster結(jié)果的影響

（1） Top n nearest neighbour的選擇，即buildSNNGraph()的k=參數(shù)設(shè)置（*）

（2）其它評(píng)價(jià)細(xì)胞間關(guān)聯(lián)性（權(quán)重weight）的方法

（3）其它劃分cluster的算法

2.4 評(píng)價(jià)cluter結(jié)果

三、其它分群算法

3.1 k均值法

（1）算法簡(jiǎn)介

（2）kmeans()函數(shù)

（3）評(píng)價(jià)k-means分群結(jié)果

（4）關(guān)于中心點(diǎn)(centroid)數(shù)量k的選擇

3.2 層次聚類(lèi)法

（1）算法簡(jiǎn)介

（2）hclust()函數(shù)實(shí)操

4 分群結(jié)果評(píng)價(jià)

4.1 cluter間的分離度

(1) 輪廓系數(shù) silhouette width

(2) clutering purity

4.2 不同clustering 結(jié)果的比較

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

OSCA單細(xì)胞數(shù)據(jù)分析筆記-9、Clustering

（1） Top n nearest neighbour的選擇，即`buildSNNGraph()`的`k=`參數(shù)設(shè)置（*）

（2）`kmeans()`函數(shù)

（2）`hclust()`函數(shù)實(shí)操