R 编程/聚类

基本聚类

您可以使用kmeans() 函数。

首先创建一些数据

> dat <- matrix(rnorm(100), nrow=10, ncol=10)

要应用kmeans()，您需要指定聚类的数量

> cl <- kmeans(dat, 3) # here 3 is the number of clusters
> table(cl$cluster)
 1  2  3 
38 44 18

基本层次聚类函数是hclust()，它对由dist() 函数生成的差异结构起作用

> hc <- hclust(dist(dat)) # data matrix from the example above
> plot(hc)

可以使用cutree() 函数裁剪生成的树。

在给定高度裁剪它

> cl <- cutree(hc, h=5.1)
> table(cl)
cl
 1  2  3  4  5 
23 33 29  4 11

裁剪它以获得给定的聚类数量

> cl <- cutree(hc, k=5)
> table(cl)
cl
 1  2  3  4  5 
23 33 29  4 11