使用 dplyr 计算每个组的质心
Compute centroid for each group using dplyr
对于 temp3
中的每个簇,计算其质心。我最终不想在它的质心坐标上绘制簇号。
数据:
> head(temp3)
X Y Transcripts Genes Timepoint Run Cluster
6B_0_GACCGCGATATT -102.1425877 13.944831 134028 11269 Day 0 6B 2
6B_0_ATTGCGGAGACA -38.6617527 0.600154 106849 10947 Day 0 6B 3
6B_0_ATGGTCACCACT -23.3275424 34.178312 105817 10495 Day 0 6B 4
6B_0_ATATTGCTAATC -0.6069128 52.449397 79920 9650 Day 0 6B 4
6B_0_ATCTAATCTACC -0.4738788 54.756711 72912 9294 Day 0 6B 4
6B_0_CGCAGTGTGCCC 108.5333675 76.637930 70132 9291 Day 0 6B 6
代码:
library(dplyr)
temp3 %>% group_by(Cluster) %>% mutate(., Centroid=rowMeans(cbind(.$X, .$Y), na.rm = TRUE))
哪个returns:
Error: incompatible size (13792), expecting 198 (the group size) or 1
编辑:
另一种方法:
library(cluster)
temp3 %>% group_by(Cluster) %>% mutate(., Centroid=pam(cbind(.$X, .$Y), 1)$medoids)
returns:
Error: incompatible size (2), expecting 198 (the group size) or 1
怎么样
temp3 %>% group_by(Cluster) %>% mutate(meanX=mean(X), meanY=mean(Y))
如果您想要一个与输入具有相同维度的结果。
或者,如果您只希望每个群集一行(这似乎更有可能):
temp3 %>% group_by(Cluster) %>% summarise(meanX=mean(X), meanY=mean(Y))