如何动态创建变量并将其组合到 r 中的数据框?
How to dynamically create variables and combine it to the dataframe in r?
我 运行 kmeans
多个 number of clusters
然后尝试 combine cluster results
到 original dataframe
.
来自 post https://stats.stackexchange.com/questions/10838/produce-a-list-of-variable-name-in-a-for-loop-then-assign-values-to-the 我正在使用他们下面提到的代码 动态创建变量 并根据我的需要进行修改。
原代码在上面post:
x <- as.list(rnorm(10000))
names(x) <- paste("a", 1:length(x), sep = "")
list2env(x , envir = .GlobalEnv)
现在将其应用于 iris 数据:
library(tidyverse)
library(ggthemes)
library(factoextra)
这在创建 3 个集群列表时效果很好:
# running for 1 to 3 clusters
lapply(1:3,
function(cluster_num){
cluster_res_list <- as.list(kmeans(iris %>% select(-Species), cluster_num, nstart = 25))
names(cluster_res_list) <- paste("iris_clus", 1:length(cluster_res_list), sep="_")
list2env(cluster_res_list, envir = .GlobalEnv)
# iris_df <- cbind(iris, cluster_res_list)
} )
问题: 当我尝试将它们与原始数据集组合时出现错误:Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) : cannot coerce class ‘"kmeans"’ to a data.frame
lapply(1:3,
function(cluster_num){
cluster_res_list <- as.list(kmeans(iris %>% select(-Species), cluster_num, nstart = 25))
names(cluster_res_list) <- paste("iris_clus", 1:length(cluster_res_list), sep="_")
list2env(cluster_res_list, envir = .GlobalEnv)
# to combine each cluster result to original df
iris_df <- cbind(iris, cluster_res_list)
} )
可以使用 fitted
函数将 kmeans
的输出视为矩阵。矩阵的行名称标识集群。如果您想在原始日期框架中添加一列来标识集群分配,那么类似的方法就可以了。
以3个集群为例:
cluster_num <- 3
iris %>%
select(-Species) %>%
kmeans(centers = cluster_num, nstart = 25) %>%
fitted() %>%
row.names() %>%
tibble(iris_clus = .) %>%
cbind(iris) %>%
tail()
iris_clus Sepal.Length Sepal.Width Petal.Length Petal.Width Species
145 2 6.7 3.3 5.7 2.5 virginica
146 2 6.7 3.0 5.2 2.3 virginica
147 1 6.3 2.5 5.0 1.9 virginica
148 2 6.5 3.0 5.2 2.0 virginica
149 2 6.2 3.4 5.4 2.3 virginica
150 1 5.9 3.0 5.1 1.8 virginica
将其插入示例中的 lapply
lapply(1:3, function(cluster_num) {
iris %>%
select(-Species) %>%
kmeans(centers = cluster_num, nstart = 25) %>%
fitted() %>%
row.names() %>%
tibble(iris_clus = .) %>%
cbind(iris)
})
这是将所有内容合并到一个数据集中的一种方法。每个模型一列
clusters <- Reduce(cbind, lapply(1:3, function(cluster_num) {
result <- iris %>%
select(-Species) %>%
kmeans(centers = cluster_num, nstart = 25) %>%
fitted() %>%
row.names() %>%
tibble(iris_clus = .)
names(result) <- paste("iris_clus", cluster_num, sep = "_")
return(result)
}))
cbind(iris, clusters)
我 运行 kmeans
多个 number of clusters
然后尝试 combine cluster results
到 original dataframe
.
来自 post https://stats.stackexchange.com/questions/10838/produce-a-list-of-variable-name-in-a-for-loop-then-assign-values-to-the 我正在使用他们下面提到的代码 动态创建变量 并根据我的需要进行修改。
原代码在上面post:
x <- as.list(rnorm(10000))
names(x) <- paste("a", 1:length(x), sep = "")
list2env(x , envir = .GlobalEnv)
现在将其应用于 iris 数据:
library(tidyverse)
library(ggthemes)
library(factoextra)
这在创建 3 个集群列表时效果很好:
# running for 1 to 3 clusters
lapply(1:3,
function(cluster_num){
cluster_res_list <- as.list(kmeans(iris %>% select(-Species), cluster_num, nstart = 25))
names(cluster_res_list) <- paste("iris_clus", 1:length(cluster_res_list), sep="_")
list2env(cluster_res_list, envir = .GlobalEnv)
# iris_df <- cbind(iris, cluster_res_list)
} )
问题: 当我尝试将它们与原始数据集组合时出现错误:Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) : cannot coerce class ‘"kmeans"’ to a data.frame
lapply(1:3,
function(cluster_num){
cluster_res_list <- as.list(kmeans(iris %>% select(-Species), cluster_num, nstart = 25))
names(cluster_res_list) <- paste("iris_clus", 1:length(cluster_res_list), sep="_")
list2env(cluster_res_list, envir = .GlobalEnv)
# to combine each cluster result to original df
iris_df <- cbind(iris, cluster_res_list)
} )
可以使用 fitted
函数将 kmeans
的输出视为矩阵。矩阵的行名称标识集群。如果您想在原始日期框架中添加一列来标识集群分配,那么类似的方法就可以了。
以3个集群为例:
cluster_num <- 3
iris %>%
select(-Species) %>%
kmeans(centers = cluster_num, nstart = 25) %>%
fitted() %>%
row.names() %>%
tibble(iris_clus = .) %>%
cbind(iris) %>%
tail()
iris_clus Sepal.Length Sepal.Width Petal.Length Petal.Width Species
145 2 6.7 3.3 5.7 2.5 virginica
146 2 6.7 3.0 5.2 2.3 virginica
147 1 6.3 2.5 5.0 1.9 virginica
148 2 6.5 3.0 5.2 2.0 virginica
149 2 6.2 3.4 5.4 2.3 virginica
150 1 5.9 3.0 5.1 1.8 virginica
将其插入示例中的 lapply
lapply(1:3, function(cluster_num) {
iris %>%
select(-Species) %>%
kmeans(centers = cluster_num, nstart = 25) %>%
fitted() %>%
row.names() %>%
tibble(iris_clus = .) %>%
cbind(iris)
})
这是将所有内容合并到一个数据集中的一种方法。每个模型一列
clusters <- Reduce(cbind, lapply(1:3, function(cluster_num) {
result <- iris %>%
select(-Species) %>%
kmeans(centers = cluster_num, nstart = 25) %>%
fitted() %>%
row.names() %>%
tibble(iris_clus = .)
names(result) <- paste("iris_clus", cluster_num, sep = "_")
return(result)
}))
cbind(iris, clusters)