R 中 Tidyr 的 "nest" 函数无法识别变量并打印:"Warning message: Unknown or uninitialised column"
Tidyr's "nest" function in R doesn't recognize a variable and prints: "Warning message: Unknown or uninitialised column"
我正在处理一个数据集,该数据集有一列的国家/地区代码名为 "ccode":
当我创建另一个列来创建名称为 "country" 的国家/地区名称时,我使用了从 CRAN 下载的国家/地区代码包中的函数 "countrycode" 并得到以下结果:
votes_processed <- votes %>%
filter(vote <= 3) %>%
mutate(year = session + 1945,
country = countrycode(ccode,"cown","country.name"))
和以下警告消息:
Warning message:
In countrycode(ccode, "cown", "country.name") :
Some values were not matched unambiguously: 260, 816
由于无法为这些国家/地区代码分配国家/地区名称,因此我将它们从数据框中过滤掉了:
> table(is.na(votes_processed$country))
FALSE TRUE
350844 2703
> votes_processed <- filter(votes_processed,!is.na(country))
> table(is.na(votes_processed$country))
FALSE
350844
之后,我 运行 使用以下命令创建另一个 tibble,为我提供有关总票数和 "yes"(1-是)票数按年份和国家/地区的比例的分组信息:
# Group by year and country: by_year_country
by_year_country <- votes_processed %>%
group_by(year,country) %>%
summarize(total = n(),
percent_yes = mean(vote == 1))
然后我 运行 以下命令按国家嵌套数据,控制台发送以下警告并删除我的国家列:
> nested <- by_year_country %>%
+ nest(-country)
Warning message:
Unknown or uninitialised column: 'country'.
> nested$country
NULL
Warning messages:
1: Unknown or uninitialised column: 'country'.
2: Unknown or uninitialised column: 'country'.
有人可以向我解释这个 "country" 专栏发生了什么,为什么 R 无法识别它,我能做些什么吗?
我是这个平台的初学者。我收到一条要求提供数据样本的评论,我将其粘贴在这里:
rcid<-c(5168,4317,3598,2314,1220,5024,3151,2042,2513,238,4171,3748,2595,
5160,4476,308,3621,874,2025,3793,3595,1191,987,1207,2255,211,
2585,2319,3590,189)
session<- c(66,56,46,36,26,64,42,34,38,4,54,48,38,66,58,6,46,18,34,
48,46,26,22,26,36,4,38,36,46,4)
vote<- c(1,8,1,8,9,1,3,2,2,9,2,1,3,1,1,1,1,1,1,1,1,1,9,2,1,9,1,1,1,2)
ccode<-as.integer(c(816,816,816,816,816,816,260,260,260,260,2,42,2,20,
31,41,20,42,41,31,70,95,80,93,58,51,53,90,55,90))
sample_data_votes<-data.frame("rcid"=rcid,"session"=session, "vote"= vote,
"ccode"=ccode)
非常感谢您的宝贵时间和建议。
by_year_country
已分组,因此您需要先取消分组然后进行嵌套
library(tidyverse)
by_year_country %>% ungroup() %>%
nest(-country) %>% head(n=2)
# A tibble: 2 x 2
country data
<chr> <list>
1 Guatemala <tibble [2 x 3]>
2 Haiti <tibble [2 x 3]>
您似乎需要从对 nest
的调用中删除 -country
部分
library(dplyr)
library(tidyr)
library(countrycode)
rcid<-c(5168,4317,3598,2314,1220,5024,3151,2042,2513,238,4171,3748,2595,
5160,4476,308,3621,874,2025,3793,3595,1191,987,1207,2255,211,
2585,2319,3590,189)
session<- c(66,56,46,36,26,64,42,34,38,4,54,48,38,66,58,6,46,18,34,
48,46,26,22,26,36,4,38,36,46,4)
vote<- c(1,8,1,8,9,1,3,2,2,9,2,1,3,1,1,1,1,1,1,1,1,1,9,2,1,9,1,1,1,2)
ccode<-as.integer(c(816,816,816,816,816,816,260,260,260,260,2,42,2,20,
31,41,20,42,41,31,70,95,80,93,58,51,53,90,55,90))
votes<-data.frame("rcid"=rcid,"session"=session, "vote"= vote,
"ccode"=ccode)
votes_processed <- votes %>%
filter(vote <= 3) %>%
mutate(year = session + 1945,
country = countrycode(ccode,"cown","country.name")) %>%
filter(!is.na(country))
by_year_country <- votes_processed %>%
group_by(year,country) %>%
summarize(total = n(),
percent_yes = mean(vote == 1))
nested <- by_year_country %>%
nest()
让 -country 告诉 nest 使用除 country 之外的所有内容。默认情况下 nest 使用除分组列之外的所有列。 by_year_country 是按年份分组的标题。汇总调用删除了一级分组,因此它不再按国家/地区分组,但仍按年份分组。如果要删除分组,请使用 ungroup()
我正在处理一个数据集,该数据集有一列的国家/地区代码名为 "ccode":
当我创建另一个列来创建名称为 "country" 的国家/地区名称时,我使用了从 CRAN 下载的国家/地区代码包中的函数 "countrycode" 并得到以下结果:
votes_processed <- votes %>%
filter(vote <= 3) %>%
mutate(year = session + 1945,
country = countrycode(ccode,"cown","country.name"))
和以下警告消息:
Warning message:
In countrycode(ccode, "cown", "country.name") :
Some values were not matched unambiguously: 260, 816
由于无法为这些国家/地区代码分配国家/地区名称,因此我将它们从数据框中过滤掉了:
> table(is.na(votes_processed$country))
FALSE TRUE
350844 2703
> votes_processed <- filter(votes_processed,!is.na(country))
> table(is.na(votes_processed$country))
FALSE
350844
之后,我 运行 使用以下命令创建另一个 tibble,为我提供有关总票数和 "yes"(1-是)票数按年份和国家/地区的比例的分组信息:
# Group by year and country: by_year_country
by_year_country <- votes_processed %>%
group_by(year,country) %>%
summarize(total = n(),
percent_yes = mean(vote == 1))
然后我 运行 以下命令按国家嵌套数据,控制台发送以下警告并删除我的国家列:
> nested <- by_year_country %>%
+ nest(-country)
Warning message:
Unknown or uninitialised column: 'country'.
> nested$country
NULL
Warning messages:
1: Unknown or uninitialised column: 'country'.
2: Unknown or uninitialised column: 'country'.
有人可以向我解释这个 "country" 专栏发生了什么,为什么 R 无法识别它,我能做些什么吗?
我是这个平台的初学者。我收到一条要求提供数据样本的评论,我将其粘贴在这里:
rcid<-c(5168,4317,3598,2314,1220,5024,3151,2042,2513,238,4171,3748,2595,
5160,4476,308,3621,874,2025,3793,3595,1191,987,1207,2255,211,
2585,2319,3590,189)
session<- c(66,56,46,36,26,64,42,34,38,4,54,48,38,66,58,6,46,18,34,
48,46,26,22,26,36,4,38,36,46,4)
vote<- c(1,8,1,8,9,1,3,2,2,9,2,1,3,1,1,1,1,1,1,1,1,1,9,2,1,9,1,1,1,2)
ccode<-as.integer(c(816,816,816,816,816,816,260,260,260,260,2,42,2,20,
31,41,20,42,41,31,70,95,80,93,58,51,53,90,55,90))
sample_data_votes<-data.frame("rcid"=rcid,"session"=session, "vote"= vote,
"ccode"=ccode)
非常感谢您的宝贵时间和建议。
by_year_country
已分组,因此您需要先取消分组然后进行嵌套
library(tidyverse)
by_year_country %>% ungroup() %>%
nest(-country) %>% head(n=2)
# A tibble: 2 x 2
country data
<chr> <list>
1 Guatemala <tibble [2 x 3]>
2 Haiti <tibble [2 x 3]>
您似乎需要从对 nest
-country
部分
library(dplyr)
library(tidyr)
library(countrycode)
rcid<-c(5168,4317,3598,2314,1220,5024,3151,2042,2513,238,4171,3748,2595,
5160,4476,308,3621,874,2025,3793,3595,1191,987,1207,2255,211,
2585,2319,3590,189)
session<- c(66,56,46,36,26,64,42,34,38,4,54,48,38,66,58,6,46,18,34,
48,46,26,22,26,36,4,38,36,46,4)
vote<- c(1,8,1,8,9,1,3,2,2,9,2,1,3,1,1,1,1,1,1,1,1,1,9,2,1,9,1,1,1,2)
ccode<-as.integer(c(816,816,816,816,816,816,260,260,260,260,2,42,2,20,
31,41,20,42,41,31,70,95,80,93,58,51,53,90,55,90))
votes<-data.frame("rcid"=rcid,"session"=session, "vote"= vote,
"ccode"=ccode)
votes_processed <- votes %>%
filter(vote <= 3) %>%
mutate(year = session + 1945,
country = countrycode(ccode,"cown","country.name")) %>%
filter(!is.na(country))
by_year_country <- votes_processed %>%
group_by(year,country) %>%
summarize(total = n(),
percent_yes = mean(vote == 1))
nested <- by_year_country %>%
nest()
让 -country 告诉 nest 使用除 country 之外的所有内容。默认情况下 nest 使用除分组列之外的所有列。 by_year_country 是按年份分组的标题。汇总调用删除了一级分组,因此它不再按国家/地区分组,但仍按年份分组。如果要删除分组,请使用 ungroup()