使用嵌套 group_by 时 dplyr 出错
Error in dplyr when using nested group_by
我正在尝试使用 dplyr 执行多项操作,但我卡住了,我不知道我做错了什么。
我的 60760 观察数据框的前十五行如下所示:
df <-structure(list(dem_sect = structure(c(4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 1L),
.Label = c("AB", "EP", "FE", "MF", "PA"), class = "factor"),
area = c(1181.16, 1181.16, 2190.28, 2190.28, 956.08, 2190.28, 1181.16, 2190.28, 956.08, 2190.28, 2190.28, 1181.16, 2190.28, 956.08, 921.47),
peso_kg = c(0.184, 0.674, 0.1, 0.152, 0.21, 0.104, 4.31, 0.048, 0.242, 0.724, 0.126, 1.13, 0.13, 0.048, 0.075),
sector = c("MFa", "MFa", "MFb", "MFb", "MFc", "MFb", "MFa", "MFb", "MFc", "MFb", "MFb", "MFa", "MFb", "MFc", "ABb")),
row.names = c(NA, 15L), class = data.frame")
这是我正在使用的代码
test <- df %>%
group_by(sector) %>%
summarise(md_area= mean (area),
md_peso= mean (peso_kg),
se_loc= sqrt(var(peso_kg))/sqrt(length (peso_kg)),
cv_loc= sd(peso_kg)/ mean (peso_kg)*100) %>%
group_by(dem_sect) %>%
mutate(sum_sect = sum(SEloc * CVloc)))
我尝试使用 mutate(sum_sect=sum(mean(area) * mean(peso_kg))
而不是 mutate(sum_sect = sum(SEloc * CVloc)
,但在这两种情况下我都收到以下错误:
Error in group_by()
:
! Must group by variables found in .data
.
x Column dem_sect
is not found.
我尝试了几种方法但都没有成功。我不知道我做错了什么。
任何提示都将非常受欢迎。提前致谢。
如果您想对 dem_sect
的每个唯一值进行一次观察,则:
test <- df %>%
group_by(sector) %>%
mutate(
md_area = mean(area),
md_peso = mean(peso_kg),
se_loc = sqrt(var(peso_kg))/sqrt(length(peso_kg)),
cv_loc = sd(peso_kg)/mean(peso_kg)*100
) %>%
ungroup() %>%
group_by(dem_sect) %>%
summarize(sum_sect = sum(SEloc * CVloc)))
我觉得SEloc
和CVloc
不存在,应该是se_loc
和cv_loc
吧?
我正在尝试使用 dplyr 执行多项操作,但我卡住了,我不知道我做错了什么。
我的 60760 观察数据框的前十五行如下所示:
df <-structure(list(dem_sect = structure(c(4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 1L),
.Label = c("AB", "EP", "FE", "MF", "PA"), class = "factor"),
area = c(1181.16, 1181.16, 2190.28, 2190.28, 956.08, 2190.28, 1181.16, 2190.28, 956.08, 2190.28, 2190.28, 1181.16, 2190.28, 956.08, 921.47),
peso_kg = c(0.184, 0.674, 0.1, 0.152, 0.21, 0.104, 4.31, 0.048, 0.242, 0.724, 0.126, 1.13, 0.13, 0.048, 0.075),
sector = c("MFa", "MFa", "MFb", "MFb", "MFc", "MFb", "MFa", "MFb", "MFc", "MFb", "MFb", "MFa", "MFb", "MFc", "ABb")),
row.names = c(NA, 15L), class = data.frame")
这是我正在使用的代码
test <- df %>%
group_by(sector) %>%
summarise(md_area= mean (area),
md_peso= mean (peso_kg),
se_loc= sqrt(var(peso_kg))/sqrt(length (peso_kg)),
cv_loc= sd(peso_kg)/ mean (peso_kg)*100) %>%
group_by(dem_sect) %>%
mutate(sum_sect = sum(SEloc * CVloc)))
我尝试使用 mutate(sum_sect=sum(mean(area) * mean(peso_kg))
而不是 mutate(sum_sect = sum(SEloc * CVloc)
,但在这两种情况下我都收到以下错误:
Error in
group_by()
: ! Must group by variables found in.data
. x Columndem_sect
is not found.
我尝试了几种方法但都没有成功。我不知道我做错了什么。 任何提示都将非常受欢迎。提前致谢。
如果您想对 dem_sect
的每个唯一值进行一次观察,则:
test <- df %>%
group_by(sector) %>%
mutate(
md_area = mean(area),
md_peso = mean(peso_kg),
se_loc = sqrt(var(peso_kg))/sqrt(length(peso_kg)),
cv_loc = sd(peso_kg)/mean(peso_kg)*100
) %>%
ungroup() %>%
group_by(dem_sect) %>%
summarize(sum_sect = sum(SEloc * CVloc)))
我觉得SEloc
和CVloc
不存在,应该是se_loc
和cv_loc
吧?