R 中枢轴更宽和枢轴更长的一些问题
Some issues with pivot wider and pivot longer in R
下面是示例数据和我所做的一项操作。我以前做过类似的事情,下面的代码完成了这项工作,但现在不是这样。第一个问题,我需要做 pivot_longer 吗?第二,为什么我得到了NA的
areaname<-c("Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace")
periodyear<-c(2011,2012,2013,2014,2015,2016,2017,2018,2019,2020,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020)
annualavg<-c(17.56,18.66,19.25,20.35,21.45,22.33,22.44,32.15,33.14,47.555,17.59,18.99,19.33,2.35,88.45,2.33,29.44,36.15,39.14,47.51)
table<-data.frame(areaname,periodyear,annualavg)
table$annualavgr <- round(table$annualavg,digits = 0)
chart17<-table %>%
dplyr::select("areaname","periodyear","annualavg","annualavgr")%>%
ungroup() %>%
pivot_longer(col = annualavgr, names_to = "measure", values_to = "value") %>%
group_by(areaname,measure) %>%
pivot_wider(names_from = periodyear, values_from = value)%>%gt()
期望的最终结果(或接近此的结果)
2011 2012 2013 2014 and so on....
Clark County 18 19 19 20
2011 2012 2013 2014
Someplace 18 19 19 2
我们需要在 pivot_longer
中使用两列
library(dplyr)
library(tidyr)
table %>%
dplyr::select("areaname","periodyear","annualavg","annualavgr")%>%
ungroup() %>%
pivot_longer(cols = c(annualavg, annualavgr),
names_to = "measure", values_to = "value") %>%
pivot_wider(names_from = periodyear, values_from = value)
-输出
# A tibble: 4 x 12
areaname measure `2011` `2012` `2013` `2014` `2015` `2016` `2017` `2018` `2019` `2020`
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Clark County annualavg 17.6 18.7 19.2 20.4 21.4 22.3 22.4 32.2 33.1 47.6
2 Clark County annualavgr 18 19 19 20 21 22 22 32 33 48
3 Someplace annualavg 17.6 19.0 19.3 2.35 88.4 2.33 29.4 36.2 39.1 47.5
4 Someplace annualavgr 18 19 19 2 88 2 29 36 39 48
如果我们需要两者的平均值 'annual',那么
table %>%
dplyr::select("areaname","periodyear","annualavg","annualavgr")%>%
ungroup() %>%
pivot_longer(cols = c(annualavg, annualavgr),
names_to = "measure", values_to = "value") %>%
pivot_wider(names_from = periodyear, values_from = value) %>%
group_by(areaname) %>%
summarise(across(where(is.numeric), mean, na.rm = TRUE))
-输出
# A tibble: 2 x 11
areaname `2011` `2012` `2013` `2014` `2015` `2016` `2017` `2018` `2019` `2020`
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Clark County 17.8 18.8 19.1 20.2 21.2 22.2 22.2 32.1 33.1 47.8
2 Someplace 17.8 19.0 19.2 2.17 88.2 2.16 29.2 36.1 39.1 47.8
如果我们只需要一个列 'annualavgr',则不需要 pivot_longer
,而只需 select
出 'annualavg'
table %>%
dplyr::select("areaname","periodyear","annualavgr")%>%
ungroup %>%
pivot_wider(names_from = periodyear, values_from = annualavgr)
# A tibble: 2 x 11
areaname `2011` `2012` `2013` `2014` `2015` `2016` `2017` `2018` `2019` `2020`
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Clark County 18 19 19 20 21 22 22 32 33 48
2 Someplace 18 19 19 2 88 2 29 36 39 48
下面是示例数据和我所做的一项操作。我以前做过类似的事情,下面的代码完成了这项工作,但现在不是这样。第一个问题,我需要做 pivot_longer 吗?第二,为什么我得到了NA的
areaname<-c("Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace")
periodyear<-c(2011,2012,2013,2014,2015,2016,2017,2018,2019,2020,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020)
annualavg<-c(17.56,18.66,19.25,20.35,21.45,22.33,22.44,32.15,33.14,47.555,17.59,18.99,19.33,2.35,88.45,2.33,29.44,36.15,39.14,47.51)
table<-data.frame(areaname,periodyear,annualavg)
table$annualavgr <- round(table$annualavg,digits = 0)
chart17<-table %>%
dplyr::select("areaname","periodyear","annualavg","annualavgr")%>%
ungroup() %>%
pivot_longer(col = annualavgr, names_to = "measure", values_to = "value") %>%
group_by(areaname,measure) %>%
pivot_wider(names_from = periodyear, values_from = value)%>%gt()
期望的最终结果(或接近此的结果)
2011 2012 2013 2014 and so on....
Clark County 18 19 19 20
2011 2012 2013 2014
Someplace 18 19 19 2
我们需要在 pivot_longer
library(dplyr)
library(tidyr)
table %>%
dplyr::select("areaname","periodyear","annualavg","annualavgr")%>%
ungroup() %>%
pivot_longer(cols = c(annualavg, annualavgr),
names_to = "measure", values_to = "value") %>%
pivot_wider(names_from = periodyear, values_from = value)
-输出
# A tibble: 4 x 12
areaname measure `2011` `2012` `2013` `2014` `2015` `2016` `2017` `2018` `2019` `2020`
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Clark County annualavg 17.6 18.7 19.2 20.4 21.4 22.3 22.4 32.2 33.1 47.6
2 Clark County annualavgr 18 19 19 20 21 22 22 32 33 48
3 Someplace annualavg 17.6 19.0 19.3 2.35 88.4 2.33 29.4 36.2 39.1 47.5
4 Someplace annualavgr 18 19 19 2 88 2 29 36 39 48
如果我们需要两者的平均值 'annual',那么
table %>%
dplyr::select("areaname","periodyear","annualavg","annualavgr")%>%
ungroup() %>%
pivot_longer(cols = c(annualavg, annualavgr),
names_to = "measure", values_to = "value") %>%
pivot_wider(names_from = periodyear, values_from = value) %>%
group_by(areaname) %>%
summarise(across(where(is.numeric), mean, na.rm = TRUE))
-输出
# A tibble: 2 x 11
areaname `2011` `2012` `2013` `2014` `2015` `2016` `2017` `2018` `2019` `2020`
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Clark County 17.8 18.8 19.1 20.2 21.2 22.2 22.2 32.1 33.1 47.8
2 Someplace 17.8 19.0 19.2 2.17 88.2 2.16 29.2 36.1 39.1 47.8
如果我们只需要一个列 'annualavgr',则不需要 pivot_longer
,而只需 select
出 'annualavg'
table %>%
dplyr::select("areaname","periodyear","annualavgr")%>%
ungroup %>%
pivot_wider(names_from = periodyear, values_from = annualavgr)
# A tibble: 2 x 11
areaname `2011` `2012` `2013` `2014` `2015` `2016` `2017` `2018` `2019` `2020`
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Clark County 18 19 19 20 21 22 22 32 33 48
2 Someplace 18 19 19 2 88 2 29 36 39 48