减去行并创建一个新的行名
Subtract rows and create a new row name
我想在此数据框中从 Florida 中减去 Bay County 并创建一个名为 [=18= 的新行]"佛罗里达州(-贝县)".
也许 group_modify 和 add_row (dplyr) 是可能的?
year <- c(2005,2006,2007,2005,2006,2007,2005,2006,2007,2005,2006,2007)
county <- c("Alachua County","Alachua County","Alachua County","Baker County","Baker County","Baker County","Bay County","Bay County","Bay County","Florida","Florida","Florida")
pop <- c(3,6,8,9,8,4,5,8,10,17,22,22)
gdp <- c(3,6,8,9,8,4,5,8,10,17,22,22)
area <- c(3,6,8,9,8,4,5,8,10,17,22,22)
density<-c(3,6,8,9,8,4,5,8,10,17,22,22)
df <- data.frame(year, county,pop,gdp,area,density, stringsAsFactors = FALSE)
year
county
pop
gdp
area
density
2005
Alachua County
3
3
3
3
2005
Baker County
9
9
9
9
2005
Bay County
5
5
5
5
2005
Florida
17
17
17
17
2005
Florida (-Bay County)
12
12
12
12
2006
Alachua County
6
6
6
6
2006
Baker County
8
8
8
8
2006
Bay County
8
8
8
8
2006
Florida
22
22
22
22
2006
Florida (-Bay County)
14
14
14
14
2007
Alachua County
8
8
8
8
2007
Baker County
4
4
4
4
2007
Bay County
10
10
10
10
2007
Florida
22
22
22
22
2007
Florida (-Bay County)
12
12
12
12
你可以这样做:
df %>%
filter(county != 'Florida' & county != 'Bay County') %>%
group_by(year) %>%
bind_rows(summarise(., county = 'Florida (-Bay County)',
across(where(is.numeric), sum))) %>%
arrange(year)
#> # A tibble: 9 x 6
#> # Groups: year [3]
#> year county pop gdp area density
#> <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 2005 Alachua County 3 3 3 3
#> 2 2005 Baker County 9 9 9 9
#> 3 2005 Florida (-Bay County) 12 12 12 12
#> 4 2006 Alachua County 6 6 6 6
#> 5 2006 Baker County 8 8 8 8
#> 6 2006 Florida (-Bay County) 14 14 14 14
#> 7 2007 Alachua County 8 8 8 8
#> 8 2007 Baker County 4 4 4 4
#> 9 2007 Florida (-Bay County) 12 12 12 12
如果你想尝试使用 group_modify
和 add_row
,你可以考虑这样的事情。这里,当使用add_row
时,使用map
将组内的数据sum
向上,但不包括“Florida”或“Bay County”。
library(tidyverse)
df %>%
group_by(year) %>%
group_modify(
~ .x %>%
add_row(
county = "Florida (-Bay County)",
!!! map(.x %>%
filter(!county %in% c("Florida", "Bay County")) %>%
select(-county),
sum)
)
)
输出
year county pop gdp area density
<dbl> <chr> <dbl> <dbl> <dbl> <dbl>
1 2005 Alachua County 3 3 3 3
2 2005 Baker County 9 9 9 9
3 2005 Bay County 5 5 5 5
4 2005 Florida 17 17 17 17
5 2005 Florida (-Bay County) 12 12 12 12
6 2006 Alachua County 6 6 6 6
7 2006 Baker County 8 8 8 8
8 2006 Bay County 8 8 8 8
9 2006 Florida 22 22 22 22
10 2006 Florida (-Bay County) 14 14 14 14
11 2007 Alachua County 8 8 8 8
12 2007 Baker County 4 4 4 4
13 2007 Bay County 10 10 10 10
14 2007 Florida 22 22 22 22
15 2007 Florida (-Bay County) 12 12 12 12
我想在此数据框中从 Florida 中减去 Bay County 并创建一个名为 [=18= 的新行]"佛罗里达州(-贝县)".
也许 group_modify 和 add_row (dplyr) 是可能的?
year <- c(2005,2006,2007,2005,2006,2007,2005,2006,2007,2005,2006,2007)
county <- c("Alachua County","Alachua County","Alachua County","Baker County","Baker County","Baker County","Bay County","Bay County","Bay County","Florida","Florida","Florida")
pop <- c(3,6,8,9,8,4,5,8,10,17,22,22)
gdp <- c(3,6,8,9,8,4,5,8,10,17,22,22)
area <- c(3,6,8,9,8,4,5,8,10,17,22,22)
density<-c(3,6,8,9,8,4,5,8,10,17,22,22)
df <- data.frame(year, county,pop,gdp,area,density, stringsAsFactors = FALSE)
year | county | pop | gdp | area | density |
---|---|---|---|---|---|
2005 | Alachua County | 3 | 3 | 3 | 3 |
2005 | Baker County | 9 | 9 | 9 | 9 |
2005 | Bay County | 5 | 5 | 5 | 5 |
2005 | Florida | 17 | 17 | 17 | 17 |
2005 | Florida (-Bay County) | 12 | 12 | 12 | 12 |
2006 | Alachua County | 6 | 6 | 6 | 6 |
2006 | Baker County | 8 | 8 | 8 | 8 |
2006 | Bay County | 8 | 8 | 8 | 8 |
2006 | Florida | 22 | 22 | 22 | 22 |
2006 | Florida (-Bay County) | 14 | 14 | 14 | 14 |
2007 | Alachua County | 8 | 8 | 8 | 8 |
2007 | Baker County | 4 | 4 | 4 | 4 |
2007 | Bay County | 10 | 10 | 10 | 10 |
2007 | Florida | 22 | 22 | 22 | 22 |
2007 | Florida (-Bay County) | 12 | 12 | 12 | 12 |
你可以这样做:
df %>%
filter(county != 'Florida' & county != 'Bay County') %>%
group_by(year) %>%
bind_rows(summarise(., county = 'Florida (-Bay County)',
across(where(is.numeric), sum))) %>%
arrange(year)
#> # A tibble: 9 x 6
#> # Groups: year [3]
#> year county pop gdp area density
#> <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 2005 Alachua County 3 3 3 3
#> 2 2005 Baker County 9 9 9 9
#> 3 2005 Florida (-Bay County) 12 12 12 12
#> 4 2006 Alachua County 6 6 6 6
#> 5 2006 Baker County 8 8 8 8
#> 6 2006 Florida (-Bay County) 14 14 14 14
#> 7 2007 Alachua County 8 8 8 8
#> 8 2007 Baker County 4 4 4 4
#> 9 2007 Florida (-Bay County) 12 12 12 12
如果你想尝试使用 group_modify
和 add_row
,你可以考虑这样的事情。这里,当使用add_row
时,使用map
将组内的数据sum
向上,但不包括“Florida”或“Bay County”。
library(tidyverse)
df %>%
group_by(year) %>%
group_modify(
~ .x %>%
add_row(
county = "Florida (-Bay County)",
!!! map(.x %>%
filter(!county %in% c("Florida", "Bay County")) %>%
select(-county),
sum)
)
)
输出
year county pop gdp area density
<dbl> <chr> <dbl> <dbl> <dbl> <dbl>
1 2005 Alachua County 3 3 3 3
2 2005 Baker County 9 9 9 9
3 2005 Bay County 5 5 5 5
4 2005 Florida 17 17 17 17
5 2005 Florida (-Bay County) 12 12 12 12
6 2006 Alachua County 6 6 6 6
7 2006 Baker County 8 8 8 8
8 2006 Bay County 8 8 8 8
9 2006 Florida 22 22 22 22
10 2006 Florida (-Bay County) 14 14 14 14
11 2007 Alachua County 8 8 8 8
12 2007 Baker County 4 4 4 4
13 2007 Bay County 10 10 10 10
14 2007 Florida 22 22 22 22
15 2007 Florida (-Bay County) 12 12 12 12