R将数据框中的多列合并为两行
R Combine two rows into one for multiple columns in a dataframe
我有一个 dataset
,我正在尝试 tidy
使用不同的方法。对于第一步,我想将每一列中的每个 two
行合并到 single
行,如 desired
输出中所示。
我如何在 R
中通过 tidy
的方式做到这一点?
示例数据
Date = c("SB",
"1/4/2021",
"HC/SB",
"1/5/2021",
"NC",
"1/6/2021",
"HC",
"1/13/2021")
Date_Approved = c(" ",
"1/4/2021",
" ",
"1/8/2021",
" ",
"1/12/2021",
" ",
"1/15/2021")
SR = c(" ",
"1A",
" ",
"1B",
" ",
"1C",
" ",
"1D")
Permit = c(" ",
"AAA",
" ",
"BBB",
" ",
"CCC",
" ",
"DDD")
Owner_Agent = c("Joe",
"Joey",
"Ross",
"Chandler",
"Monica",
"Rachel",
"Ed",
"Edd",
"Eddy")
Address = c("1111 W. Broward Boulevard",
"Plantation, 33333",
"2222 N 23 Avenue",
"Hollywood, FL 33322",
"3333 Taylor Street",
"Hollywood, 33311",
"44444 NW 19th St",
"5555 Oak St",
"Pembroke Pines, 33300")
原始数据是这样的:
期望的输出
Date Date_Approved SR Permit Owner_Agent
SB 1/4/2021 1/4/2021 1A AAA Joe, Joey
HC/SB 1/5/2021 1/8/2021 1B BBB Chandler, Monica
NC 1/6/2021 1/12/2021 1C CCC Rachel, Ed
HC 1/13/2021 1/15/2021 1D DDD Edd, Eddy
Address
1111 W. Broward Boulevard Plantation, 33333
2222 N 23 Avenue Hollywood, FL 33322
3333 Taylor Street Hollywood, 33311
44444 NW 19th St Pembroke Pines, 33300
我查过 and ,但使用 group_by
弄乱了 df
。
代码
library(tidyverse)
df = data.frame(Date,
Date_Approved,
SR,
Permit,
Owner_Agent,
Address)
# Tidy up the df
df = df %>%
您可以尝试创建行标识符,按该 ID 分组,然后使用 summarize(across())
,如下所示:
df %>%
mutate(id=rep(1:(n()/2), each=2)) %>%
group_by(id) %>%
summarize(across(Date:Address, ~trimws(paste0(.x, collapse=" "))))
输出:
# A tibble: 4 × 7
id Date Date_Approved SR Permit Owner_Agent Address
<int> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 SB 1/4/2021 1/4/2021 1A AAA Joe Joey 1111 W. Broward Boulevard Plantation, 33333
2 2 HC/SB 1/5/2021 1/8/2021 1B BBB Ross Chandler 2222 N 23 Avenue Hollywood, FL 33322
3 3 NC 1/6/2021 1/12/2021 1C CCC Monica Rachel 3333 Taylor Street Hollywood, 33311
4 4 HC 1/13/2021 1/15/2021 1D DDD Ed Eddy 44444 NW 19th St Pembroke Pines, 33300
输入:
structure(list(Date = c("SB", "1/4/2021", "HC/SB", "1/5/2021",
"NC", "1/6/2021", "HC", "1/13/2021"), Date_Approved = c(" ",
"1/4/2021", " ", "1/8/2021", " ", "1/12/2021", " ", "1/15/2021"
), SR = c(" ", "1A", " ", "1B", " ", "1C", " ", "1D"), Permit = c(" ",
"AAA", " ", "BBB", " ", "CCC", " ", "DDD"), Owner_Agent = c("Joe",
"Joey", "Ross", "Chandler", "Monica", "Rachel", "Ed", "Eddy"),
Address = c("1111 W. Broward Boulevard", "Plantation, 33333",
"2222 N 23 Avenue", "Hollywood, FL 33322", "3333 Taylor Street",
"Hollywood, 33311", "44444 NW 19th St", "Pembroke Pines, 33300"
)), class = "data.frame", row.names = c(NA, -8L))
我有一个 dataset
,我正在尝试 tidy
使用不同的方法。对于第一步,我想将每一列中的每个 two
行合并到 single
行,如 desired
输出中所示。
我如何在 R
中通过 tidy
的方式做到这一点?
示例数据
Date = c("SB",
"1/4/2021",
"HC/SB",
"1/5/2021",
"NC",
"1/6/2021",
"HC",
"1/13/2021")
Date_Approved = c(" ",
"1/4/2021",
" ",
"1/8/2021",
" ",
"1/12/2021",
" ",
"1/15/2021")
SR = c(" ",
"1A",
" ",
"1B",
" ",
"1C",
" ",
"1D")
Permit = c(" ",
"AAA",
" ",
"BBB",
" ",
"CCC",
" ",
"DDD")
Owner_Agent = c("Joe",
"Joey",
"Ross",
"Chandler",
"Monica",
"Rachel",
"Ed",
"Edd",
"Eddy")
Address = c("1111 W. Broward Boulevard",
"Plantation, 33333",
"2222 N 23 Avenue",
"Hollywood, FL 33322",
"3333 Taylor Street",
"Hollywood, 33311",
"44444 NW 19th St",
"5555 Oak St",
"Pembroke Pines, 33300")
原始数据是这样的:
期望的输出
Date Date_Approved SR Permit Owner_Agent
SB 1/4/2021 1/4/2021 1A AAA Joe, Joey
HC/SB 1/5/2021 1/8/2021 1B BBB Chandler, Monica
NC 1/6/2021 1/12/2021 1C CCC Rachel, Ed
HC 1/13/2021 1/15/2021 1D DDD Edd, Eddy
Address
1111 W. Broward Boulevard Plantation, 33333
2222 N 23 Avenue Hollywood, FL 33322
3333 Taylor Street Hollywood, 33311
44444 NW 19th St Pembroke Pines, 33300
我查过 group_by
弄乱了 df
。
代码
library(tidyverse)
df = data.frame(Date,
Date_Approved,
SR,
Permit,
Owner_Agent,
Address)
# Tidy up the df
df = df %>%
您可以尝试创建行标识符,按该 ID 分组,然后使用 summarize(across())
,如下所示:
df %>%
mutate(id=rep(1:(n()/2), each=2)) %>%
group_by(id) %>%
summarize(across(Date:Address, ~trimws(paste0(.x, collapse=" "))))
输出:
# A tibble: 4 × 7
id Date Date_Approved SR Permit Owner_Agent Address
<int> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 SB 1/4/2021 1/4/2021 1A AAA Joe Joey 1111 W. Broward Boulevard Plantation, 33333
2 2 HC/SB 1/5/2021 1/8/2021 1B BBB Ross Chandler 2222 N 23 Avenue Hollywood, FL 33322
3 3 NC 1/6/2021 1/12/2021 1C CCC Monica Rachel 3333 Taylor Street Hollywood, 33311
4 4 HC 1/13/2021 1/15/2021 1D DDD Ed Eddy 44444 NW 19th St Pembroke Pines, 33300
输入:
structure(list(Date = c("SB", "1/4/2021", "HC/SB", "1/5/2021",
"NC", "1/6/2021", "HC", "1/13/2021"), Date_Approved = c(" ",
"1/4/2021", " ", "1/8/2021", " ", "1/12/2021", " ", "1/15/2021"
), SR = c(" ", "1A", " ", "1B", " ", "1C", " ", "1D"), Permit = c(" ",
"AAA", " ", "BBB", " ", "CCC", " ", "DDD"), Owner_Agent = c("Joe",
"Joey", "Ross", "Chandler", "Monica", "Rachel", "Ed", "Eddy"),
Address = c("1111 W. Broward Boulevard", "Plantation, 33333",
"2222 N 23 Avenue", "Hollywood, FL 33322", "3333 Taylor Street",
"Hollywood, 33311", "44444 NW 19th St", "Pembroke Pines, 33300"
)), class = "data.frame", row.names = c(NA, -8L))