删除四列中不需要的行,并在 R 中从其中一列创建一个新列
Remove unnecessary rows in four columns and make a new column from one of them in R
我有一个包含多列的 CSV
文件。 date
列中的每个 ID
都有两行。第一行带有 acronym
,第二行带有 date
。我想删除所有在 date
列中具有 acronyms
的偶数行,并从这些 rows
中生成 another
column
。此外,我还想删除列 Date_Approved
、SR
和 Permit
.
中的空白行
如何使用 tidy
方法在 R
中执行此操作?
示例数据
Date = c("SB",
"1/4/2021",
"HC/SB",
"1/5/2021",
"NC",
"1/6/2021",
"HC",
"1/13/2021")
Date_Approved = c(" ",
"1/4/2021",
" ",
"1/8/2021",
" ",
"1/12/2021",
" ",
"1/15/2021")
SR = c(" ",
"1A",
" ",
"1B",
" ",
"1C",
" ",
"1D")
Permit = c(" ",
"AAA",
" ",
"BBB",
" ",
"CCC",
" ",
"DDD")
Owner_Agent = c("Joe",
"Joey",
"Ross",
"Chandler",
"Monica",
"Rachel",
"Ed",
"Edd")
Address = c("1111 W. Broward Boulevard",
"Plantation, 33317",
"2222 N 23 Avenue",
"Hollywood, FL 33020",
"3333 Taylor Street",
"Hollywood, 33021",
"44444 NW 19th St",
"5555 Oak St")
df = data.frame(Date,
Date_Approved,
SR,
Permit,
Owner_Agent,
Address)
快照
代码
library(tidyverse)
df = data.frame(Date,
Date_Approved,
SR,
Permit,
Owner_Agent,
Address)
# Remove even rows with Acronyms and make a column out of them
# Stuck
使用 lag()
函数非常简单,假设每个“正确”行之间只有一个 space。
见下文:
library(tidyverse)
data.frame(Date, Date_Approved, SR, Permit, `Owner/Agent`, Address) %>%
mutate(Acronym = lag(Date)) %>%
filter(Date_Approved != " ")
这还假设其他列中的空值是 " " 而不是 NA
如果是 NA
将最后一行替换为 filter(!is.na(Date_Approved))
结果数据框如下:
Date Date_Approved Acronym .....
1 1/4/2021 1/4/2021 SB
2 1/5/2021 1/8/2021 HC/SB
3 1/6/2021 1/12/2021 NC
4 1/13/2021 1/15/2021 HC
我有一个包含多列的 CSV
文件。 date
列中的每个 ID
都有两行。第一行带有 acronym
,第二行带有 date
。我想删除所有在 date
列中具有 acronyms
的偶数行,并从这些 rows
中生成 another
column
。此外,我还想删除列 Date_Approved
、SR
和 Permit
.
如何使用 tidy
方法在 R
中执行此操作?
示例数据
Date = c("SB",
"1/4/2021",
"HC/SB",
"1/5/2021",
"NC",
"1/6/2021",
"HC",
"1/13/2021")
Date_Approved = c(" ",
"1/4/2021",
" ",
"1/8/2021",
" ",
"1/12/2021",
" ",
"1/15/2021")
SR = c(" ",
"1A",
" ",
"1B",
" ",
"1C",
" ",
"1D")
Permit = c(" ",
"AAA",
" ",
"BBB",
" ",
"CCC",
" ",
"DDD")
Owner_Agent = c("Joe",
"Joey",
"Ross",
"Chandler",
"Monica",
"Rachel",
"Ed",
"Edd")
Address = c("1111 W. Broward Boulevard",
"Plantation, 33317",
"2222 N 23 Avenue",
"Hollywood, FL 33020",
"3333 Taylor Street",
"Hollywood, 33021",
"44444 NW 19th St",
"5555 Oak St")
df = data.frame(Date,
Date_Approved,
SR,
Permit,
Owner_Agent,
Address)
快照
代码
library(tidyverse)
df = data.frame(Date,
Date_Approved,
SR,
Permit,
Owner_Agent,
Address)
# Remove even rows with Acronyms and make a column out of them
# Stuck
使用 lag()
函数非常简单,假设每个“正确”行之间只有一个 space。
见下文:
library(tidyverse)
data.frame(Date, Date_Approved, SR, Permit, `Owner/Agent`, Address) %>%
mutate(Acronym = lag(Date)) %>%
filter(Date_Approved != " ")
这还假设其他列中的空值是 " " 而不是 NA
如果是 NA
将最后一行替换为 filter(!is.na(Date_Approved))
结果数据框如下:
Date Date_Approved Acronym .....
1 1/4/2021 1/4/2021 SB
2 1/5/2021 1/8/2021 HC/SB
3 1/6/2021 1/12/2021 NC
4 1/13/2021 1/15/2021 HC