删除四列中不需要的行,并在 R 中从其中一列创建一个新列

Remove unnecessary rows in four columns and make a new column from one of them in R

我有一个包含多列的 CSV 文件。 date 列中的每个 ID 都有两行。第一行带有 acronym,第二行带有 date。我想删除所有在 date 列中具有 acronyms 的偶数行,并从这些 rows 中生成 another column。此外,我还想删除列 Date_ApprovedSRPermit.

中的空白行

如何使用 tidy 方法在 R 中执行此操作?

示例数据

    Date = c("SB",
         "1/4/2021", 
         "HC/SB",
         "1/5/2021",
         "NC",
         "1/6/2021",
         "HC",
         "1/13/2021")

Date_Approved = c(" ",
                  "1/4/2021",
                  " ",
                  "1/8/2021",
                  " ",
                  "1/12/2021",
                  " ",
                  "1/15/2021")

SR = c(" ",
       "1A",
       " ",
       "1B",
       " ",
       "1C",
       " ",
       "1D")

Permit = c(" ",
       "AAA",
       " ",
       "BBB",
       " ",
       "CCC",
       " ",
       "DDD") 

Owner_Agent = c("Joe",
                "Joey",
                "Ross",
                "Chandler",
                "Monica",
                "Rachel",
                "Ed",
                "Edd")

Address = c("1111 W. Broward Boulevard",
            "Plantation, 33317",
            "2222 N 23 Avenue",
            "Hollywood, FL 33020",
            "3333 Taylor Street",
            "Hollywood, 33021",
            "44444 NW 19th St",
            "5555 Oak St")     

df = data.frame(Date,
                Date_Approved,
                SR,
                Permit,
                Owner_Agent,
                Address)

  

快照

代码

library(tidyverse)

df = data.frame(Date,
                Date_Approved,
                SR,
                Permit,
                Owner_Agent,
                Address)

# Remove even rows with Acronyms and make a column out of them   
# Stuck 

           

使用 lag() 函数非常简单,假设每个“正确”行之间只有一个 space。

见下文:

library(tidyverse)

data.frame(Date, Date_Approved, SR, Permit, `Owner/Agent`, Address) %>% 
  mutate(Acronym = lag(Date)) %>% 
  filter(Date_Approved != " ")

这还假设其他列中的空值是 " " 而不是 NA

如果是 NA 将最后一行替换为 filter(!is.na(Date_Approved))

结果数据框如下:

       Date Date_Approved Acronym .....
1  1/4/2021      1/4/2021      SB
2  1/5/2021      1/8/2021   HC/SB
3  1/6/2021     1/12/2021      NC
4 1/13/2021     1/15/2021      HC