将多行数据格式化为R中的单行
formatting multi-row data into single row in R
我是一个奇怪的 excel 或 csv 格式的文件,我想将其作为数据框导入到 R。问题是有些列有多行记录,例如数据如下: 有三列两行但是工具列有多列,有没有办法我可以格式化数据所以我会有只记录多个工具(比如工具 1、工具 2 等)
Task Location Tools
Raising ticket Alabama sharepoint
word
oracle
Changing ticket Seattle word
oracle
预期最终输出
Task Location Tools1 Tools2 Tools3
Raising ticket Alabama sharepoint word oracle
Changing ticket Seattle word oracle
与 dplyr
和 tidyr
。您可以 fill
您的数据框,以便任务和位置包含在每一行中。然后 group_by
任务和 mutate
为每个组中的每个任务添加一个 id 列。然后使用 spread
将新创建的 id 列分布到多个列中。
library(dplyr)
library(tidyr)
df <- data.frame(Task = c("Raising ticket","","","Changing ticket",""), Location = c("Alabama","","","Seattle",""), Tools = c("sharepoint","word","oracle","word","oracle"))
df[df==""] <- NA
df %>%
fill(Task,Location) %>%
group_by(Task) %>%
mutate(id = paste0("Tools",row_number())) %>%
spread(id, Tools)
# A tibble: 2 x 5
# Groups: Task [2]
# Task Location Tools1 Tools2 Tools3
# <fct> <fct> <fct> <fct> <fct>
# 1 Changing ticket Seattle word oracle <NA>
# 2 Raising ticket Alabama sharepoint word oracle
我是一个奇怪的 excel 或 csv 格式的文件,我想将其作为数据框导入到 R。问题是有些列有多行记录,例如数据如下: 有三列两行但是工具列有多列,有没有办法我可以格式化数据所以我会有只记录多个工具(比如工具 1、工具 2 等)
Task Location Tools
Raising ticket Alabama sharepoint
word
oracle
Changing ticket Seattle word
oracle
预期最终输出
Task Location Tools1 Tools2 Tools3
Raising ticket Alabama sharepoint word oracle
Changing ticket Seattle word oracle
与 dplyr
和 tidyr
。您可以 fill
您的数据框,以便任务和位置包含在每一行中。然后 group_by
任务和 mutate
为每个组中的每个任务添加一个 id 列。然后使用 spread
将新创建的 id 列分布到多个列中。
library(dplyr)
library(tidyr)
df <- data.frame(Task = c("Raising ticket","","","Changing ticket",""), Location = c("Alabama","","","Seattle",""), Tools = c("sharepoint","word","oracle","word","oracle"))
df[df==""] <- NA
df %>%
fill(Task,Location) %>%
group_by(Task) %>%
mutate(id = paste0("Tools",row_number())) %>%
spread(id, Tools)
# A tibble: 2 x 5
# Groups: Task [2]
# Task Location Tools1 Tools2 Tools3
# <fct> <fct> <fct> <fct> <fct>
# 1 Changing ticket Seattle word oracle <NA>
# 2 Raising ticket Alabama sharepoint word oracle