如何将数据帧行转换为 R 中的列？

Question

我有一个需要转换的数据框。我需要根据列的值将行更改为唯一的列。

例如：

输入数据帧

| column_1 | column_2 |
-----------------------
|   A      |     B    |
|   A      |     C    |
|   B      |     E    |
|   B      |     C    |
|   C      |     F    |
|   C      |     G    |

输出数据帧

| column_1 | column_2 | column_3 |
----------------------------------
|   A      |     B    |     C    |
|   B      |     E    |     C    |
|   C      |     F    |     G    |

最终的 DataFrame 应具有 column_1 中的所有唯一值，并且来自输入 DataFrame column_2 的值将作为新列添加到新 DataFrame 中，即 Column_2 和 Column_3.

我尝试在 R 中使用 reshape 和 melt 包，但我得到了错误的数据框。

Answer 1

假设 column_1 中的每个值总是有 2 行。

在第一个 data.table 中提取每个 column_1 元素的第一行，然后在第二个 data.table 中提取最后一行，最后将它们合并为一个新的 data.table

library(data.table)

df <- data.frame(column_1=c("A","A","B","B","C","C"),column_2=c("B","C","E","C","F","G"))
df <- as.data.table(df)
setkey(df,column_1)
first_part <- df[J(unique(column_1)), mult = "first"]
second_part <- df[J(unique(column_1)), mult = "last"]
setnames(second_part,"column_2","column_3")

new_df <- merge(first_part,second_part, by="column_1")

   column_1 column_2 column_3
1:        A        B        C
2:        B        E        C
3:        C        F        G

Answer 2

我们可以使用 splitstackshape 中的 dplyr 和 cSplit 函数。它也适用于每组有两个以上值的情况。

library(dplyr)
library(splitstackshape)
dt2 <- dt %>%
  group_by(column_1) %>%
  summarise(column_2 = toString(column_2)) %>%
  cSplit("column_2") %>%
  setNames(paste0("column_", 1:ncol(.)))

dt2
   column_1 column_2 column_3
1:        A        B        C
2:        B        E        C
3:        C        F        G

数据

dt <- data.frame(column_1 = c("A", "A", "B", "B", "C", "C"),
                 column_2 = c("B", "C", "E", "C", "F", "G"),
                 stringsAsFactors = FALSE)

Answer 3

这是 dplyr 和 tidyr 的简短解决方案：

library(dplyr)
library(tidyr)
df %>% mutate(col = c("column_2","column_3")[duplicated(column_1)+1]) %>%
  spread(col,column_2)

#   column_1 column_2 column_3
# 1        A        B        C
# 2        B        E        C
# 3        C        F        G

还有一个通用的解决方案：

df <- data.frame(column_1 = c("A", "A", "B", "B", "C", "C","A","B","C"),
                 column_2 = c("B", "C", "E", "C", "F", "G","X","Y","Z"),
                 stringsAsFactors = FALSE)

df %>% group_by(column_1) %>%
  mutate(col=paste0("column_",row_number()+1)) %>%
  spread(col,column_2) %>% ungroup

# # A tibble: 3 x 4
#   column_1 column_2 column_3 column_4
# *    <chr>    <chr>    <chr>    <chr>
# 1        A        B        C        X
# 2        B        E        C        Y
# 3        C        F        G        Z

如何将数据帧行转换为 R 中的列？

How to transform a dataframes row into columns in R?

r

transform

reshape

dataframe

melt