通过 "multiplying" 一列的元素通过其他列的名称转换数据框

Transforming a dataframe by "multiplying" a column's elements by the names of the other columns

下面是一个例子。如何将列为 names 的数据框 df 转换为下面的 df.transformed 形式?

> df <- data.frame("names" = c("y1", "y2"), "x1" = 1:2, "x2" = 4:5)
> df
  names x1 x2
1    y1  1  4
2    y2  2  5


> df.transformed <- data.frame("y1x1" = 1, "y1x2" =4, "y2x1" = 2, "y2x2" = 5)
> df.transformed
  y1x1 y1x2 y2x1 y2x2
1    1    4    2    5

您可以在 base R 中实现这一点。这应该适用于任何数据帧大小。这个想法是将 Reduceouter 结合起来构建数据框列名称。

df <- data.frame("names" = c("y1", "y2"), "x1" = 1:2, "x2" = 4:5)

df_names <- outer(df[,1], names(df[,-1]), paste0)
df.transformed <- as.data.frame(matrix(,ncol = nrow(df)*ncol(df[,-1]), nrow = 0))
names(df.transformed) <- Reduce(`c`,t(df_names))
df.transformed[1,] <- Reduce(`c`,t(df[-1]))

输出

#  y1x1 y1x2 y2x1 y2x2
#    1    4    2    5

代码

require(data.table); setDT(df)

dt = melt(df, id.vars = 'names')[, col := paste0(variable, names)]
out = dt$value; names(out) = dt$col

结果

> data.frame(t(out))

x1y1 x1y2 x2y1 x2y2 
   1    2    4    5 

您可以使用新的 tidyr::pivot_wider 在一行中完成此操作。为值设置多列意味着名称将粘贴在一起进行分配。

library(tidyr)

pivot_wider(df, names_from = names, values_from = c(x1, x2), names_sep = "")
#> # A tibble: 1 x 4
#>    x1y1  x1y2  x2y1  x2y2
#>   <int> <int> <int> <int>
#> 1     1     2     4     5

但是,列名("x1"、"x2")排在第一位。如果您需要交换名称的 "x" 和 "y" 组件,您可以使用 dplyr::rename_all.

进行正则表达式替换
df %>%
  pivot_wider(names_from = names, values_from = c(x1, x2), names_sep = "") %>%
  dplyr::rename_all(gsub, pattern = "(x\d+)(y\d+)", replacement = "\2\1")
#> # A tibble: 1 x 4
#>    y1x1  y2x1  y1x2  y2x2
#>   <int> <int> <int> <int>
#> 1     1     2     4     5