如何转置我的数据,使其在 R 中每组只有一行?

How can I transpose my data so it only have one row per group in R?

我在 R 中有一个带有变量 (permno) 的数据框,它是一个唯一的公司 ID。对于每家公司,我都估算了 Intercept、r2_12、sue 和 car3。如下所示。

    Permno   Term      Estimate
1   10001 Intercept    0.020
2   10001     r2_12   -0.010
3   10001       sue    0.007
4   10001      car3    0.140
5   10025 Intercept    0.007
6   10025     r2_12   -0.004
7   10025       sue    0.001
8   10025      car3    0.020
9   10026 Intercept    0.020
10  10026     r2_12   -0.010
11  10026       sue    0.002
12  10026      car3    0.030

现在我想将行转换为列,所以每个 Permno 我只有一行。这意味着 Intercept、r2_12、sue 和 car3 成为 4 个新列,如下所示:

 Permno Intercept  r2_12   sue car3
1  10001     0.020 -0.010 0.007 0.14
2  10025     0.007 -0.004 0.001 0.02
3  10026     0.020 -0.010 0.002 0.03

有谁知道我如何在 R 中做到这一点?

您可以使用 tidyr 库中的 pivot_wider 执行此操作:

library(tidyr)

df %>% 
  tidyr::pivot_wider(id_cols = Permno,
                     names_from = Term,
                     values_from = Estimate)

 Permno Intercept r2_12   sue  car3
   <dbl>     <dbl> <dbl> <dbl> <dbl>
1   1001      0.02 -0.01 0.007  0.14


数据

df <- data.frame("Permno" = rep(1001, 4),
                 "Term" = c("Intercept", "r2_12", "sue", "car3"),
                 "Estimate" = c(0.020, -0.010, 0.007, 0.140))

这里有一些基本的 R 可能性:

xtabs(Estimate ~ ., DF)
##        Term
## Permno    car3 Intercept  r2_12    sue
##  10001  0.140     0.020 -0.010  0.007
##  10025  0.020     0.007 -0.004  0.001
##  10026  0.030     0.020 -0.010  0.002

with(DF, tapply(DF[[3]], DF[-3], c))
##        Term
## Permno  car3 Intercept  r2_12   sue
##  10001 0.14     0.020 -0.010 0.007
##  10025 0.02     0.007 -0.004 0.001
##  10026 0.03     0.020 -0.010 0.002

reshape(DF, dir = "wide", idvar = "Permno", timevar = "Term")
##   Permno Estimate.Intercept Estimate.r2_12 Estimate.sue Estimate.car3
## 1  10001              0.020         -0.010        0.007          0.14
## 5  10025              0.007         -0.004        0.001          0.02
## 9  10026              0.020         -0.010        0.002          0.03

备注

可重现形式的输入:

Lines <- "
    Permno   Term      Estimate
1   10001 Intercept    0.020
2   10001     r2_12   -0.010
3   10001       sue    0.007
4   10001      car3    0.140
5   10025 Intercept    0.007
6   10025     r2_12   -0.004
7   10025       sue    0.001
8   10025      car3    0.020
9   10026 Intercept    0.020
10  10026     r2_12   -0.010
11  10026       sue    0.002
12  10026      car3    0.030"
DF <- read.table(text = Lines, header = TRUE)

你可以使用pivot_wider

library(tidyr)
library(dplyr)

# your data
df <- tribble(
  ~Permno, ~Term, ~Estimate,
10001, "Intercept", 0.020, 
10001, "r2_12", -0.010, 
10001, "sue", 0.007, 
10001, "car3", 0.140, 
10025, "Intercept", 0.007, 
10025, "r2_12", -0.004, 
10025, "sue", 0.001, 
10025, "car3", 0.020, 
10026, "Intercept", 0.020, 
10026, "r2_12", -0.010, 
10026, "sue", 0.002, 
10026, "car3", 0.030)

df <- df %>% 
  pivot_wider(names_from = Term, values_from = Estimate)