用 R 中重新排列的部分制作一个新的 table

Question

我正在将一些 SAS 代码转换为 R 并且卡在了一部分以将最终数据输出重塑为新格式。我有一个数据框 df，它看起来像这样：

 State  AREA Year species ncount_ip   est.ip     se.ip  est.tib    se.tib
1    CT 12593 2015    ABDU        56 1349.250  943.2464  4497.50 2871.4829
2    CT 12593 2015    GADW        56  224.875  224.3744  6746.25 6290.3472
3    CT 12593 2015    COME        56    0.000    0.0000     0.00    0.0000
4    VT 12593 2015    ABDU        56 8545.250 1756.8546 19114.38 5443.0618
5    VT 12593 2015    COME        56  674.625  498.0543  1349.25  996.1086
6    VT 12593 2015    GADW        56  224.875  224.3744   449.75  448.7489

我想用这样的格式（最终）：

Species Type    Year    VTest     VTse      CTest       CTse
GADW    Pop     2015    449.75    448.7489  6746.25     6290.3472
GADW    Pairs   2015    224.875   224.3744  224.875     224.3744
ABDU    Pop     2015    19114.38  5443.0618 4497.50     2871.4829
ABDU    Pairs   2015    8545.250  1756.8546 1349.250    943.2464
COME    Pop     2015    1349.25   996.1086  0.00        0.00
COME    Pairs   2015    674.625   498.0543  0.00        0.00

基本上，我需要从 df 中获取 pop 的估计值 (est) 和标准误差 (se)。 (.tib) 和每个状态的对 (.ip)（示例中仅给出 2 个，但在实际数据集中大约有 10 个）为每个物种形成 2 行，每个状态有 2 列以创建最终结果。

我开始尝试包装重塑和熔化，但没有完全得到我需要的东西。我认为在融化后按状态重命名可能会起作用，但不能适当地编码。感谢您的时间和帮助。

Answer 1

从您的 df 开始，并使用 reshape2 包，首先融化数据框，保留前 5 列：

> library(reshape2)
> melted <- melt(df, id.vars=1:5)
> head(melted)
  State  AREA Year species ncount_ip variable    value
1    CT 12593 2015    ABDU        56   est.ip 1349.250
2    CT 12593 2015    GADW        56   est.ip  224.875
3    CT 12593 2015    COME        56   est.ip    0.000
4    VT 12593 2015    ABDU        56   est.ip 8545.250
5    VT 12593 2015    COME        56   est.ip  674.625
6    VT 12593 2015    GADW        56   est.ip  224.875

使用 colsplit 从新的 variable 列（旧的 headers）中提取统计信息和类型标识符，并将这些列添加到数据框：

> melted <- cbind(melted,
+ colsplit(melted$variable, "\.", c('stat','type')))
> head(melted)
  State  AREA Year species ncount_ip variable    value stat type
1    CT 12593 2015    ABDU        56   est.ip 1349.250  est   ip
2    CT 12593 2015    GADW        56   est.ip  224.875  est   ip
3    CT 12593 2015    COME        56   est.ip    0.000  est   ip
4    VT 12593 2015    ABDU        56   est.ip 8545.250  est   ip
5    VT 12593 2015    COME        56   est.ip  674.625  est   ip
6    VT 12593 2015    GADW        56   est.ip  224.875  est   ip

结合State和stat来实现你想要的标签，并替换type中的字符串：

> melted$state_stat <- paste(melted$State, melted$stat, sep ="_")
> melted$type <- gsub("tib", "Pop", melted$type)
> melted$type <- gsub("ip", "Pairs", melted$type)
> head(melted)
  State  AREA Year species ncount_ip variable    value stat  type state_stat
1    CT 12593 2015    ABDU        56   est.ip 1349.250  est Pairs     CT_est
2    CT 12593 2015    GADW        56   est.ip  224.875  est Pairs     CT_est
3    CT 12593 2015    COME        56   est.ip    0.000  est Pairs     CT_est
4    VT 12593 2015    ABDU        56   est.ip 8545.250  est Pairs     VT_est
5    VT 12593 2015    COME        56   est.ip  674.625  est Pairs     VT_est
6    VT 12593 2015    GADW        56   est.ip  224.875  est Pairs     VT_est

重铸数据框，使用新的 state_stat 列作为列：

> final <- dcast(melted, Year+species+type~state_stat, value.var="value")
> final
  Year species  type   CT_est     CT_se    VT_est     VT_se
1 2015    ABDU Pairs 1349.250  943.2464  8545.250 1756.8546
2 2015    ABDU   Pop 4497.500 2871.4829 19114.380 5443.0618
3 2015    COME Pairs    0.000    0.0000   674.625  498.0543
4 2015    COME   Pop    0.000    0.0000  1349.250  996.1086
5 2015    GADW Pairs  224.875  224.3744   224.875  224.3744
6 2015    GADW   Pop 6746.250 6290.3472   449.750  448.7489

用 R 中重新排列的部分制作一个新的 table

Make a new table with rearranged pieces in R

r

reshape

dataframe