如何通过R中的其他变量将一列分成多列

How to divide one column in to multiple columns by other variables in R

我有一个数据集

data
    Choice    Length Gender
 1       I subadults      M
 2       F subadults      M
 3       F subadults      M
 4       F subadults      M
 5       I subadults      M
 6       F subadults      M
 7       I subadults      M
 8       F subadults      M
 9       I subadults      M
 10      I subadults      M
 11      I subadults      M
 12      O subadults      M
 13      O subadults      M
 14      I subadults      M
 15      F subadults      M
 16      F subadults      M
 17      I subadults      M
 18      O subadults      M
 19      F subadults      M
 20      O subadults      M
 21      F subadults      M
 22      F    adults      M
 23      I    adults      M
 24      F    adults      M
 25      I    adults      M
 26      F    adults      M
 27      F    adults      M
 28      F    adults      M
 29      F    adults      M
 30      F    adults      M
 31      O    adults      M
 32      O    adults      M
 33      F    adults      F
 34      F    adults      F
 35      F    adults      F
 36      F    adults      F
 37      O    adults      F
 38      F    adults      F
 39      F    adults      F
 40      I subadults      F
 41      I subadults      F
 42      I subadults      F
 43      O subadults      F
 44      I subadults      F
 45      I subadults      F
 46      I subadults      F
 47      F subadults      F
 48      I subadults      F
 49      O subadults      F
 50      I subadults      F
 51      I    adults      F
 52      F    adults      F
 53      F    adults      F
 54      F    adults      F
 55      F    adults      F

现在我想将 Choice 列分成三部分,因此数据集将像:

  F  I  O  Length    Gender
  1  0 20  subadults   F
  0  10 0  adults      F
  12 0  11  subadults  M
  0  10 0  adults      M

其中F、I、O是长度和性别之和。

我找不到执行此操作的 R 命令。有没有人可以帮助我? 太感谢了!颜

你可以试试:

 reshape(as.data.frame(table(df)),
         idvar=c("Length","Gender"),
         timevar="Choice",direction="wide")
 #      Length Gender Freq.F Freq.I Freq.O
 #1     adults      F     10      1      1
 #4  subadults      F      1      8      2
 #7     adults      M      7      2      2
 #10 subadults      M      9      8      4

函数table给出了每个ChoiceGenderLength作为多维数组出现的次数。然后,您强制转换为具有 4 列的 data.frame(上面的三列加上一个名为 Freq 的列,该列指示每个案例的出现次数),然后根据需要重塑结果。

编辑

我现在意识到我没有理解您的价值观。这里我统计了每个case出现的次数。你的价值观正确吗?如果是这样,您如何得出这些值?

尝试:

require(reshape2)
data <- data.frame(choice = c('I', 'F', 'I', 'O', 'F', 'O'), 
                   length = c('subadults', 'subadults', 'subadults', 'adults', 'adults', 'adults'),
                   gender = c('M', 'M', 'F', 'F', 'M', 'F'))

melt_data = melt(data, value.name = "value", id.vars = c("length", "gender"))

dcast(melt_data, gender+length ~ value)

  gender    length F I O
1      F    adults 0 0 2
2      F subadults 0 1 0
3      M    adults 1 0 0
4      M subadults 1 1 0

在 base R 中,要考虑的两种方法是 ftableaggregate

这是ftable

> ftable(mydf, col.vars = "Choice")
                 Choice  F  I  O
Length    Gender                
adults    F             10  1  1
          M              7  2  2
subadults F              1  8  2
          M              9  8  4

这里是aggregate

> aggregate(Choice ~ Length + Gender, mydf, table)
     Length Gender Choice.F Choice.I Choice.O
1    adults      F       10        1        1
2 subadults      F        1        8        2
3    adults      M        7        2        2
4 subadults      M        9        8        4

使用"data.table",您还可以尝试以下操作:

as.data.table(mydf)[, as.list(table(Choice)), by = list(Length, Gender)]
#       Length Gender  F I O
# 1: subadults      M  9 8 4
# 2:    adults      M  7 2 2
# 3:    adults      F 10 1 1
# 4: subadults      F  1 8 2

但是,dcast.data.table 将是更常见的方法:

dcast.data.table(as.data.table(mydf), Length + Gender ~ Choice, value.var = "Choice")

使用"dplyr"和"tidyr",你可以试试:

library(dplyr)
library(tidyr)

mydf %>%
  group_by(Length, Gender, Choice) %>%
  summarise(Count = n()) %>%
  spread(Choice, Count)
# Source: local data frame [4 x 5]
# 
#      Length Gender  F I O
# 1    adults      F 10 1 1
# 2    adults      M  7 2 2
# 3 subadults      F  1 8 2
# 4 subadults      M  9 8 4