当行值匹配时，将数据框的列中的值除以来自不同数据框的值

Question

我有一个 data.frame x 格式如下：

     species      site  count
1:         A       1.1     25
2:         A       1.2   1152
3:         A       2.1     26
4:         A       3.5      1
5:         A       3.7     98
---                         
101:       B       1.2      6
102:       B       1.3     10
103:       B       2.1      8
104:       B       2.2      8
105:       B       2.3      5

我还有另一个 data.frame area，格式如下：

      species    area
1:          A    59.7
2:          B    34.4
3:          C    37.7
4:          D    22.8

我想将 data.frame x 的 count 列除以 area 列 data.frame area 中的值每个 data.frame 的物种列中的值匹配

我一直在努力让它与 ddply 函数一起工作：

density = ddply(x, "species", mutate, density = x$count/area[,2]

但我无法找出 area[] 调用 select 的正确索引语法，只有与 x$species 中找到的值匹配的行。但是，我对 plyr 包（以及 apply* 的整体功能）非常陌生，所以这可能是完全错误的方法

我希望 return 一个 data.frame 以下格式：

     species      site  count   density
1:         A       1.1     25     0.419
2:         A       1.2    152     2.546
3:         A       2.1     26     0.436
4:         A       3.5      1     0.017
5:         A       3.7     98     1.641
---                         
101:       B       1.2      6     0.174
102:       B       1.3     10     0.291
103:       B       2.1      8     0.233
104:       B       2.2      8     0.233
105:       B       2.3      5     0.145

Answer 1

这很容易 data.table:

library(data.table)
#converting your data to the native type for the package (by reference)
setDT(x); setDT(area) 
x[area, density:=count/i.area, on="species"]

:= 是在 data.table 中添加列的自然方式（ 通过参考 ，参见 this 小插图，特别是 b 点）关于这一点的更多信息及其重要性），因此 x:=y 将名为 x 的列添加到您的 data.table 并为其分配值 y.

以X[Y,]形式合并时，我们可以认为Y是选择X行进行操作；此外，当 Y 是 data.table 时，X 和 Y 中的所有对象都可以在 j 中使用（即逗号后面的内容），所以我们可以说 density:=count/area；当我们想要确定我们指的是 Y 的列之一时，我们在其名称前加上 i. 以便我们知道我们是指的是 i 中的一列，即逗号之前的内容。合并 forthcoming.

应该有一个小插图

一般来说，一想到"match across different data sets"你的本能应该是合并。有关 data.table 的更多信息，请参阅 here。

Answer 2

我会使用合并 (left_join)，然后使用 mutate:

添加新列

library(dplyr)

x %>% left_join(area, by="species") %>%
      mutate(density = count/area)

当行值匹配时，将数据框的列中的值除以来自不同数据框的值

Dividing values in a column of a data frame by values from a different data frame when row values match

r

plyr