使用数据 table 从另一个 table B 中随机检索元素

Get elements randomly retrieve from another table B with data table

我有一个 table A data.table 类型,其中包含一组元素,我想向这个 table 添加一个新的质量列,从另一个 table 中随机检索 table B. 我想用 data.table 包来做这个,但我不知道如何优化它。

Table一个

element
----------
silver
chlorine
silver
chlorine
chlorine
chlorine
silver

Table B

mass    element
------ 
0.3     silver
0.5     silver
1.6     silver
1.2     chlorine
5.3     chlorine
0.1     chlorine

要构建的代码 tables:

tableA <- data.table(
  element = c("silver","chlorine","silver", "chlorine","chlorine","chlorine","silver")
)

tableB <- data.table(

  mass = c(0.3,0.5,1.6,1.2,5.3,0.1),
  element = c("silver","silver","silver", "chlorine","chlorine","chlorine")
)

数据量非常重要,我想在 data.table 包中使用

预期结果:

    element     mass
-----------------------
    silver      1.6
    chlorine    5.3
    silver      0.3
    chlorine    1.2
    chlorine    1.2
    chlorine    0.1
    silver      1.6

这段代码returns我有一个无法修复的错误,但是你认为这个方法好并且优化了吗?

tableA[, mass := tableB[sample(mass), on = .(element)]$mass]

这是一种方法,

set.seed(42)
tableA[tableB[, list(mass = list(mass)), by = element], mass := sapply(i.mass, sample, size = 1), on = .(element)]
tableA
#     element  mass
#      <char> <num>
# 1:   silver   0.3
# 2: chlorine   1.2
# 3:   silver   0.3
# 4: chlorine   5.3
# 5: chlorine   5.3
# 6: chlorine   5.3
# 7:   silver   0.3

如果您有兴趣,这是另一种 data.table 方法:

tableA[, mass:=sample(tableB[element==.BY$element,mass], .N, replace=T), by=element]

输出:

    element mass
1:   silver  0.3
2: chlorine  1.2
3:   silver  1.6
4: chlorine  5.3
5: chlorine  5.3
6: chlorine  5.3
7:   silver  0.5