使用数据 table 从另一个 table B 中随机检索元素
Get elements randomly retrieve from another table B with data table
我有一个 table A data.table 类型,其中包含一组元素,我想向这个 table 添加一个新的质量列,从另一个 table 中随机检索 table B.
我想用 data.table 包来做这个,但我不知道如何优化它。
Table一个
element
----------
silver
chlorine
silver
chlorine
chlorine
chlorine
silver
Table B
mass element
------
0.3 silver
0.5 silver
1.6 silver
1.2 chlorine
5.3 chlorine
0.1 chlorine
要构建的代码 tables:
tableA <- data.table(
element = c("silver","chlorine","silver", "chlorine","chlorine","chlorine","silver")
)
tableB <- data.table(
mass = c(0.3,0.5,1.6,1.2,5.3,0.1),
element = c("silver","silver","silver", "chlorine","chlorine","chlorine")
)
数据量非常重要,我想在 data.table 包中使用
预期结果:
element mass
-----------------------
silver 1.6
chlorine 5.3
silver 0.3
chlorine 1.2
chlorine 1.2
chlorine 0.1
silver 1.6
这段代码returns我有一个无法修复的错误,但是你认为这个方法好并且优化了吗?
tableA[, mass := tableB[sample(mass), on = .(element)]$mass]
这是一种方法,
set.seed(42)
tableA[tableB[, list(mass = list(mass)), by = element], mass := sapply(i.mass, sample, size = 1), on = .(element)]
tableA
# element mass
# <char> <num>
# 1: silver 0.3
# 2: chlorine 1.2
# 3: silver 0.3
# 4: chlorine 5.3
# 5: chlorine 5.3
# 6: chlorine 5.3
# 7: silver 0.3
如果您有兴趣,这是另一种 data.table 方法:
tableA[, mass:=sample(tableB[element==.BY$element,mass], .N, replace=T), by=element]
输出:
element mass
1: silver 0.3
2: chlorine 1.2
3: silver 1.6
4: chlorine 5.3
5: chlorine 5.3
6: chlorine 5.3
7: silver 0.5
我有一个 table A data.table 类型,其中包含一组元素,我想向这个 table 添加一个新的质量列,从另一个 table 中随机检索 table B. 我想用 data.table 包来做这个,但我不知道如何优化它。
Table一个
element
----------
silver
chlorine
silver
chlorine
chlorine
chlorine
silver
Table B
mass element
------
0.3 silver
0.5 silver
1.6 silver
1.2 chlorine
5.3 chlorine
0.1 chlorine
要构建的代码 tables:
tableA <- data.table(
element = c("silver","chlorine","silver", "chlorine","chlorine","chlorine","silver")
)
tableB <- data.table(
mass = c(0.3,0.5,1.6,1.2,5.3,0.1),
element = c("silver","silver","silver", "chlorine","chlorine","chlorine")
)
数据量非常重要,我想在 data.table 包中使用
预期结果:
element mass
-----------------------
silver 1.6
chlorine 5.3
silver 0.3
chlorine 1.2
chlorine 1.2
chlorine 0.1
silver 1.6
这段代码returns我有一个无法修复的错误,但是你认为这个方法好并且优化了吗?
tableA[, mass := tableB[sample(mass), on = .(element)]$mass]
这是一种方法,
set.seed(42)
tableA[tableB[, list(mass = list(mass)), by = element], mass := sapply(i.mass, sample, size = 1), on = .(element)]
tableA
# element mass
# <char> <num>
# 1: silver 0.3
# 2: chlorine 1.2
# 3: silver 0.3
# 4: chlorine 5.3
# 5: chlorine 5.3
# 6: chlorine 5.3
# 7: silver 0.3
如果您有兴趣,这是另一种 data.table 方法:
tableA[, mass:=sample(tableB[element==.BY$element,mass], .N, replace=T), by=element]
输出:
element mass
1: silver 0.3
2: chlorine 1.2
3: silver 1.6
4: chlorine 5.3
5: chlorine 5.3
6: chlorine 5.3
7: silver 0.5