如何在 R 中 "populate" 不同行中的一系列值?
How to "populate" a range of values in different rows, in R?
我有一个包含数千行的数据框,类似于:
set.seed(1)
df = data.frame(ID=c(1:10),
from=c(sort(runif(10, min=1000, max=10000))),
to=c(sort(runif(10, min=5000, max=20000))))
在这个数据框中,我有一个范围(从到),我需要一些中间值,它可以是每 1000 个单位的一个数字。所需的输出是一个数据框,其中包含 ID 列和范围内每个值的一行,例如:
ID number
1 1556.076
1 2556.076
1 3556.076
1 4556.076
1 5556.076
1 6556.076
1 7556.076
1 7648.351 # From column "to"
2 2815.137
2 3815.137
2 4815.137
2 5815.137
2 6815.137
2 7815.137
2 8089.619 # From column "to"
# And so on…
有人可以帮忙吗?我一直在尝试多种不同的方法,但无法找到一种有效的方法。
提前致谢!
我们通过使用 map2
遍历相应的 'from'、'to' 列来创建 list
列,应用 seq
然后 unnest
list
library(dplyr)
library(purrr)
library(tidyr)
df %>%
transmute(ID, number = map2(from, to, ~ seq(.x, .y, by = 1000))) %>%
unnest(c(number))
-输出
# A tibble: 80 x 2
# ID number
# <int> <dbl>
# 1 1 1556.
# 2 1 2556.
# 3 1 3556.
# 4 1 4556.
# 5 1 5556.
# 6 1 6556.
# 7 1 7556.
# 8 2 2815.
# 9 2 3815.
#10 2 4815.
# … with 70 more rows
或使用 base R
和 Map
lst1 <- Map(seq, MoreArgs = list(by = 1000), df$from, df$to)
data.frame(ID = rep(df$ID, lengths(lst1)), number = unlist(lst1))
这里有一个data.table
选项
> setDT(df)[, .(number = do.call(seq, c(.SD, by = 1e3))), ID]
ID number
1: 1 1556.076
2: 1 2556.076
3: 1 3556.076
4: 1 4556.076
5: 1 5556.076
6: 1 6556.076
7: 1 7556.076
8: 2 2815.137
9: 2 3815.137
10: 2 4815.137
11: 2 5815.137
12: 2 6815.137
13: 2 7815.137
14: 3 3389.578
15: 3 4389.578
16: 3 5389.578
17: 3 6389.578
18: 3 7389.578
19: 3 8389.578
20: 3 9389.578
21: 3 10389.578
22: 4 4349.115
23: 4 5349.115
24: 4 6349.115
25: 4 7349.115
26: 4 8349.115
27: 4 9349.115
28: 4 10349.115
29: 5 6155.680
30: 5 7155.680
31: 5 8155.680
32: 5 9155.680
33: 5 10155.680
34: 5 11155.680
35: 5 12155.680
36: 6 6662.026
37: 6 7662.026
38: 6 8662.026
39: 6 9662.026
40: 6 10662.026
41: 6 11662.026
42: 6 12662.026
43: 6 13662.026
44: 6 14662.026
45: 7 6947.180
46: 7 7947.180
47: 7 8947.180
48: 7 9947.180
49: 7 10947.180
50: 7 11947.180
51: 7 12947.180
52: 7 13947.180
53: 7 14947.180
54: 8 9085.507
55: 8 10085.507
56: 8 11085.507
57: 8 12085.507
58: 8 13085.507
59: 8 14085.507
60: 8 15085.507
61: 8 16085.507
62: 9 9173.870
63: 9 10173.870
64: 9 11173.870
65: 9 12173.870
66: 9 13173.870
67: 9 14173.870
68: 9 15173.870
69: 9 16173.870
70: 10 9502.077
71: 10 10502.077
72: 10 11502.077
73: 10 12502.077
74: 10 13502.077
75: 10 14502.077
76: 10 15502.077
77: 10 16502.077
78: 10 17502.077
79: 10 18502.077
80: 10 19502.077
ID number
我有一个包含数千行的数据框,类似于:
set.seed(1)
df = data.frame(ID=c(1:10),
from=c(sort(runif(10, min=1000, max=10000))),
to=c(sort(runif(10, min=5000, max=20000))))
在这个数据框中,我有一个范围(从到),我需要一些中间值,它可以是每 1000 个单位的一个数字。所需的输出是一个数据框,其中包含 ID 列和范围内每个值的一行,例如:
ID number
1 1556.076
1 2556.076
1 3556.076
1 4556.076
1 5556.076
1 6556.076
1 7556.076
1 7648.351 # From column "to"
2 2815.137
2 3815.137
2 4815.137
2 5815.137
2 6815.137
2 7815.137
2 8089.619 # From column "to"
# And so on…
有人可以帮忙吗?我一直在尝试多种不同的方法,但无法找到一种有效的方法。
提前致谢!
我们通过使用 map2
遍历相应的 'from'、'to' 列来创建 list
列,应用 seq
然后 unnest
list
library(dplyr)
library(purrr)
library(tidyr)
df %>%
transmute(ID, number = map2(from, to, ~ seq(.x, .y, by = 1000))) %>%
unnest(c(number))
-输出
# A tibble: 80 x 2
# ID number
# <int> <dbl>
# 1 1 1556.
# 2 1 2556.
# 3 1 3556.
# 4 1 4556.
# 5 1 5556.
# 6 1 6556.
# 7 1 7556.
# 8 2 2815.
# 9 2 3815.
#10 2 4815.
# … with 70 more rows
或使用 base R
和 Map
lst1 <- Map(seq, MoreArgs = list(by = 1000), df$from, df$to)
data.frame(ID = rep(df$ID, lengths(lst1)), number = unlist(lst1))
这里有一个data.table
选项
> setDT(df)[, .(number = do.call(seq, c(.SD, by = 1e3))), ID]
ID number
1: 1 1556.076
2: 1 2556.076
3: 1 3556.076
4: 1 4556.076
5: 1 5556.076
6: 1 6556.076
7: 1 7556.076
8: 2 2815.137
9: 2 3815.137
10: 2 4815.137
11: 2 5815.137
12: 2 6815.137
13: 2 7815.137
14: 3 3389.578
15: 3 4389.578
16: 3 5389.578
17: 3 6389.578
18: 3 7389.578
19: 3 8389.578
20: 3 9389.578
21: 3 10389.578
22: 4 4349.115
23: 4 5349.115
24: 4 6349.115
25: 4 7349.115
26: 4 8349.115
27: 4 9349.115
28: 4 10349.115
29: 5 6155.680
30: 5 7155.680
31: 5 8155.680
32: 5 9155.680
33: 5 10155.680
34: 5 11155.680
35: 5 12155.680
36: 6 6662.026
37: 6 7662.026
38: 6 8662.026
39: 6 9662.026
40: 6 10662.026
41: 6 11662.026
42: 6 12662.026
43: 6 13662.026
44: 6 14662.026
45: 7 6947.180
46: 7 7947.180
47: 7 8947.180
48: 7 9947.180
49: 7 10947.180
50: 7 11947.180
51: 7 12947.180
52: 7 13947.180
53: 7 14947.180
54: 8 9085.507
55: 8 10085.507
56: 8 11085.507
57: 8 12085.507
58: 8 13085.507
59: 8 14085.507
60: 8 15085.507
61: 8 16085.507
62: 9 9173.870
63: 9 10173.870
64: 9 11173.870
65: 9 12173.870
66: 9 13173.870
67: 9 14173.870
68: 9 15173.870
69: 9 16173.870
70: 10 9502.077
71: 10 10502.077
72: 10 11502.077
73: 10 12502.077
74: 10 13502.077
75: 10 14502.077
76: 10 15502.077
77: 10 16502.077
78: 10 17502.077
79: 10 18502.077
80: 10 19502.077
ID number