R:按日期将行插入数据框
R: Insert rows into dataframe by date
(R)
我有一个数据框,其中包含仅具有可用分数的日期行。
我想在没有分数的日子里插入分数为 0 的行,在死亡日期后的日子里插入分数为 20 的行(NA 表示研究对象在第 90 天还活着)。第 1 天被定义为具有可用分数的第一个日期,我正在尝试为每个 id(第 1 到 90 天)创建一个正好包含 90 行的数据框。有谁知道如何解决这个问题?提前致谢。
id date death score
1 01-01-2001 04-02-2001 14
1 02-01-2001 04-02-2001 16
1 31-01-2001 04-02-2001 15
1 01-02-2001 04-02-2001 16
1 02-02-2001 04-02-2001 12
2 01-04-2001 NA 5
2 03-04-2001 NA 6
2 05-05-2001 NA 9
2 06-05-2001 NA 2
2 07-05-2001 NA 10
您可以尝试以下方法 - 如果有帮助请告诉我。
首先,确保你的日期是 Date
格式(如果还没有的话):
df$date <- as.Date(df$date, format = "%d-%m-%Y")
df$death <- as.Date(df$death, format = "%d-%m-%Y")
您可以使用 tidyverse
和 lubridate
来实现所需的大部分功能。另请注意,padr
可用于填充日期。
在这种情况下,第一个 group_by
id。您可以使用 complete
来填写缺失的日期。使用的日期顺序将从第一个日期到 89 天后(总共 90 天)。对于添加的这些额外日期,fill
将使 score
为零。
另一个 fill
可用于为每个后续添加的行包含死亡日期(因为它似乎是已经使用的结构)。
最后,您可以 mutate
score
- 如果 death
不是 NA
并且 date
排在 [=24= 之后],然后将值设为20(否则只保留当前的score
)。
library(tidyverse)
library(lubridate)
df %>%
group_by(id) %>%
complete(date = seq(date[1], date[1] + days(89), by = "1 day"), fill = list(score = 0)) %>%
fill(death, .direction = "down") %>%
mutate(score = ifelse(!is.na(death) & date > death, 20, score)) %>%
print(n=180)
输出
# A tibble: 180 x 4
# Groups: id [2]
id date death score
<int> <date> <date> <dbl>
1 1 2001-01-01 2001-02-04 14
2 1 2001-01-02 2001-02-04 16
3 1 2001-01-03 2001-02-04 0
4 1 2001-01-04 2001-02-04 0
5 1 2001-01-05 2001-02-04 0
6 1 2001-01-06 2001-02-04 0
7 1 2001-01-07 2001-02-04 0
8 1 2001-01-08 2001-02-04 0
9 1 2001-01-09 2001-02-04 0
10 1 2001-01-10 2001-02-04 0
11 1 2001-01-11 2001-02-04 0
12 1 2001-01-12 2001-02-04 0
13 1 2001-01-13 2001-02-04 0
14 1 2001-01-14 2001-02-04 0
15 1 2001-01-15 2001-02-04 0
16 1 2001-01-16 2001-02-04 0
17 1 2001-01-17 2001-02-04 0
18 1 2001-01-18 2001-02-04 0
19 1 2001-01-19 2001-02-04 0
20 1 2001-01-20 2001-02-04 0
21 1 2001-01-21 2001-02-04 0
22 1 2001-01-22 2001-02-04 0
23 1 2001-01-23 2001-02-04 0
24 1 2001-01-24 2001-02-04 0
25 1 2001-01-25 2001-02-04 0
26 1 2001-01-26 2001-02-04 0
27 1 2001-01-27 2001-02-04 0
28 1 2001-01-28 2001-02-04 0
29 1 2001-01-29 2001-02-04 0
30 1 2001-01-30 2001-02-04 0
31 1 2001-01-31 2001-02-04 15
32 1 2001-02-01 2001-02-04 16
33 1 2001-02-02 2001-02-04 12
34 1 2001-02-03 2001-02-04 0
35 1 2001-02-04 2001-02-04 0
36 1 2001-02-05 2001-02-04 20
37 1 2001-02-06 2001-02-04 20
38 1 2001-02-07 2001-02-04 20
39 1 2001-02-08 2001-02-04 20
40 1 2001-02-09 2001-02-04 20
41 1 2001-02-10 2001-02-04 20
42 1 2001-02-11 2001-02-04 20
43 1 2001-02-12 2001-02-04 20
44 1 2001-02-13 2001-02-04 20
45 1 2001-02-14 2001-02-04 20
46 1 2001-02-15 2001-02-04 20
47 1 2001-02-16 2001-02-04 20
48 1 2001-02-17 2001-02-04 20
49 1 2001-02-18 2001-02-04 20
50 1 2001-02-19 2001-02-04 20
51 1 2001-02-20 2001-02-04 20
52 1 2001-02-21 2001-02-04 20
53 1 2001-02-22 2001-02-04 20
54 1 2001-02-23 2001-02-04 20
55 1 2001-02-24 2001-02-04 20
56 1 2001-02-25 2001-02-04 20
57 1 2001-02-26 2001-02-04 20
58 1 2001-02-27 2001-02-04 20
59 1 2001-02-28 2001-02-04 20
60 1 2001-03-01 2001-02-04 20
61 1 2001-03-02 2001-02-04 20
62 1 2001-03-03 2001-02-04 20
63 1 2001-03-04 2001-02-04 20
64 1 2001-03-05 2001-02-04 20
65 1 2001-03-06 2001-02-04 20
66 1 2001-03-07 2001-02-04 20
67 1 2001-03-08 2001-02-04 20
68 1 2001-03-09 2001-02-04 20
69 1 2001-03-10 2001-02-04 20
70 1 2001-03-11 2001-02-04 20
71 1 2001-03-12 2001-02-04 20
72 1 2001-03-13 2001-02-04 20
73 1 2001-03-14 2001-02-04 20
74 1 2001-03-15 2001-02-04 20
75 1 2001-03-16 2001-02-04 20
76 1 2001-03-17 2001-02-04 20
77 1 2001-03-18 2001-02-04 20
78 1 2001-03-19 2001-02-04 20
79 1 2001-03-20 2001-02-04 20
80 1 2001-03-21 2001-02-04 20
81 1 2001-03-22 2001-02-04 20
82 1 2001-03-23 2001-02-04 20
83 1 2001-03-24 2001-02-04 20
84 1 2001-03-25 2001-02-04 20
85 1 2001-03-26 2001-02-04 20
86 1 2001-03-27 2001-02-04 20
87 1 2001-03-28 2001-02-04 20
88 1 2001-03-29 2001-02-04 20
89 1 2001-03-30 2001-02-04 20
90 1 2001-03-31 2001-02-04 20
91 2 2001-04-01 NA 5
92 2 2001-04-02 NA 0
93 2 2001-04-03 NA 6
94 2 2001-04-04 NA 0
95 2 2001-04-05 NA 0
96 2 2001-04-06 NA 0
97 2 2001-04-07 NA 0
98 2 2001-04-08 NA 0
99 2 2001-04-09 NA 0
100 2 2001-04-10 NA 0
101 2 2001-04-11 NA 0
102 2 2001-04-12 NA 0
103 2 2001-04-13 NA 0
104 2 2001-04-14 NA 0
105 2 2001-04-15 NA 0
106 2 2001-04-16 NA 0
107 2 2001-04-17 NA 0
108 2 2001-04-18 NA 0
109 2 2001-04-19 NA 0
110 2 2001-04-20 NA 0
111 2 2001-04-21 NA 0
112 2 2001-04-22 NA 0
113 2 2001-04-23 NA 0
114 2 2001-04-24 NA 0
115 2 2001-04-25 NA 0
116 2 2001-04-26 NA 0
117 2 2001-04-27 NA 0
118 2 2001-04-28 NA 0
119 2 2001-04-29 NA 0
120 2 2001-04-30 NA 0
121 2 2001-05-01 NA 0
122 2 2001-05-02 NA 0
123 2 2001-05-03 NA 0
124 2 2001-05-04 NA 0
125 2 2001-05-05 NA 9
126 2 2001-05-06 NA 2
127 2 2001-05-07 NA 10
128 2 2001-05-08 NA 0
129 2 2001-05-09 NA 0
130 2 2001-05-10 NA 0
131 2 2001-05-11 NA 0
132 2 2001-05-12 NA 0
133 2 2001-05-13 NA 0
134 2 2001-05-14 NA 0
135 2 2001-05-15 NA 0
136 2 2001-05-16 NA 0
137 2 2001-05-17 NA 0
138 2 2001-05-18 NA 0
139 2 2001-05-19 NA 0
140 2 2001-05-20 NA 0
141 2 2001-05-21 NA 0
142 2 2001-05-22 NA 0
143 2 2001-05-23 NA 0
144 2 2001-05-24 NA 0
145 2 2001-05-25 NA 0
146 2 2001-05-26 NA 0
147 2 2001-05-27 NA 0
148 2 2001-05-28 NA 0
149 2 2001-05-29 NA 0
150 2 2001-05-30 NA 0
151 2 2001-05-31 NA 0
152 2 2001-06-01 NA 0
153 2 2001-06-02 NA 0
154 2 2001-06-03 NA 0
155 2 2001-06-04 NA 0
156 2 2001-06-05 NA 0
157 2 2001-06-06 NA 0
158 2 2001-06-07 NA 0
159 2 2001-06-08 NA 0
160 2 2001-06-09 NA 0
161 2 2001-06-10 NA 0
162 2 2001-06-11 NA 0
163 2 2001-06-12 NA 0
164 2 2001-06-13 NA 0
165 2 2001-06-14 NA 0
166 2 2001-06-15 NA 0
167 2 2001-06-16 NA 0
168 2 2001-06-17 NA 0
169 2 2001-06-18 NA 0
170 2 2001-06-19 NA 0
171 2 2001-06-20 NA 0
172 2 2001-06-21 NA 0
173 2 2001-06-22 NA 0
174 2 2001-06-23 NA 0
175 2 2001-06-24 NA 0
176 2 2001-06-25 NA 0
177 2 2001-06-26 NA 0
178 2 2001-06-27 NA 0
179 2 2001-06-28 NA 0
180 2 2001-06-29 NA 0
(R) 我有一个数据框,其中包含仅具有可用分数的日期行。 我想在没有分数的日子里插入分数为 0 的行,在死亡日期后的日子里插入分数为 20 的行(NA 表示研究对象在第 90 天还活着)。第 1 天被定义为具有可用分数的第一个日期,我正在尝试为每个 id(第 1 到 90 天)创建一个正好包含 90 行的数据框。有谁知道如何解决这个问题?提前致谢。
id date death score
1 01-01-2001 04-02-2001 14
1 02-01-2001 04-02-2001 16
1 31-01-2001 04-02-2001 15
1 01-02-2001 04-02-2001 16
1 02-02-2001 04-02-2001 12
2 01-04-2001 NA 5
2 03-04-2001 NA 6
2 05-05-2001 NA 9
2 06-05-2001 NA 2
2 07-05-2001 NA 10
您可以尝试以下方法 - 如果有帮助请告诉我。
首先,确保你的日期是 Date
格式(如果还没有的话):
df$date <- as.Date(df$date, format = "%d-%m-%Y")
df$death <- as.Date(df$death, format = "%d-%m-%Y")
您可以使用 tidyverse
和 lubridate
来实现所需的大部分功能。另请注意,padr
可用于填充日期。
在这种情况下,第一个 group_by
id。您可以使用 complete
来填写缺失的日期。使用的日期顺序将从第一个日期到 89 天后(总共 90 天)。对于添加的这些额外日期,fill
将使 score
为零。
另一个 fill
可用于为每个后续添加的行包含死亡日期(因为它似乎是已经使用的结构)。
最后,您可以 mutate
score
- 如果 death
不是 NA
并且 date
排在 [=24= 之后],然后将值设为20(否则只保留当前的score
)。
library(tidyverse)
library(lubridate)
df %>%
group_by(id) %>%
complete(date = seq(date[1], date[1] + days(89), by = "1 day"), fill = list(score = 0)) %>%
fill(death, .direction = "down") %>%
mutate(score = ifelse(!is.na(death) & date > death, 20, score)) %>%
print(n=180)
输出
# A tibble: 180 x 4
# Groups: id [2]
id date death score
<int> <date> <date> <dbl>
1 1 2001-01-01 2001-02-04 14
2 1 2001-01-02 2001-02-04 16
3 1 2001-01-03 2001-02-04 0
4 1 2001-01-04 2001-02-04 0
5 1 2001-01-05 2001-02-04 0
6 1 2001-01-06 2001-02-04 0
7 1 2001-01-07 2001-02-04 0
8 1 2001-01-08 2001-02-04 0
9 1 2001-01-09 2001-02-04 0
10 1 2001-01-10 2001-02-04 0
11 1 2001-01-11 2001-02-04 0
12 1 2001-01-12 2001-02-04 0
13 1 2001-01-13 2001-02-04 0
14 1 2001-01-14 2001-02-04 0
15 1 2001-01-15 2001-02-04 0
16 1 2001-01-16 2001-02-04 0
17 1 2001-01-17 2001-02-04 0
18 1 2001-01-18 2001-02-04 0
19 1 2001-01-19 2001-02-04 0
20 1 2001-01-20 2001-02-04 0
21 1 2001-01-21 2001-02-04 0
22 1 2001-01-22 2001-02-04 0
23 1 2001-01-23 2001-02-04 0
24 1 2001-01-24 2001-02-04 0
25 1 2001-01-25 2001-02-04 0
26 1 2001-01-26 2001-02-04 0
27 1 2001-01-27 2001-02-04 0
28 1 2001-01-28 2001-02-04 0
29 1 2001-01-29 2001-02-04 0
30 1 2001-01-30 2001-02-04 0
31 1 2001-01-31 2001-02-04 15
32 1 2001-02-01 2001-02-04 16
33 1 2001-02-02 2001-02-04 12
34 1 2001-02-03 2001-02-04 0
35 1 2001-02-04 2001-02-04 0
36 1 2001-02-05 2001-02-04 20
37 1 2001-02-06 2001-02-04 20
38 1 2001-02-07 2001-02-04 20
39 1 2001-02-08 2001-02-04 20
40 1 2001-02-09 2001-02-04 20
41 1 2001-02-10 2001-02-04 20
42 1 2001-02-11 2001-02-04 20
43 1 2001-02-12 2001-02-04 20
44 1 2001-02-13 2001-02-04 20
45 1 2001-02-14 2001-02-04 20
46 1 2001-02-15 2001-02-04 20
47 1 2001-02-16 2001-02-04 20
48 1 2001-02-17 2001-02-04 20
49 1 2001-02-18 2001-02-04 20
50 1 2001-02-19 2001-02-04 20
51 1 2001-02-20 2001-02-04 20
52 1 2001-02-21 2001-02-04 20
53 1 2001-02-22 2001-02-04 20
54 1 2001-02-23 2001-02-04 20
55 1 2001-02-24 2001-02-04 20
56 1 2001-02-25 2001-02-04 20
57 1 2001-02-26 2001-02-04 20
58 1 2001-02-27 2001-02-04 20
59 1 2001-02-28 2001-02-04 20
60 1 2001-03-01 2001-02-04 20
61 1 2001-03-02 2001-02-04 20
62 1 2001-03-03 2001-02-04 20
63 1 2001-03-04 2001-02-04 20
64 1 2001-03-05 2001-02-04 20
65 1 2001-03-06 2001-02-04 20
66 1 2001-03-07 2001-02-04 20
67 1 2001-03-08 2001-02-04 20
68 1 2001-03-09 2001-02-04 20
69 1 2001-03-10 2001-02-04 20
70 1 2001-03-11 2001-02-04 20
71 1 2001-03-12 2001-02-04 20
72 1 2001-03-13 2001-02-04 20
73 1 2001-03-14 2001-02-04 20
74 1 2001-03-15 2001-02-04 20
75 1 2001-03-16 2001-02-04 20
76 1 2001-03-17 2001-02-04 20
77 1 2001-03-18 2001-02-04 20
78 1 2001-03-19 2001-02-04 20
79 1 2001-03-20 2001-02-04 20
80 1 2001-03-21 2001-02-04 20
81 1 2001-03-22 2001-02-04 20
82 1 2001-03-23 2001-02-04 20
83 1 2001-03-24 2001-02-04 20
84 1 2001-03-25 2001-02-04 20
85 1 2001-03-26 2001-02-04 20
86 1 2001-03-27 2001-02-04 20
87 1 2001-03-28 2001-02-04 20
88 1 2001-03-29 2001-02-04 20
89 1 2001-03-30 2001-02-04 20
90 1 2001-03-31 2001-02-04 20
91 2 2001-04-01 NA 5
92 2 2001-04-02 NA 0
93 2 2001-04-03 NA 6
94 2 2001-04-04 NA 0
95 2 2001-04-05 NA 0
96 2 2001-04-06 NA 0
97 2 2001-04-07 NA 0
98 2 2001-04-08 NA 0
99 2 2001-04-09 NA 0
100 2 2001-04-10 NA 0
101 2 2001-04-11 NA 0
102 2 2001-04-12 NA 0
103 2 2001-04-13 NA 0
104 2 2001-04-14 NA 0
105 2 2001-04-15 NA 0
106 2 2001-04-16 NA 0
107 2 2001-04-17 NA 0
108 2 2001-04-18 NA 0
109 2 2001-04-19 NA 0
110 2 2001-04-20 NA 0
111 2 2001-04-21 NA 0
112 2 2001-04-22 NA 0
113 2 2001-04-23 NA 0
114 2 2001-04-24 NA 0
115 2 2001-04-25 NA 0
116 2 2001-04-26 NA 0
117 2 2001-04-27 NA 0
118 2 2001-04-28 NA 0
119 2 2001-04-29 NA 0
120 2 2001-04-30 NA 0
121 2 2001-05-01 NA 0
122 2 2001-05-02 NA 0
123 2 2001-05-03 NA 0
124 2 2001-05-04 NA 0
125 2 2001-05-05 NA 9
126 2 2001-05-06 NA 2
127 2 2001-05-07 NA 10
128 2 2001-05-08 NA 0
129 2 2001-05-09 NA 0
130 2 2001-05-10 NA 0
131 2 2001-05-11 NA 0
132 2 2001-05-12 NA 0
133 2 2001-05-13 NA 0
134 2 2001-05-14 NA 0
135 2 2001-05-15 NA 0
136 2 2001-05-16 NA 0
137 2 2001-05-17 NA 0
138 2 2001-05-18 NA 0
139 2 2001-05-19 NA 0
140 2 2001-05-20 NA 0
141 2 2001-05-21 NA 0
142 2 2001-05-22 NA 0
143 2 2001-05-23 NA 0
144 2 2001-05-24 NA 0
145 2 2001-05-25 NA 0
146 2 2001-05-26 NA 0
147 2 2001-05-27 NA 0
148 2 2001-05-28 NA 0
149 2 2001-05-29 NA 0
150 2 2001-05-30 NA 0
151 2 2001-05-31 NA 0
152 2 2001-06-01 NA 0
153 2 2001-06-02 NA 0
154 2 2001-06-03 NA 0
155 2 2001-06-04 NA 0
156 2 2001-06-05 NA 0
157 2 2001-06-06 NA 0
158 2 2001-06-07 NA 0
159 2 2001-06-08 NA 0
160 2 2001-06-09 NA 0
161 2 2001-06-10 NA 0
162 2 2001-06-11 NA 0
163 2 2001-06-12 NA 0
164 2 2001-06-13 NA 0
165 2 2001-06-14 NA 0
166 2 2001-06-15 NA 0
167 2 2001-06-16 NA 0
168 2 2001-06-17 NA 0
169 2 2001-06-18 NA 0
170 2 2001-06-19 NA 0
171 2 2001-06-20 NA 0
172 2 2001-06-21 NA 0
173 2 2001-06-22 NA 0
174 2 2001-06-23 NA 0
175 2 2001-06-24 NA 0
176 2 2001-06-25 NA 0
177 2 2001-06-26 NA 0
178 2 2001-06-27 NA 0
179 2 2001-06-28 NA 0
180 2 2001-06-29 NA 0