根据因子列在数据框中展开列
spread column on dataframe based on factor column
我有一个包含 3 列的数据框
> str(lagdf)
'data.frame': 2208 obs. of 3 variables:
$ time: POSIXct, format: "2015-10-27 00:00:00" "2015-10-27 00:15:00" "2015-10-27 00:30:00" "2015-10-27 00:45:00" ...
$ site: Factor w/ 23 levels "2001","2002",..: 1 1 1 1 1 1 1 1 1 1 ...
$ lag : int 8 8 8 8 8 8 8 8 7 8 ...
列滞后表示特定站点在特定时间的滞后
> head(lagdf,14)
time site lag
1 2015-10-27 00:00:00 2001 8
2 2015-10-27 00:15:00 2001 8
3 2015-10-27 00:30:00 2001 8
4 2015-10-27 00:45:00 2001 8
5 2015-10-27 01:00:00 2001 8
6 2015-10-27 01:15:00 2001 8
7 2015-10-27 01:30:00 2001 8
8 2015-10-27 01:45:00 2001 8
9 2015-10-27 02:00:00 2001 7
10 2015-10-27 02:15:00 2001 8
11 2015-10-27 02:30:00 2001 9
12 2015-10-27 02:45:00 2001 9
13 2015-10-27 03:00:00 2001 9
14 2015-10-27 03:15:00 2001 8
我希望能够分散滞后,这样我就可以将每个特定站点的滞后作为列。
site lag1 lag2 lag3
2001 8 8 8
时间栏不会保留
使用 tidyr 没有帮助
您确实可以将 tidyr
与 dplyr
结合使用:
library(tidyr)
library(dplyr)
lagdf %>% group_by(site) %>%
select(-time) %>%
mutate(row = paste0("lag",row_number())) %>%
spread(row, lag)
Source: local data frame [1 x 15]
site lag1 lag10 lag11 lag12 lag13 lag14 lag2 lag3 lag4 lag5 lag6 lag7 lag8 lag9
(int) (int) (int) (int) (int) (int) (int) (int) (int) (int) (int) (int) (int) (int) (int)
1 2001 8 8 9 9 9 8 8 8 8 8 8 8 8 7
我有一个包含 3 列的数据框
> str(lagdf)
'data.frame': 2208 obs. of 3 variables:
$ time: POSIXct, format: "2015-10-27 00:00:00" "2015-10-27 00:15:00" "2015-10-27 00:30:00" "2015-10-27 00:45:00" ...
$ site: Factor w/ 23 levels "2001","2002",..: 1 1 1 1 1 1 1 1 1 1 ...
$ lag : int 8 8 8 8 8 8 8 8 7 8 ...
列滞后表示特定站点在特定时间的滞后
> head(lagdf,14)
time site lag
1 2015-10-27 00:00:00 2001 8
2 2015-10-27 00:15:00 2001 8
3 2015-10-27 00:30:00 2001 8
4 2015-10-27 00:45:00 2001 8
5 2015-10-27 01:00:00 2001 8
6 2015-10-27 01:15:00 2001 8
7 2015-10-27 01:30:00 2001 8
8 2015-10-27 01:45:00 2001 8
9 2015-10-27 02:00:00 2001 7
10 2015-10-27 02:15:00 2001 8
11 2015-10-27 02:30:00 2001 9
12 2015-10-27 02:45:00 2001 9
13 2015-10-27 03:00:00 2001 9
14 2015-10-27 03:15:00 2001 8
我希望能够分散滞后,这样我就可以将每个特定站点的滞后作为列。
site lag1 lag2 lag3
2001 8 8 8
时间栏不会保留
使用 tidyr 没有帮助
您确实可以将 tidyr
与 dplyr
结合使用:
library(tidyr)
library(dplyr)
lagdf %>% group_by(site) %>%
select(-time) %>%
mutate(row = paste0("lag",row_number())) %>%
spread(row, lag)
Source: local data frame [1 x 15]
site lag1 lag10 lag11 lag12 lag13 lag14 lag2 lag3 lag4 lag5 lag6 lag7 lag8 lag9
(int) (int) (int) (int) (int) (int) (int) (int) (int) (int) (int) (int) (int) (int) (int)
1 2001 8 8 9 9 9 8 8 8 8 8 8 8 8 7