将数据框重新格式化为长格式,保持分组列

Reformat data frame to long format keeping grouped columns

考虑以下数据框:

set.seed(42)
ID <- c(1:6) 
OB <- c(rep("A",4),rep("B",2))
lat_start <- rnorm(6,42,2)
lon_start <- rnorm(6,12,2)
lat_stopp <- rnorm(6,42,2)
lon_stopp <- rnorm(6,12,2)
df <- data.frame(ID,OB,lat_start,lon_start,lat_stopp,lon_stopp)

我想将 df 重新格式化为长格式,其中每个 ID 都有一行作为开始和停止坐标。一个简单的 gather() 解决方案,例如df_wrong <- gather(df,coords,val,lat_start:lon_stopp) 显然不起作用,因为我需要 lat/lon 列保持分组。我希望长数据框看起来像这样:

   ID OB    SS      lat       lon
1   1  A start 44.74192 15.023040
2   1  A  stop 39.22228  7.119066
3   2  A start 40.87060 11.810680
4   2  A  stop 41.44242 14.640227
5   3  A start 42.72626 16.036850
6   3  A  stop 41.73336 11.386723
7   4  A start 43.26573 11.874570
8   4  A  stop 43.27190  8.437383
9   5  B start 42.80854 14.609740
10  5  B  stop 41.43149 11.656165
11  6  B start 41.78775 16.573290
12  6  B  stop 36.68709 14.429349

SS 栏目当然可以稍后添加。任何建议将不胜感激!

一个 tidyverse 可能是:

df %>%
 gather(var, val, -c(ID, OB)) %>%
 separate(var, c("var1", "SS")) %>%
 spread(var1, val)

   ID OB    SS      lat       lon
1   1  A start 44.74192 15.023044
2   1  A stopp 39.22228  7.119066
3   2  A start 40.87060 11.810682
4   2  A stopp 41.44242 14.640227
5   3  A start 42.72626 16.036847
6   3  A stopp 41.73336 11.386723
7   4  A start 43.26573 11.874572
8   4  A stopp 43.27190  8.437383
9   5  B start 42.80854 14.609739
10  5  B stopp 41.43149 11.656165
11  6  B start 41.78775 16.573291
12  6  B stopp 36.68709 14.429349

您可以unlist并创建一个重新排列的矩阵。

d <- matrix(unlist(t(df[3:6])), ncol=2, byrow=TRUE)
setNames(data.frame(do.call(cbind, lapply(list(df$ID, as.character(df$OB)), 
                                 function(x) rep(x, each=2))), 
                    c("start", "stop"),
                    d), c(names(df)[1:3], "lat", "lon"))
#    ID OB lat_start      lat       lon
# 1   1  A     start 44.74192 15.023044
# 2   1  A      stop 39.22228  7.119066
# 3   2  A     start 40.87060 11.810682
# 4   2  A      stop 41.44242 14.640227
# 5   3  A     start 42.72626 16.036847
# 6   3  A      stop 41.73336 11.386723
# 7   4  A     start 43.26573 11.874572
# 8   4  A      stop 43.27190  8.437383
# 9   5  B     start 42.80854 14.609739
# 10  5  B      stop 41.43149 11.656165
# 11  6  B     start 41.78775 16.573291
# 12  6  B      stop 36.68709 14.429349