R 将宽数据转换为长数据
R Converting Wide Data to Long
如何转换我的数据:
example <- data.frame(RTD_1_LOC = c('A', 'B'), RTD_2_LOC = c('C', 'D'),
RTD_3_LOC = c('E', 'F'), RTD_4_LOC = c('G', 'H'),
RTD_5_LOC = c('I', 'J'),RTD_1_OFF = c('1', '2'), RTD_2_OFF = c('3', '4'),
RTD_3_OFF = c('5', '6'), RTD_4_OFF = c('7', '8'),
RTD_5_OFF = c('9', '10'))
对此:
example2 <- data.frame(RTD = c(1,1,2,2,3,3,4,4,5,5),LOC = c('A', 'B','C','D','E','F','G','H','I','J'),
OFF = c(1,2,3,4,5,6,7,8,9,10))
我一直在使用 tidyverse gather,但我最终得到了大约 50 列
ex <- gather(example,RTD, Location, RTD_1_LOC:RTD_5_LOC)
ex$RTD <- sub('_LOC',"",ex$RTD)
ex3 <- gather(ex,RTD, Offset, RTD_1_OFF:RTD_5_OFF)
ex2$RTD <- sub('_OFF',"",ex2$RTD)
我们可以使用 tidyr
中的 pivot_longer
并指定 names_pattern
从列名中捕获组。由于 'RTD' 列应保留原样,因此在 names_to
中指定一个 'RTD' 向量和列值 (.value
),以便 'RTD'将获取数字捕获 ((\d+
) 和单词 ((\w+)
) 'LOC', 'OFF' 将被创建为具有列值
的新列
library(dplyr)
library(tidyr)
example %>%
pivot_longer(cols = everything(),
names_to = c("RTD", ".value"), names_pattern = "\w+_(\d+)_(\w+)")
-输出
# A tibble: 10 x 3
RTD LOC OFF
<chr> <chr> <chr>
1 1 A 1
2 2 C 3
3 3 E 5
4 4 G 7
5 5 I 9
6 1 B 2
7 2 D 4
8 3 F 6
9 4 H 8
10 5 J 10
如何转换我的数据:
example <- data.frame(RTD_1_LOC = c('A', 'B'), RTD_2_LOC = c('C', 'D'),
RTD_3_LOC = c('E', 'F'), RTD_4_LOC = c('G', 'H'),
RTD_5_LOC = c('I', 'J'),RTD_1_OFF = c('1', '2'), RTD_2_OFF = c('3', '4'),
RTD_3_OFF = c('5', '6'), RTD_4_OFF = c('7', '8'),
RTD_5_OFF = c('9', '10'))
对此:
example2 <- data.frame(RTD = c(1,1,2,2,3,3,4,4,5,5),LOC = c('A', 'B','C','D','E','F','G','H','I','J'),
OFF = c(1,2,3,4,5,6,7,8,9,10))
我一直在使用 tidyverse gather,但我最终得到了大约 50 列
ex <- gather(example,RTD, Location, RTD_1_LOC:RTD_5_LOC)
ex$RTD <- sub('_LOC',"",ex$RTD)
ex3 <- gather(ex,RTD, Offset, RTD_1_OFF:RTD_5_OFF)
ex2$RTD <- sub('_OFF',"",ex2$RTD)
我们可以使用 tidyr
中的 pivot_longer
并指定 names_pattern
从列名中捕获组。由于 'RTD' 列应保留原样,因此在 names_to
中指定一个 'RTD' 向量和列值 (.value
),以便 'RTD'将获取数字捕获 ((\d+
) 和单词 ((\w+)
) 'LOC', 'OFF' 将被创建为具有列值
library(dplyr)
library(tidyr)
example %>%
pivot_longer(cols = everything(),
names_to = c("RTD", ".value"), names_pattern = "\w+_(\d+)_(\w+)")
-输出
# A tibble: 10 x 3
RTD LOC OFF
<chr> <chr> <chr>
1 1 A 1
2 2 C 3
3 3 E 5
4 4 G 7
5 5 I 9
6 1 B 2
7 2 D 4
8 3 F 6
9 4 H 8
10 5 J 10