在 R 中,根据行之间的差异计算新列
In R, calculate a new column based on difference between rows
考虑我的数据字段的以下子集:
Pack side row col v1 v2
1 P1 Left 1 1 0.4094 -3.8700
2 P1 Right 1 1 0.4110 -3.5245
3 P1 Left 1 2 0.4118 -3.4876
4 P1 Right 1 2 0.4108 -3.7268
5 P1 Left 1 3 0.4119 -3.5322
6 P1 Right 1 3 0.4110 -3.6101
我对 v1 和 v2 的左右分别感兴趣,特别是 v1 的百分比差异和 v2 的直接差异。
我想要的输出是一个新的数据字段,如下所示:
Pack row col dv1 dv2
1 P1 1 1 0.389294404 0.3455
2 P1 1 2 -0.243427459 -0.2392
3 P1 1 3 -0.218978102 -0.0779
其中 dv1 的计算是 v1 的 (Right-Left)/Left*100
,dv2 的计算是 v2 的 Right-Left
。
这是 df 数据:
df <- structure(list(Pack = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("P1",
"P2", "P3", "P4"), class = "factor"), side = structure(c(1L,
2L, 1L, 2L, 1L, 2L), .Label = c("Left", "Right"), class = "factor"),
row = c(1L, 1L, 1L, 1L, 1L, 1L), col = c(1L, 1L, 2L, 2L,
3L, 3L), v1 = c(0.4094, 0.411, 0.4118, 0.4108, 0.4119, 0.411
), v2 = c(-3.87, -3.5245, -3.4876, -3.7268, -3.5322, -3.6101
)), .Names = c("Pack", "side", "row", "col", "v1", "v2"), row.names = c(NA,
6L), class = "data.frame")
谢谢!
我们可以先按 side
对行进行排序,并确保首先是 Left
,然后是 Right
。这给出了
library(tidyverse)
df %>% arrange(side) %>% group_by(Pack, row, col) %>%
summarise(dv1 = (v1[2] - v1[1]) / v1[1] * 100, dv2 = v2[2] - v2[1])
# A tibble: 3 x 5
# Groups: Pack, row [?]
# Pack row col dv1 dv2
# <fct> <int> <int> <dbl> <dbl>
# 1 P1 1 1 0.391 0.345
# 2 P1 1 2 -0.243 -0.239
# 3 P1 1 3 -0.218 -0.0779
或者只是
df %>% arrange(side) %>% group_by(Pack, row, col) %>%
summarise(dv1 = diff(v1) / v1[1] * 100, dv2 = diff(v2))
另一种 dplyr
方法使用 lead
和 mutate
library(tidyverse)
df2 <- df %>%
mutate(lead_v1 = lead(v1), lead_v2 = lead(v2), dv1 = (lead_v1-v1)/v1*100, dv2 = lead_v2-v2) %>%
select(c(1,3,4,9,10)) %>%
filter(row_number() %% 2 != 0)
> df2
Pack row col dv1 dv2
1 P1 1 1 0.3908158 0.3455
2 P1 1 2 -0.2428363 -0.2392
3 P1 1 3 -0.2184996 -0.0779
>
编辑 - 更改过滤器以删除偶数行
考虑我的数据字段的以下子集:
Pack side row col v1 v2
1 P1 Left 1 1 0.4094 -3.8700
2 P1 Right 1 1 0.4110 -3.5245
3 P1 Left 1 2 0.4118 -3.4876
4 P1 Right 1 2 0.4108 -3.7268
5 P1 Left 1 3 0.4119 -3.5322
6 P1 Right 1 3 0.4110 -3.6101
我对 v1 和 v2 的左右分别感兴趣,特别是 v1 的百分比差异和 v2 的直接差异。
我想要的输出是一个新的数据字段,如下所示:
Pack row col dv1 dv2
1 P1 1 1 0.389294404 0.3455
2 P1 1 2 -0.243427459 -0.2392
3 P1 1 3 -0.218978102 -0.0779
其中 dv1 的计算是 v1 的 (Right-Left)/Left*100
,dv2 的计算是 v2 的 Right-Left
。
这是 df 数据:
df <- structure(list(Pack = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("P1",
"P2", "P3", "P4"), class = "factor"), side = structure(c(1L,
2L, 1L, 2L, 1L, 2L), .Label = c("Left", "Right"), class = "factor"),
row = c(1L, 1L, 1L, 1L, 1L, 1L), col = c(1L, 1L, 2L, 2L,
3L, 3L), v1 = c(0.4094, 0.411, 0.4118, 0.4108, 0.4119, 0.411
), v2 = c(-3.87, -3.5245, -3.4876, -3.7268, -3.5322, -3.6101
)), .Names = c("Pack", "side", "row", "col", "v1", "v2"), row.names = c(NA,
6L), class = "data.frame")
谢谢!
我们可以先按 side
对行进行排序,并确保首先是 Left
,然后是 Right
。这给出了
library(tidyverse)
df %>% arrange(side) %>% group_by(Pack, row, col) %>%
summarise(dv1 = (v1[2] - v1[1]) / v1[1] * 100, dv2 = v2[2] - v2[1])
# A tibble: 3 x 5
# Groups: Pack, row [?]
# Pack row col dv1 dv2
# <fct> <int> <int> <dbl> <dbl>
# 1 P1 1 1 0.391 0.345
# 2 P1 1 2 -0.243 -0.239
# 3 P1 1 3 -0.218 -0.0779
或者只是
df %>% arrange(side) %>% group_by(Pack, row, col) %>%
summarise(dv1 = diff(v1) / v1[1] * 100, dv2 = diff(v2))
另一种 dplyr
方法使用 lead
和 mutate
library(tidyverse)
df2 <- df %>%
mutate(lead_v1 = lead(v1), lead_v2 = lead(v2), dv1 = (lead_v1-v1)/v1*100, dv2 = lead_v2-v2) %>%
select(c(1,3,4,9,10)) %>%
filter(row_number() %% 2 != 0)
> df2
Pack row col dv1 dv2
1 P1 1 1 0.3908158 0.3455
2 P1 1 2 -0.2428363 -0.2392
3 P1 1 3 -0.2184996 -0.0779
>
编辑 - 更改过滤器以删除偶数行