使用函数 diff 对非连续行进行聚合
Aggregating using function diff with non-sequential rows
我是 r 的新手,我正在自学如何使用它,希望我能很好地解释我的问题。
在我的数据中有 4 列:
1. Code=Location of a plot
2. Event= Pre or Post. Refers to whether the year of sampling was before or after a disturbance
3. Season= The season the sampling was done in
4. Total= Number of individuals found in plot
我想汇总数据,以便每个位置和季节都有一行,其中包含 post 火灾前和 post 火灾之间的总变化。
我希望更改始终按预先计算 - Post 并且在我的数据中并不总是按该顺序进行。
我有:
Code Event Season Total
A Post AUTUMN 2
A Pre AUTUMN 5
A Pre SUMMER 15
A Post SUMMER 40
B Pre AUTUMN 5
B Post AUTUMN 8
我想要的:
Code Season Change
A AUTUMN 3
A SUMMER -25
B AUTUMN -3
我们可以在按 'Code' 和 'Season'
分组后在 'Total' 上使用 diff
aggregate(cbind(Change = Total) ~ Code + Season, df1, diff)
或 dplyr
library(dplyr)
df1 %>%
group_by(Code, Season) %>%
summarise(Change = Total[Event == "Pre"] - Total[Event == "Post"])
# A tibble: 3 x 3
# Groups: Code [2]
# Code Season Change
# <chr> <chr> <int>
#1 A AUTUMN 3
#2 A SUMMER -25
#3 B AUTUMN -3
或使用data.table
library(data.table)
setDT(df1)[, .(Change = Total[Event == 'Pre'] - Total[Event == 'Post']), .(Code, Season)]
数据
df1 <- structure(list(Code = c("A", "A", "A", "A", "B", "B"), Event = c("Post",
"Pre", "Pre", "Post", "Pre", "Post"), Season = c("AUTUMN", "AUTUMN",
"SUMMER", "SUMMER", "AUTUMN", "AUTUMN"), Total = c(2L, 5L, 15L,
40L, 5L, 8L)), class = "data.frame", row.names = c(NA, -6L))
这是一个基本的 R 选项
dfout <- aggregate(Change~Code + Season,
transform(df,Change = Total*ifelse(Event=="Post",-1,1)),
sum)
这给出了
> dfout
Code Season Change
1 A AUTUMN 3
2 B AUTUMN -3
3 A SUMMER -25
数据
df <- structure(list(Code = c("A", "A", "A", "A", "B", "B"), Event = c("Post",
"Pre", "Pre", "Post", "Pre", "Post"), Season = c("AUTUMN", "AUTUMN",
"SUMMER", "SUMMER", "AUTUMN", "AUTUMN"), Total = c(2L, 5L, 15L,
40L, 5L, 8L)), class = "data.frame", row.names = c(NA, -6L))
我是 r 的新手,我正在自学如何使用它,希望我能很好地解释我的问题。
在我的数据中有 4 列:
1. Code=Location of a plot
2. Event= Pre or Post. Refers to whether the year of sampling was before or after a disturbance
3. Season= The season the sampling was done in
4. Total= Number of individuals found in plot
我想汇总数据,以便每个位置和季节都有一行,其中包含 post 火灾前和 post 火灾之间的总变化。
我希望更改始终按预先计算 - Post 并且在我的数据中并不总是按该顺序进行。
我有:
Code Event Season Total
A Post AUTUMN 2
A Pre AUTUMN 5
A Pre SUMMER 15
A Post SUMMER 40
B Pre AUTUMN 5
B Post AUTUMN 8
我想要的:
Code Season Change
A AUTUMN 3
A SUMMER -25
B AUTUMN -3
我们可以在按 'Code' 和 'Season'
分组后在 'Total' 上使用diff
aggregate(cbind(Change = Total) ~ Code + Season, df1, diff)
或 dplyr
library(dplyr)
df1 %>%
group_by(Code, Season) %>%
summarise(Change = Total[Event == "Pre"] - Total[Event == "Post"])
# A tibble: 3 x 3
# Groups: Code [2]
# Code Season Change
# <chr> <chr> <int>
#1 A AUTUMN 3
#2 A SUMMER -25
#3 B AUTUMN -3
或使用data.table
library(data.table)
setDT(df1)[, .(Change = Total[Event == 'Pre'] - Total[Event == 'Post']), .(Code, Season)]
数据
df1 <- structure(list(Code = c("A", "A", "A", "A", "B", "B"), Event = c("Post",
"Pre", "Pre", "Post", "Pre", "Post"), Season = c("AUTUMN", "AUTUMN",
"SUMMER", "SUMMER", "AUTUMN", "AUTUMN"), Total = c(2L, 5L, 15L,
40L, 5L, 8L)), class = "data.frame", row.names = c(NA, -6L))
这是一个基本的 R 选项
dfout <- aggregate(Change~Code + Season,
transform(df,Change = Total*ifelse(Event=="Post",-1,1)),
sum)
这给出了
> dfout
Code Season Change
1 A AUTUMN 3
2 B AUTUMN -3
3 A SUMMER -25
数据
df <- structure(list(Code = c("A", "A", "A", "A", "B", "B"), Event = c("Post",
"Pre", "Pre", "Post", "Pre", "Post"), Season = c("AUTUMN", "AUTUMN",
"SUMMER", "SUMMER", "AUTUMN", "AUTUMN"), Total = c(2L, 5L, 15L,
40L, 5L, 8L)), class = "data.frame", row.names = c(NA, -6L))