根据多列不断搞乱百分比变化
Keep messing up percentage change depending on multiple columns
伟大的 Whosebug 社区,我再次向您寻求帮助。我不止一次发布了这个问题,但不知何故我无法解决这个简单的问题......老实说,我在这里有点沮丧,所以永远感谢所有帮助我的人。
关于堆栈溢出的多个类似答案适用于我的小型可重现数据框,但是当我在我的原始数据框上使用相同的策略时,它不起作用。所以首先对变量进行一些翻译(它们是荷兰语):
Gemeente
== 自治市
jaar
== 年份
Beleidscode
~ 犯罪类别
aantal_misdrijven
== 违规次数
- 对于这个问题,我们不需要
Kennisnamedatum
==(日期)和 weekdag
== 工作日。
我的问题:
我想计算 2017
相对于按 Gemeente
和 Beleidscode
分组的 2015
的变化。
library(tidyverse)
# This wil download my original data frame with ease:
df <- read_csv("https://github.com/thomasdebeus/colourful-facts/raw/master/projects/crime_dataset.csv")
# The following tries to first add a column with totals per
# year, municipality and crime category. Then calculate percentage change.
df %>%
group_by(Gemeente, jaar, Beleidscode) %>%
arrange(Gemeente, jaar, Beleidscode) %>%
summarise(per_jaar_Gem_misdrijf = sum(aantal_misdrijven)) %>%
mutate(perct_change = (per_jaar_gem_misdrijf - lag(per_jaar_gem_misdrijf, order_by = jaar)) / lag(per_jaar_gem_misdrijf, order_by = jaar))
ungroup()
所以是的,正如您可能接受的那样,这没有创建正确的数字...
希望有人能帮忙。
您似乎在尝试计算与上一年相比的百分比变化。一种方法是
library(tidyverse)
# This wil download my original data frame with ease:
df <- read_csv("https://github.com/thomasdebeus/colourful-facts/raw/master/projects/crime_dataset.csv")
# Create a data frame with the summary count by year
dfSumByYear <-
df %>%
group_by(Gemeente, jaar, Beleidscode) %>%
summarise(per_jaar_Gem_misdrijf = sum(aantal_misdrijven)) %>%
ungroup()
# Add the Previous Year counts as an additional column
dfSumByYearWithPrev <-
dfSumByYear %>%
left_join(dfSumByYear %>%
mutate(JaarJoin = jaar+1) %>%
rename(per_PrevJaar_Gem_misdrijf = per_jaar_Gem_misdrijf) %>%
select(-jaar), by = c("Gemeente", c("jaar"="JaarJoin"), "Beleidscode")) %>%
# Calculate the Percentage Change
mutate(perct_change = (coalesce(per_jaar_Gem_misdrijf,0L) - coalesce(per_PrevJaar_Gem_misdrijf,0L)) / coalesce(per_PrevJaar_Gem_misdrijf,0L))
如果您想具体计算 2015 年到 2017 年的变化,一种方法是
library(tidyverse)
# This wil download my original data frame with ease:
df <- read_csv("https://github.com/thomasdebeus/colourful-facts/raw/master/projects/crime_dataset.csv")
# Calculate the percentage change from 2015 to 2017
df %>%
group_by(Gemeente, jaar, Beleidscode) %>%
summarise(per_jaar_Gem_misdrijf = sum(aantal_misdrijven)) %>%
ungroup() %>%
spread(jaar, per_jaar_Gem_misdrijf, fill = 0L) %>%
mutate(perct_change = (`2017` - `2015`) / `2015`)
伟大的 Whosebug 社区,我再次向您寻求帮助。我不止一次发布了这个问题,但不知何故我无法解决这个简单的问题......老实说,我在这里有点沮丧,所以永远感谢所有帮助我的人。
关于堆栈溢出的多个类似答案适用于我的小型可重现数据框,但是当我在我的原始数据框上使用相同的策略时,它不起作用。所以首先对变量进行一些翻译(它们是荷兰语):
Gemeente
== 自治市jaar
== 年份Beleidscode
~ 犯罪类别aantal_misdrijven
== 违规次数- 对于这个问题,我们不需要
Kennisnamedatum
==(日期)和weekdag
== 工作日。
我的问题:
我想计算 2017
相对于按 Gemeente
和 Beleidscode
分组的 2015
的变化。
library(tidyverse)
# This wil download my original data frame with ease:
df <- read_csv("https://github.com/thomasdebeus/colourful-facts/raw/master/projects/crime_dataset.csv")
# The following tries to first add a column with totals per
# year, municipality and crime category. Then calculate percentage change.
df %>%
group_by(Gemeente, jaar, Beleidscode) %>%
arrange(Gemeente, jaar, Beleidscode) %>%
summarise(per_jaar_Gem_misdrijf = sum(aantal_misdrijven)) %>%
mutate(perct_change = (per_jaar_gem_misdrijf - lag(per_jaar_gem_misdrijf, order_by = jaar)) / lag(per_jaar_gem_misdrijf, order_by = jaar))
ungroup()
所以是的,正如您可能接受的那样,这没有创建正确的数字... 希望有人能帮忙。
您似乎在尝试计算与上一年相比的百分比变化。一种方法是
library(tidyverse)
# This wil download my original data frame with ease:
df <- read_csv("https://github.com/thomasdebeus/colourful-facts/raw/master/projects/crime_dataset.csv")
# Create a data frame with the summary count by year
dfSumByYear <-
df %>%
group_by(Gemeente, jaar, Beleidscode) %>%
summarise(per_jaar_Gem_misdrijf = sum(aantal_misdrijven)) %>%
ungroup()
# Add the Previous Year counts as an additional column
dfSumByYearWithPrev <-
dfSumByYear %>%
left_join(dfSumByYear %>%
mutate(JaarJoin = jaar+1) %>%
rename(per_PrevJaar_Gem_misdrijf = per_jaar_Gem_misdrijf) %>%
select(-jaar), by = c("Gemeente", c("jaar"="JaarJoin"), "Beleidscode")) %>%
# Calculate the Percentage Change
mutate(perct_change = (coalesce(per_jaar_Gem_misdrijf,0L) - coalesce(per_PrevJaar_Gem_misdrijf,0L)) / coalesce(per_PrevJaar_Gem_misdrijf,0L))
如果您想具体计算 2015 年到 2017 年的变化,一种方法是
library(tidyverse)
# This wil download my original data frame with ease:
df <- read_csv("https://github.com/thomasdebeus/colourful-facts/raw/master/projects/crime_dataset.csv")
# Calculate the percentage change from 2015 to 2017
df %>%
group_by(Gemeente, jaar, Beleidscode) %>%
summarise(per_jaar_Gem_misdrijf = sum(aantal_misdrijven)) %>%
ungroup() %>%
spread(jaar, per_jaar_Gem_misdrijf, fill = 0L) %>%
mutate(perct_change = (`2017` - `2015`) / `2015`)