如何计算长格式数据框中值的转换?

how to count a transition of values in a long format dataframe?

对于我的硕士论文,我正在分析一个食品安全模型,我需要获得的下一个要素是研究期间发生的危机转变的数量。危机过渡是指在标准预测期间(最初为 3 个月,后来为 4 个月,但除此之外),粮食安全 IPC 值从 1 或 2 变为 3,4 或 5。所以我想计算一个区域从 1 或 2 变为 3,4 或 5 的次数。我有一个很长的数据框,其中有一列包含周期、区域(生计区)和 IPC 值。我把link打包成两个csv文件给大家下载查看。

你们认为获得每种区域类型的计​​数的最佳方法是什么? 如果您需要其他信息,请告诉我。我希望你们能帮上忙,这意义重大!

Dput前48行的输出,也就是两个周期乘以所有的面积:

structure(list(`Livelihood zone` = c("Central Highlands, High Potential Zone", 
"Marsabit Marginal Mixed Farming Zone", "Northwestern Agropastoral Zone", 
"Southeastern Marginal Mixed Farming Zone", "Turkwell Riverine Zone", 
"Western High Potential Zone", "Tana Riverine Zone", "Southeastern Medium Potential, Mixed Farming Zone", 
"Northern Pastoral Zone", "Western Medium Potential Zone", "Western Lakeshore Marginal Mixed Farming Zone", 
"Southern Pastoral Zone", "Northeastern Pastoral Zone", "Mandera Riverine Zone", 
"Eastern Pastoral Zone", "Northeastern Agropastoral Zone", "Lake Turkana Fishing", 
"Lake Victoria Fishing Zone", "Western Agropastoral Zone", "Coastal Medium Potential Farming Zone", 
"Coastal Marginal Agricultural Mixed Farming Zone", "Southeastern Pastoral  Zone", 
"Northwestern Pastoral Zone", "Southern Agropastoral Zone", "Central Highlands, High Potential Zone", 
"Marsabit Marginal Mixed Farming Zone", "Northwestern Agropastoral Zone", 
"Southeastern Marginal Mixed Farming Zone", "Turkwell Riverine Zone", 
"Western High Potential Zone", "Tana Riverine Zone", "Southeastern Medium Potential, Mixed Farming Zone", 
"Northern Pastoral Zone", "Western Medium Potential Zone", "Western Lakeshore Marginal Mixed Farming Zone", 
"Southern Pastoral Zone", "Northeastern Pastoral Zone", "Mandera Riverine Zone", 
"Eastern Pastoral Zone", "Northeastern Agropastoral Zone", "Lake Turkana Fishing", 
"Lake Victoria Fishing Zone", "Western Agropastoral Zone", "Coastal Medium Potential Farming Zone", 
"Coastal Marginal Agricultural Mixed Farming Zone", "Southeastern Pastoral  Zone", 
"Northwestern Pastoral Zone", "Southern Agropastoral Zone"), 
    `Period of measurement Kenya` = c("2011-01", "2011-01", "2011-01", 
    "2011-01", "2011-01", "2011-01", "2011-01", "2011-01", "2011-01", 
    "2011-01", "2011-01", "2011-01", "2011-01", "2011-01", "2011-01", 
    "2011-01", "2011-01", "2011-01", "2011-01", "2011-01", "2011-01", 
    "2011-01", "2011-01", "2011-01", "2011-04", "2011-04", "2011-04", 
    "2011-04", "2011-04", "2011-04", "2011-04", "2011-04", "2011-04", 
    "2011-04", "2011-04", "2011-04", "2011-04", "2011-04", "2011-04", 
    "2011-04", "2011-04", "2011-04", "2011-04", "2011-04", "2011-04", 
    "2011-04", "2011-04", "2011-04"), `IPC class` = c(1, 3, 2, 
    2, 2, 1, 2, 2, 3, 1, 1, 2, 3, 3, 2, 3, 2, 1, 2, 2, 2, 2, 
    2, 2, 1, 3, 2, 2, 2, 1, 2, 2, 3, 1, 1, 2, 3, 3, 2, 3, 2, 
    1, 2, 2, 2, 2, 2, 2)), row.names = c(NA, 48L), class = "data.frame")

对于结果,我想要一个数据框,其中包含每个生计区域的危机转变计数。提前致谢!

我认为这应该可行。如果它不起作用,请分享一个不正确的危机转换示例,以便我进行调试。

library(dplyr)
df %>% mutate(crisis = ifelse(`IPC class` %in% 3:5, 1, 0)) %>%
  arrange(`Livelihood zone`, `Period of measurement Kenya`) %>%
  group_by(`Livelihood zone`) %>%
  summarize(crisis_trans_count = sum(diff(crisis) > 0))