如何通过 ICD 数据的前缀创建子组?

How to make subgroups by prefixes from ICD data?

我有大量 ICD-10 数据,我想创建子组并从中求和。

例如,我有 'JAL01, JAL20 and JAL21',我需要以 'JAL' 开头的所有代码的总和。我该怎么做?

子字符串前 3 个字母,然后分组并求和:

# example data
df1 <- data.frame(icd = c("JAL01", "JAL20", "JAL21", "foo11", "foo22"),
                  x = 1:5)

# get 1st 3 letters
df1$grp <- substr(df1$icd, 1, 3)

# get sum per group
aggregate(x ~ grp, df1, sum)
#   grp x
# 1 foo 9
# 2 JAL 6