在 dplyr 中使用 case_when 改变新列时出现问题
Trouble mutating new column using case_when in dplyr
我想使用 this 数据帧上 Código 列的不同长度值来改变新列。
ods <- readODS::read_ods('http://www.arcotel.gob.ec/wp-content/uploads/2016/09/proyeccion_cantonal_total_2010-2020_seg%C3%BAn_INEC1.ods', skip = 2)
我试过在像这样的 mutate 中使用 case_when:
mutate(ods, Provincia = case_when(
length(ods$Código) == 3 ~ str_extract(ods$Código, '[[:digit:]]{1}'),
length(ods$Código) == 4 ~ str_extract(ods$Código, '[[:digit:]]{2}')
))
当其值的长度为 3 时,它应该用 Código 的第一个数字创建一个新的 Provincia 列,否则应该提取两个数字。当运行上面的代码我只得到NA的
使用nchar
,这将计算每次观察中的字符数:
ods <- mutate(ods, Provincia = case_when(
nchar(ods$Código) == 3 ~ str_extract(ods$Código, '[[:digit:]]{1}'),
nchar(ods$Código) == 4 ~ str_extract(ods$Código, '[[:digit:]]{2}')
))
结果:
> ods %>% pull(Provincia)
[1] "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "2" "2" "2" "2" "2" "2"
[22] "2" "3" "3" "3" "3" "3" "3" "3" "4" "4" "4" "4" "4" "4" "5" "5" "5" "5" "5" "5" "5"
[43] "6" "6" "6" "6" "6" "6" "6" "6" "6" "6" "7" "7" "7" "7" "7" "7" "7" "7" "7" "7" "7"
[64] "7" "7" "7" "8" "8" "8" "8" "8" "8" "8" "8" "9" "9" "9" "9" "9" "9" "9" "9" "9" "9"
[85] "9" "9" "9" "9" "9" "9" "9" "9" "9" "9" "9" "9" "9" "9" "9" "10" "10" "10" "10" "10" "10"
[106] "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "12" "12" "12" "12" "12"
[127] "12" "12" "12" "12" "12" "12" "12" "12" "13" "13" "13" "13" "13" "13" "13" "13" "13" "13" "13" "13" "13"
[148] "13" "13" "13" "13" "13" "13" "13" "13" "13" "14" "14" "14" "14" "14" "14" "14" "14" "14" "14" "14" "14"
[169] "15" "15" "15" "15" "15" "16" "16" "16" "16" "17" "17" "17" "17" "17" "17" "17" "17" "18" "18" "18" "18"
[190] "18" "18" "18" "18" "18" "19" "19" "19" "19" "19" "19" "19" "19" "19" "20" "20" "20" "21" "21" "21" "21"
[211] "21" "21" "21" "22" "22" "22" "22" "23" "24" "24" "24" "90" "90" "90"
我想使用 this 数据帧上 Código 列的不同长度值来改变新列。
ods <- readODS::read_ods('http://www.arcotel.gob.ec/wp-content/uploads/2016/09/proyeccion_cantonal_total_2010-2020_seg%C3%BAn_INEC1.ods', skip = 2)
我试过在像这样的 mutate 中使用 case_when:
mutate(ods, Provincia = case_when(
length(ods$Código) == 3 ~ str_extract(ods$Código, '[[:digit:]]{1}'),
length(ods$Código) == 4 ~ str_extract(ods$Código, '[[:digit:]]{2}')
))
当其值的长度为 3 时,它应该用 Código 的第一个数字创建一个新的 Provincia 列,否则应该提取两个数字。当运行上面的代码我只得到NA的
使用nchar
,这将计算每次观察中的字符数:
ods <- mutate(ods, Provincia = case_when(
nchar(ods$Código) == 3 ~ str_extract(ods$Código, '[[:digit:]]{1}'),
nchar(ods$Código) == 4 ~ str_extract(ods$Código, '[[:digit:]]{2}')
))
结果:
> ods %>% pull(Provincia)
[1] "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "2" "2" "2" "2" "2" "2"
[22] "2" "3" "3" "3" "3" "3" "3" "3" "4" "4" "4" "4" "4" "4" "5" "5" "5" "5" "5" "5" "5"
[43] "6" "6" "6" "6" "6" "6" "6" "6" "6" "6" "7" "7" "7" "7" "7" "7" "7" "7" "7" "7" "7"
[64] "7" "7" "7" "8" "8" "8" "8" "8" "8" "8" "8" "9" "9" "9" "9" "9" "9" "9" "9" "9" "9"
[85] "9" "9" "9" "9" "9" "9" "9" "9" "9" "9" "9" "9" "9" "9" "9" "10" "10" "10" "10" "10" "10"
[106] "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "11" "12" "12" "12" "12" "12"
[127] "12" "12" "12" "12" "12" "12" "12" "12" "13" "13" "13" "13" "13" "13" "13" "13" "13" "13" "13" "13" "13"
[148] "13" "13" "13" "13" "13" "13" "13" "13" "13" "14" "14" "14" "14" "14" "14" "14" "14" "14" "14" "14" "14"
[169] "15" "15" "15" "15" "15" "16" "16" "16" "16" "17" "17" "17" "17" "17" "17" "17" "17" "18" "18" "18" "18"
[190] "18" "18" "18" "18" "18" "19" "19" "19" "19" "19" "19" "19" "19" "19" "20" "20" "20" "21" "21" "21" "21"
[211] "21" "21" "21" "22" "22" "22" "22" "23" "24" "24" "24" "90" "90" "90"