近似字符串匹配的逻辑是什么?

What is the logic of approximate string matching?

有谁知道下面例子的原因是什么:

agrepl("cold", "cool")
#> [1] FALSE
agrepl("cool", "cold")
#> [1] TRUE

由于 max distance 默认为:

If cost is not given, all defaults to 10%, and the other transformation number bounds default to all. The component names can be abbreviated.

并且:

Expressed either as integer, or as a fraction of the pattern length times the maximal transformation cost (will be replaced by the smallest integer not less than the corresponding fraction)

长度为 4 的模式的默认最大转换量为 1。 cool 模式匹配 cold 开头的 col,仅使用 1 个删除。更改 cold 以匹配 cool 将至少进行两次转换(两次替换或一次删除和一次插入)。

这些例子可能会进一步解释它:

agrepl("cold", "cool",max.distance = 1) # two changes necessary
#> [1] FALSE
agrepl("cold", "cool",max.distance = 2)
#> [1] TRUE
agrepl("cold", "coold") # just one addition necessary
#> [1] TRUE