使用 substr 查找附近的字符

Question

所以我试图在与我正在遍历的每个字符相距 X 距离内找到字符。举个例子....

nearby("abcdefg", 2)

应该return一个以每个字符为键的集合，其值在2的距离内接近。它应该看起来像这样...

dictionary('a' -> set(a, b, c), 'b' -> set(a, b, c, d), 'c' -> set(a,b,c,d,e))

我的代码现在看起来像这样...

dictionary<char, set<char>> near(const std::string word, int dist) {
    dictionary<char, set<char>> map;
    for (int x = 0; x < word.size(); x++) {
        for (char letter : word.substr(std::max(0, x - dist), std::min(dist + 1, int(word.size()))))
            map[word[x]].insert(letter);
    }
    return map;
}

问题概要： - 它在大多数情况下都有效，但是，由于 C++ 的子字符串，我无法指定我想要从索引 0 到 4 的所有字符。相反，它在 0 处建立索引，然后包括 4 范围内的任何内容。这是有问题的当我想倒退以在前面包含字符 4 个字母时和在后面。

到现在为止，我的代码是正确的，但最后要少一个字符。所以它看起来像这样......

nearby(abcdefg, 2)
dictionary('c' -> set(a,b,c))

它省略了 d。

Answer 1

您只需要：

        const auto start = std::max(0, x-dist);
        const auto end = std::min(x+dist+1, int(word.size());
        const auto len = end - start;
        const auto substring = word.substr(start,len);
        auto &the_set = map[word[x]];
        for (const auto letter : substring)
            the_set.insert(letter);

如评论中所述，如果 word.size() > INT_MAX，这将中断。解决方案是在 size_t 中完成所有操作（您可以在 std::string::size_t 中完成所有操作，但这非常冗长，并且不会真正给您带来任何好处）。

dictionary<char, set<char>> near(const std::string word, size_t dist) {
    dictionary<char, set<char>> map;
    for (size_t x = 0; x < word.size(); x++) {
        const auto start = (x > dist) ? x-dist : 0;  // Beware underflow
        const auto end = std::min(x+dist+1, word.size());
        const auto len = end - start;
        const auto substring = word.substr(start,len);
        auto &the_set = map[word[x]];
        for (const auto letter : substring)
            the_set.insert(letter);
     }
 }

这个版本的优点是 gcc 会用 -Werror -Wall 编译它（以前的版本会抱怨 signed/unsigned 比较），并且没有强制转换（总是一个好兆头） .

更好的版本是 start 和 end 是 word 的迭代器——此时您不需要创建子字符串 at all（看原字中的字符即可）

使用 substr 查找附近的字符

Using substr to find nearby characters

c++

splice