有没有办法与其他循环条件一起使用 erase-remove 习语？

Question

我有一个函数，给定一段文本，应该删除所有标点符号，并将所有字母变成小写，最后，应该根据单字母密码转换它们。下面的代码有效：

class Cipher { 
  public:

  string keyword; 

  string decipheredText;

  deque<string> encipheredAlphabet;

    static bool is_punctuation (char c) {
      return c ==  '.' || c == ',' || c == '!' || c == '\''|| c == '?' || c 
      == ' ';
    }

  string encipher(string text) { 
    Alphabet a;
    encipheredAlphabet = a.cipherLetters(keyword);


    text.erase( remove_if(text.begin(), text.end(), is_punctuation), 
    text.end() );

    string::iterator it;
    for (it = text.begin(); it != text.end(); it++) { 
      *it = tolower(*it); 
      // encipher text according to shift
    }

    return text;
  }

};

问题是，它目前对字符串进行了两次遍历，一次删除标点符号，一次执行所有其他操作。这似乎效率低下，因为似乎所有的转换都可以通过某种方式一次性完成。有没有一种干净的方法可以将擦除-删除习惯用法与其他循环条件结合起来？

Answer 1

您可以通过使用 std::accumulate 和迭代器作为初始值插入输出 std::string

auto filter = [](auto pred) {
    return [=](auto map) {
        auto accumulator = [=](auto it, auto c) {
            if (pred(c)) {
                *it = map(c);
            }
            return ++it;
        };
        return accumulator;
    };
};

auto accumulator = filter(std::not_fn(is_punctuation))
([](auto c) {
    return std::tolower(c);
});

std::string in = "insIsjs.|s!js";
std::string out;
std::accumulate(std::begin(in), std::end(in), std::back_inserter(out), accumulator);

见demo

Answer 2

如果您不想进行两次循环，因为您已经测量并发现它比较慢，请编写自定义算法：

template <typename Iter, typename OutIter>
OutIter lowercased_without_punctuation(Iter begin, Iter end, OutIter out) {
    while (begin != end) {
        // Ignoring things like std::move_iterator for brevity.
        if (!is_punctuation(*begin)) {
            *out = tolower(*begin);
            ++out;
        }

        // Use `++iter` rather than `iter++` when possible
        ++begin;
    }

    return out;
}

// ...

string encipher(string text) {
    Alphabet a;
    encipheredAlphabet = a.cipherLetters(keyword);

    text.erase(
        lowercased_without_punctuation(text.begin(), text.end(), text.begin()),
        text.end());

    return text;
}

如果你再考虑一下，lowercased_without_punctuation 实际上是一个更通用的算法的特例，可以称为 transform_if (relevant Q&A):

template <typename Iter, typename OutIter, typename Pred, typename Transf>
OutIter transform_if(Iter begin, Iter end, OutIter out, Pred p, Transf t) {
    while (begin != end) {
        if (p(*begin)) {
            *out = t(*begin);
            ++out;
        }

        ++begin;
    }

    return out;
}

// ...

string encipher(string text) {
    Alphabet a;
    encipheredAlphabet = a.cipherLetters(keyword);

    text.erase(
        transform_if(text.begin(), text.end(), text.begin(),
            [](char c) { return !is_punctuation(c); },
            [](char c) { return tolower(c); }),
        text.end());

    return text;
}

Answer 3

复制and/or修改字符，然后截断字符串：

string encipher(string text)
{
    auto it = text.begin(),
         jt = it;
    for (; it != text.end(); it++)
    {
        if (!is_punctuation(*it))
        {
            *jt = tolower(*it);
            ++jt;
        }
    }
    text.erase(jt, it);
    return text;
}

Answer 4

使用 range-v3，您可以创建（惰性）视图：

return text | ranges::view::filter([](char c){ return !is_punctuation(c); })
            | ranges::view::transform([](char c) -> char { return to_lower(c); });

有没有办法与其他循环条件一起使用 erase-remove 习语？

Is there a way to use the erase-remove idiom in concert with other looping conditions?

c++

erase-remove-idiom