为什么 std::codecvt 仅供文件 I/O 流使用?

Why is std::codecvt only used by file I/O streams?

我一直在实现一个 codecvt 来处理输出流的缩进。它可以像这样使用并且工作正常:

std::cout << indenter::push << "im indentet" << indenter::pop << "\n im not..."

然而,虽然我可以将 std::codecvt 灌输给任何 std::ostream,但当我发现我的代码适用于 std::cout 以及 std::ofstream 时,我感到非常困惑, 但不是例如 std::ostringstream 即使所有这些都继承自基础 class std::ostream.

切面构造正常,代码编译通过,没有抛出任何异常...只是调用了std::codecvt的none个成员函数。

对我来说这很令人困惑,我不得不花很多时间弄清楚 std::codecvt 不会对非文件 I/O 流做任何事情。

有什么原因 std::codecvt 没有被 std::ostream 继承的所有 class 所使用吗?

此外,有人知道我可以依靠哪些结构来实现压头吗?

编辑:这是我所指的语言的一部分:

All file I/O operations performed through std::basic_fstream use the std::codecvt<CharT, char, std::mbstate_t> facet of the locale imbued in the stream.

来源:https://en.cppreference.com/w/cpp/locale/codecvt


更新 1:

我做了一个小例子来说明我的问题:

#include <iostream>
#include <locale>
#include <fstream>
#include <sstream>

static auto invocation_counter = 0u;

struct custom_facet : std::codecvt<char, char, std::mbstate_t>
{
  using parent_t = std::codecvt<char, char, std::mbstate_t>;

  custom_facet() : parent_t(std::size_t { 0u }) {}

  using parent_t::intern_type;
  using parent_t::extern_type;
  using parent_t::state_type;

  virtual std::codecvt_base::result do_out (state_type& state, const intern_type* from, const intern_type* from_end, const intern_type*& from_next,
                                                               extern_type* to, extern_type* to_end, extern_type*& to_next) const override
  {
    while (from < from_end && to < to_end)
    {
      *to = *from;

      to++;
      from++;
    }

    invocation_counter++;

    from_next = from;
    to_next = to;

    return std::codecvt_base::noconv;
  }

  virtual bool do_always_noconv() const throw() override
  {
    return false;
  }
};

std::ostream& imbueFacet (std::ostream& ostream)
{
  ostream.imbue(std::locale { ostream.getloc(), new custom_facet{} });

  return ostream;
}

int main()
{
  std::ios::sync_with_stdio(false);

  std::cout << "invocation_counter = " << invocation_counter << "\n";

  {
    auto ofstream = std::ofstream { "testFile.txt" };

    ofstream << imbueFacet << "test\n";
  }

  std::cout << "invocation_counter = " << invocation_counter << "\n";

  {
     auto osstream = std::ostringstream {};

     osstream << imbueFacet << "test\n";
  }

  std::cout << "invocation_counter = " << invocation_counter << "\n";
}

我希望 invocation_counterstd::ostringstream 中流式传输后增加,但事实并非如此。


更新 2:

经过更多研究,我发现我可以使用 std::wbuffer_converter。引用 https://en.cppreference.com/w/cpp/locale/wbuffer_convert

std::wbuffer_convert is a wrapper over stream buffer of type std::basic_streambuf<char> which gives it the appearance of std::basic_streambuf<Elem>. All I/O performed through std::wbuffer_convert undergoes character conversion as defined by the facet Codecvt. [...]

This class template makes the implicit character conversion functionality of std::basic_filebuf available for any std::basic_streambuf.

这样我就可以将分面应用到 std::ostringstream:

auto osstream = std::ostringstream {};

osstream << "test\n";
  
auto facet = custom_facet{};
  
std::wstring_convert<custom_facet, char> conv;
  
auto str = conv.to_bytes(osstream.str());

但是,我无法使用流式运算符连接分面 <<

这让我更加困惑,为什么 std::codecvt 不是所有输出流都隐式使用的。所有输出流都继承自std::basic_streambuf,其接口适合使用std::codecvt,它只是使用一个输入和输出字符序列,在std::basic_streambuf.

中完全实现

那么为什么std::codecvt的解析是在std::basic_filebuf而不是std::basic_streambuf中实现的呢? std::basic_filebuf毕竟继承了std::basic_streambuf...

要么是我对流在 C++ 中的工作方式有一些根本性的误解,要么是 std::codecvt 没有很好地集成到标准中。也许这就是它被标记为已弃用的原因?

std::codecvt facet 最初旨在处理 I/O diskmemory 字符表示之间的转换。引自 Bjarne Stroustrup The C++ Programming Language 第四版的第 39.4.6 段:

Sometimes, the representation of characters stored in a file differs from the desired representation of those same characters in main memory. ... the codecvt facet provides a mechanism for converting characters from one representation to another as they are read or written.

预期的目的是std::codecvt仅用于文件(磁盘)和内存之间的适配字符,部分 回答你的问题:

Why is std::codecvt only used by file I/O streams?

docs 我们看到:

All file I/O operations performed through std::basic_fstream<CharT> use the std::codecvt<CharT, char, std::mbstate_t> facet of the locale imbued in the stream.

然后回答了为什么 std::ofstream(使用基于文件的流缓冲区)和 std::cout (linked to standard output FILE stream) 调用 std::codecvt.

的问题

现在,要使用高级 std::ostream 接口,您需要提供底层 streambufstd::ofstream 提供了一个 filebufstd::ostringstream 提供了一个 stringbuf(与 std::codecvt 的使用无关)。在 streams 上查看此 post,其中还突出显示了以下内容:

...in the case of ofstream, there are also a few extra functions which forward to additional functions in the filebuf interface

但是,当您有 std::ostringstream 时调用 std::codecvt 的字符转换功能,您可以使用 std::ostream 和底层 std::basic_streambuf,如在您的 post、std::wbuffer_convert.

中指明

您在第二次更新中只使用了 std::wstring_convert 而没有使用 std::wbuffer_convert

当使用 std::wbuffer_convert 时,您可以将原始 std::ostringstream 换成 std::ostream,如下所示:

// Create a std::ostringstream
auto osstream = std::ostringstream{};

// Create the wrapper for the ostringstream
std::wbuffer_convert<custom_facet, char> wrapper(osstream.rdbuf());

// Now create a std::ostream which uses the wrapper to send data to
// the original std::ostringstream
std::ostream normal_ostream(&wrapper);
normal_ostream << "test\n";

// Flush the stream to invoke the conversion
normal_ostream << std::flush;

// Check the invocation_counter
std::cout << "invocation_counter after wrapping std::ostringstream with "
                "std::wbuffer_convert = "
            << invocation_counter << "\n";

连同完整示例 here,输出将是:

invocation_counter start of test1 = 0
invocation_counter after std::ofstream = 1
> test printed to std::cout
invocation_counter after std::cout = 2
invocation_counter after std::ostringstream (should not have changed)= 2
ic after test1 = 2
invocation_counter after std::ostringstream with std::wstring_convert = 3
ic after test2 = 3
invocation_counter after wrapping std::ostringstream with std::wbuffer_convert = 4
ic after test3 = 4

结论

std::codecvt 用于 磁盘和内存表示 之间的转换。这就是为什么 std::codecvt 实现仅通过使用底层 filebuf 的流调用,例如 std::ofstreamstd::cout。 但是,可以使用 std::wbuffer_convert 将使用底层 stringbuf 的流包装到 std::ostream 实例中,然后调用底层 std::codecvt.