std::codecvt do_out 在 1:N 次转换后跳过字符
std::codecvt do_out skipping characters after 1:N conversions
我曾尝试编写一个自动压头,但是 - 它在向流中添加新字符时会跳过字符。我试过调试它并验证 from_next 和 to_next 以及 from 和 to 都正常工作。
当然我在规范中遗漏了一些东西,但这是我的代码,也许你能帮助我:
virtual result_t do_out(state_type& state, const intern_type* from, const intern_type* from_end, const intern_type*& from_next,
extern_type* to, extern_type* to_end, extern_type*& to_next) const override
{
auto result = std::codecvt_base::noconv;
while (from < from_end && to < to_end)
{
if (getState(state).missingWhitespaces > 0u && *from != '\n')
{
while (getState(state).missingWhitespaces > 0u && to < to_end)
{
*to = ' ';
to++;
getState(state).missingWhitespaces--;
}
if (to < to_end)
{
result = std::codecvt_base::partial;
}
else
{
result = std::codecvt_base::partial;
break;
}
}
else
{
*to = *from;
if (*from == '\n')
{
getState(state).missingWhitespaces = tabSize * indentLevel;
}
to++;
from++;
}
}
from_next = from;
to_next = to;
return result;
};
状态对象也正常工作。该问题仅发生在函数调用之间。
编辑:将 if (to < to_end)
之后的结果更改为 std::codecvt_base::ok
也没有解决问题。
经过更多挖掘,我找到了解决问题的方法。我从这个网站得到了 std::codecvt
的详细解释:http://stdcxx.apache.org/doc/stdlibref/codecvt.html
事实证明,我忘记重写这两个方法:
virtual int do_length(state_type& state, const extern_type *from, const extern_type *end, size_t max) const;
Determines and returns n
, where n
is the number of elements of extern_type
in the source range [from,end)
that can be converted to max
or fewer characters of intern_type
, as if by a call to in(state, from, from_end, from_next, to, to_end, to_next)
where to_end == to + max
.
Sets the value of state to correspond to the shift state of the
sequence starting at from + n
.
Function do_length
must be called under the following preconditions:
state
is either initialized to the beginning of a sequence or equal to
the result of the previous conversion on the sequence.
from <= end
is well-defined and true.
Note that this function does not behave similarly to the C Standard
Library function mbsrtowcs()
. See the mbsrtowcs.cpp example program
for an implementation of this function using the codecvt facet.
virtual int do_max_length() const throw();
Returns the maximum value that do_length()
can return for any valid combination of its first three arguments, with the fourth argument max
set to 1.
我以这种方式实现了它们并且有效:
virtual int do_length(state_type& state, const extern_type* from, const extern_type* end, size_t max) const override
{
auto numberOfCharsAbleToCopy = max;
numberOfCharsAbleToCopy -= std::min(static_cast<unsigned int>(numberOfCharsAbleToCopy), getState(state).missingWhitespaces);
bool newLineToAppend = false;
for (auto c = from + getState(state).missingWhitespaces; c < end && numberOfCharsAbleToCopy > 0u; c++)
{
if (*c == '\n' && !newLineToAppend)
{
newLineToAppend = true;
}
else if (*c != '\n' && newLineToAppend)
{
numberOfCharsAbleToCopy -= std::min(tabSize * indentLevel, numberOfCharsAbleToCopy);
if (numberOfCharsAbleToCopy == 0u)
{
break;
}
newLineToAppend = false;
}
}
return numberOfCharsAbleToCopy;
}
virtual int do_max_length() const throw() override
{
return tabSize * indentLevel;
}
我曾尝试编写一个自动压头,但是 - 它在向流中添加新字符时会跳过字符。我试过调试它并验证 from_next 和 to_next 以及 from 和 to 都正常工作。
当然我在规范中遗漏了一些东西,但这是我的代码,也许你能帮助我:
virtual result_t do_out(state_type& state, const intern_type* from, const intern_type* from_end, const intern_type*& from_next,
extern_type* to, extern_type* to_end, extern_type*& to_next) const override
{
auto result = std::codecvt_base::noconv;
while (from < from_end && to < to_end)
{
if (getState(state).missingWhitespaces > 0u && *from != '\n')
{
while (getState(state).missingWhitespaces > 0u && to < to_end)
{
*to = ' ';
to++;
getState(state).missingWhitespaces--;
}
if (to < to_end)
{
result = std::codecvt_base::partial;
}
else
{
result = std::codecvt_base::partial;
break;
}
}
else
{
*to = *from;
if (*from == '\n')
{
getState(state).missingWhitespaces = tabSize * indentLevel;
}
to++;
from++;
}
}
from_next = from;
to_next = to;
return result;
};
状态对象也正常工作。该问题仅发生在函数调用之间。
编辑:将 if (to < to_end)
之后的结果更改为 std::codecvt_base::ok
也没有解决问题。
经过更多挖掘,我找到了解决问题的方法。我从这个网站得到了 std::codecvt
的详细解释:http://stdcxx.apache.org/doc/stdlibref/codecvt.html
事实证明,我忘记重写这两个方法:
virtual int do_length(state_type& state, const extern_type *from, const extern_type *end, size_t max) const;
Determines and returnsn
, wheren
is the number of elements ofextern_type
in the source range[from,end)
that can be converted tomax
or fewer characters ofintern_type
, as if by a call toin(state, from, from_end, from_next, to, to_end, to_next)
whereto_end == to + max
.Sets the value of state to correspond to the shift state of the sequence starting at
from + n
.Function
do_length
must be called under the following preconditions:
state
is either initialized to the beginning of a sequence or equal to the result of the previous conversion on the sequence.
from <= end
is well-defined and true.Note that this function does not behave similarly to the C Standard Library function
mbsrtowcs()
. See the mbsrtowcs.cpp example program for an implementation of this function using the codecvt facet.
virtual int do_max_length() const throw();
Returns the maximum value that
do_length()
can return for any valid combination of its first three arguments, with the fourth argumentmax
set to 1.
我以这种方式实现了它们并且有效:
virtual int do_length(state_type& state, const extern_type* from, const extern_type* end, size_t max) const override
{
auto numberOfCharsAbleToCopy = max;
numberOfCharsAbleToCopy -= std::min(static_cast<unsigned int>(numberOfCharsAbleToCopy), getState(state).missingWhitespaces);
bool newLineToAppend = false;
for (auto c = from + getState(state).missingWhitespaces; c < end && numberOfCharsAbleToCopy > 0u; c++)
{
if (*c == '\n' && !newLineToAppend)
{
newLineToAppend = true;
}
else if (*c != '\n' && newLineToAppend)
{
numberOfCharsAbleToCopy -= std::min(tabSize * indentLevel, numberOfCharsAbleToCopy);
if (numberOfCharsAbleToCopy == 0u)
{
break;
}
newLineToAppend = false;
}
}
return numberOfCharsAbleToCopy;
}
virtual int do_max_length() const throw() override
{
return tabSize * indentLevel;
}