如何从 streambuf 中获取一个或多个特定字节？

Question

为了接收带有自定义 headers 以太网帧的原始协议，我正在使用 streambuf 缓冲区从以太网中读取字节。大部分有效载荷被成功复制，但我需要检查缓冲区中帧 header 的特定字节，以便我可以处理某些极端情况，但无法弄清楚如何获取特定字节，并且如何把它变成一个整数。这是代码：

boost::asio::streambuf read_buffer;

boost::asio::streambuf::mutable_buffers_type buf = read_buffer.prepare(bytesToGet);
bytesRead = d_socket10->receive(boost::asio::buffer(buf, bytesToGet));
read_buffer.commit(bytesRead);

const char *readData = boost::asio::buffer_cast<const char*>( read_buffer.data() + 32 );

我需要获取地址 20 处的字节长度。我尝试过使用 stringstream、memcpy 和转换来做一些事情，但我对此没有把握，要么出现编译错误，要么没有做我认为应该做的事。

如何从我需要的偏移量中获取字节并将其转换为字节或短字节？大小实际上是 2 个字节，但在这种特定情况下，其中一个字节应该为零，因此获得 1 个字节或 2 个字节都是理想的。

谢谢！

Answer 1

欢迎解析。

欢迎使用二进制数据。

欢迎使用便携式网络协议。

这三个主题中的每一个都是需要处理的事情。

最简单的事情是读入缓冲区并使用它。使用 Boost Endian 消除可移植性问题。

这是我能想到的最简单的事情，只使用标准库的东西（忽略字节顺序）：

Live On Coliru

#include <boost/asio.hpp>
#include <istream>
#include <iostream>

namespace ba = boost::asio;

void fill_testdata(ba::streambuf&);

int main() {
    ba::streambuf sb;
    fill_testdata(sb);

    // parsing starts here
    char buf[1024];
    std::istream is(&sb);
    // read first including bytes 20..21:
    is.read(buf, 22);
    size_t actual = is.gcount();

    std::cout << "stream ok? " << std::boolalpha << is.good() << "\n";
    std::cout << "actual: " << actual << "\n";
    if (is && actual >= 22) { // stream ok, and not a short read
        uint16_t length = *reinterpret_cast<uint16_t const*>(buf + 20);
        std::cout << "length: " << length << "\n";

        std::string payload(length, '[=10=]');
        is.read(&payload[0], length);
        actual = is.gcount();

        std::cout << "actual payload bytes: " << actual << "\n";
        std::cout << "stream ok? " << std::boolalpha << is.good() << "\n";
        payload.resize(actual);

        std::cout << "payload: '" << payload << "'\n";
    }
}

// some testdata
void fill_testdata(ba::streambuf& sb) 
{
    char data[] = { 
        '\x00', '\x00', '\x00', '\x00', '\x00', // 0..4
        '\x00', '\x00', '\x00', '\x00', '\x00', // 5..9
        '\x00', '\x00', '\x00', '\x00', '\x00', // 10..14
        '\x00', '\x00', '\x00', '\x00', '\x00', // 15..19
        '\x0b', '\x00', 'H'   , 'e'   , 'l'   , // 20..24
        'l'   , 'o'   , ' '   , 'w'   , 'o'   , // 25..29
        'r'   , 'l'   , 'd'   , '!'   ,         // 30..33
    };
    std::ostream(&sb).write(data, sizeof(data));
}

版画

stream ok? true
actual: 22
length: 11
actual payload bytes: 11
stream ok? true
payload: 'Hello world'

将\x0b增加到\x0c得到：

stream ok? true
actual: 22
length: 12
actual payload bytes: 12
stream ok? true
payload: 'Hello world!'

将其增加到超过缓冲区中的值，例如 '\x0d 会导致（部分）读取失败：

stream ok? true
actual: 22
length: 13
actual payload bytes: 12
stream ok? false
payload: 'Hello world!'

让我们去专业

为了成为专业人士，我会使用一个库，例如提振精神。这了解字节顺序，进行验证并且当您在解析器中获得分支时真正闪耀，例如

 record = compressed_record | uncompressed_record;

或者

 exif_tags = .... >> custom_attrs;

 custom_attr  = attr_key >> attr_value;
 custom_attrs = repeat(_ca_count) [ custom_attrs ];

 attr_key = bson_string(64);     // max 64, for security
 attr_value = bson_string(1024); // max 1024, for security

 bson_string %= omit[little_dword[_a=_1]] 
             >> eps(_a<=_r) // not exceeding maximum
             >> repeat(_a) [byte_];

但这还遥遥无期。让我们做一个更简单的演示：

Live On Coliru ¹

#include <boost/asio.hpp>

#include <istream>
#include <iostream>

namespace ba = boost::asio;

void fill_testdata(ba::streambuf&);

struct FormatData {
    std::string signature, header; // e.g. 4 + 16 = 20 bytes - could be different, of course
    std::string payload;           // 16bit length prefixed
};

FormatData parse(std::istream& is);

int main() {
    ba::streambuf sb;
    fill_testdata(sb);

    try {
        std::istream is(&sb);
        FormatData data = parse(is);

        std::cout << "actual payload bytes: " << data.payload.length() << "\n";
        std::cout << "payload: '" << data.payload << "'\n";
    } catch(std::runtime_error const& e) {
        std::cout << "Error: " << e.what() << "\n";
    }
}

// some testdata
void fill_testdata(ba::streambuf& sb) 
{
    char data[] = { 
        'S'   , 'I'   , 'G'   , 'N'   , '\x00'   , // 0..4
        '\x00', '\x00', '\x00', '\x00', '\x00'   , // 5..9
        '\x00', '\x00', '\x00', '\x00', '\x00'   , // 10..14
        '\x00', '\x00', '\x00', '\x00', '\x00'   , // 15..19
        '\x0b', '\x00', 'H'   , 'e'   , 'l'      , // 20..24
        'l'   , 'o'   , ' '   , 'w'   , 'o'      , // 25..29
        'r'   , 'l'   , 'd'   , '!'   , // 30..33
    };
    std::ostream(&sb).write(data, sizeof(data));
}

//#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;

BOOST_FUSION_ADAPT_STRUCT(FormatData, signature, header, payload)

template <typename It>
struct FileFormat : qi::grammar<It, FormatData()> {
    FileFormat() : FileFormat::base_type(start) {
        using namespace qi;

        signature  = string("SIGN");     // 4 byte signature, just for example
        header     = repeat(16) [byte_]; // 16 byte header, same

        payload   %= omit[little_word[_len=_1]] >> repeat(_len) [byte_];
        start      = signature >> header >> payload;

        //BOOST_SPIRIT_DEBUG_NODES((start)(signature)(header)(payload))
    }
  private:
    qi::rule<It, FormatData()> start;
    qi::rule<It, std::string()> signature, header;

    qi::_a_type _len;
    qi::rule<It, std::string(), qi::locals<uint16_t> > payload;
};

FormatData parse(std::istream& is) {
    using it = boost::spirit::istream_iterator;

    FormatData data;
    it f(is >> std::noskipws), l;
    bool ok = parse(f, l, FileFormat<it>{}, data);

    if (!ok)
        throw std::runtime_error("parse failure\n");

    return data;
}

打印：

actual payload bytes: 11
payload: 'Hello world'

¹ 多么美好的时光啊！ Coliru 淹没了魔杖，同时向下！必须删除在线演示的 Boost Asio，因为 IdeOne 没有 link Boost System

如何从 streambuf 中获取一个或多个特定字节？

How do I get a specific byte or bytes from a streambuf?

buffer

boost

ethernet

gnuradio

让我们去专业