使用 Apache Arrow 从 std::vector<unsigned char> 读取 CSV
Read CSV from std::vector<unsigned char> using Apache Arrow
我正在尝试使用 Apache arrow 读取 csv 输入格式。示例 here mentions that the input should be an InputStream, however in my case I just have an std::vector of unsigned chars. Is it possible to parse this using apache arrow? I have checked the I/O interface 查看是否存在“内存中”数据结构,运气不佳。
为了方便起见,我在这里复制粘贴了示例代码以及我的输入数据:
#include "arrow/csv/api.h"
{
// ...
std::vector<unsigned char> data;
arrow::io::IOContext io_context = arrow::io::default_io_context();
// how can I fit the std::vector to the input stream?
std::shared_ptr<arrow::io::InputStream> input = ...;
auto read_options = arrow::csv::ReadOptions::Defaults();
auto parse_options = arrow::csv::ParseOptions::Defaults();
auto convert_options = arrow::csv::ConvertOptions::Defaults();
// Instantiate TableReader from input stream and options
auto maybe_reader =
arrow::csv::TableReader::Make(io_context,
input,
read_options,
parse_options,
convert_options);
if (!maybe_reader.ok()) {
// Handle TableReader instantiation error...
}
std::shared_ptr<arrow::csv::TableReader> reader = *maybe_reader;
// Read table from CSV file
auto maybe_table = reader->Read();
if (!maybe_table.ok()) {
// Handle CSV read error
// (for example a CSV syntax error or failed type conversion)
}
std::shared_ptr<arrow::Table> table = *maybe_table;
}
如有任何帮助,我们将不胜感激!
I/O 界面文档列表 BufferReader which works as an in-memory input stream. While not listed in the docs, it can be constructed from a pointer and a size 应该可以让您使用 vector<char>
。
我正在尝试使用 Apache arrow 读取 csv 输入格式。示例 here mentions that the input should be an InputStream, however in my case I just have an std::vector of unsigned chars. Is it possible to parse this using apache arrow? I have checked the I/O interface 查看是否存在“内存中”数据结构,运气不佳。 为了方便起见,我在这里复制粘贴了示例代码以及我的输入数据:
#include "arrow/csv/api.h"
{
// ...
std::vector<unsigned char> data;
arrow::io::IOContext io_context = arrow::io::default_io_context();
// how can I fit the std::vector to the input stream?
std::shared_ptr<arrow::io::InputStream> input = ...;
auto read_options = arrow::csv::ReadOptions::Defaults();
auto parse_options = arrow::csv::ParseOptions::Defaults();
auto convert_options = arrow::csv::ConvertOptions::Defaults();
// Instantiate TableReader from input stream and options
auto maybe_reader =
arrow::csv::TableReader::Make(io_context,
input,
read_options,
parse_options,
convert_options);
if (!maybe_reader.ok()) {
// Handle TableReader instantiation error...
}
std::shared_ptr<arrow::csv::TableReader> reader = *maybe_reader;
// Read table from CSV file
auto maybe_table = reader->Read();
if (!maybe_table.ok()) {
// Handle CSV read error
// (for example a CSV syntax error or failed type conversion)
}
std::shared_ptr<arrow::Table> table = *maybe_table;
}
如有任何帮助,我们将不胜感激!
I/O 界面文档列表 BufferReader which works as an in-memory input stream. While not listed in the docs, it can be constructed from a pointer and a size 应该可以让您使用 vector<char>
。