在libarchive中查找,如何重置header?
Seek in libarchive, how to reset header?
是否可以再次读取解压后的文件?
假设我使用了 archive_read_next_header(a, &entry)
,
我使用 archive_read_data(a, ptr_to_buffer, buffer_size)
读取了未知数量的字节。现在我想重置它并从头开始重新阅读。我试图覆盖 seekoff(std::streamoff off, std::ios_base::seekdir way, std::ios_base::openmode which)
。我知道由于压缩算法的内部工作,可能不可能只在解压后的数据中查找,而且除了 libarchive 内部缓冲区中有限的字节数外,数据不会存储在任何地方。
我的想法是将其全部重置,然后读取 std::streamoff off
字节,这样我就可以创建向后搜索。向前查找很容易,只需读取 std::streamoff off
个字节。效率真的很低,希望seek用的少吧
整个结构 archive
是这样初始化的:
archive_read_set_read_callback(a, read_callback);
archive_read_set_callback_data(a, container);
archive_read_set_seek_callback(a, seek_callback);
archive_read_set_skip_callback(a, skip_callback);
int r = (archive_read_open1(a));
其中容器包含大部分 std::istream
,回调是操作该流的函数。
我想要实现的模板
`
std::streampos seek_beg(std::streamoff off) {
if(off >= 0) {
// read/skip 'off' bytes
} else {
// reset (a)
// read/skip 'off' bytes
}
// return position
}
`
我的 underflow() 方法也是这样实现的:
`
int underflow() {
int r = archive_read_data(ar, ptr, BUFFER_SIZE);
if (r < 0) {
throw std::runtime_error("ERROR");
} else if (r == 0) {
return std::streambuf::traits_type::eof();
} else {
setg(ptr, ptr, ptr + r);
}
return std::streambuf::traits_type::to_int_type(*ptr);
}
`
Libarchive 文档,更准确地说,wishlist in libarchive wiki on GitHub says:
A few people have asked for the ability to efficiently "re-read"
particular archive entries. This is a tricky subject. For many
formats, the performance gains from this would be very modest. For
example, with a little performance work, the seeking Zip reader could
support very fast re-reading from the beginning since it only involves
re-parsing the central directory. The cases where there would be real
gains (e.g., tar.gz) are going to be very difficult to handle. The
most likely implementation would be some form of checkpointing so that
clients can explicitly ask for a checkpoint object and then restore
back to that checkpoint. The checkpoint object could be complex if you
have a series of stacked read filters plus state in the format handler
itself.
正如我所见,现在无法借助 libarchive 在档案中查找,因此解决我的问题的方法是仅当我怀疑我想要 re-read 时才记住所有读取的数据,或者将其推回流。
是否可以再次读取解压后的文件?
假设我使用了 archive_read_next_header(a, &entry)
,
我使用 archive_read_data(a, ptr_to_buffer, buffer_size)
读取了未知数量的字节。现在我想重置它并从头开始重新阅读。我试图覆盖 seekoff(std::streamoff off, std::ios_base::seekdir way, std::ios_base::openmode which)
。我知道由于压缩算法的内部工作,可能不可能只在解压后的数据中查找,而且除了 libarchive 内部缓冲区中有限的字节数外,数据不会存储在任何地方。
我的想法是将其全部重置,然后读取 std::streamoff off
字节,这样我就可以创建向后搜索。向前查找很容易,只需读取 std::streamoff off
个字节。效率真的很低,希望seek用的少吧
整个结构 archive
是这样初始化的:
archive_read_set_read_callback(a, read_callback);
archive_read_set_callback_data(a, container);
archive_read_set_seek_callback(a, seek_callback);
archive_read_set_skip_callback(a, skip_callback);
int r = (archive_read_open1(a));
其中容器包含大部分 std::istream
,回调是操作该流的函数。
我想要实现的模板 `
std::streampos seek_beg(std::streamoff off) {
if(off >= 0) {
// read/skip 'off' bytes
} else {
// reset (a)
// read/skip 'off' bytes
}
// return position
}
`
我的 underflow() 方法也是这样实现的: `
int underflow() {
int r = archive_read_data(ar, ptr, BUFFER_SIZE);
if (r < 0) {
throw std::runtime_error("ERROR");
} else if (r == 0) {
return std::streambuf::traits_type::eof();
} else {
setg(ptr, ptr, ptr + r);
}
return std::streambuf::traits_type::to_int_type(*ptr);
}
`
Libarchive 文档,更准确地说,wishlist in libarchive wiki on GitHub says:
A few people have asked for the ability to efficiently "re-read" particular archive entries. This is a tricky subject. For many formats, the performance gains from this would be very modest. For example, with a little performance work, the seeking Zip reader could support very fast re-reading from the beginning since it only involves re-parsing the central directory. The cases where there would be real gains (e.g., tar.gz) are going to be very difficult to handle. The most likely implementation would be some form of checkpointing so that clients can explicitly ask for a checkpoint object and then restore back to that checkpoint. The checkpoint object could be complex if you have a series of stacked read filters plus state in the format handler itself.
正如我所见,现在无法借助 libarchive 在档案中查找,因此解决我的问题的方法是仅当我怀疑我想要 re-read 时才记住所有读取的数据,或者将其推回流。