WinHTTP 在 8 KiB 后停止下载

WinHTTP stops downloading after 8 KiB

我正在使用 WinHTTP(通过 HTTPS)从 libraries.minecraft.netrepo1.maven.org 下载 .jar 文件。这是函数:

// Namespace alias, all stdfs mentions in the function refer to std::filesystem
namespace stdfs = std::filesystem;

bool downloadFile(DLElement element) {
    HINTERNET session = WinHttpOpen(
        // Identify as Firefox
        L"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:10.0) Gecko/20100101 Firefox/10.0",
        // Don't care about proxies
        WINHTTP_ACCESS_TYPE_DEFAULT_PROXY, WINHTTP_NO_PROXY_NAME, WINHTTP_NO_PROXY_BYPASS, 0);
    if (session == nullptr) return false;
    HINTERNET connection;

    connection = WinHttpConnect(session, element.hostname.c_str(),
        element.https ? INTERNET_DEFAULT_HTTPS_PORT : INTERNET_DEFAULT_HTTP_PORT, 0);

    // Accept `.jar` files
    const wchar_t** acceptTypes = new const wchar_t* [2];
    acceptTypes[0] = L"application/java-archive";
    acceptTypes[1] = nullptr;
    HINTERNET request = WinHttpOpenRequest(connection, L"GET", element.object.c_str(),
        nullptr, WINHTTP_NO_REFERER, acceptTypes, element.https ? WINHTTP_FLAG_SECURE : 0);

    bool result = WinHttpSendRequest(request, WINHTTP_NO_ADDITIONAL_HEADERS, 0, WINHTTP_NO_REQUEST_DATA, 0, 0, 0);
    if (!result) {
        DWORD hr = GetLastError();
        checkHresult(hr, window, false);
    }
    if (!result) {
    failRet:
        WinHttpCloseHandle(request);
        WinHttpCloseHandle(connection);
        WinHttpCloseHandle(session);
        return false;
    }
    result = WinHttpReceiveResponse(request, nullptr);
    DWORD size = 0;
    DWORD downloaded = 0;
    std::vector<uint8_t> file;
    if (!result) goto failRet;
    do {
        if (!WinHttpQueryDataAvailable(request, &size)) {
            goto failRet;
        }
        uint8_t* buffer;
    alloc:
        try {
            buffer = new uint8_t[size];
        }
        catch (std::bad_alloc&) {
            MessageBox(window, lstr(VCLS_OUT_OF_MEMORY_DESC), lstr(VCLS_OUT_OF_MEMORY_TITLE), MB_ICONERROR);
            goto alloc;
        }
        // Dunno why, do I even need this memset?
        memset(buffer, 0, size);
        if (!WinHttpReadData(request, buffer, size, &downloaded)) {
            delete[] buffer;
            checkHresult(GetLastError(), window, false);
            goto failRet;
        }
        size_t vectSize = file.size();
        file.reserve(vectSize + size);
        for (size_t i = vectSize; i < size; i++) {
            // Might as well rewrite this and make this more efficient
            file.push_back(buffer[i]);
        }
        delete[] buffer;

    } while (size > 0);
    WinHttpCloseHandle(request);
    WinHttpCloseHandle(connection);
    WinHttpCloseHandle(session);

    element.path.make_preferred();
    stdfs::path dir = stdfs::path(element.path).remove_filename();
    std::wstring fdirStr = dir.wstring();
    fdirStr.pop_back();
    dir = fdirStr;

tryCreateDir:
    try {
        stdfs::create_directories(mcFolderPath/dir);
    }
    catch (const std::bad_alloc&) {
        MessageBox(window, lstr(VCLS_OUT_OF_MEMORY_DESC), lstr(VCLS_OUT_OF_MEMORY_TITLE), MB_ICONERROR);
        goto tryCreateDir;
    }
    catch (const stdfs::filesystem_error& e) {
        // Error handling removed for brevity

        return false;
    }

    std::basic_ofstream<uint8_t> ofs(mcFolderPath/element.path, std::ios::binary | std::ios::trunc);
    ofs.write(file.data(), file.size());
    ofs.close();

    return true;
}

DLElement定义如下:

struct DLElement {
    std::wstring hostname; std::wstring object; stdfs::path path; std::string sha1hex;
    bool hasSha = true; bool https = false;
};

问题是,出于某种原因,此函数仅下载 正好 8 KiB 的实际文件,这恰好是 WinHTTP 的缓冲区大小。这段代码是经过修改的 Microsoft WinHTTP example,所以我假设它是正确的并且能够获取超过 8 KiB 的数据。我做错了什么,为什么 WinHttpQueryDataAvailable return 0 到 size 一旦达到 8 KiB?


OS: Windows 10 专业,更新 1903
架构: x86-64 CPU, x86-64 OS
CPU: Intel Core i5-2300 @ 2.8 GHz
RAM: 8 GiB
交换文件: 16000 MB

buffer = new uint8_t[size];
size_t vectSize = file.size();
file.reserve(vectSize + size);
...
for (size_t i = vectSize; i < size; i++) {
    // Might as well rewrite this and make this more efficient
    file.push_back(buffer[i]);
}

此代码应该将 buffer 附加到 file,因此 for 循环应该从零开始,而不是 vectSize。将其更改为:

for (size_t i = 0; i < size; i++) 
    file.push_back(buffer[i]);

看来您使用的是 C++11 或更新版本,因此您可以使用 std::vector 而不是 new。考虑用以下内容替换 do 循环:

while(true)
{
    if(!WinHttpQueryDataAvailable(request, &size))
        break;
    if(!size) 
        break;
    std::vector<uint8_t> buffer(size); 
    if(!WinHttpReadData(request, buffer.data(), size, &downloaded))
        break;
    if(!downloaded)
        break;
    buffer.resize(downloaded);
    file.insert(file.end(), buffer.begin(), buffer.end());
}

或者直接写入主缓冲区:

while(true)
{
    if(!WinHttpQueryDataAvailable(request, &size))
        break;
    if(!size)
        break;

    size_t current_size = file.size();
    file.resize(current_size + size);
    downloaded = 0;
    result = WinHttpReadData(request, file.data() + current_size, size, &downloaded);
    file.resize(current_size + downloaded);

    if(!result || !downloaded)
        break;
}

与错误无关:

不需要将缓冲区初始化为零。 Microsoft 示例中已完成,但该示例使用以空字符结尾的字符串,只需将最后一个字节设置为零即可。

考虑使用 WinINet 而不是 WinHTTP。如果您不需要服务器,WinINet 会更简单。

为我解决了这个问题,尽管还有另一种方法可以解决这个问题,同时仍然使用缓冲区。 (虽然我提到的答案仍然更快更优雅。)下面的单行替换了 for 循环:

file.insert(file.end(), &buffer[0], &buffer[size]);

将可迭代对象附加到 std::vector 是 "the C++ way"。尽管如此,首先写入 vector 显然更快。如果可以指定写在哪里,那就是。