在 Golang 中就地编辑 ZIP 存档

Editing ZIP archive in place in Golang

我正在编写一个应用程序,允许用户将匿名数据上传到 S3 存储桶,以便他们在不向我们提供身份验证数据的情况下试用我们的产品。

这是处理 ZIP 存档的结构,已被证明是正确的:

type ZipWriter struct {
    buffer *bytes.Buffer
    writer *zip.Writer
}

func FromFile(file io.Reader) (*ZipWriter, error) {

    // First, read all the data from the file; if this fails then return an error
    data, err := ioutil.ReadAll(file)
    if err != nil {
        return nil, fmt.Errorf("Failed to read data from the ZIP archive")
    }

    // Next, put all the data into a buffer and then create a ZIP writer
    // from the buffer and return that writer
    buffer := bytes.NewBuffer(data)
    return &ZipWriter{
        buffer: buffer,
        writer: zip.NewWriter(buffer),
    }, nil
}

// WriteToStream writes the contents of the ZIP archive to the provided stream
func (writer *ZipWriter) WriteToStream(file io.Writer) error {

    // First, attempt to close the ZIP archive writer so that we can avoid
    // double writes to the underlying buffer; if an error occurs then return it
    if err := writer.writer.Close(); err != nil {
        return fmt.Errorf("Failed to close ZIP archive, error: %v", err)
    }

    // Next, write the underlying buffer to the provided stream; if this fails
    // then return an error
    if _, err := writer.buffer.WriteTo(file); err != nil {
        return fmt.Errorf("Failed to write the ZIP data to the stream, error: %v", err)
    }

    return nil
}

使用 ZipWriter,我使用 FromFile 函数加载 ZIP 文件,然后使用 WriteToStream 函数将其写入字节数组。之后,我调用以下函数将 ZIP 存档数据上传到 S3 中的预签名 URL:

// DoRequest does an HTTP request against an endpoint with a given URL, method and access token
func DoRequest(client *http.Client, method string, url string, code string, reader io.Reader) ([]byte, error) {

    // First, create the request with the method, URL, body and access token
    // We don't expect this to fail so ignore the error
    request, _ := http.NewRequest(method, url, reader)
    if !util.IsEmpty(code) {
        request.Header.Set(headers.Accept, echo.MIMEApplicationJSON)
        request.Header.Set(headers.Authorization, fmt.Sprintf("Bearer %s", code))
    } else {
        request.Header.Set(headers.ContentType, "application/zip")
    }

    // Next, do the request; if this fails then return an error
    resp, err := client.Do(request)
    if err != nil {
        return nil, fmt.Errorf("Failed to run the %s request against %s, error: %v", method, url, err)
    } else if resp.StatusCode != http.StatusOK {
        return nil, fmt.Errorf("Failed to run the %s request against %s, response: %v", method, url, resp)
    }

    // Now, read the body from the response; if this fails then return an error
    defer resp.Body.Close()
    body, err := ioutil.ReadAll(resp.Body)
    if err != nil {
        return nil, fmt.Errorf("Failed to read the body associated with the response, error: %v", err)
    }

    // Finally, return the body from the response
    return body, nil
}

所以,整个操作是这样的:

file, err := os.Open(location)
if err != nil {
    log.Fatalf("Unable to open ZIP archive located in %s, error: %v", location, err)
}

writer, err := lutils.FromFile(file)
if err != nil {
    log.Fatalf("File located in %s could not be read as a ZIP archive, error: %v", location, err)
}

buffer := new(bytes.Buffer)
if err := writer.WriteToStream(buffer); err != nil {
    log.Fatalf("Failed to write data to the ZIP archive, error: %v", err)
}

if body, err := DoRequest(new(http.Client), http.MethodPut, url, "", buffer); err != nil {
    log.Fatalf("Failed to upload the data to S3, response: %s, error: %v", string(body), err)
}

我遇到的问题是,虽然上传到 S3 成功,但当下载 ZIP 存档并提取数据时,没有找到任何文件。在调查这个问题时,我想出了一些可能的失败点:

  1. FromFile 没有正确地从文件创建 ZIP 存档;导致存档文件损坏。
  2. WriteToStream 在写入存档时损坏数据。这似乎不太可能,因为我已经使用 bytes.Buffer 作为 reader 测试了此功能。除非 os.File 生成损坏的 ZIP 存档,而 bytes.Buffer 不会生成损坏的 ZIP 存档,否则我认为此功能可能会按预期工作。
  3. DoRequest 将数据写入 S3 时数据已损坏。这似乎不太可能,因为我已经将此代码用于其他数据而没有问题。因此,除非 ZIP 存档的结构需要与其他文件类型区别对待,否则我在这里也看不到问题。

在更深入地研究了这些可能性之后,我认为问题可能在于我如何从存档文件创建 ZIP 编写器,但我不确定问题出在哪里。

这里的问题有点转移注意力。正如@CeriseLimón 指出的那样,在现有 ZIP 存档上调用 NewWriterClose 必然会导致在文件末尾添加一个空存档。在我的用例中,解决方案是打开文件并将其直接写入流,而不是尝试将其作为 ZIP 存档读取。

file, err := os.Open(location)
if err != nil {
    log.Fatalf("Unable to open ZIP archive located in %s, error: %v", location, err)
}

if body, err := DoRequest(new(http.Client), http.MethodPut, url, "", file); err != nil {
    log.Fatalf("Failed to upload the data to S3, response: %s, error: %v", string(body), err)
}