下载并解压缩 XML 文件

Question

我想解压缩并解析位于 here

的 xml 文件

这是我的代码：

HttpClientHandler handler = new HttpClientHandler()
{
    CookieContainer = new CookieContainer(),
    UseCookies = true,
    AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate,
   // | DecompressionMethods.None,

};

using (var http = new HttpClient(handler))
{

    var response =
         http.GetAsync(@"https://login.tradedoubler.com/report/published/aAffiliateEventBreakdownReportWithPLC_806880712_4446152766894956100.xml.zip").Result;

    Stream streamContent = response.Content.ReadAsStreamAsync().Result;

    using (var gZipStream = new GZipStream(streamContent, CompressionMode.Decompress))
    {
        var settings = new XmlReaderSettings()
        {
             DtdProcessing = DtdProcessing.Ignore
         };

         var reader = XmlReader.Create(gZipStream, settings);
         reader.MoveToContent();

         XElement root = XElement.ReadFrom(reader) as XElement;
     }
}

我在 XmlReader.Create(gZipStream, settings)

上遇到异常

GZip header 中的幻数不正确。确保您传入的是 GZip 流

为了仔细检查我是否从 Web 获取格式正确的数据，我抓取流并将其保存到文件中：

byte[] byteContent = response.Content.ReadAsByteArrayAsync().Result;
File.WriteAllBytes(@"C:\temp11.zip", byteContent);

在我检查 1111.zip 之后，它显示为一个格式正确的 zip 文件，其中包含我需要的 xml。

我被告知 here 我根本不需要 GZipStream 但如果我从代码中完全删除压缩流，并将 streamContent 直接传递给 xml reader，我得到一个例外：

"Data at the root level is invalid. Line 1, position 1."

无论压缩还是不压缩，我都无法解析这个文件。我做错了什么？

Answer 1

相关文件以 PKZip format, not GZip 格式编码。

您需要一个不同的库来解压它，例如 System.IO.Compression.ZipFile。

您通常可以通过文件扩展名来判断编码。 PKZip 文件通常使用 .zip 而 GZip 文件通常使用 .gz.

参见：Unzip files programmatically in .net

Answer 2

将流保存到本地文件夹后，使用 ZipFile class 将其解压缩。像这样：

    byte[] byteContent = response.Content.ReadAsByteArrayAsync().Result;
    string filename = @"C:\temp11.zip";
    File.WriteAllBytes(filename, byteContent);

    string destinationDir = @"c:\temp";
    string xmlFilename = "report.xml";

    System.IO.Compression.ZipFile.ExtractToDirectory(filename, destinationDir);

    XmlDocument xmlDoc = new XmlDocument();
    xmlDoc.Load(Path.Combine(destinationDir, xmlFilename));

    //xml reading goes here...

下载并解压缩 XML 文件

Download and Unzip XML file

.net

c#

compression

xmlreader

xml-parsing