直接从 url 下载 r 中的 gz 文件已损坏

download of gz file in r directly from url is corrupt

当我直接在 r 中从

下载 gz 文件时

http://prices.shufersal.co.il/(任何 link 都将在 table 的最左边的列中执行)

文件被读取为损坏,但当我从站点手动下载它时,它读取正常

>shufersal.url='http://pricesprodpublic.blob.core.windows.net/price/Price7290027600007-001-201505220240.gz?sv=2014-02-14&sr=b&sig=VUHTJAzWEBqMJXO%2BwHE4WAh3DJNWkw4w03%2BLk8c6dUw%3D&se=2015-05-23T14%3A47%3A42Z&sp=r'
> temp <- tempfile()
> download.file(shufersal.url,temp,quiet = T)
> gzfile(temp)

                                                            class 
                                                         "gzfile" 
                                                             mode 
                                                             "rb" 
                                                             text 
                                                           "text" 
                                                           opened 
                                                         "closed" 
                                                         can read 
                                                            "yes" 
                                                        can write 
                                                            "yes" 

> readLines(gzfile(temp))
 character(0)
 Warning message:
 invalid or incomplete compressed data
> unlink(temp)
shufersal.url="https://github.com/yonicd/supermarketprices/raw/master/shufersal/Price7290027600007-001-201505220240.gz"
temp <- tempfile()
download.file(shufersal.url,temp,quiet = T)
readLines(gzfile(temp),encoding = "UTF-8")
unlink(temp)

> readLines(gzfile(temp),encoding = "UTF-8")
[1] "<?xml version=\"1.0\" encoding=\"utf-8\"?>"                                
[2] "<root>"                                                                   
[3] "  <ChainId>7290027600007</ChainId>"                                       
[4] "  <SubChainId>001</SubChainId>"                                           
[5] "  <StoreId>001</StoreId>"                                                 
[6] "  <BikoretNo>6</BikoretNo>"                                               
[7] "  <Items>"                                                                
[8] "    <Item>"                                                               
[9] "      <PriceUpdateDate>2015-05-21 11:11</PriceUpdateDate>"                
[10] "      <ItemCode>7290000000046</ItemCode>"
...
[77] "      <AllowDiscount>1</AllowDiscount>"                                   
[78] "      <ItemStatus>1</ItemStatus>"                                         
[79] "    </Item>"                                                              
[80] "  </Items>" 
#Set mode="wb" as an argument in download.file
download.file(shufersal.url,temp,mode="wb",quiet = T)

详情见help(download.file)