直接从 url 下载 r 中的 gz 文件已损坏
download of gz file in r directly from url is corrupt
当我直接在 r 中从
下载 gz 文件时
http://prices.shufersal.co.il/(任何 link 都将在 table 的最左边的列中执行)
文件被读取为损坏,但当我从站点手动下载它时,它读取正常
>shufersal.url='http://pricesprodpublic.blob.core.windows.net/price/Price7290027600007-001-201505220240.gz?sv=2014-02-14&sr=b&sig=VUHTJAzWEBqMJXO%2BwHE4WAh3DJNWkw4w03%2BLk8c6dUw%3D&se=2015-05-23T14%3A47%3A42Z&sp=r'
> temp <- tempfile()
> download.file(shufersal.url,temp,quiet = T)
> gzfile(temp)
class
"gzfile"
mode
"rb"
text
"text"
opened
"closed"
can read
"yes"
can write
"yes"
> readLines(gzfile(temp))
character(0)
Warning message:
invalid or incomplete compressed data
> unlink(temp)
shufersal.url="https://github.com/yonicd/supermarketprices/raw/master/shufersal/Price7290027600007-001-201505220240.gz"
temp <- tempfile()
download.file(shufersal.url,temp,quiet = T)
readLines(gzfile(temp),encoding = "UTF-8")
unlink(temp)
> readLines(gzfile(temp),encoding = "UTF-8")
[1] "<?xml version=\"1.0\" encoding=\"utf-8\"?>"
[2] "<root>"
[3] " <ChainId>7290027600007</ChainId>"
[4] " <SubChainId>001</SubChainId>"
[5] " <StoreId>001</StoreId>"
[6] " <BikoretNo>6</BikoretNo>"
[7] " <Items>"
[8] " <Item>"
[9] " <PriceUpdateDate>2015-05-21 11:11</PriceUpdateDate>"
[10] " <ItemCode>7290000000046</ItemCode>"
...
[77] " <AllowDiscount>1</AllowDiscount>"
[78] " <ItemStatus>1</ItemStatus>"
[79] " </Item>"
[80] " </Items>"
#Set mode="wb" as an argument in download.file
download.file(shufersal.url,temp,mode="wb",quiet = T)
详情见help(download.file)
当我直接在 r 中从
下载 gz 文件时http://prices.shufersal.co.il/(任何 link 都将在 table 的最左边的列中执行)
文件被读取为损坏,但当我从站点手动下载它时,它读取正常
>shufersal.url='http://pricesprodpublic.blob.core.windows.net/price/Price7290027600007-001-201505220240.gz?sv=2014-02-14&sr=b&sig=VUHTJAzWEBqMJXO%2BwHE4WAh3DJNWkw4w03%2BLk8c6dUw%3D&se=2015-05-23T14%3A47%3A42Z&sp=r'
> temp <- tempfile()
> download.file(shufersal.url,temp,quiet = T)
> gzfile(temp)
class
"gzfile"
mode
"rb"
text
"text"
opened
"closed"
can read
"yes"
can write
"yes"
> readLines(gzfile(temp))
character(0)
Warning message:
invalid or incomplete compressed data
> unlink(temp)
shufersal.url="https://github.com/yonicd/supermarketprices/raw/master/shufersal/Price7290027600007-001-201505220240.gz"
temp <- tempfile()
download.file(shufersal.url,temp,quiet = T)
readLines(gzfile(temp),encoding = "UTF-8")
unlink(temp)
> readLines(gzfile(temp),encoding = "UTF-8")
[1] "<?xml version=\"1.0\" encoding=\"utf-8\"?>"
[2] "<root>"
[3] " <ChainId>7290027600007</ChainId>"
[4] " <SubChainId>001</SubChainId>"
[5] " <StoreId>001</StoreId>"
[6] " <BikoretNo>6</BikoretNo>"
[7] " <Items>"
[8] " <Item>"
[9] " <PriceUpdateDate>2015-05-21 11:11</PriceUpdateDate>"
[10] " <ItemCode>7290000000046</ItemCode>"
...
[77] " <AllowDiscount>1</AllowDiscount>"
[78] " <ItemStatus>1</ItemStatus>"
[79] " </Item>"
[80] " </Items>"
#Set mode="wb" as an argument in download.file
download.file(shufersal.url,temp,mode="wb",quiet = T)
详情见help(download.file)