从 zip 文件中的 csv 创建 Dataframe
Create Dataframe from a csv inside a zip file
我正在尝试读取 pandas 数据框中的 WGIData.csv 文件。 WGIData.csv 存在于我从中下载的 zip 文件中 url
http://databank.worldbank.org/data/download/WGI_csv.zip
但是当我尝试读取时,它抛出错误 BadZipFile: 文件不是 zip 文件
这是我的 python 代码
import pandas as pd
from urllib.request import urlopen
from zipfile import ZipFile
class Get_Data():
def Return_csv_from_zip(self, url):
self.zip = urlopen(url)
self.myzip = ZipFile(self.zip)
self.myzip = self.zip.extractall(self.myzip)
self.file = pd.read_csv(self.myzip)
self.zip.close()
return self.file
url = 'http://databank.worldbank.org/data/download/WGI_csv.zip'
data = Get_Data()
df = data.Return_csv_from_zip(url)
urlopen()
没有 return 您可以发送到 ZipFile()
的对象 (HTTPResponse
)。您可以 read()
响应并使用 io.BytesIO()
执行您需要的操作:
In []:
from io import BytesIO
z = urlopen('http://databank.worldbank.org/data/download/WGI_csv.zip')
myzip = ZipFile(BytesIO(z.read())).extract('WGIData.csv')
pd.read_csv(myzip)
Out[]:
Country Name Country Code Indicator Name Indicator Code 1996 \
0 Anguilla AIA Control of Corruption: Estimate CC.EST NaN
1 Anguilla AIA Control of Corruption: Number of Sources CC.NO.SRC NaN
2 Anguilla AIA Control of Corruption: Percentile Rank CC.PER.RNK NaN
3 Anguilla AIA Control of Corruption: Percentile Rank, Lower ... CC.PER.RNK.LOWER NaN
4 Anguilla AIA Control of Corruption: Percentile Rank, Upper ... CC.PER.RNK.UPPER NaN
5 Anguilla AIA Control of Corruption: Standard Error CC.STD.ERR NaN
...
我正在尝试读取 pandas 数据框中的 WGIData.csv 文件。 WGIData.csv 存在于我从中下载的 zip 文件中 url
http://databank.worldbank.org/data/download/WGI_csv.zip
但是当我尝试读取时,它抛出错误 BadZipFile: 文件不是 zip 文件
这是我的 python 代码
import pandas as pd
from urllib.request import urlopen
from zipfile import ZipFile
class Get_Data():
def Return_csv_from_zip(self, url):
self.zip = urlopen(url)
self.myzip = ZipFile(self.zip)
self.myzip = self.zip.extractall(self.myzip)
self.file = pd.read_csv(self.myzip)
self.zip.close()
return self.file
url = 'http://databank.worldbank.org/data/download/WGI_csv.zip'
data = Get_Data()
df = data.Return_csv_from_zip(url)
urlopen()
没有 return 您可以发送到 ZipFile()
的对象 (HTTPResponse
)。您可以 read()
响应并使用 io.BytesIO()
执行您需要的操作:
In []:
from io import BytesIO
z = urlopen('http://databank.worldbank.org/data/download/WGI_csv.zip')
myzip = ZipFile(BytesIO(z.read())).extract('WGIData.csv')
pd.read_csv(myzip)
Out[]:
Country Name Country Code Indicator Name Indicator Code 1996 \
0 Anguilla AIA Control of Corruption: Estimate CC.EST NaN
1 Anguilla AIA Control of Corruption: Number of Sources CC.NO.SRC NaN
2 Anguilla AIA Control of Corruption: Percentile Rank CC.PER.RNK NaN
3 Anguilla AIA Control of Corruption: Percentile Rank, Lower ... CC.PER.RNK.LOWER NaN
4 Anguilla AIA Control of Corruption: Percentile Rank, Upper ... CC.PER.RNK.UPPER NaN
5 Anguilla AIA Control of Corruption: Standard Error CC.STD.ERR NaN
...