如果文件被压缩 (.zip),则使用 python 从 UCI 数据集中在线提取数据
Extracting data from a UCI dataset Online using python if the file is compressed(.zip)
我想使用网络抓取从文件中获取数据
https://archive.ics.uci.edu/ml/machine-learning-databases/00380/YouTube-Spam-Collection-v1.zip
如何在 python 中使用 requests?
您可以使用此示例如何使用 requests
和 built-in zipfile
模块加载 zip 文件:
import requests
from io import BytesIO
from zipfile import ZipFile
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/00380/YouTube-Spam-Collection-v1.zip"
with ZipFile(BytesIO(requests.get(url).content), "r") as myzip:
# print content of zip:
# print(myzip.namelist())
# print content of one of the file:
with myzip.open("Youtube01-Psy.csv", "r") as f_in:
print(f_in.read())
打印:
b'COMMENT_ID,AUTHOR,DATE,CONTENT,CLASS\n
...
我想使用网络抓取从文件中获取数据 https://archive.ics.uci.edu/ml/machine-learning-databases/00380/YouTube-Spam-Collection-v1.zip
如何在 python 中使用 requests?
您可以使用此示例如何使用 requests
和 built-in zipfile
模块加载 zip 文件:
import requests
from io import BytesIO
from zipfile import ZipFile
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/00380/YouTube-Spam-Collection-v1.zip"
with ZipFile(BytesIO(requests.get(url).content), "r") as myzip:
# print content of zip:
# print(myzip.namelist())
# print content of one of the file:
with myzip.open("Youtube01-Psy.csv", "r") as f_in:
print(f_in.read())
打印:
b'COMMENT_ID,AUTHOR,DATE,CONTENT,CLASS\n
...