从 python 中的 URL 中读取 xls 文件

Question

我正在尝试从 python 下面的 link 中读取数据 https://drive.google.com/file/d/16cp23cJxeyUfnBHMp-sNCuFNQxe8cqOV/view

我试过这个：

import pandas as pd

path = pd.read_excel('https://drive.google.com/file/d/16cp23cJxeyUfnBHMp-sNCuFNQxe8cqOV/view')

返回此错误：

XLRDError：不支持的格式，或损坏的文件：预期的 BOF 记录；找到 b'

然后我尝试使用cvs格式

path = pd.read_csv('https://drive.google.com/file/d/16cp23cJxeyUfnBHMp-sNCuFNQxe8cqOV/view')

返回了这个

ParserError：标记数据时出错。 C 错误：第 133 行应有 298 个字段，但看到了 440

最后我试了这个：

path = pd.read_csv("https://drive.google.com/file/d/16cp23cJxeyUfnBHMp-sNCuFNQxe8cqOV/view")

这读取了数据，但这不是我在看到 link（283 行，7 列）后所期望的。下面的照片。

Error reading data

关于如何读取数据的任何想法？

谢谢

Answer 1

使用此示例从 Google 驱动器下载 excel（fileid 是 URL 中 /d/ 部分之后的 ID）：

fileid = "16cp23cJxeyUfnBHMp-sNCuFNQxe8cqOV"

df = pd.read_excel(
    "https://drive.google.com/uc?export=download&id={fileid}".format(
        fileid=fileid
    ),
    skiprows=17,
)
print(df)

打印：

     Unnamed: 0                                         Unnamed: 1                                         Unnamed: 2 Petajoules Gigajoules           %
0           NaN                                        Afghanistan                                        Afghanistan        321         10   78.669280
1           NaN                                            Albania                                            Albania        102         35  100.000000
2           NaN                                            Algeria                                            Algeria       1959         51    0.551010
3           NaN                                     American Samoa                                     American Samoa        ...        ...    0.641026
4           NaN                                            Andorra                                            Andorra          9        121   88.695650
5           NaN                                             Angola                                             Angola        642         27   70.909090

...and so on.

从 python 中的 URL 中读取 xls 文件

Read xls file from a URL in python

python

xls

pandas