xlrd throws TypeError: embedded NUL character when trying to open an `.xls` file from web in Python 3.4

xlrd throws TypeError: embedded NUL character when trying to open an `.xls` file from web in Python 3.4

我正在尝试从网络上打开一个 excel 文件并提取其中一列。但是,当我尝试使用 xlrd 打开文件时出现错误。我正在尝试的代码是:

from urllib.request import urlopen
import xlrd
DJIA_URL = 'http://www.djaverages.com/?go=export-components&symbol=DJI'
xlfile = urlopen(DJIA_URL).read()
xlbook = xlrd.open_workbook(xlfile)

但是,我遇到了类型错误:

Traceback (most recent call last):
  File "C:\Code\development\Pynance\pynance\sources\indices.py", line 31, in <module>
    xlbook = xlrd.open_workbook(xlfile)
  File "C:\Python34\lib\site-packages\xlrd\__init__.py", line 394, in open_workbook
    f = open(filename, "rb")
TypeError: embedded NUL character
[Finished in 0.8s with exit code 1]

如果我手动下载文件并打开它:

xlfile = 'DJIComponents.xls'
xlbook = xlrd.open_workbook(xlfile)

没问题,我宁愿跳过手动步骤。是否有编码设置或我缺少的东西?

xlrd.open_workbook() 只能打开 excel 个文件。但是,xlfile = urlopen(DJIA_URL).read() 创建的 xlfile 对象不是 excel 文件,因此 xlbook = xlrd.open_workbook(xlfile) 无法打开。

上述方式创建的xlfile是class"bytes"的对象。事实可以通过命令看到

print(type(xlfile))

那应该给

<class 'bytes'>

因此,您必须通过

检索文件

(1) 添加

import urllib.request

(2) 通过

保存excel文件

urllib.request.urlretrieve(DJIA_URL, r'path\to\file\xxx.xls')

(3) 最后用

打开

xlrd.open_workbook(r'path\to\file\xxx.xls')

(在 python 3.4 eclipse PyDev win7 x64 上测试。)