spacy 数组枚举上标量变量的无效索引
invalid index to scalar variable on spacy array enumerate
我正在尝试 运行 在本地压缩以下句子:https://github.com/zhaohengyang/Generate-Parallel-Data-for-Sentence-Compression
所以我复制了文件并使用 conda 安装了所有依赖项。我做了一些小的修改,例如从 url 而不是本地磁盘读取数据,并将他的 parallel_data_gen.py 捆绑在我的单个 py 文件中。
然而,当我 运行 它时,我得到:
Spacy 库无法将句子解析为树。请忽略这句话对
----------------59-------------------
reducing sentence: This year the Venezuelan government plans to continue its pace of land expropriations in order to move towards what it terms ``agrarian socialism''.
reducing headline: Venezuelan government to continue pace of land expropriations for ``agrarian socialism''
Traceback (most recent call last):
File "/home/user/dev/projects/python-snippets/zhaohengyang/sentence-compression.py", line 701, in <module>
reduce_sentence(sample)
File "/home/user/dev/projects/python-snippets/zhaohengyang/sentence-compression.py", line 641, in reduce_sentence
sentence_info = parse_info(sentence)
File "/home/user/dev/projects/python-snippets/zhaohengyang/sentence-compression.py", line 616, in parse_info
heads = [index + item[0] for index, item in enumerate(doc.to_array([HEAD]))]
IndexError: invalid index to scalar variable.
我不确定如何解决这个问题,因为我是一个新手 python 用户。
这是我 运行 重现问题的完整代码:https://gist.github.com/avidanyum/3edfbc96ea22807445ab5307830d41db
失败的内部片段:
def parse_info(sentence):
doc = nlp(sentence)
heads = [index + item[0] for index, item in enumerate(doc.to_array([HEAD]))]
现在我加载了 nlp
:
import spacy
print('if you didnt run: python -m spacy download en')
import spacy.lang.en
nlp = spacy.load('en')
关于我的环境的更多信息:
/home/user/home/user/dev/anaconda3/envs/pymachine/bin/python --version
Python 2.7.15 :: Anaconda, Inc.
请注意,我是 运行 spaCy 2.0 on python 3.6,但只是 运行 对例句的快速测试:
nlp = spacy.load('en_core_web_lg')
doc = nlp("Here is a test sentence for me to use.")
我收到一些错误 运行 你的代码,它们都在你指定的行中:
heads = [(index, item) for index, item in enumerate(doc.to_array([HEAD]))]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'HEAD' is not defined
这是因为 to_array
调用需要 list
个 string
个对象。修正为:
# Note that HEAD is now a string, rather than a variable
heads = [(index, item) for index, item in enumerate(doc.to_array(['HEAD']))]
heads
[(0, 3), (1, 1), (2, 1), (3, 0), (4, 18446744073709551615), (5, 1), (6, 18446744073709551614), (7, 18446744073709551612)]
问题解决。您还会注意到 enumerate
返回的 item
是 int
或 scalar
类型,因此它没有索引属性。摆脱你的 index[0]
,这应该可以解决你的问题。
您的方法没有错误:
def parse_info(sentence):
doc = nlp(sentence)
heads = [index + item for index, item in enumerate(doc.to_array(['HEAD']))]
我正在尝试 运行 在本地压缩以下句子:https://github.com/zhaohengyang/Generate-Parallel-Data-for-Sentence-Compression
所以我复制了文件并使用 conda 安装了所有依赖项。我做了一些小的修改,例如从 url 而不是本地磁盘读取数据,并将他的 parallel_data_gen.py 捆绑在我的单个 py 文件中。
然而,当我 运行 它时,我得到:
Spacy 库无法将句子解析为树。请忽略这句话对
----------------59-------------------
reducing sentence: This year the Venezuelan government plans to continue its pace of land expropriations in order to move towards what it terms ``agrarian socialism''.
reducing headline: Venezuelan government to continue pace of land expropriations for ``agrarian socialism''
Traceback (most recent call last):
File "/home/user/dev/projects/python-snippets/zhaohengyang/sentence-compression.py", line 701, in <module>
reduce_sentence(sample)
File "/home/user/dev/projects/python-snippets/zhaohengyang/sentence-compression.py", line 641, in reduce_sentence
sentence_info = parse_info(sentence)
File "/home/user/dev/projects/python-snippets/zhaohengyang/sentence-compression.py", line 616, in parse_info
heads = [index + item[0] for index, item in enumerate(doc.to_array([HEAD]))]
IndexError: invalid index to scalar variable.
我不确定如何解决这个问题,因为我是一个新手 python 用户。
这是我 运行 重现问题的完整代码:https://gist.github.com/avidanyum/3edfbc96ea22807445ab5307830d41db
失败的内部片段:
def parse_info(sentence):
doc = nlp(sentence)
heads = [index + item[0] for index, item in enumerate(doc.to_array([HEAD]))]
现在我加载了 nlp
:
import spacy
print('if you didnt run: python -m spacy download en')
import spacy.lang.en
nlp = spacy.load('en')
关于我的环境的更多信息:
/home/user/home/user/dev/anaconda3/envs/pymachine/bin/python --version
Python 2.7.15 :: Anaconda, Inc.
请注意,我是 运行 spaCy 2.0 on python 3.6,但只是 运行 对例句的快速测试:
nlp = spacy.load('en_core_web_lg')
doc = nlp("Here is a test sentence for me to use.")
我收到一些错误 运行 你的代码,它们都在你指定的行中:
heads = [(index, item) for index, item in enumerate(doc.to_array([HEAD]))]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'HEAD' is not defined
这是因为 to_array
调用需要 list
个 string
个对象。修正为:
# Note that HEAD is now a string, rather than a variable
heads = [(index, item) for index, item in enumerate(doc.to_array(['HEAD']))]
heads
[(0, 3), (1, 1), (2, 1), (3, 0), (4, 18446744073709551615), (5, 1), (6, 18446744073709551614), (7, 18446744073709551612)]
问题解决。您还会注意到 enumerate
返回的 item
是 int
或 scalar
类型,因此它没有索引属性。摆脱你的 index[0]
,这应该可以解决你的问题。
您的方法没有错误:
def parse_info(sentence):
doc = nlp(sentence)
heads = [index + item for index, item in enumerate(doc.to_array(['HEAD']))]