nltk ngrams >> TypeError: '_io.TextIOWrapper' object is not callable

Question

我已经阅读了几个小时，似乎每次我解决一个错误时，我运行就会变成另一个错误。

我正在尝试使用 nltk 根据在 csv 中找到的单词（附示例）生成各种 ngram（unigram / bigram / trigram ...）。

抱歉，这可能真的很简单。也就是说，我们将不胜感激！

import re
import os
import csv
from collections import Counter
from nltk.util import ngrams
from nltk import word_tokenize
import nltk
nltk.download('punkt')


cwd = os.getcwd()

ngrams = open(os.path.join(cwd, "combined.csv"),
              "r", encoding="utf8")

with ngrams as f:
    reader = csv.DictReader(f, delimiter=',')
    keywords = [item['Keyword'] for item in reader]
    string = " ".join(keywords)
    # token = nltk.word_tokenize(string)
    unigrams = ngrams(string, 1)
    bigrams = ngrams(string, 2)
    trigrams = ngrams(string, 3)

    print(trigrams)

错误

  File "ngram.py", line 27, in <module>
    unigrams = ngrams(string, 1)
TypeError: '_io.TextIOWrapper' object is not callable

combined.csv >>

关键词
'k cups',
'k cup coffee',
'keurig coffee pods',
'coffee pods',
'keurig not dispensing water',
'keurig not pumping water',
'how long do k cups last',
'keurig won t pump water',
'keurig troubleshooting',
'cheap k cups',
'folgers commercial',
'tea k cups',
'keurig water not coming out',

Answer 1

你的错误是你超载了ngrams。（您将其用作文件和 ntlk 函数）

修复可以是：

with open(os.path.join(cwd, "combined.csv"),
              "r", encoding="utf8") as ngrams_file
    reader = csv.DictReader(ngrams_file, delimiter=',')
    keywords = [item['Keyword'] for item in reader]
    string = " ".join(keywords)
    # token = nltk.word_tokenize(string)
    unigrams = ngrams(string, 1)
    bigrams = ngrams(string, 2)
    trigrams = ngrams(string, 3)

print(trigrams)

Answer 2

您在 NLTK 函数名称和文件描述符名称之间存在冲突。您需要更改 descriptpr 名称或重写 with 构造：

with open(os.path.join(cwd, "combined.csv"), "r", encoding="utf8") as f:
    # your operations and ngrams method here

nltk ngrams >> TypeError: '_io.TextIOWrapper' object is not callable

nltk ngrams >> TypeError: '_io.TextIOWrapper' object is not callable

python

nltk

n-gram