NLTK 和 SCIPY 没有执行我的 python 脚本

NLTK and SCIPY are not executing my python script

您好,我已经安装了 scipyNLTK 软件包来使用 python 探索机器学习。

我已经安装了所有必要的依赖项并按照 NLTKscipy 网站

中给出的安装程序进行操作

我的脚本在 python shell 中执行得很好,但是当我将它保存在 python 脚本中并在终端中执行时,我得到以下输出....

puneet@puneet-HP-Pavilion-g6-Notebook-PC:~/Desktop$ python token.py 
Traceback (most recent call last):
  File "token.py", line 4, in <module>
    import nltk 
  File "/usr/local/lib/python2.7/dist-packages/nltk/__init__.py", line 103, in <module>
    from nltk.collocations import *
  File "/usr/local/lib/python2.7/dist-packages/nltk/collocations.py", line 38, in <module>
    from nltk.util import ngrams
  File "/usr/local/lib/python2.7/dist-packages/nltk/util.py", line 13, in <module>
    import pydoc
  File "/usr/lib/python2.7/pydoc.py", line 55, in <module>
    import sys, imp, os, re, types, inspect, __builtin__, pkgutil, warnings
  File "/usr/lib/python2.7/inspect.py", line 39, in <module>
    import tokenize
  File "/usr/lib/python2.7/tokenize.py", line 31, in <module>
    from token import *
  File "/home/puneet/Desktop/token.py", line 6, in <module>
    from sklearn.feature_extraction.text import CountVectorizer
ImportError: No module named sklearn.feature_extraction.text

这是我的脚本:

import nltk 
import scipy
from sklearn.feature_extraction.text import CountVectorizer
#vectorizer = CountVectorizer()

train_set = ("The sky is blue.", "The sun is bright.")
test_set = ("The sun in the sky is bright.",
    "We can see the shining sun, the bright sun.")

print vectorizer

cv = sklearn.feature_extraction.text.CountVectorizer(vocabulary=['sun', 'blue', 'bright'])
cv.fit_transform(['The sun in the sky is bright.', 'We can see the shining sun, the bright sun.', 'The sun is bright.', 'nine days old']).toarray()

有人可以建议一种解决方法来执行我的脚本。我认为这与 PYTHONPATH 变量有关......我认为! :D 但我无法修改它!请帮忙!!

something.py 重命名你的文件,因为 python2.7 package(/usr/lib/python2.7/token.py) 中还有一个文件name token.py 并且存在名称冲突。

如果您遇到以下错误:

NameError: name 'N_TOKENS' is not defined

然后以root用户登录。更改文件名 tokenize.py

中的 line 30
from token import * --> from token2 import *

还需要重命名文件(/usr/lib/python2.7/token.py):

token.py --> token2.py

从痕迹来看,您似乎没有安装 sklearn

您可以使用:

sudo apt-get install python-sklearn

我刚刚修改了您的脚本,添加和删除了一些行。

import nltk 
import scipy
import sklearn         #added this line
from sklearn.feature_extraction.text import CountVectorizer

train_set = ("The sky is blue.", "The sun is bright.")
test_set = ("The sun in the sky is bright.",
    "We can see the shining sun, the bright sun.")
#removed the line which was present
cv = sklearn.feature_extraction.text.CountVectorizer(vocabulary=['sun', 'blue', 'bright'])
output = cv.fit_transform(['The sun in the sky is bright.', 'We can see the shining sun, the bright sun.', 'The sun is bright.', 'nine days old']).toarray()
print output

编译正常,没有任何错误。

输出:

manjunath@manjunath-virtual-michine:~/Desktop $ python my_test.py
[[1,0,1]
 [2,0,1]
 [1,0,1]
 [0,0,0]]
[[1,0,1]
 [2,0,1]
 [1,0,1]
 [0,0,0]]
manjunath@manjunath-virtual-michine:~/Desktop $