python 的语义智能

Semantic Intelligence with python

我正在尝试将单词分类成分数,目前的评分很简单,我只想根据-1、0、1对单词进行分类,最后对分数求和。这种分类将基于单词的情感内涵,因此像 "great,awesome,excellent" 这样的正面词会得到 +1 的分数,像 "bad, ill, not" 这样的负面词会得到 -1 的分数,而中性词会得到 0 分。例如;text = "I feel bad" 将通过 table,DB,library 推送,其中单词被预先分类并汇总为“I(0) + feel(0) + bad(-1 ) = -1

我已经开始了,举个例子,使用 BeautifulSoup 和 urllib 库(代码如下)剥离了一个网站的 HTML 编码:

import urllib
from bs4 import BeautifulSoup

url = "http://www.greenovergrey.com/living-walls/what-are-living-walls.php"
html = urllib.urlopen(url).read()
soup = BeautifulSoup(html)

# kill all script and style elements
for script in soup(["script", "style"]):
    script.extract()    # rip it out

# get text
text = soup.get_text()

# break into lines and remove leading and trailing space on each
lines = (line.strip() for line in text.splitlines())
# break multi-headlines into a line each
chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
# drop blank lines
text = '\n'.join(chunk for chunk in chunks if chunk)

print(text)

输出:

What are Living Walls? Definition of Green Wall and Vertical Garden
GREEN OVER CREY
Overview
/
What are living walls
/
Our green wall system vs. modular boxes
What are living walls
L iving walls or green walls are self sufficient vertical gardens that are attached to the exterior or interior of a building. They differ from green façades (e.g. ivy walls) in that the plants root in a structural support which is fastened to the wall itself. The plants receive water and nutrients from within the vertical support instead of from the ground.
The Green over Grey™ living wall system is different than others on the market today. It closely mimics nature and allows plants to grow to their full potential, without limitations. It is also by far the lightest.
Diversity is the key and by utilizing hundreds of different types of plants we create striking patterns and unique designs. We achieve this by utilizing the multitude of colours, textures and sizes that nature provides. Our system accommodates flowering perennials, beautiful foliage plants, ground covers and even allows for bushes, shrubs, and small trees!
Living walls are also referred to as green walls, vertical gardens or in French, mur végétal. The French botanist and artist Patrick Blanc was a pioneer by creating
the first vertical garden over 30 years ago.
Our system
consists of a frame, waterproof panels, an automatic irrigation system, special materials, lights when needed and of course plants. The frame is built in front of a pre existing wall and attached at various points; there is no damage done to the building. Waterproof panels are mounted to the frame; these are rigid and provide structural support. There is a layer of air between the building and the panels which enables the building to breath. This adds beneficial insulating properties and acts like rain-screening to protect the building envelop.
Our green walls are low maintenance thanks to an automatic irrigation system

我的问题是,通过 table 或预分类词库来 运行 这个字符串的最佳方法是什么?有人知道任何现有的基于情感的预分类词库吗?我怎样才能创建一个小的 table 或数据库来真正快速地进行测试?

提前谢谢大家, 生锈

我不知道如何将此问题标记为重复问题,但快速 google 搜索 this 结果。

第一个答案看起来很有希望。我去了 link,它只需要一些信息来访问文件。我认为它的格式很容易解析。

如果您有这样的table,您可以在此处找到此类词典的列表:http://mpqa.cs.pitt.edu/lexicons/effect_lexicon/

您可以将该列表加载到字典中并执行您描述的算法。但是,如果您正在寻找快速结果,我建议您使用 textblob 库。它非常易于使用,并且具有很多功能。一个非常好的开始项目的地方,就像您可能正在开始的那样。