如何找到输入实体与数据库实体之间的相似性

How to find similarity between entity in input to that of database

我正在尝试使用 rasa nlu 创建一个聊天机器人来帮助搜索酒店。我创建了一个小型 sqlite 数据库,其中包含一些餐馆的名称和其他描述。这是我的数据库的结构

Name            Cuisine    Price   ambience location    rating
Flower Drum     chinese    high    2        south       5
Little Italy    italian    high    2        south       2
Quattro         mexican    low     2        center      3
Domino's Pizza  fast food  mid     0        east        3

我就这样的一些自定义意图训练了解释器。

## intent:hotel_search
- I'm looking for a [Mexican](cuisine) restaurant in the [North](location) of town
- Which is the [best](rating) restaurant in the city
- Which restaurant has the most [rating](rating) in the city
- I am looking for a [burger](dish) joint in the [south](location) of the city
- I am trying to find an [expensive](price) [Indian](cuisine) restaurant in the [east](location) of the city

这是训练解释器的代码

def train(data, config_file, model_dir):
    training_data = load_data(data)
    trainer = Trainer(config.load(config_file))
    trainer.train(training_data)
    model_directory = trainer.persist(model_dir, fixed_model_name = 'chat')

这是从sqlite数据库中查找酒店的代码

def find_hotels(params):
    # Create the base query
    query = 'SELECT * FROM hotels'
    # Add filter clauses for each of the parameters
    if len(params) > 0:
        filters = ["{}=?".format(k) for k in params]
        query += " WHERE " + " and ".join(filters)
    # Create the tuple of values
    t = tuple(params.values())

    # Open connection to DB
    conn = sqlite3.connect('hotels.sqlite')
    # Create a cursor
    c = conn.cursor()
    # Execute the query
    c.execute(query, t)
    # Return the results
    return c.fetchall()

这是响应输入消息的代码

# Define respond()
def respond(message):
    # responses
    responses = ["I'm sorry :( I couldn't find anything like that",
                 '{} is a great hotel!',
                 '{} or {} would work!',
                 '{} is one option, but I know others too :)']

    # Extract the entities
    entities = interpreter.parse(message)["entities"]
    # Initialize an empty params dictionary
    params = {}
    # Fill the dictionary with entities
    for ent in entities:
        params[ent["entity"]] = str(ent["value"])

    print("\n\nparams: {}\n\n".format(params))
    # Find hotels that match the dictionary
    results = find_hotels(params)
    print("\n\nresults: {}\n\n".format(results))
    # Get the names of the hotels and index of the response
    names = [r[0] for r in results]
    n = min(len(results),3)
    # Select the nth element of the responses array
    return responses[n].format(*names)

但是当我用这个例子测试解释器时

我在城南找一家贵的中餐馆

这是我得到的结果

params: {'price': 'expensive', 'cuisine': 'chinese', 'location': 'south'}

results: []

I'm sorry :( I couldn't find anything like that

如果我从输入问题中删除昂贵的词,我会得到一个正确的结果,例如这个

我在城南找中餐馆

params: {'cuisine': 'chinese', 'location': 'south'}

results: [('Flower Drum', 'chinese', 'high', 2, 'south', 5)]

Flower Drum is a great hotel!

机器人能够识别所有实体,但是它无法select数据库中的正确数据,因为没有名称为 expensive[=44 的数据条目=] 在数据库的价格列中。我如何训练机器人将单词 expensive 识别为 high

您可以在 nlu.md 中添加同义词。将此添加到您的文件中,'expensive' 将映射到 high:

## synonym:high
- expensive