Python Elasticsearch:使用来自 search_exists 的回复

Python Elasticsearch: using responses from search_exists

我正在尝试从文本文件中获取一个 url 列表,看看它们是否已经存储在 elasticsearch 中。这是代码:

import fileinput
import sys
import urllib2
import os
from urlparse import urlparse
from elasticsearch import Elasticsearch

es = Elasticsearch()

for line_number, line in enumerate(fileinput.input('bangersandmash_items.csv', inplace=1)):
    if len(line) > 4:
            sys.stdout.write(line)


#open file to load URLs

with open('bangersandmash_items.csv') as urls:
    for line in urls:

        #strip out http:// as this seems to cause elasticsearch to return no results

        url = line.rstrip()
        prefix = 'http://'
        if url.startswith(prefix):
            url = url[len(prefix):]

        #query elasticsearch to see if url already exists in library's 'link' fied

        response = es.search_exists(index="websearch", doc_type="site", body={"query": {"match_phrase": {"link": url}}}, ignore=[400, 404])
            print url
            print response

            #Is url in library?

            if response == "{u'exists': true}":
                print url
                print "bingo!"
            else:
                print url
                print "nuthin."

它按照第 19-22 行的格式打印出 url,但它似乎不处理错误代码。第 25 和 26 行打印出 URL 和 elasticsearch 的响应。第 28-33 行似乎没有正确地根据此信息采取行动。对我在这里做错了什么有什么想法吗?

想通了。必须调整 if/else 语句,以便将 elasticsearch 的响应作为字典中的字符串读取:

state = str(response['exists'])
               if state == 'True':
               print url
               print "bingo!"
               [etc].