Python xpath: 尝试一个 xpath,除了填写给定的值

Python xpath: try an xpath, except fill in a given value

我正在从网站上抓取评论。最终我需要几个列表(例如用户名和日期),它们将在每次审查时放入一个字典中,这样它看起来像这样:

reviews:[{'username':'Harry','date':'april'},
         {'username':'Rob','date':'may'}]

这些列表必须同样长,因为我将它们放在这样的字典中: 评论=[]

for i in range(len(username)):
    reviews.append({'username':username[i].strip(),
                              'date':date[i].strip()})

然而,当没有用户名时,xpath 不会 return 任何东西,而且我的列表太短(这将给出错误 "list index out of range")。当 xpath 不起作用时,如何填写给定值(例如 "no name")?如果尝试过这样的事情(我认为会起作用但不起作用):

try:
    names = tree.xpath..
except:
    "no name"

编辑:HTML 评论类型的示例(移动与非移动)。 手机评论:

<div class="rating reviewItemInline">
  <span class="ui_bubble_rating bubble_50"></span>
  <span class="ratingDate relativeDate">Reviewed 6 days ago</span>
  <a class="viaMobile">via mobile</a>
</div>

非手机评论:

<div class="rating reviewItemInline">
  <span class="ui_bubble_rating bubble_50"></span>
  <span class="ratingDate relativeDate">Reviewed 6 days ago</span>
</div>

您必须迭代所需的项目,然后检查每个字段所需的 xpath,例如:

review_elems = tree_html.xpath('//div[@class="rating reviewItemInline"]')

reviews = []   

for review_elem in reviews_elems:
    review = {}
    username = review_elem.xpath('.//a[@class="viaMobile"]')
    if username:
        review['username'] = username[0].text
    else:
        review['username'] = 'no name'

    # keep filling review with more fields
    reviews.append(review)

print(reviews)

不需要实现try/except,只需尝试获取所有必需元素的两个列表,如下所示:

html = lxml.html.fromstring("source code here")
reviews = html.xpath('//div[@class="rating reviewItemInline"]')
dates = [i.xpath('./span[@class="ratingDate relativeDate"]')[0].text for i in reviews]
mobile = [i.xpath('./a')[0].text if i.xpath('./a') else "no" for i in reviews]
output = [{'date': i, 'via mobile': j} for i, j in zip(dates, mobile)]

output 应该类似于

[{'date': 'Reviewed 6 days ago', 'via mobile': 'via mobile'}, {'date': 'Reviewed 6 days ago', 'via mobile': 'no'}]