python3 xpath can't reach a child node (AttributeError: 'NoneType' object has no attribute 'text')
python3 xpath can't reach a child node (AttributeError: 'NoneType' object has no attribute 'text')
需要帮助解决一些我没能找到的问题
我有一个像这样的 xml:
<forecast xmlns="http://weather.yandex.ru/forecast" country_id="8996ba26eb0edf7ea5a055dc16c2ccbd" part="Лен Стокгольм" link="http://pogoda.yandex.ru/stockholm/" part_id="53f767b78d8f180c28d55ebda1d07e0c" lat="59.381981" slug="stockholm" city="Стокгольм" climate="1" country="Швеция" region="10519" lon="17.956846" zoom="12" id="2464" source="Station" exactname="Стокгольм" geoid="10519">
<fact>...</fact>
<yesterday id="435077826">...</yesterday>
<informer>...</informer>
<day date="2016-04-18">
<sunrise>05:22</sunrise>
<sunset>20:12</sunset>
<moon_phase code="growing-moon">14</moon_phase>
<moonrise>15:53</moonrise>
<moonset>04:37</moonset>
<biomet index="3" geomag="2" low_press="1" uv="1">...</biomet>
<day_part typeid="1" type="morning">...</day_part>
<day_part typeid="2" type="day">...</day_part>
<day_part typeid="3" type="evening">...</day_part>
<day_part typeid="4" type="night">...</day_part>
<day_part typeid="5" type="day_short">
<temperature>11</temperature>
</day_part>
</day>
</forecast>
(整个 xml 可以在 https://export.yandex.ru/weather-ng/forecasts/2464.xml 到达)。需要得到 temperature.text (11),试试这个代码:
import urllib.request
import codecs
import lxml
from xml.etree import ElementTree as ET
def gen_ns(tag):
if tag.startswith('{'):
ns, tag = tag.split('}')
return ns[1:]
else:
return ''
with codecs.open(fname, 'r', encoding = 'utf-8') as t:
town_tree = ET.parse(t)
town_root = town_tree.getroot()
print (town_root)
namespaces = {'ns': gen_ns(town_root.tag)}
print (namespaces)
for day in town_root.iterfind('ns:day', namespaces):
date = (day.get('date'))
print (date)
day_temp = day.find('.//*[@type="day_short"]/temperature')
print (day_temp.text)
得到:
Traceback (most recent call last):
File "weather.py", line 154, in <module>
print (day_temp.text)
AttributeError: 'NoneType' object has no attribute 'text'
我的 xpath 有什么问题?我可以获取 ('.//*[@type="day_short"]')
的属性,但无法获取其子(温度)文本
谢谢大家!
xml 文档包含一个默认命名空间,而 XPath 没有默认命名空间的概念。在 XPath 中,您需要将它映射到一个前缀(就像您使用 day
所做的那样)或使用其他方法,例如 local-name
来确定元素的标签名称是否与您想要的匹配。
.//*[@type="day_short"]/*[local-name()='temperature']
或
day_temp = day.find('.//*[@type="day_short"]/ns:temperature', namespaces)
需要帮助解决一些我没能找到的问题
我有一个像这样的 xml:
<forecast xmlns="http://weather.yandex.ru/forecast" country_id="8996ba26eb0edf7ea5a055dc16c2ccbd" part="Лен Стокгольм" link="http://pogoda.yandex.ru/stockholm/" part_id="53f767b78d8f180c28d55ebda1d07e0c" lat="59.381981" slug="stockholm" city="Стокгольм" climate="1" country="Швеция" region="10519" lon="17.956846" zoom="12" id="2464" source="Station" exactname="Стокгольм" geoid="10519">
<fact>...</fact>
<yesterday id="435077826">...</yesterday>
<informer>...</informer>
<day date="2016-04-18">
<sunrise>05:22</sunrise>
<sunset>20:12</sunset>
<moon_phase code="growing-moon">14</moon_phase>
<moonrise>15:53</moonrise>
<moonset>04:37</moonset>
<biomet index="3" geomag="2" low_press="1" uv="1">...</biomet>
<day_part typeid="1" type="morning">...</day_part>
<day_part typeid="2" type="day">...</day_part>
<day_part typeid="3" type="evening">...</day_part>
<day_part typeid="4" type="night">...</day_part>
<day_part typeid="5" type="day_short">
<temperature>11</temperature>
</day_part>
</day>
</forecast>
(整个 xml 可以在 https://export.yandex.ru/weather-ng/forecasts/2464.xml 到达)。需要得到 temperature.text (11),试试这个代码:
import urllib.request
import codecs
import lxml
from xml.etree import ElementTree as ET
def gen_ns(tag):
if tag.startswith('{'):
ns, tag = tag.split('}')
return ns[1:]
else:
return ''
with codecs.open(fname, 'r', encoding = 'utf-8') as t:
town_tree = ET.parse(t)
town_root = town_tree.getroot()
print (town_root)
namespaces = {'ns': gen_ns(town_root.tag)}
print (namespaces)
for day in town_root.iterfind('ns:day', namespaces):
date = (day.get('date'))
print (date)
day_temp = day.find('.//*[@type="day_short"]/temperature')
print (day_temp.text)
得到:
Traceback (most recent call last):
File "weather.py", line 154, in <module>
print (day_temp.text)
AttributeError: 'NoneType' object has no attribute 'text'
我的 xpath 有什么问题?我可以获取 ('.//*[@type="day_short"]')
的属性,但无法获取其子(温度)文本
谢谢大家!
xml 文档包含一个默认命名空间,而 XPath 没有默认命名空间的概念。在 XPath 中,您需要将它映射到一个前缀(就像您使用 day
所做的那样)或使用其他方法,例如 local-name
来确定元素的标签名称是否与您想要的匹配。
.//*[@type="day_short"]/*[local-name()='temperature']
或
day_temp = day.find('.//*[@type="day_short"]/ns:temperature', namespaces)