使用 lxml/python 解析论坛帖子
Parsing forum posts using lxml/python
当我使用下面的代码时,它将一个 div 拆分为数组中的十五个项目。问题是我希望这个 post 作为数组中的一项。这可能是因为 <br>
标签,但我不确定如何解决它。
from lxml import html
import requests
page = requests.get('http://www.city-data.com/forum/economics/2056372-minimum-wage-vs-liveable-wage.html')
tree = html.fromstring(page.text)
details = tree.xpath('//div[contains(@id, "post_message_33583236")]/text()')
print len(details) #prints 15
使用 xpath(不是文本)找到 元素 并使用 text_content()
方法:
details = tree.xpath('.//div[contains(@id, "post_message_33583236")]')[0]
print(details.text_content())
打印:
With all the talk about raising the minimum wage, I think the real issue is that people are not getting a liveable wage anymore. This applies to many skilled people too in which their job tries to pay them -13hr for -30hr type of work.
Not everyone deserves a raise at walmart or other low paying jobs. I think everyone should atleast prove themselves for 6 months to year then start to gradually get a raise. You cant act a fool and get paid the same as people who work hard and try to move up in life. Even if walmart workers weren't making minimum wage and making hr, you cant really do much making 22k a year other than live in a cheap/borderline crime infested area
hr gets you about 50 a month after taxes and health coverage at most jobs and ill list just the basic necessities in life
...
当我使用下面的代码时,它将一个 div 拆分为数组中的十五个项目。问题是我希望这个 post 作为数组中的一项。这可能是因为 <br>
标签,但我不确定如何解决它。
from lxml import html
import requests
page = requests.get('http://www.city-data.com/forum/economics/2056372-minimum-wage-vs-liveable-wage.html')
tree = html.fromstring(page.text)
details = tree.xpath('//div[contains(@id, "post_message_33583236")]/text()')
print len(details) #prints 15
使用 xpath(不是文本)找到 元素 并使用 text_content()
方法:
details = tree.xpath('.//div[contains(@id, "post_message_33583236")]')[0]
print(details.text_content())
打印:
With all the talk about raising the minimum wage, I think the real issue is that people are not getting a liveable wage anymore. This applies to many skilled people too in which their job tries to pay them -13hr for -30hr type of work.
Not everyone deserves a raise at walmart or other low paying jobs. I think everyone should atleast prove themselves for 6 months to year then start to gradually get a raise. You cant act a fool and get paid the same as people who work hard and try to move up in life. Even if walmart workers weren't making minimum wage and making hr, you cant really do much making 22k a year other than live in a cheap/borderline crime infested area
hr gets you about 50 a month after taxes and health coverage at most jobs and ill list just the basic necessities in life
...