Python 3.4:LXML:解析表

Python 3.4 : LXML : Parsing Tables

我想解析来自雅虎财经的整个 table。据我了解,'tbody' 和 'thead' 标签不是由 lxml 注册的,而是作为附加 TR 注册的,因此我将 xpath 切换为:

/html/body/div[4]/div[4]/table[2]/tbody/tr[2]/td/table[2]/tbody/tr/td/table/tbody

到下面代码中看到的内容

url = 'http://finance.yahoo.com/q/is?s=MMM+Income+Statement&annual'

tree = html.parse(url)



tick_content = [td.text_content() for td in tree.xpath('/html/body/div[4]/div[4]/table[2]/tr[3]/td/table[2]/tr[1]/td/table/td[1]')]

print(tick_content)

我返回的是空白屏幕。有没有特殊的方法来解析 table orrrr?

与其使用由 Chrome 生成的巨大的长 XPath,不如使用 yfnc_tabledata1 class 搜索 table;只有一个:

>>> tree.xpath("//table[@class='yfnc_tabledata1']")
[<Element table at 0x10445e788>]

从那里前往您的<td>

>>> tree.xpath("//table[@class='yfnc_tabledata1']//td[1]")[0].text_content()
'Period EndingDec 31, 2014Dec 31, 2013Dec 31, 2012\n                            \n                        Total Revenue\n                            \n                        \n                                \n                            31,821,000\xa0\xa0\n                                \n                            \n                                \n                            30,871,000\xa0\xa0\n                                \n                            \n                                \n                            29,904,000\xa0\xa0\n                                \n                            Cost of Revenue16,447,000\xa0\xa016,106,000\xa0\xa015,685,000\xa0\xa0\n                            \n                        Gross Profit\n                            \n                        \n                                \n                            15,374,000\xa0\xa0\n                                \n                            \n                                \n                            14,765,000\xa0\xa0\n                                \n                            \n                                \n                            14,219,000\xa0\xa0\n                                \n                            \n                    \n                Operating Expenses\n                    \n                Research Development1,770,000\xa0\xa01,715,000\xa0\xa01,634,000\xa0\xa0\n                    \n                Selling General and Administrative6,469,000\xa0\xa06,384,000\xa0\xa06,102,000\xa0\xa0\n                    \n                Non Recurring\n            -\n            \xa0\n            -\n            \xa0\n            -\n            \xa0\n                    \n                Others\n            -\n            \xa0\n            -\n            \xa0\n            -\n            \xa0\n                    \n                \n                    \n                Total Operating Expenses\n            -\n            \xa0\n            -\n            \xa0\n            -\n            \xa0\n                            \n                        Operating Income or Loss\n                            \n                        \n                                \n                            7,135,000\xa0\xa0\n                                \n                            \n                                \n                            6,666,000\xa0\xa0\n                                \n                            \n                                \n                            6,483,000\xa0\xa0\n                                \n                            \n                    \n                Income from Continuing Operations\n                    \n                Total Other Income/Expenses Net33,000\xa0\xa041,000\xa0\xa039,000\xa0\xa0\n                    \n                Earnings Before Interest And Taxes7,168,000\xa0\xa06,707,000\xa0\xa06,522,000\xa0\xa0\n                    \n                Interest Expense142,000\xa0\xa0145,000\xa0\xa0171,000\xa0\xa0\n                    \n                Income Before Tax7,026,000\xa0\xa06,562,000\xa0\xa06,351,000\xa0\xa0\n                    \n                Income Tax Expense2,028,000\xa0\xa01,841,000\xa0\xa01,840,000\xa0\xa0\n                    \n                Minority Interest(42,000)(62,000)(67,000)\n                    \n                \n                    \n                Net Income From Continuing Ops4,956,000\xa0\xa04,659,000\xa0\xa04,444,000\xa0\xa0\n                    \n                Non-recurring Events\n                    \n                Discontinued Operations\n            -\n            \xa0\n            -\n            \xa0\n            -\n            \xa0\n                    \n                Extraordinary Items\n            -\n            \xa0\n            -\n            \xa0\n            -\n            \xa0\n                    \n                Effect Of Accounting Changes\n            -\n            \xa0\n            -\n            \xa0\n            -\n            \xa0\n                    \n                Other Items\n            -\n            \xa0\n            -\n            \xa0\n            -\n            \xa0\n                            \n                        Net Income\n                            \n                        \n                                \n                            4,956,000\xa0\xa0\n                                \n                            \n                                \n                            4,659,000\xa0\xa0\n                                \n                            \n                                \n                            4,444,000\xa0\xa0\n                                \n                            Preferred Stock And Other Adjustments\n            -\n            \xa0\n            -\n            \xa0\n            -\n            \xa0\n                            \n                        Net Income Applicable To Common Shares\n                            \n                        \n                                \n                            4,956,000\xa0\xa0\n                                \n                            \n                                \n                            4,659,000\xa0\xa0\n                                \n                            \n                                \n                            4,444,000\xa0\xa0\n                                \n                            '
>>> print(tree.xpath("//table[@class='yfnc_tabledata1']//td[1]")[0].text_content())
Period EndingDec 31, 2014Dec 31, 2013Dec 31, 2012

                        Total Revenue



                            31,821,000  



                            30,871,000  



                            29,904,000  

                            Cost of Revenue16,447,000  16,106,000  15,685,000  

                        Gross Profit



                            15,374,000  



                            14,765,000  



                            14,219,000  



                Operating Expenses

                Research Development1,770,000  1,715,000  1,634,000  

                Selling General and Administrative6,469,000  6,384,000  6,102,000  

                Non Recurring
            -
             
            -
             
            -
             

                Others
            -
             
            -
             
            -
             



                Total Operating Expenses
            -
             
            -
             
            -
             

                        Operating Income or Loss



                            7,135,000  



                            6,666,000  



                            6,483,000  



                Income from Continuing Operations

                Total Other Income/Expenses Net33,000  41,000  39,000  

                Earnings Before Interest And Taxes7,168,000  6,707,000  6,522,000  

                Interest Expense142,000  145,000  171,000  

                Income Before Tax7,026,000  6,562,000  6,351,000  

                Income Tax Expense2,028,000  1,841,000  1,840,000  

                Minority Interest(42,000)(62,000)(67,000)



                Net Income From Continuing Ops4,956,000  4,659,000  4,444,000  

                Non-recurring Events

                Discontinued Operations
            -
             
            -
             
            -
             

                Extraordinary Items
            -
             
            -
             
            -
             

                Effect Of Accounting Changes
            -
             
            -
             
            -
             

                Other Items
            -
             
            -
             
            -
             

                        Net Income



                            4,956,000  



                            4,659,000  



                            4,444,000  

                            Preferred Stock And Other Adjustments
            -
             
            -
             
            -
             

                        Net Income Applicable To Common Shares



                            4,956,000  



                            4,659,000  



                            4,444,000