使用 jsoup 解析 HTML 部分

Parse an HTML part using jsoup

我需要解析 table_stato_dati (http://as777.brt.it/vas/sped_det_show.hsm?referer=sped_numspe_par.htm&Nspediz=031000032043&RicercaNumeroSpedizione=Ricerca):

<table class="table_stato_dati">
                        <caption><label id="diz_386" title="Stati">Stati</label></caption>
                        <tbody>
                            <tr>
                                <th><label id="diz_85" title="Data">Data</label></th>
                                <th><label id="diz_273" title="Ora">Ora</label></th>
                                <th><label id="diz_305" title="Filiale">Filiale</label></th>
                                <th><label id="diz_387" title="Stato">Stato</label></th>
                            </tr>

                            <tr>
                                <td style="white-space: nowrap; width: 1%">26.01.2015</td>
                                <td style="white-space: nowrap; width: 1%">10.42</td>
                                <td style="text-align: left; width: 35%">PORDENONE (069)</td>
                                <td style="text-align: left;">CONSEGNATA</td>
                            </tr>

                            <tr class="riga_pari">
                                <td style="white-space: nowrap; width: 1%">26.01.2015</td>
                                <td style="white-space: nowrap; width: 1%"></td>
                                <td style="text-align: left; width: 35%">PORDENONE (069)</td>
                                <td style="text-align: left;">MESSA IN CONSEGNA</td>
                            </tr>

                            <tr>
                                <td style="white-space: nowrap; width: 1%">23.01.2015</td>
                                <td style="white-space: nowrap; width: 1%">11.29</td>
                                <td style="text-align: left; width: 35%">PORDENONE (069)</td>
                                <td style="text-align: left;">LASCIATO AVVISO</td>
                            </tr>

                            <tr class="riga_pari">
                                <td style="white-space: nowrap; width: 1%">23.01.2015</td>
                                <td style="white-space: nowrap; width: 1%"></td>
                                <td style="text-align: left; width: 35%">PORDENONE (069)</td>
                                <td style="text-align: left;">MESSA IN CONSEGNA</td>
                            </tr>

                            <tr>
                                <td style="white-space: nowrap; width: 1%">23.01.2015</td>
                                <td style="white-space: nowrap; width: 1%">08.36</td>
                                <td style="text-align: left; width: 35%">PORDENONE (069)</td>
                                <td style="text-align: left;">ARRIVATA IN FILIALE</td>
                            </tr>

                            <tr class="riga_pari">
                                <td style="white-space: nowrap; width: 1%">21.01.2015</td>
                                <td style="white-space: nowrap; width: 1%">21.00</td>
                                <td style="text-align: left; width: 35%">CATANIA (031)</td>
                                <td style="text-align: left;">PARTITA</td>
                            </tr>

                            <tr>
                                <td style="white-space: nowrap; width: 1%">21.01.2015</td>
                                <td style="white-space: nowrap; width: 1%">18.00</td>
                                <td style="text-align: left; width: 35%">CATANIA (031)</td>
                                <td style="text-align: left;">RITIRATA</td>
                            </tr>

                        </tbody>
                    </table>

我的实际代码是:

public static void TestParse (String trackingCode){
            Document doc;
            try {
                doc = Jsoup.connect("http://as777.brt.it/vas/sped_det_show.hsm?referer=sped_numspe_par.htm&Nspediz=031000032043"+trackingCode+"&RicercaNumeroSpedizione=Ricerca").get();

                Elements table = doc.select("table.table_stato_dati");
                System.out.print(table);

            } catch (IOException e) {
                e.printStackTrace();
            }

我该怎么办?谢谢。


我只需要关于我的货件的有序文本。我修改了我的代码:

Elements table = doc.select("table.table_stato_dati");
            String text = table.text();
            System.out.println(text);

但我无法修改组织!我想要这样的东西:

17.12.2014  11.35   REGGIO CALABRIA (017)   CONSEGNATA
17.12.2014      REGGIO CALABRIA (017)   MESSA IN CONSEGNA
17.12.2014  07.54   REGGIO CALABRIA (017)   ARRIVATA IN FILIALE
15.12.2014  21.00   BRESCIA (093)   PARTITA
15.12.2014  18.00   BRESCIA (093)   RITIRATA

不是这样的:

Stati Data Ora Filiale Stato 17.12.2014 11.35 REGGIO CALABRIA (017) CONSEGNATA 17.12.2014 REGGIO CALABRIA (017) MESSA IN CONSEGNA 17.12.2014 07.54 REGGIO CALABRIA (017) ARRIVATA IN FILIALE 15.12.2014 21.00 BRESCIA (093) PARTITA 15.12.2014 18.00 BRESCIA (093) RITIRATA
        Elements trs = doc.select("table.table_stato_dati").select("tr:has(td)");
        for (Element tr : trs) {
            Elements tds = tr.select("td");
            for (Element td : tds) {
                System.out.print(td.text()+" ");                
            }
            System.out.println();
        }