我如何使用jsoup遍历div

How do i loop through divs using jsoup

大家好,我在 IntelliJ 上的 java 网络应用程序中使用 jsoup。我正在尝试从船舶跟踪 website 中抓取港口停靠事件的数据并将数据存储在 mySQL 数据库中。

事件的数据组织在divs with the class name table-group and the values are in another div with the class name table-row.
我的问题是所有容器的 divs 行都是相同的 class 名称和 im试图遍历每一行并将数据推送到数据库。到目前为止,我已经设法创建了一个 java class 来抓取第一行。
我如何遍历每一行并将这些值存储到我的数据库中。我应该创建一个数组列表来存储值吗?



这是我的刷屏class

public class Scarper {

    private static Document doc;

    public static void main(String[] args) {

        final String url =
                "https://www.myshiptracking.com/ports-arrivals-departures/?mmsi=&pid=277&type=0&time=&pp=20";

        try {

            doc = Jsoup.connect(url).get();
        } catch (IOException e) {
            e.printStackTrace();
        }
        Events();
    }

    public static void Events() {
        Elements elm = doc.select("div.table-group:nth-of-type(2) > .table-row");

        List<String> arrayList = new ArrayList();

        for (Element ele : elm) {

            String event = ele.select("div.col:nth-of-type(2)").text();
            String time = ele.select("div.col:nth-of-type(3)").text();
            String port = ele.select("div.col:nth-of-type(4)").text();
            String vessel = ele.select(".td_vesseltype.col").text();
            Event ev = new Event();
            System.out.println(event);
            System.out.println(time);
            System.out.println(port);
            System.out.println(vessel);
        }
    }
}

divclass我想抓取的样本

<div style="box-sizing: border-box;padding: 0px 10px 10px 10px;">
            <div class="cs-table">
                <div class="heading">
                    <div class="col" style="width: 10px"></div>
                    <div class="col" style="width: 110px">Event</div>
                    <div class="col" style="width: 120px">Time (<span class="tooltip" title="My Time: In your current TimeZone">MT</span>)</div>
                    <div class="col" style="width: 150px">Port</div>
                    <div class="col">Vessel</div>
                </div>
                                    <div class="table-group">
                    <div class="table-row">
                        <div class="col"><i class="fa fa-sign-out red"></i></div>
                        <div class="col">Departure</div>
                        <div class="col" style="text-align: center;">2022-02-14 <b>16:51</b></div>
                        <div class="col"><img class="flag_line tooltip" src="/icons/flags2/16/GB.png" title=" United Kingdom"/><a href="/ports/port-of-belfast-in-gb-united-kingdom-id-101">BELFAST</a></div>
                        <div class="col td_vesseltype"><img src="/icons/icon7_511.png"><span class="padding_18"><a href="/vessels/wilson-blyth-mmsi-314544000-imo-9124419">WILSON BLYTH</a> [GB]</span></div>
                    </div>
                </div>
                                    <div class="table-group">
                    <div class="table-row">
                        <div class="col"><i class="fa fa-flag-checkered green"></i></div>
                        <div class="col">Arrival</div>
                        <div class="col" style="text-align: center;">2022-02-14 <b>16:51</b></div>
                        <div class="col"><img class="flag_line tooltip" src="/icons/flags2/16/GB.png" title=" United Kingdom"/><a href="/ports/port-of-hunters-quay-in-gb-united-kingdom-id-218">HUNTERS QUAY</a></div>
                        <div class="col td_vesseltype"><img src="/icons/icon6_511.png"><span class="padding_18"><a href="/vessels/sound-of-soay-mmsi-235101063-imo-9665229">SOUND OF SOAY</a> [GB]</span></div>
                    </div>
                </div>
                                    <div class="table-group">
                    <div class="table-row">
                        <div class="col"><i class="fa fa-sign-out red"></i></div>
                        <div class="col">Departure</div>
                        <div class="col" style="text-align: center;">2022-02-14 <b>16:51</b></div>
                        <div class="col"><img class="flag_line tooltip" src="/icons/flags2/16/GB.png" title=" United Kingdom"/><a href="/ports/port-of-largs-in-gb-united-kingdom-id-1602">LARGS</a></div>
                        <div class="col td_vesseltype"><img src="/icons/icon6_511.png"><span class="padding_18"><a href="/vessels/loch-shira-mmsi-235053239-imo-9376919">LOCH SHIRA</a> [GB]</span></div>
                    </div>
                </div>
                                    <div class="table-group">
                    <div class="table-row">
                        <div class="col"><i class="fa fa-sign-out red"></i></div>
                        <div class="col">Departure</div>
                        <div class="col" style="text-align: center;">2022-02-14 <b>16:51</b></div>
                        <div class="col"><img class="flag_line tooltip" src="/icons/flags2/16/GB.png" title=" United Kingdom"/><a href="/ports/port-of-ryde-in-gb-united-kingdom-id-1629">RYDE</a></div>
                        <div class="col td_vesseltype"><img src="/icons/icon4_511.png"><span class="padding_18"><a href="/vessels/island-flyer-mmsi-235117772-imo-9737797">ISLAND FLYER</a> [GB]</span></div>
                    </div>
                </div>

您可以从遍历 table 的行开始:table 的选择器是 .cs-table 因此您可以使用 [=12] 获得 table =].接下来,您可以使用选择器 div.table-row - Elements rows = doc.select("div.table-row"); 获取 table 的行,现在您可以遍历所有行并从每一行中提取数据。代码应如下所示:

Element table = doc.select(".cs-table").first();
Elements rows = doc.select("div.table-row");
for (Element row : rows) {
        String event = row.select("div.col:nth-of-type(2)").text();
        String time = row.select("div.col:nth-of-type(3)").text();
        String port = row.select("div.col:nth-of-type(4)").text();
        String vessel = row.select(".td_vesseltype.col").text();
        System.out.println(event + "-" + time + " " + port + " " + vessel);
        System.out.println("---------------------------");
        // Do stuff with data here
    }

现在由您决定是要将数据保留在循环中的某些 array/list 中供以后使用,还是直接将其插入数据库。