从 vb.net 读取网站 table

Reading a website table from vb.net

我正在 中制作一个从网页获取数据的应用程序: http://files.minecraftforge.net/maven/net/minecraftforge/forge/index_1.7.10.html

在该页面上,单击 "Show all downloads" 按钮,出现 table,我想从 table 中获取数据,特别是 "Version" 列,然后将其添加到列表视图。 为此,我使用以下代码:

Private Sub files()
    Dim source As String = New Net.WebClient().DownloadString("http://files.minecraftforge.net/maven/net/minecraftforge/forge/index_1.7.10.html")
    Dim recentSource As String = GetTagContents(source, "<table class=""downloadsTable"" id=""downloadsTable"">", "</table>")(0)
    Dim lvi As New ListViewItem
    For Each title As String In GetTagContents(recentSource, "<li>", "</li>")
        If Not title.Contains("http:") Then
            lvi.Text = title
            ListView1.Items.Add(lvi)
        End If
    Next
End Sub

Private Function GetTagContents(ByVal Source As String, ByVal startTag As String, ByVal endTag As String) As List(Of String)
    Dim StringsFound As New List(Of String)
    Dim Index As Integer = Source.IndexOf(startTag) + startTag.Length
    While Index <> startTag.Length - 1
        StringsFound.Add(Source.Substring(Index, Source.IndexOf(endTag, Index) - Index))
        Index = Source.IndexOf(startTag, Index) + startTag.Length
    End While
    Return StringsFound
End Function

问题是它只显示 table“10.13.4.1492”中的第一个值。 该程序不会继续 table 的以下几行,只是停留在那里。

查看以下代码:

Dim lvi As New ListViewItem
For Each title As String In GetTagContents(recentSource, "<li>", "</li>")
    If Not title.Contains("http:") Then
        lvi.Text = title
        ListView1.Items.Add(lvi)
    End If
Next

它只创建一个ListViewItem 对象,在循环之前。每次迭代都需要一个新的 ListViewItem:

For Each title As String In GetTagContents(recentSource, "<li>", "</li>")
    If Not title.Contains("http:") Then
        Dim lvi As New ListViewItem
        lvi.Text = title
        ListView1.Items.Add(lvi)
    End If
Next

甚至更好:

ListView1.Items.AddRange(
GetTagContents(recentSource, "<li>", "</li>").
    Where(Function(t) Not t.Contains("http:")).
    Select(Function(t) New ListViewItem(t)).
    ToArray() )

理想情况下,您会找到此数据的 rss 提要源。 Rss就是为这种抓取而生的。