vb.net词云索引

vb.net word cloud indexing

我根据单词列表及其频率使用 wordcloud。我从文本文件加载列表并将它们显示在 Listviewimage 中。当文本文件未被索引时(频率最高的在前),词云不会使计数最高的词成为最大的词。 有没有办法加载单词,首先加载频率最高的单词,而不必更改列表?

Imports WordCloudGen = WordCloud.WordCloud
Imports System.IO
Public Class WordCloud
    Private Sub WordCloud_Load(sender As Object, e As EventArgs) Handles MyBase.Load

        Dim lines = File.ReadLines("C:\Users\Gebruiker\Downloads\Words.txt")
        Dim Words As New List(Of String) '({100})
        Dim Frequencies As New List(Of Integer) '({100})
        Dim textValue As String()
        Dim items As New List(Of ListViewItem)

        For Each line In lines
            textValue = line.Split(New Char() {","})
            Words.Add(textValue(0))
            Frequencies.Add(Integer.Parse(textValue(1)))
            items.Add(New ListViewItem(New String() {textValue(0).ToString, textValue(1).ToString}, 0))
        Next
        ListView1.Items.AddRange(items.ToArray)

        Dim wc As WordCloudGen = New WordCloudGen(600, 400)
        Dim i As Image = wc.Draw(Words, Frequencies)
        ResultPictureBox.Image = i

    End Sub

When the textfile is not indexed (the highest frequencies first) the word cloud doesn't make the words with the highest counts the biggest. Is there a way to load the words, highest frequencies first, without having to change the list?

我会推荐一个新的 class 来保存您的数据,然后您可以更轻松地对您需要的任何内容进行排序。

  • 新建一个class:WordsFrequencies

    Public Class WordsFrequencies
    
     Public Property Word As String
     Public Property Frequency As Integer
    
    End Class
    
  • 如下更改您的 WordCloud_Load 例程:

     Dim WordsFreqList As New List(Of WordsFrequencies)
     For Each line As String In File.ReadLines("C:\Users\Gebruiker\Downloads\Words.txt")
         Dim splitText As String() = line.Split(","c)
         If splitText IsNot Nothing AndAlso splitText.Length = 2 Then
             Dim wordFrq As New WordsFrequencies
             Dim freq As Integer
             wordFrq.Word = splitText(0)
             wordFrq.Frequency = If(Integer.TryParse(splitText(1), freq), freq, 0)
             WordsFreqList.Add(wordFrq)
         End If
     Next
     If WordsFreqList.Count > 0 Then
         ' Order the list based on the Frequency
         WordsFreqList = WordsFreqList.OrderByDescending(Function(w) w.Frequency).ToList
         ' Add the sorted items to the listview
         WordsFreqList.ForEach(Sub(wf)
                                   ListView1.Items.Add(New ListViewItem(New String() {wf.Word, wf.Frequency.ToString}, 0))
                               End Sub)
     End If
    

在上面,我建议用 File.ReadLines 做一个简单的 For 循环,这样如果您只是获取数据,就不必将整个文件加载到内存中并解析它。我正在使用 OrderByDescending Method,它是 System.Linq 命名空间的一部分。

就此而言:Dim i As Image = wc.Draw(Words, Frequencies) 然后您可以执行以下操作:

 Dim i As Image = wc.Draw(WordsFreqList.Select(Function(wf) wf.Word), WordsFreqList.Select(Function(wf) wf.Frequency))

这会将 Word 投影到 IEnumerable(String),然后将 Frequency 投影到 IEnumerable(Integer)