从文本文件中删除重复的单词
Remove Repeated Words from Text file
我有一个文本文件,包含将近 45,000 个单词,每行一个单词。成千上万的这些词出现了 10 次以上。我想创建一个没有重复单词的新文件。我使用 Stream reader 但它只读取文件一次。我怎样才能摆脱重复的单词。请帮我。谢谢
我的代码是这样的
Try
File.OpenText(TextBox1.Text)
Catch ex As Exception
MsgBox(ex.Message)
Exit Sub
End Try
Dim line As String = String.Empty
Dim OldLine As String = String.Empty
Dim sr = File.OpenText(TextBox1.Text)
line = sr.ReadLine
OldLine = line
Do While sr.Peek <> -1
Application.DoEvents()
line = sr.ReadLine
If OldLine <> line Then
My.Computer.FileSystem.WriteAllText(My.Computer.FileSystem.SpecialDirectories.Desktop & "\Splitted File without Repeats.txt", line & vbCrLf, True)
End If
OldLine = line
Loop
sr.Close()
System.Diagnostics.Process.Start(My.Computer.FileSystem.SpecialDirectories.Desktop & "\Splitted File without Repeats.txt")
MsgBox("Loop terminated. Stream Reader Closed." & vbCrLf)
您可以为此使用 LINQ 的 Distinct()
方法。
这适用于较小的文件:
Dim lines As String() = File.ReadAllLines("yourfile.txt")
File.WriteAllLines("yourfile.txt", lines.Distinct().ToArray())
我有一个文本文件,包含将近 45,000 个单词,每行一个单词。成千上万的这些词出现了 10 次以上。我想创建一个没有重复单词的新文件。我使用 Stream reader 但它只读取文件一次。我怎样才能摆脱重复的单词。请帮我。谢谢 我的代码是这样的
Try
File.OpenText(TextBox1.Text)
Catch ex As Exception
MsgBox(ex.Message)
Exit Sub
End Try
Dim line As String = String.Empty
Dim OldLine As String = String.Empty
Dim sr = File.OpenText(TextBox1.Text)
line = sr.ReadLine
OldLine = line
Do While sr.Peek <> -1
Application.DoEvents()
line = sr.ReadLine
If OldLine <> line Then
My.Computer.FileSystem.WriteAllText(My.Computer.FileSystem.SpecialDirectories.Desktop & "\Splitted File without Repeats.txt", line & vbCrLf, True)
End If
OldLine = line
Loop
sr.Close()
System.Diagnostics.Process.Start(My.Computer.FileSystem.SpecialDirectories.Desktop & "\Splitted File without Repeats.txt")
MsgBox("Loop terminated. Stream Reader Closed." & vbCrLf)
您可以为此使用 LINQ 的 Distinct()
方法。
这适用于较小的文件:
Dim lines As String() = File.ReadAllLines("yourfile.txt")
File.WriteAllLines("yourfile.txt", lines.Distinct().ToArray())