Select Txt 文件中的随机文件并保存到另一个 Txt - VBScript
Select Random Files From Txt File and Save to Another Txt - VBScript
我有一个包含 350.000 多行的文本文件。我需要从该文件中 select 随机抽取 100 行并将它们保存到单独的文本文件中。
- 这可以用 vbscript 实现吗?
- 该文件是 UTF-8,这会成为问题吗?
之后我可能需要做一些更复杂的事情,例如:Select 100 行随机并将它们从多个文本文件(每个包含 350k+ 行)保存到一个文本文件中。这也可以实现吗?
Sub Randomise
Randomize
Set rs = CreateObject("ADODB.Recordset")
With rs
.Fields.Append "RandomNumber", 4
.Fields.Append "Txt", 201, 5000
.Open
Do Until Inp.AtEndOfStream
.AddNew
.Fields("RandomNumber").value = Rnd() * 10000
.Fields("Txt").value = Inp.readline
.UpDate
Loop
.Sort = "RandomNumber"
Do While not .EOF
Outp.writeline .Fields("Txt").Value
.MoveNext
Loop
End With
End Sub
随机排列文件中的行。
Sub Cut
Set rs = CreateObject("ADODB.Recordset")
With rs
.Fields.Append "LineNumber", 4
.Fields.Append "Txt", 201, 5000
.Open
LineCount = 0
Do Until Inp.AtEndOfStream
LineCount = LineCount + 1
.AddNew
.Fields("LineNumber").value = LineCount
.Fields("Txt").value = Inp.readline
.UpDate
Loop
.Sort = "LineNumber ASC"
If LCase(Arg(1)) = "t" then
If LCase(Arg(2)) = "i" then
.filter = "LineNumber < " & LCase(Arg(3)) + 1
ElseIf LCase(Arg(2)) = "x" then
.filter = "LineNumber > " & LCase(Arg(3))
End If
ElseIf LCase(Arg(1)) = "b" then
If LCase(Arg(2)) = "i" then
.filter = "LineNumber > " & LineCount - LCase(Arg(3))
ElseIf LCase(Arg(2)) = "x" then
.filter = "LineNumber < " & LineCount - LCase(Arg(3)) + 1
End If
End If
Do While not .EOF
Outp.writeline .Fields("Txt").Value
.MoveNext
Loop
End With
End Sub
可以从文件中删除前 100 行。
Cut
filter cut {t|b} {i|x} NumOfLines
Cuts the number of lines from the top or bottom of file.
t - top of the file
b - bottom of the file
i - include n lines
x - exclude n lines
Example
filter cut t i 5 <"%systemroot%\win.ini"
---
Random
filter random
filter rand
Randomises lines of text in a file. Used to unsort a list.
Example
filter random < "%windir%\win.ini"
这些是另一个程序的示例。常见的声明是
Set Arg = WScript.Arguments
set WshShell = createObject("Wscript.Shell")
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
一般使用是
过滤器仅读取和写入标准输入和标准输出。这些仅在命令提示符下可用。
cscript filter <inputfile >outputfile
cscript filter <inputfile | other_command
other_command | cscript filter >outputfile
other_command | cscript filter | other_command
t
这是我用来调用上述函数的批处理文件。如果需要,批处理文件会将自身安装到路径中。这允许在命令提示符中使用 vbs 脚本 (filter.vbs),就好像它是一个 exe 文件一样。
@echo off
Rem Make sure filter.vbs exists
set filter=
set filterpath=
Call :FindFilter filter.vbs
Rem Add filter.bat to the path if not in there, setx fails if it's already there
setx path %~dp0;%path% 1>nul 2>nul
Rem Test for some command line parameters
If not "%1"=="" goto main
echo.
echo -------------------------------------------------------------------------------
echo.
echo Filter.bat
echo ==========
echo.
echo The Filter program is a vbs file for searching, replacing, extracting, and
echo trimming console output and text files.
echo.
echo Filter.bat makes Filter.vbs easily usable from the command line. It
echo controls unicode/ansi support and debugging.
echo.
echo Type Filter Help or Filter HTMLHelp for more information.
echo.
cscript //nologo "%filter%" menu
Goto :EOF
:Main
echo %date% %time% %~n0 %* >>"%~dp0\FilterHistory.txt"
rem echo Batch file ran
rem echo %*
Rem /ud Unicode and Debug
If %1==/ud FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //u //x %%j&Goto :EOF
Rem /u Unicode
If %1==/u FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //u %%j&Goto :EOF
Rem /d Ansi Debug
If %1==/d FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //x %%j&Goto :EOF
Rem -ud Unicode and Debug
If %1==-ud FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //u //x %%j&Goto :EOF
Rem /u Unicode
If %1==-u FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //u %%j&Goto :EOF
Rem -d Ansi Debug
If %1==-d FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //x %%j&Goto :EOF
Rem ANSI
cscript "%filter%
" //nologo %*&Goto :EOF
Goto :EOF
:FindFilter
If Exist "%~dpn0.vbs" set filter=%~dpn0.vbs&set filterpath=%~dp0&goto :EOF
echo find filter 1
If Not "%~dpnx$PATH:1" == "" set filter=%~dpnx1&set filterpath=%~dp1&goto :EOF
echo find filter 2
If Exist "%temp%\filter.vbs" set filter=%temp%\filter.vbs&set filterpath=%temp%&goto :EOF
copy "%~dpnx0" "%~dpn0.bak"
if not errorlevel 1 (
echo creating "%~dpn0.vbs"
goto :EOF
)
copy "%~dpnx0" "%temp%\filter.bak"
echo Error %errorlevel%
if not errorlevel 1 (
echo creating "%temp%\filter.bak"
Goto :EOF
)
Goto :EOF
我是这样解决的:
randomize
set objFSO = CreateObject("Scripting.FileSystemObject")
Set numberDic = CreateObject("Scripting.Dictionary")
set objInFile = objFSO.OpenTextFile("C:\Users\Mega\Desktop0kWords.txt", 1, true, 0)
set objOutFile = objFSO.OpenTextFile("C:\Users\Mega\Desktop0WordsSelected.txt", 2, true, 0)
strLines = objInFile.ReadAll
arrLines = split(strLines, vbNewLine)
intUpperLimit = ubound(arrLines)
numPicks = 100
'number of random picks must be less than or
'equal to the number of lines in the input file
if intUpperLimit < numPicks then
numPicks = intUpperLimit
end if
Do Until numberDic.Count = numPicks
index = int(rnd() * intUpperLimit) + 1
intRandom = arrLines(index)
intRandom = Trim(intRandom)
'if blank lines exist in text file, don't add them
if intRandom <> "" then
'if the line chosen is not in the dictionary object, add it
if not numberDic.exists(intRandom) then
numberDic.Add intRandom, intRandom
end if
end if
Loop
for each item in numberDic
objOutFile.WriteLine item
next
objInFile.close
objOutFile.close
我有一个包含 350.000 多行的文本文件。我需要从该文件中 select 随机抽取 100 行并将它们保存到单独的文本文件中。
- 这可以用 vbscript 实现吗?
- 该文件是 UTF-8,这会成为问题吗?
之后我可能需要做一些更复杂的事情,例如:Select 100 行随机并将它们从多个文本文件(每个包含 350k+ 行)保存到一个文本文件中。这也可以实现吗?
Sub Randomise
Randomize
Set rs = CreateObject("ADODB.Recordset")
With rs
.Fields.Append "RandomNumber", 4
.Fields.Append "Txt", 201, 5000
.Open
Do Until Inp.AtEndOfStream
.AddNew
.Fields("RandomNumber").value = Rnd() * 10000
.Fields("Txt").value = Inp.readline
.UpDate
Loop
.Sort = "RandomNumber"
Do While not .EOF
Outp.writeline .Fields("Txt").Value
.MoveNext
Loop
End With
End Sub
随机排列文件中的行。
Sub Cut
Set rs = CreateObject("ADODB.Recordset")
With rs
.Fields.Append "LineNumber", 4
.Fields.Append "Txt", 201, 5000
.Open
LineCount = 0
Do Until Inp.AtEndOfStream
LineCount = LineCount + 1
.AddNew
.Fields("LineNumber").value = LineCount
.Fields("Txt").value = Inp.readline
.UpDate
Loop
.Sort = "LineNumber ASC"
If LCase(Arg(1)) = "t" then
If LCase(Arg(2)) = "i" then
.filter = "LineNumber < " & LCase(Arg(3)) + 1
ElseIf LCase(Arg(2)) = "x" then
.filter = "LineNumber > " & LCase(Arg(3))
End If
ElseIf LCase(Arg(1)) = "b" then
If LCase(Arg(2)) = "i" then
.filter = "LineNumber > " & LineCount - LCase(Arg(3))
ElseIf LCase(Arg(2)) = "x" then
.filter = "LineNumber < " & LineCount - LCase(Arg(3)) + 1
End If
End If
Do While not .EOF
Outp.writeline .Fields("Txt").Value
.MoveNext
Loop
End With
End Sub
可以从文件中删除前 100 行。
Cut
filter cut {t|b} {i|x} NumOfLines
Cuts the number of lines from the top or bottom of file.
t - top of the file
b - bottom of the file
i - include n lines
x - exclude n lines
Example
filter cut t i 5 <"%systemroot%\win.ini"
---
Random
filter random
filter rand
Randomises lines of text in a file. Used to unsort a list.
Example
filter random < "%windir%\win.ini"
这些是另一个程序的示例。常见的声明是
Set Arg = WScript.Arguments
set WshShell = createObject("Wscript.Shell")
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
一般使用是
过滤器仅读取和写入标准输入和标准输出。这些仅在命令提示符下可用。
cscript filter <inputfile >outputfile
cscript filter <inputfile | other_command
other_command | cscript filter >outputfile
other_command | cscript filter | other_command
t
这是我用来调用上述函数的批处理文件。如果需要,批处理文件会将自身安装到路径中。这允许在命令提示符中使用 vbs 脚本 (filter.vbs),就好像它是一个 exe 文件一样。
@echo off
Rem Make sure filter.vbs exists
set filter=
set filterpath=
Call :FindFilter filter.vbs
Rem Add filter.bat to the path if not in there, setx fails if it's already there
setx path %~dp0;%path% 1>nul 2>nul
Rem Test for some command line parameters
If not "%1"=="" goto main
echo.
echo -------------------------------------------------------------------------------
echo.
echo Filter.bat
echo ==========
echo.
echo The Filter program is a vbs file for searching, replacing, extracting, and
echo trimming console output and text files.
echo.
echo Filter.bat makes Filter.vbs easily usable from the command line. It
echo controls unicode/ansi support and debugging.
echo.
echo Type Filter Help or Filter HTMLHelp for more information.
echo.
cscript //nologo "%filter%" menu
Goto :EOF
:Main
echo %date% %time% %~n0 %* >>"%~dp0\FilterHistory.txt"
rem echo Batch file ran
rem echo %*
Rem /ud Unicode and Debug
If %1==/ud FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //u //x %%j&Goto :EOF
Rem /u Unicode
If %1==/u FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //u %%j&Goto :EOF
Rem /d Ansi Debug
If %1==/d FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //x %%j&Goto :EOF
Rem -ud Unicode and Debug
If %1==-ud FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //u //x %%j&Goto :EOF
Rem /u Unicode
If %1==-u FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //u %%j&Goto :EOF
Rem -d Ansi Debug
If %1==-d FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //x %%j&Goto :EOF
Rem ANSI
cscript "%filter%
" //nologo %*&Goto :EOF
Goto :EOF
:FindFilter
If Exist "%~dpn0.vbs" set filter=%~dpn0.vbs&set filterpath=%~dp0&goto :EOF
echo find filter 1
If Not "%~dpnx$PATH:1" == "" set filter=%~dpnx1&set filterpath=%~dp1&goto :EOF
echo find filter 2
If Exist "%temp%\filter.vbs" set filter=%temp%\filter.vbs&set filterpath=%temp%&goto :EOF
copy "%~dpnx0" "%~dpn0.bak"
if not errorlevel 1 (
echo creating "%~dpn0.vbs"
goto :EOF
)
copy "%~dpnx0" "%temp%\filter.bak"
echo Error %errorlevel%
if not errorlevel 1 (
echo creating "%temp%\filter.bak"
Goto :EOF
)
Goto :EOF
我是这样解决的:
randomize
set objFSO = CreateObject("Scripting.FileSystemObject")
Set numberDic = CreateObject("Scripting.Dictionary")
set objInFile = objFSO.OpenTextFile("C:\Users\Mega\Desktop0kWords.txt", 1, true, 0)
set objOutFile = objFSO.OpenTextFile("C:\Users\Mega\Desktop0WordsSelected.txt", 2, true, 0)
strLines = objInFile.ReadAll
arrLines = split(strLines, vbNewLine)
intUpperLimit = ubound(arrLines)
numPicks = 100
'number of random picks must be less than or
'equal to the number of lines in the input file
if intUpperLimit < numPicks then
numPicks = intUpperLimit
end if
Do Until numberDic.Count = numPicks
index = int(rnd() * intUpperLimit) + 1
intRandom = arrLines(index)
intRandom = Trim(intRandom)
'if blank lines exist in text file, don't add them
if intRandom <> "" then
'if the line chosen is not in the dictionary object, add it
if not numberDic.exists(intRandom) then
numberDic.Add intRandom, intRandom
end if
end if
Loop
for each item in numberDic
objOutFile.WriteLine item
next
objInFile.close
objOutFile.close