Select Txt 文件中的随机文件并保存到另一个 Txt - VBScript

Select Random Files From Txt File and Save to Another Txt - VBScript

我有一个包含 350.000 多行的文本文件。我需要从该文件中 select 随机抽取 100 行并将它们保存到单独的文本文件中。

  1. 这可以用 vbscript 实现吗?
  2. 该文件是 UTF-8,这会成为问题吗?

之后我可能需要做一些更复杂的事情,例如:Select 100 行随机并将它们从多个文本文件(每个包含 350k+ 行)保存到一个文本文件中。这也可以实现吗?

Sub Randomise
    Randomize 
    Set rs = CreateObject("ADODB.Recordset")
    With rs
        .Fields.Append "RandomNumber", 4 
        .Fields.Append "Txt", 201, 5000 
        .Open
        Do Until Inp.AtEndOfStream
            .AddNew
            .Fields("RandomNumber").value = Rnd() * 10000
            .Fields("Txt").value = Inp.readline
            .UpDate
        Loop
        .Sort = "RandomNumber"
        Do While not .EOF
            Outp.writeline .Fields("Txt").Value
            .MoveNext
        Loop
    End With
End Sub

随机排列文件中的行。

Sub Cut
    Set rs = CreateObject("ADODB.Recordset")
    With rs
        .Fields.Append "LineNumber", 4 

        .Fields.Append "Txt", 201, 5000 
        .Open
        LineCount = 0
        Do Until Inp.AtEndOfStream
            LineCount = LineCount + 1
            .AddNew
            .Fields("LineNumber").value = LineCount
            .Fields("Txt").value = Inp.readline
            .UpDate
        Loop

        .Sort = "LineNumber ASC"

        If LCase(Arg(1)) = "t" then
            If LCase(Arg(2)) = "i" then
                .filter = "LineNumber < " & LCase(Arg(3)) + 1
            ElseIf LCase(Arg(2)) = "x" then
                .filter = "LineNumber > " & LCase(Arg(3))
            End If
        ElseIf LCase(Arg(1)) = "b" then
            If LCase(Arg(2)) = "i" then
                .filter = "LineNumber > " & LineCount - LCase(Arg(3))
            ElseIf LCase(Arg(2)) = "x" then
                .filter = "LineNumber < " & LineCount - LCase(Arg(3)) + 1
            End If
        End If

        Do While not .EOF
            Outp.writeline .Fields("Txt").Value

            .MoveNext
        Loop
    End With
End Sub

可以从文件中删除前 100 行。


Cut
filter cut {t|b} {i|x} NumOfLines

Cuts the number of lines from the top or bottom of file.

t - top of the file
b - bottom of the file
i - include n lines
x - exclude n lines
Example



filter cut t i 5 <"%systemroot%\win.ini"

---

Random
filter random
filter rand
Randomises lines of text in a file. Used to unsort a list.

Example

filter random < "%windir%\win.ini"

这些是另一个程序的示例。常见的声明是

Set Arg = WScript.Arguments
set WshShell = createObject("Wscript.Shell")
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout

一般使用是

过滤器仅读取和写入标准输入和标准输出。这些仅在命令提示符下可用。

cscript filter <inputfile >outputfile
cscript filter <inputfile | other_command
other_command | cscript filter >outputfile
other_command | cscript filter | other_command

t

这是我用来调用上述函数的批处理文件。如果需要,批处理文件会将自身安装到路径中。这允许在命令提示符中使用 vbs 脚本 (filter.vbs),就好像它是一个 exe 文件一样。

@echo off


Rem Make sure filter.vbs exists
set filter=
set filterpath=
Call :FindFilter filter.vbs

Rem Add filter.bat to the path if not in there, setx fails if it's already there
setx path %~dp0;%path% 1>nul 2>nul




Rem Test for some command line parameters
If not "%1"=="" goto main

echo.
echo -------------------------------------------------------------------------------
echo.
echo   Filter.bat

echo   ==========
echo.
echo     The Filter program is a vbs file for searching, replacing, extracting, and 
echo     trimming console output and text files.
echo.
echo     Filter.bat makes Filter.vbs easily usable from the command line. It 
echo     controls unicode/ansi support and debugging.
echo.
echo           Type Filter Help or Filter HTMLHelp for more information.
echo.
cscript //nologo "%filter%" menu
Goto :EOF

:Main


echo %date% %time% %~n0 %* >>"%~dp0\FilterHistory.txt"



rem echo Batch file ran
rem echo %*
Rem /ud Unicode and Debug
If %1==/ud FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //u //x %%j&Goto :EOF

Rem /u Unicode
If %1==/u FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //u %%j&Goto :EOF

Rem /d Ansi Debug
If %1==/d FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //x %%j&Goto :EOF

Rem -ud Unicode and Debug
If %1==-ud FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //u //x %%j&Goto :EOF

Rem /u Unicode
If %1==-u FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //u %%j&Goto :EOF

Rem -d Ansi Debug
If %1==-d FOR /F "tokens=1*" %%i IN ("%*") DO cscript "%filter%
" //nologo //x %%j&Goto :EOF

Rem ANSI
cscript "%filter%
" //nologo %*&Goto :EOF

Goto :EOF

:FindFilter

If Exist "%~dpn0.vbs" set filter=%~dpn0.vbs&set filterpath=%~dp0&goto :EOF

echo find filter 1
If Not "%~dpnx$PATH:1" == "" set filter=%~dpnx1&set filterpath=%~dp1&goto :EOF

echo find filter 2
If Exist "%temp%\filter.vbs" set filter=%temp%\filter.vbs&set filterpath=%temp%&goto :EOF

copy "%~dpnx0" "%~dpn0.bak"
if not errorlevel 1 (
    echo creating "%~dpn0.vbs"
    goto :EOF
)

copy "%~dpnx0" "%temp%\filter.bak" 

echo Error %errorlevel%
if not errorlevel 1 (
    echo creating "%temp%\filter.bak"
    Goto :EOF
)
Goto :EOF

我是这样解决的:

randomize  
set objFSO = CreateObject("Scripting.FileSystemObject")  
Set numberDic = CreateObject("Scripting.Dictionary")  
set objInFile = objFSO.OpenTextFile("C:\Users\Mega\Desktop0kWords.txt", 1, true, 0)  
set objOutFile = objFSO.OpenTextFile("C:\Users\Mega\Desktop0WordsSelected.txt", 2, true, 0)  

strLines = objInFile.ReadAll  
arrLines = split(strLines, vbNewLine)  
intUpperLimit = ubound(arrLines)  
numPicks = 100 

'number of random picks must be less than or
'equal to the number of lines in the input file
if intUpperLimit < numPicks then  
    numPicks = intUpperLimit  
end if  

Do Until numberDic.Count = numPicks  
    index = int(rnd() * intUpperLimit) + 1  
    intRandom = arrLines(index)  
    intRandom = Trim(intRandom)  
      'if blank lines exist in text file, don't add them
    if intRandom <> "" then  
          'if the line chosen is not in the dictionary object, add it
        if not numberDic.exists(intRandom) then  
            numberDic.Add intRandom, intRandom  
        end if  
    end if  
Loop  

for each item in numberDic  
    objOutFile.WriteLine item  
next  

objInFile.close  
objOutFile.close