从目录中按顺序编号的文件中查找丢失的文件

Find missing files from sequentially numbered files in a directory

我在一个目录中有大约 300 000 个文件。它们按顺序编号 - x000001、x000002、...、x300000。但是其中一些文件丢失了,我需要编写一个包含丢失文件编号的输出文本文件。以下代码最多只能处理 10 000 个文件:

@echo off
setlocal enabledelayedexpansion
set "log=%cd%\logfile.txt"
for /f "delims=" %%a in ('dir /ad /b /s') do (
 pushd "%%a"
  for /L %%b in (10000,1,19999) do (
   set a=%%b
   set a=!a:~-4!
   if not exist "*!a!.csv" >>"%log%" echo "%%a - *!a!.csv"
  )
 popd
)

如何扩展到 3 * 10^5 个文件?

解决方案 1 - 简单但缓慢

如果在执行批处理文件时所有 300000 个 CSV 文件都在当前目录中,则此批处理代码可以完成工作。

@echo off
set "log=%cd%\logfile.txt"
del "%log%" 2>nul
for /L %%N in (1,1,9) do if not exist *00000%%N.csv echo %%N - *00000%%N.csv>>"%log%"
for /L %%N in (10,1,99) do if not exist *0000%%N.csv echo %%N - *0000%%N.csv>>"%log%"
for /L %%N in (100,1,999) do if not exist *000%%N.csv echo %%N - *000%%N.csv>>"%log%"
for /L %%N in (1000,1,9999) do if not exist *00%%N.csv echo %%N - *00%%N.csv>>"%log%"
for /L %%N in (10000,1,99999) do if not exist *0%%N.csv echo %%N - *0%%N.csv>>"%log%"
for /L %%N in (100000,1,300000) do if not exist *%%N.csv echo %%N - *%%N.csv>>"%log%"
set "log="

解决方案 2 - 更快但更难理解

第二个解决方案肯定比上面的解决方案快得多,因为它处理当前目录中从第一个文件名到最后一个文件名的文件名列表。

如果最后一个文件不是 x300000.csv,下面的批处理代码只是在日志文件中再写入一行,其中包含当前目录中缺少从第 300000 个文件到预期结束编号的信息。

@echo off
setlocal EnableExtensions EnableDelayedExpansion

rem Delete log file before running file check.
set "log=%cd%\logfile.txt"
del "%log%" 2>nul

rem Define initial value for the number in the file names.
set "Number=0"

rem Define the file extension of the files.
set "Ext=.csv"

rem Define beginning of first file name with number 1.
set "Name=x00000"

rem Define position of dot separating name from extension.
set "DotPos=7"

rem Process list of files matching the pattern of fixed length in current
rem directory sorted by file name line by line. Each file name is compared
rem case-sensitive with the expected file name according to current number.
rem A subroutine is called if current file name is not equal expected one.
for /F "delims=" %%F in ('dir /B /ON x??????%Ext% 2^>nul') do (
    set /A Number+=1
    if "!Name!!Number!%Ext%" NEQ "%%F" call :CheckDiff "%%F"
)

rem Has last file not expected number 300000, log the file numbers
rem of the files missing in current directory with a single line.
if "%Number%" NEQ "300000" (
    set /A Number+=1
    echo All files from number !Number! to 300000 are also missing.>>"%log%"
)
endlocal

rem Exit this batch file to jump to predefined label EOF (End Of File).
goto :EOF

rem This is a subroutine called from main loop whenever current file name
rem does not match with expected file name. There are two reasons possible
rem with file names being in expected format:

rem 1. One leading zero must be removed from variable "Name" as number
rem    has increased to next higher power of 10, i.e. from 1-9 to 10,
rem    from 10-99 to 100, etc.

rem 2. The next file name has really a different number as expected
rem    which means there are one or even more files missing in list.

rem The first reason is checked by testing if the dot separating name
rem and extension is at correct position. One zero from end of string
rem of variable "Name" is removed if this is the case and then the
rem new expected file name is compared with the current file name.

rem Is the perhaps newly determined expected file name still not
rem equal the current file name, the expected file name is written
rem into the log file because this file is missing in list.

rem There can be even more files missing up to current file name. Therefore
rem the number is increased and entire subroutine is executed once more as
rem long as expected file name is not equal the current file name.

rem The subroutine is exited with goto :EOF if the expected file name
rem is equal the current file name resulting in continuing in main
rem loop above with checking next file name from directory listing.

:CheckDiff
set "Expected=%Name%%Number%%Ext%"
if "!Expected:~%DotPos%,1!" NEQ "." (
    set "Name=%Name:~0,-1%"
    set "Expected=!Name!%Number%%Ext%"
)
if "%Expected%" EQU %1 goto :EOF
echo %Expected%>>"%log%"
set /A Number+=1
goto CheckDiff

要了解这两个解决方案中使用的命令及其工作原理,请打开命令提示符 window,在其中执行以下命令,并仔细阅读为每个命令显示的所有帮助页面。

  • call /?
  • dir /?
  • echo /?
  • endlocal /?
  • for /?
  • if /?
  • goto /?
  • rem /?
  • set /?
  • setlocal /?
@echo off
setlocal EnableDelayedExpansion

for /F %%a in ('copy /Z "%~F0" NUL') do set "CR=%%a"

set "num=1000000"
del logfile.txt 2> NUL
< NUL (for %%a in (*.csv) do (
   set /A num+=1
   set /P "=!num:~1!!CR!"
   if "x!num:~1!" neq "%%~Na" call :missingFile "%%~Na"
))
goto :EOF


:missingFile file
echo x%num:~1%.csv>> logfile.txt
echo x%num:~1%.csv Missing
set /A num+=1
if "x%num:~1%" neq "%~1" goto missingFile
exit /B