如何使用 bat 脚本创建带有变音符号的 files/folders

How do I create files/folders with diacritics using a bat script

我有一个 bat 文件,它读取文件中的行,然后根据给定的参数尝试创建文件或文件夹。

问题是,当它以 ăâşîş 的形式获取字符时,它不起作用。

这是我的代码:

IF "%1"=="" GOTO Final
IF "%1"=="file" GOTO File
IF "%1"=="folder" GOTO Folder

:File
    for /f %%i in (files.txt) do echo. > %%i.rtf
GOTO Final

:Folder
    for /f "tokens=*" %%a in (folders.txt) do (
    mkdir "%%a"
    )
GOTO Final

:Final

到目前为止,我使用这个 link 尝试过的内容:Manage paths with accented characters

  1. bat 脚本是 ANSI
  2. CHCP 1250 > 无效

我该如何解决这个问题?

CHCP XXX 放入批处理中,其中 XXX 是与您的文本文件(files.txt 和 folders.txt)编码相匹配的代码页。请注意,您可以使用 CHCP 65001,它相当于 UTF-8,应该可以毫无问题地处理大部分变音符号。

避免在文件和文件夹名称中使用重音字符。否则,mojibake 在 windows 命令行中得到保证。

md files   2>NUL
pushd files
md unASCII 2>NUL
chcp 852 >nul
echo ěščřžýáíé-852>diacritic--852.txt
chcp 1250 >nul
echo ěščřžýáíé1250>diacritic-1250.txt

chcp 1250 >nul
findstr /R "^" "diacritic-*.txt"
for %G in (diacritic*.txt) do @for /F %g in (%G) do @echo %G:%g
for %G in (diacritic*.txt) do @for /F %g in (%G) do @echo(%~nG:%g>"unASCII\%gANSI.txt"
chcp 852 >nul
findstr /R "^" "diacritic-*.txt"
for %G in (diacritic*.txt) do @for /F %g in (%G) do @echo %G:%g
for %G in (diacritic*.txt) do @for /F %g in (%G) do @echo(%~nG:%g>"unASCII\%g-OEM.txt"
popd

请注意,上面的 CLI 命令列表不是 .bat 代码片段。但是,在命令行 window 中复制并粘贴它会大致给出下一个 输出 显示文件 创建 时的实际代码页,并且used一定要互相配合。否则,结晶 mojibake 可见,参见例如findstr /R "^" "diacritic-*.txt":

==>md files   2>NUL
==>pushd files
==>md unASCII 2>NUL

==>chcp 852 >nul
==>echo ěščřžýáíé-852>diacritic--852.txt

==>chcp 1250 >nul
==>echo ěščřžýáíé1250>diacritic-1250.txt

==>
==>chcp 1250 >nul

==>findstr /R "^" "diacritic-*.txt"
diacritic--852.txt:Řçźý§ě ˇ‚-852
diacritic-1250.txt:ěščřžýáíé1250

==>for %G in (diacritic*.txt) do @for /F %g in (%G) do @echo %G:%g
diacritic--852.txt:Řçźý§ě ˇ‚-852
diacritic-1250.txt:ěščřžýáíé1250

==>for %G in (diacritic*.txt) do @for /F %g in (%G) do @echo(%~nG:%g>"unASCII\%gANSI.txt"

==>chcp 852 >nul

==>findstr /R "^" "diacritic-*.txt"
diacritic--852.txt:ěščřžýáíé-852
diacritic-1250.txt:ýÜŔ°×řßÝÚ1250

==>for %G in (diacritic*.txt) do @for /F %g in (%G) do @echo %G:%g
diacritic--852.txt:ěščřžýáíé-852
diacritic-1250.txt:ýÜŔ°×řßÝÚ1250

==>for %G in (diacritic*.txt) do @for /F %g in (%G) do @echo(%~nG:%g>"unASCII\%g-OEM.txt"

==>popd

我们已将 ěščřžýáíé 字符串(后跟 CHCP 编号)写入下一个文件:

  • ěščřžýáíé-852 字符串在 files\diacritic--852.txt 文件中,并且
  • ěščřžýáíé1250 字符串在 files\diacritic-1250.txt 文件中。

然后,我们使用这些字符串创建 <String><Chcp><CPID>.txt 名称模式的文件,其中

  • <String> = ěščřžýáíé 带变音符号的字符串从 diacritic-<Chcp>.txt 文件读取;
  • <Chcp> = -8521250: diacritic-<Chcp>.txt 文件所在的代码页;
  • <CPID> = -OEMANSI:编写此文件的代码页名称的文本缩写(分别为 8521250) .

让我们尝试 使用 最后四个文件:Copy&Paste 在命令行 代码片段 之后 window再次:

chcp 437 >nul
dir /B /S "files\unASCII\*.txt"
for %G in (files\unASCII\ěščřžýáíé*.txt) do @echo %G
findstr /S /R "^" "files\unASCII\ěščřžýáíé*.txt"

chcp 1250 >nul
for %G in (files\unASCII\ěščřžýáíé*.txt) do type "%G"
chcp 852 >nul
for %G in (files\unASCII\ěščřžýáíé*.txt) do type "%G"

输出:我们可以一次又一次地看到mojibake

==>chcp 437 >nul

==>dir /B /S "files\unASCII\*.txt"
d:\bat\files\unASCII\ýÜŔ°×řßÝÚ1250-OEM.txt
d:\bat\files\unASCII\ěščřžýáíé-852-OEM.txt
d:\bat\files\unASCII\ěščřžýáíé1250ANSI.txt
d:\bat\files\unASCII\Řçźý§ě ˇ‚-852ANSI.txt

==>for %G in (files\unASCII\ěščřžýáíé*.txt) do @echo %G
files\unASCII\ěščřžýáíé-852-OEM.txt
files\unASCII\ěščřžýáíé1250ANSI.txt

==>findstr /S /R "^" "files\unASCII\ěščřžýáíé*.txt"

==>
==>chcp 1250 >nul

==>for %G in (files\unASCII\ěščřžýáíé*.txt) do type "%G"

==>type "files\unASCII\ěščřžýáíé-852-OEM.txt"
diacritic--852:Řçźý§ě ˇ‚-852

==>type "files\unASCII\ěščřžýáíé1250ANSI.txt"
diacritic-1250:ěščřžýáíé1250

==>chcp 852 >nul

==>for %G in (files\unASCII\ěščřžýáíé*.txt) do type "%G"

==>type "files\unASCII\ěščřžýáíé-852-OEM.txt"
diacritic--852:ěščřžýáíé-852

==>type "files\unASCII\ěščřžýáíé1250ANSI.txt"
diacritic-1250:ýÜŔ°×řßÝÚ1250

OOPS,为什么findstr没有输出?让我们使用

chcp 1250 >nul
findstr /S /R "^" "files\unASCII\*.txt"
chcp 852 >nul
findstr /S /R "^" "files\unASCII\*.txt"

输出显示 findstr 不仅在 file contents 中导致 mojibake但在 文件 名称 中还有:

==>chcp 1250 >nul

==>findstr /S /R "^" "files\unASCII\*.txt"
FINDSTR: Cannot open files\unASCII\ŤsR›zr ˇ‚1250-OEM.txt
FINDSTR: Cannot open files\unASCII\escrzŤ˙­'-852-OEM.txt
FINDSTR: Cannot open files\unASCII\escrzŤ˙­'1250ANSI.txt
FINDSTR: Cannot open files\unASCII\RÎzŤäe?'-852ANSI.txt

==>chcp 852 >nul

==>findstr /S /R "^" "files\unASCII\*.txt"
FINDSTR: Cannot open files\unASCII\ŹsRŤzráíé1250-OEM.txt
FINDSTR: Cannot open files\unASCII\escrzŹ ş'-852-OEM.txt
FINDSTR: Cannot open files\unASCII\escrzŹ ş'1250ANSI.txt
FINDSTR: Cannot open files\unASCII\R╬zŹńeś?'-852ANSI.txt

仅供参考:CHCP 65001 (UTF-8) 也无济于事...根据 MSDN: Naming Files, Paths, and Namespaces,Windows NTFS 对象名称似乎是 UTF-16编码:

On newer file systems, such as NTFS, exFAT, UDFS, and FAT32, Windows stores the long file names on disk in Unicode ... the file system treats path and file names as an opaque sequence of WCHARs.

此外:

The shell and the file system have different requirements. It is possible to create a path with the Windows API that the shell user interface is not able to interpret properly.