使用部分和通配符查找和替换 csv 中的文本，直到第一个定界符

Question

我有一个从 SQL 输出生成的 csv，我正在尝试用通用字符串替换 csv 的部分字符串。我尝试过放屁、（总是让我发笑）、FINDSTR 和 POWERSHELL，但我认为我的技能不够，而且 Google 由于我规定的警告，搜索非常困难。

txt文件是这样的（示例数据）。

course_id,user_id,role,status
2122-DAC00002,123456,sometxt,active
2122-DAC00002,13456,sometxt,active
2122-DAC00010/1,987654,sometxt,active
2122-DAC00010,55669988,sometxt,active
2122-DAC00010/2,112233,sometxt,active
2122-DAC00010,852349,sometxt,active

headers可以忽略，第一部分是我需要改变的部分en-masse所以搜索2122-*直到第一个,（2122-* 的字符长度可能略有不同，但始终会停在 , 分隔符处，然后将 2122-* 的所有第一个迭代替换为 2122-GCE.

所以最终输出将是：

course_id,user_id,role,status
2122-GCE,123456,sometxt,active
2122-GCE,13456,sometxt,active
2122-GCE,987654,sometxt,active
2122-GCE,55669988,sometxt,active
2122-GCE,112233,sometxt,active
2122-GCE,852349,sometxt,active

我需要自动执行此操作，因此在 .bat 文件或 .ps1 文件中会很好。

希望这有意义吗？

[编辑/]

抱歉，错过了我的代码尝试。

我的 findstr 尝试：

findstr /V /i /R '2122-.*' '2122-GCE' "E:\path to file\file1.csv" > "E:\path to file\output3.csv"

findstr 输出：

course_id,user_id,role,status
2122-GCENAC00025,123456,sometxt,active
2122-GCENAC00025,568974,sometxt,active
2122-GCENAC00025,223366,sometxt,active
2122-GCENAC00025,987654,sometxt,active

正如您在上面看到的，它有前缀而不是被替换。

我的 FART 尝试：

E:\path to\fart "E:\path to file\file1.csv" 2122-N* 2122-GCE
E:\path to\fart "E:\path to file\output3.csv" 2122-D? 2122-GCE

我的 PS1 尝试是在 ISE 中进行的，我没有保存就关闭了。

编辑，我有一个 ps window 仍然打开：

((Get-Content -path E:\path to file\file1.csv -Raw) -replace '2122-*','2122-GCE') | Set-Content -Path E:\path to file\file2.csv

替换命令的一些迭代：-replace '[^2122]*'

type file1.csv | ForEach-Object { $_ -replace "2122-*", "2122-GCE" } | Set-Content file2.csv

Answer 1

看起来第一个数据值总是以非空值存在并且不以 ; 开头并且必须始终替换为相同的值并且第二个数据列在所有数据行上始终包含一个值，因此在数据行中的第一个数据值之后永远不会有 ,,。

在这些条件下可以使用下面注释的批处理文件：

@echo off
setlocal EnableExtensions DisableDelayedExpansion
set "SourceFile=E:\path to file\file1.csv"
set "OutputFile=E:\path to file\file2.csv"
if not exist "%SourceFile%" goto :EOF

rem Read just the header row from source CSV file and
rem create the output CSV file with just this line.
for /F "usebackq delims=" %%I in ("%SourceFile%") do (
    >"%OutputFile%" echo(%%I
    goto DataRows
)

rem Process just all data rows in source file by skipping the header row
rem with splitting each line into two substrings using the comma as string
rem delimiter with first substring assigned to loop variable I and not used
rem further and everything after first series of commas assigned to the
rem next but one loop variable J according to the ASCII table. The command
rem ECHO is used to output the first fixed data value and a comma and the
rem other data values of each data row. This output is appended to just
rem created output CSV file.

:DataRows
(for /F "usebackq skip=1 tokens=1* delims=," %%I in ("%SourceFile%") do echo 2122-GCE,%%J)>>"%OutputFile%"

endlocal

要了解使用的命令及其工作原理，请打开 command prompt window，在其中执行以下命令，并仔细阅读为每个命令显示的所有帮助页面。

echo /?
endlocal /?
for /?
goto /?
if /?
rem /?
set /?
setlocal /?

使用部分和通配符查找和替换 csv 中的文本，直到第一个定界符

find and replace text within a csv using partial and wildcard up to the first delimiter

batch-file

findstr