在 Window 批处理中 - 如何解析字段包含逗号和双引号的 CSV 文件
In Window Batch - how do I parse CSV file where fields include Comma and double quote
我有一个输入 CSV 文件,ttt.csv,它是逗号分隔的,每个字段可以包含双引号和逗号:
这里是ttt.csv的内容:
"CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP",Bar,Alex,"Barziza,Alex",BARAAA,aaa@email.com
"CN=Boo\,Ryan,OU=Users,OU=Headquarters,DC=CORP",Boo,Ryan,"Boo,Ryan",BABBBB,bbb@email.com
我需要循环此文件,对于每一行,我需要获取 6 个值中的每一个并创建我的 SQL 插入语句到数据库。
对于第 2 行,我需要得到:
Value1= CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP
Value2= Boo
Value3= Ryan
Value4= Boo,Ryan
Value5= BABBBB
Value6= bbb@email.com
我使用了包含双引号的定界符,但它似乎不起作用:
set str2="CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP",Bar,Alex,"Barziza,Alex",BARAAA,aaa@email.com
echo %str2%
for /f "tokens=1 delims=(,")" %%a in ("!str2!") do ( set newstr2=%%a )
echo !newstr2!
正如我在上面评论的那样,只需使用一个普通的 for
循环——没有 /f
,没有 /r
,没有 /d
,没有 /l
,只是一个简单的 for
循环。它将处理 CSV 定界符,同时将引用的内容视为单个标记。
@echo off
setlocal enabledelayedexpansion
set str2="CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP",Bar,Alex,"Barziza,Alex",BARAAA,aaa@email.com
echo %str2%
set idx=0
for %%a in (%str2%) do (
set "newstr[!idx!]=%%~a"
set /a idx += 1
)
set newstr
输出:
C:\Users\me\Desktop>test.bat
"CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP",Bar,Alex,"Barziza,Alex",BARAAA,
aaa@email.com
newstr[0]=CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP
newstr[1]=Bar
newstr[2]=Alex
newstr[3]=Barziza,Alex
newstr[4]=BARAAA
newstr[5]=aaa@email.com
如果您的 csv 数据包含不应该被视为标记分隔符的未引号空格,您可以在拆分之前暂时将空格转换为下划线,然后像这样转换回来:
@echo off
setlocal enabledelayedexpansion
set str2="CN=Ryan\,David Paul,OU=Users,OU=Singapore,DC=GLOBAL,DC=CORP",Ryan,David Paul,"Ryan, David Paul",RPAUL123,David@aaad.com
echo %str2%
set idx=0
for %%a in (%str2: =_%) do (
set "str=%%~a"
set "newstr[!idx!]=!str:_= !"
set /a idx += 1
)
set newstr
如果您愿意,可以read more on substring substitution。输出:
C:\Users\me\Desktop>test.bat
"CN=Ryan\,David Paul,OU=Users,OU=Singapore,DC=GLOBAL,DC=CORP",Ryan,David Paul,"Ryan, David Paul",RPAUL123,David@aaad.com
newstr[0]=CN=Ryan\,David Paul,OU=Users,OU=Singapore,DC=GLOBAL,DC=CORP
newstr[1]=Ryan
newstr[2]=David Paul
newstr[3]=Ryan, David Paul
newstr[4]=RPAUL123
newstr[5]=David@aaad.com
当然,如果您的数据已经包含下划线,则使用它不包含的字符 -- 反引号、波浪号、美元符号或其他字符。
@echo off
(
echo "CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP",Bar,Alex,"Barziza,Alex",BARAAA,aaa@email.com
echo "CN=Boo\,Ryan,OU=Users,OU=Headquarters,DC=CORP",Boo,Ryan,"Boo,Ryan",BABBBB,bbb@em
)>%tmp%\tmp.csv
for /f tokens^=^1*^ delims^=^" %%i in (%tmp%\tmp.csv) do (
echo value0= "%%i"
for /f tokens^=^1-6^ delims^=^=^,^" %%a in ("%%j") do (
echo value1= %%a&echo value2= %%b&echo value3= %%c,%%d
echo value4= %%e&echo value5= %%f&echo:
)
)
输出:
value0= "CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP"
value1= Bar
value2= Alex
value3= Barziza,Alex
value4= BARAAA
value5= aaa@email.com
value0= "CN=Boo\,Ryan,OU=Users,OU=Headquarters,DC=CORP"
value1= Boo
value2= Ryan
value3= Boo,Ryan
value4= BABBBB
value5= bbb@em
我有一个输入 CSV 文件,ttt.csv,它是逗号分隔的,每个字段可以包含双引号和逗号:
这里是ttt.csv的内容:
"CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP",Bar,Alex,"Barziza,Alex",BARAAA,aaa@email.com
"CN=Boo\,Ryan,OU=Users,OU=Headquarters,DC=CORP",Boo,Ryan,"Boo,Ryan",BABBBB,bbb@email.com
我需要循环此文件,对于每一行,我需要获取 6 个值中的每一个并创建我的 SQL 插入语句到数据库。
对于第 2 行,我需要得到:
Value1= CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP
Value2= Boo
Value3= Ryan
Value4= Boo,Ryan
Value5= BABBBB
Value6= bbb@email.com
我使用了包含双引号的定界符,但它似乎不起作用:
set str2="CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP",Bar,Alex,"Barziza,Alex",BARAAA,aaa@email.com
echo %str2%
for /f "tokens=1 delims=(,")" %%a in ("!str2!") do ( set newstr2=%%a )
echo !newstr2!
正如我在上面评论的那样,只需使用一个普通的 for
循环——没有 /f
,没有 /r
,没有 /d
,没有 /l
,只是一个简单的 for
循环。它将处理 CSV 定界符,同时将引用的内容视为单个标记。
@echo off
setlocal enabledelayedexpansion
set str2="CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP",Bar,Alex,"Barziza,Alex",BARAAA,aaa@email.com
echo %str2%
set idx=0
for %%a in (%str2%) do (
set "newstr[!idx!]=%%~a"
set /a idx += 1
)
set newstr
输出:
C:\Users\me\Desktop>test.bat "CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP",Bar,Alex,"Barziza,Alex",BARAAA, aaa@email.com
newstr[0]=CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP
newstr[1]=Bar
newstr[2]=Alex
newstr[3]=Barziza,Alex
newstr[4]=BARAAA
newstr[5]=aaa@email.com
如果您的 csv 数据包含不应该被视为标记分隔符的未引号空格,您可以在拆分之前暂时将空格转换为下划线,然后像这样转换回来:
@echo off
setlocal enabledelayedexpansion
set str2="CN=Ryan\,David Paul,OU=Users,OU=Singapore,DC=GLOBAL,DC=CORP",Ryan,David Paul,"Ryan, David Paul",RPAUL123,David@aaad.com
echo %str2%
set idx=0
for %%a in (%str2: =_%) do (
set "str=%%~a"
set "newstr[!idx!]=!str:_= !"
set /a idx += 1
)
set newstr
如果您愿意,可以read more on substring substitution。输出:
C:\Users\me\Desktop>test.bat
"CN=Ryan\,David Paul,OU=Users,OU=Singapore,DC=GLOBAL,DC=CORP",Ryan,David Paul,"Ryan, David Paul",RPAUL123,David@aaad.com
newstr[0]=CN=Ryan\,David Paul,OU=Users,OU=Singapore,DC=GLOBAL,DC=CORP
newstr[1]=Ryan
newstr[2]=David Paul
newstr[3]=Ryan, David Paul
newstr[4]=RPAUL123
newstr[5]=David@aaad.com
当然,如果您的数据已经包含下划线,则使用它不包含的字符 -- 反引号、波浪号、美元符号或其他字符。
@echo off
(
echo "CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP",Bar,Alex,"Barziza,Alex",BARAAA,aaa@email.com
echo "CN=Boo\,Ryan,OU=Users,OU=Headquarters,DC=CORP",Boo,Ryan,"Boo,Ryan",BABBBB,bbb@em
)>%tmp%\tmp.csv
for /f tokens^=^1*^ delims^=^" %%i in (%tmp%\tmp.csv) do (
echo value0= "%%i"
for /f tokens^=^1-6^ delims^=^=^,^" %%a in ("%%j") do (
echo value1= %%a&echo value2= %%b&echo value3= %%c,%%d
echo value4= %%e&echo value5= %%f&echo:
)
)
输出:
value0= "CN=Bar\,Alex,OU=Users,OU=Headquarters,DC=CORP"
value1= Bar
value2= Alex
value3= Barziza,Alex
value4= BARAAA
value5= aaa@email.com
value0= "CN=Boo\,Ryan,OU=Users,OU=Headquarters,DC=CORP"
value1= Boo
value2= Ryan
value3= Boo,Ryan
value4= BABBBB
value5= bbb@em