Azure / U-SQL - 正则表达式替换
Azure / U-SQL - Regex substitution
我有一些包含空格 (</code>) 和连字符 (<code>-
) 的数据,我想将其转换为下划线字符 (_
)。在其他语言(即 R)中,我可以这样写:
var1 <- gsub(var1, "s+|\-", "_")
这将查找多个字符并将它们全部转换为其他字符。
有没有办法在 U-SQL 中做到这一点?
编辑:
我试过了 运行 没有错误,但没有更改数据:
@t2 = SELECT var1,
var2,
var3.Replace("s+|\'|\-","_") AS var3
FROM @t1;
您几乎完成了,但是您使用的是 System.String.Replace
而不是正则表达式。所以改变
@t2 = SELECT var1,
var2,
var3.Replace("s+|\'|\-","_") AS var3
FROM @t1;
至
@t2 = SELECT var1,
var2,
Regex.Replace(var3, "s+|\'|\-", "_") AS var3
FROM @t1;
编辑:我不是正则表达式专家,所以我没有验证表达式本身。
@someData =
SELECT * FROM
( VALUES
("tic tac-toe")
) AS T(col1);
DECLARE @pattern string = "\s|-";
@result =
SELECT col1 AS original,
Regex.Replace(col1, "\s", "_") AS regex_replaceSpace,
Regex.Replace(col1, "-", "_") AS regex_replaceHypen,
Regex.Replace(col1, "\055", "_") AS regex_replaceHypenDecimal,
Regex.Replace(col1, "\s|-", "_") AS regex_replaceBoth,
Regex.Replace(col1, @"\s|-", "_") AS regex_replaceBoth_verbatim,
Regex.Replace(col1, @pattern, "_") AS regex_replaceBoth_pattern,
col1.Replace(" ", "_") AS string_replaceSpace,
col1.Replace("-", "_") AS string_replaceHypen,
col1.Replace("-", "_").Replace(" ", "_") AS string_replaceBoth
FROM @someData;
OUTPUT @result
TO "/Replace.csv"
USING Outputters.Csv(outputHeader: true);
我有一些包含空格 (</code>) 和连字符 (<code>-
) 的数据,我想将其转换为下划线字符 (_
)。在其他语言(即 R)中,我可以这样写:
var1 <- gsub(var1, "s+|\-", "_")
这将查找多个字符并将它们全部转换为其他字符。
有没有办法在 U-SQL 中做到这一点?
编辑:
我试过了 运行 没有错误,但没有更改数据:
@t2 = SELECT var1,
var2,
var3.Replace("s+|\'|\-","_") AS var3
FROM @t1;
您几乎完成了,但是您使用的是 System.String.Replace
而不是正则表达式。所以改变
@t2 = SELECT var1,
var2,
var3.Replace("s+|\'|\-","_") AS var3
FROM @t1;
至
@t2 = SELECT var1,
var2,
Regex.Replace(var3, "s+|\'|\-", "_") AS var3
FROM @t1;
编辑:我不是正则表达式专家,所以我没有验证表达式本身。
@someData =
SELECT * FROM
( VALUES
("tic tac-toe")
) AS T(col1);
DECLARE @pattern string = "\s|-";
@result =
SELECT col1 AS original,
Regex.Replace(col1, "\s", "_") AS regex_replaceSpace,
Regex.Replace(col1, "-", "_") AS regex_replaceHypen,
Regex.Replace(col1, "\055", "_") AS regex_replaceHypenDecimal,
Regex.Replace(col1, "\s|-", "_") AS regex_replaceBoth,
Regex.Replace(col1, @"\s|-", "_") AS regex_replaceBoth_verbatim,
Regex.Replace(col1, @pattern, "_") AS regex_replaceBoth_pattern,
col1.Replace(" ", "_") AS string_replaceSpace,
col1.Replace("-", "_") AS string_replaceHypen,
col1.Replace("-", "_").Replace(" ", "_") AS string_replaceBoth
FROM @someData;
OUTPUT @result
TO "/Replace.csv"
USING Outputters.Csv(outputHeader: true);