Azure / U-SQL - 正则表达式替换

Azure / U-SQL - Regex substitution

我有一些包含空格 (</code>) 和连字符 (<code>-) 的数据,我想将其转换为下划线字符 (_)。在其他语言(即 R)中,我可以这样写:

var1 <- gsub(var1, "s+|\-", "_")

这将查找多个字符并将它们全部转换为其他字符。

有没有办法在 U-SQL 中做到这一点?

编辑:

我试过了 运行 没有错误,但没有更改数据:

@t2 = SELECT var1,
           var2,
           var3.Replace("s+|\'|\-","_") AS var3          
    FROM @t1;

您几乎完成了,但是您使用的是 System.String.Replace 而不是正则表达式。所以改变

@t2 = SELECT var1,
       var2,
       var3.Replace("s+|\'|\-","_") AS var3          
FROM @t1;

@t2 = SELECT var1,
       var2,
       Regex.Replace(var3, "s+|\'|\-", "_") AS var3          
FROM @t1;

编辑:我不是正则表达式专家,所以我没有验证表达式本身。

@someData =
SELECT * FROM
    ( VALUES
    ("tic tac-toe")
    ) AS T(col1);

DECLARE  @pattern string = "\s|-";

@result =
SELECT  col1 AS original,
        Regex.Replace(col1,  "\s",    "_") AS regex_replaceSpace,
        Regex.Replace(col1,  "-",    "_") AS regex_replaceHypen,
        Regex.Replace(col1,  "\055",    "_") AS regex_replaceHypenDecimal,
        Regex.Replace(col1,  "\s|-",    "_") AS regex_replaceBoth,
        Regex.Replace(col1,  @"\s|-",    "_") AS regex_replaceBoth_verbatim,
        Regex.Replace(col1,  @pattern,    "_") AS regex_replaceBoth_pattern,

        col1.Replace(" ", "_") AS string_replaceSpace,
        col1.Replace("-", "_") AS string_replaceHypen,
        col1.Replace("-", "_").Replace(" ", "_") AS string_replaceBoth
FROM @someData;

OUTPUT @result
TO "/Replace.csv"
USING Outputters.Csv(outputHeader: true);