正则表达式：捕获包含换行符、空格和下划线字符的行间文本

Question

我从 pdf 文件中提取了一些文本并读入了一个字符串：

...

Fabric Business Of the Cloths 

4 Description of the property being purchased 
______________________________________________________________________________

...

我想提取行 4 Description of the property being purchased 之前的单词，而不是它上面的任何内容或它下面的下划线。

我尝试使用正则表达式 /^[^4]*/ 但它返回 null。

实现上述目标的合适的正则表达式是什么？

谢谢。

Answer 1

您的正则表达式有效，只需删除开头和结尾的 /。

例子

    private void TestRegex()
    {
        string s = "...\n Fabric Business Of the Cloths\n                         4 Description of the property being purchased\n____________________________________________________________________________\n ...";
        Regex regex = new Regex("^[^4]*"); // <--- DO LIKE THIS, PERHAPS.
        //Regex regex = new Regex("/^[^4]*/"); <----NOT THIS
        Match match = regex.Match(s, 0);
        if (match.Success)
        {
            Console.WriteLine(match.Value);
        }
    }

输出

...
 Fabric Business Of the Cloths

正则表达式：捕获包含换行符、空格和下划线字符的行间文本

Regex Expression: Capture the Text Between the Lines which includes Newline, Spaces, And Underscore characters

.net

c#

vb.net

uipath