如何在 python 中使用正则表达式提取三引号之间的文本

Question

我有以下一段原始字符串：

s = "###Sample Input\r\n```\r\n3\r\n100 400 1000 1200\r\n100 450 1000 1350\r\n150 400 1200 1200\r\n```"

我想提取三引号之间的文本，即 '3\r\n100 400 1000 1200\r\n100 450 1000 1350\r\n150 400 1200 1200\r\n'

我首先将这个原始字符串转换为 python 字符串，然后应用以下模式：

pattern = r"Sample Input/s/s('''.*''')"
match = re.findall(pattern, s)
print(match)

但我只得到一个空列表作为输出。在这种情况下，用于提取三引号之间的文本的正确正则表达式是什么。

Answer 1

您可以使用：

yourstring.split("```")[1]

请注意，这将为您提供 ```

前两次出现之间的一段文字

Answer 2

使用

```([\w\W]*?)```

见regex proof。

解释

--------------------------------------------------------------------------------
  ```                      '```'
--------------------------------------------------------------------------------
  (                        group and capture to :
--------------------------------------------------------------------------------
    [\w\W]*?                 any character of: word characters (a-z,
                             A-Z, 0-9, _), non-word characters (all
                             but a-z, A-Z, 0-9, _) (0 or more times
                             (matching the least amount possible))
--------------------------------------------------------------------------------
  )                        end of 
--------------------------------------------------------------------------------
  ```                      '```'

Python code:

s = "###Sample Input\r\n```\r\n3\r\n100 400 1000 1200\r\n100 450 1000 1350\r\n150 400 1200 1200\r\n```"
matches = [m.group(1) for m in re.finditer("```([\w\W]*?)```", s)]
print(matches)

结果：['\r\n3\r\n100 400 1000 1200\r\n100 450 1000 1350\r\n150 400 1200 1200\r\n']

如何在 python 中使用正则表达式提取三引号之间的文本

How to extract text between triple quotes using regular experssion in python

python

regex

python-re