正则表达式 - 从降价字符串中提取所有 headers

Regex - extract all headers from markdown string

我正在使用 gray-matter 来将文件系统中的 .MD 文件解析为字符串。解析器生成的结果是这样的字符串:

\n# Clean-er ReactJS Code - Conditional Rendering\n\n## TL;DR\n\nMove render conditions into appropriately named variables. Abstract the condition logic into a function. This makes the render function code a lot easier to understand, refactor, reuse, test, and think about.\n\n## Introduction\n\nConditional rendering is when a logical operator determines what will be rendered. The following code is from the examples in the official ReactJS documentation. It is one of the simplest examples of conditional rendering that I can think of.\n\n

我现在正在尝试编写一个正则表达式来从字符串中提取所有标题文本。 Headers 在 markdown 中以 # 开头(可以是 1-6),在我的例子中总是以新行结尾。

我试过使用以下正则表达式,但在我的测试字符串上调用它 returns 没有:

const testString = "\n# Clean-er ReactJS Code - Conditional Rendering\n\n## TL;DR\n\nMove render conditions into appropriately named variables. Abstract the condition logic into a function. This makes the render function code a lot easier to understand, refactor, reuse, test, and think about.\n\n## Introduction\n\nConditional rendering is when a logical operator determines what will be rendered. The following code is from the examples in the official ReactJS documentation. It is one of the simplest examples of conditional rendering that I can think of.\n\n"
const HEADING_R = /(?<!#)#{1,6} (.*?)(\r(?:\n)?|\n)/gm;
const headings = HEADING_R.exec(content);

console.log('headings: ', headings);

此控制台将 headings 记录为 null(未找到匹配项)。我正在寻找的结果是:["# Clean-er ReactJS Code - Conditional Rendering", "## TL;DR", "## Introduction"].

我认为正则表达式是错误的,但不知道为什么。

问题是您正在为正则表达式使用文字,不应双重转义反斜杠,因此您可以将其写为 (?<!#)#{1,6} (.*?)(\r(?:\n)?|\n)

您可以缩短捕获所需内容的模式并匹配尾随的换行符,而不是使用后向断言。

(#{1,6} .*)\r?\n

正在检索所有捕获组 1 值:

const testString = "\n# Clean-er ReactJS Code - Conditional Rendering\n\n## TL;DR\n\nMove render conditions into appropriately named variables. Abstract the condition logic into a function. This makes the render function code a lot easier to understand, refactor, reuse, test, and think about.\n\n## Introduction\n\nConditional rendering is when a logical operator determines what will be rendered. The following code is from the examples in the official ReactJS documentation. It is one of the simplest examples of conditional rendering that I can think of.\n\n"
const HEADING_R = /(#{1,6} .*)\r?\n/g;
const headings = Array.from(testString.matchAll(HEADING_R), m => m[1]);
console.log('headings: ', headings);

/#{1,6}.+(?=\n)/g

  • #{1,6} ... 匹配字符 # 至少一次或作为最多 6 个相等字符的序列。

  • .+ 匹配任何字符(行终止符除外)至少一次并尽可能多次(贪婪)

  • 这样做直到 positive lookahead (?=\n) 匹配...

    • 这是... \n ...换行符/line-feed.
  • 使用 g 匹配所有内容的全局修饰符。

编辑

提到

"matches any character (except for line terminators)"

因此像... /#{1,6}.+/g ... 这样的正则表达式应该已经为 OP 的用例完成了工作(不需要积极的前瞻)...

"Headers in markdown start with a # (there can be from 1-6), and in my case always end with a new line."

The result that I am looking for would be: ["# Clean-er ReactJS Code - Conditional Rendering", "## TL;DR", "## Introduction"].

const testString = `\n# Clean-er ReactJS Code - Conditional Rendering\n\n## TL;DR\n\nMove render conditions into appropriately named variables. Abstract the condition logic into a function. This makes the render function code a lot easier to understand, refactor, reuse, test, and think about.\n\n## Introduction\n\nConditional rendering is when a logical operator determines what will be rendered. The following code is from the examples in the official ReactJS documentation. It is one of the simplest examples of conditional rendering that I can think of.\n\n`;

// see...[https://regex101.com/r/n6XQub/2]
const regXHeader = /#{1,6}.+/g

console.log(
  testString.match(regXHeader)
);
.as-console-wrapper { min-height: 100%!important; top: 0; }

奖金

将上述正则表达式重构为例如/(?<flag>#{1,6})\s+(?<content>.+)/g by utilizing named capturing groups alongside with matchAll and a mapping 任务,可以实现类似于下一个示例代码所计算的结果...

const testString = `\n# Clean-er ReactJS Code - Conditional Rendering\n\n## TL;DR\n\nMove render conditions into appropriately named variables. Abstract the condition logic into a function. This makes the render function code a lot easier to understand, refactor, reuse, test, and think about.\n\n## Introduction\n\nConditional rendering is when a logical operator determines what will be rendered. The following code is from the examples in the official ReactJS documentation. It is one of the simplest examples of conditional rendering that I can think of.\n\n`;

// see...[https://regex101.com/r/n6XQub/4]
const regXHeader = /(?<flag>#{1,6})\s+(?<content>.+)/g

console.log(
  Array
    .from(
      testString.matchAll(regXHeader)
    )
    .map(({ groups: { flag, content } }) => ({
      heading: `h${ flag.length }`,
      content,
    }))
);
.as-console-wrapper { min-height: 100%!important; top: 0; }