XRegexP.matchRecursive - 添加回调功能以允许多个相同的实例

Question

我正在使用 XRegexP 来解析一个文本文件，专门用于查找两组预定义评论标签之间的内容，我无法更改这些标签，因此我需要找到一种方法使其与提供的文本。

我使用提供的正则表达式找到了所有标签的列表（link 中的示例还包括示例内容）：https://regex101.com/r/kCwyok/1/

然后我使用 XRegexP 的 matchRecursive 函数来获取开始和结束标签之间的所有内容，这些内容几乎都可以正常工作。

// Map the list of component tags and extract data from them
return generateComponentList(data).map((component) => {
    console.log(chalk.blue('Processing', component[1], 'component.'))
    const contents = XRegExp.matchRecursive(data, '<!-- @\[' + component[1] + '\][.\w-_+]* -->', '<!-- @\[/' + component[1] + '\] -->', 'g')
    let body = ''
    let classes = ''

    contents.map((content) => {
      const filteredContent = filterContent(content)
      body = filteredContent.value
      classes = cleanClasses(component[2])
      console.log(chalk.green(component[1], 'processing complete.'))
    })

    // Output the content as a JSON object
    return {
      componentName: component[1],
      classes,
      body
    }
  })

我遇到的问题是CodeExample标签存在两次，标签是一样的但是内容不一样，但是因为matchRecursive好像没有回调函数，所以只是同时在该组件的所有实例上运行匹配，因此有 1 个或 10 个 CodeExample 实例并不重要，所有这些实例的内容都会返回。

有没有办法可以实际向 matchRecursive 添加某种回调？如果失败了，有没有办法让 JavaScript 了解正在查看哪个 CodeExample 实例，以便我可以直接引用数组位置？我假设 XRegexP 知道它正在查看哪个编号的 CodeExample 标记，那么有没有办法捕获它？

为了清楚起见，这里是完整的代码：https://pastebin.com/2MpdvdNA

我想要的输出是一个包含以下数据的 JSON 文件：

[
{
 componentName: "hero",
 classes: "",
 body: "# Creating new contexts"
},
{
 componentName: "CodeExample",
 classes: "",
 body: "## Usage example

    ```javascript
      Import { ICON_NAME } from 'Icons'
    ```"
},
{
 componentName: "ArticleSection",
 classes: "",
 body: // This section is massive and not relevant to question so skipping
},
{
 componentName: "NoteBlock",
 classes: ["warning"],
 body: "> #### Be Careful
> Eu laboris eiusmod ut exercitation minim laboris ipsum magna consectetur est [commodo](/nope)."
},
{
 componentName: "CodeExample",
 classes: "",
 body: "#### Code example
```javascript
  class ScrollingList extends React.Component {
      constructor(props) {
        super(props);
        this.listRef = React.createRef();
      }

      render() {
        return (
          &#60;div ref={this.listRef}&#62;{/* ...contents... */}&#60;/div&#62;
        );
      }
    }
```"
}
// Skipping the rest as not relevant to question
]

抱歉，如果我没有解释清楚，我已经看这个太久了。

Answer 1

最后是这样解决的：

import XRegExp from 'xregexp'

const extractComponents = data => {
  const components = []
  const re = '<!-- @\[(\w+)\]([.\w-_+]+)* -->'

  XRegExp.forEach(data, XRegExp(re, 'g'), match => {
    const name = match[1]
    const classes = match[2]

    const count = components.filter(item => item.name === name).length
    const instance = count ? count : 0

    components.push({
      name,
      classes,
      instance
    })
  })

  return components
}

const cleanClasses = classes => {
  const filteredClasses = classes ? classes.split('.') : []
  filteredClasses.shift()

  return filteredClasses
}

const extractContent = (data, component) => {
  const re = `<!-- @\[${component.name}\][.\w-_+]* -->`
  const re2 = `<!-- @\[/${component.name}\] -->`

  return XRegExp.matchRecursive(
    data, 
    re, re2, 'g'
  )[component.instance]
}

const parseComponents = data => {
  return extractComponents(data).map(component => {
    return {
      componentName: component.name,
      classes: cleanClasses(component.classes),
      body: extractContent(data, component)
    }
  })
}

export default parseComponents

XRegexP.matchRecursive - 添加回调功能以允许多个相同的实例

XRegexP.matchRecursive - add callback functionality to allow for multiple identical instances

javascript

regex

xregexp