根据多个特定字符从平面字符串数组创建嵌套 objects

Question

我有一个大文本，使用正则表达式从中提取了标题字符串。标题以 1-6 个主题标签 (#) 开头。这是输入数组的示例：

  const content = [
    "#1",
    "##1a",
    "###1a1",
    "###1a2",
    "##1b",
    "#2",
    "#3",
    "##3a",
    "##3b",
    "#4",
  ];

标题级别（字符串开头的主题标签数）描述了某个标题在章节层次结构中的位置。我想将我的输入解析为一个标题数组 objects，其中包含没有主题标签的标题文本和标题嵌套的章节。上面数组的期望输出是：

export interface Heading {
  chapters: Heading[];
  text: string;
}

const headings: Heading[] = [
  {
    text: "1",
    chapters: [
      {
        text: "1a",
        chapters: [
          { text: "1a1", chapters: [] },
          { text: "1a2", chapters: [] },
        ],
      },
      { text: "1b", chapters: [] },
    ],
  },
  { text: "2", chapters: [] },
  {
    text: "3",
    chapters: [
      { text: "3a", chapters: [] },
      { text: "3b", chapters: [] },
    ],
  },
  { text: "4", chapters: [] },
];

我尝试编写一个解析字符串的函数，但卡在了如何知道当前字符串属于哪个标题输出的问题上：

export const getHeadings = (content: string[]): Heading[] => {
  let headingLevel = 2;
  let headingIndex = 0;
  const allHeadings = content.reduce((acc, currentHeading) => {
    const hashTagsCount = countHastags(currentHeading);

    const sanitizedHeading = currentHeading.replace(/#/g, "").trim();
    const heading = {
      chapters: [],
      text: sanitizedHeading,
    };

    if (hashTagsCount === headingLevel) {
      headingIndex = headingIndex + 1;
    } else {
      headingIndex = 0;
    }

    headingLevel = hashTagsCount;
    if (hashTagsCount === 2) {
      acc.push(heading);
    } else if (hashTagsCount === 3) {
      if (acc.length === 0) {
        return acc;
      }
      if (acc.length === 1) {
        acc[acc.length - 1]["chapters"].push(heading);
      }
    } else if (acc.length === 2) {
      acc[acc.length - 1]["chapters"][headingIndex]["chapters"].push(heading);
    } else if (acc.length === 3) {
      acc[acc.length - 1]["chapters"][headingIndex]["chapters"][headingIndex][
        "chapters"
      ].push(heading);
    }
    return acc;
  }, []);
  return allHeadings;
};

虽然这适用于非常简单的情况，但它不可扩展并且具有预定义的标题级别（使用 if 语句）。我怎样才能以级别数（主题标签）无关紧要的方式重写它？

Answer 1

使用基于 reduce 的方法，可以保持 tracing/managing 正确的（嵌套）chapters 数组，其中需要将新的章节项目推入其中。

因此，累加器可以是一个对象，除了 result 数组之外，还具有一个 index/map 用于要跟踪的嵌套级别 chapters 数组。

要减少的 heading 字符串被分解为其基于 '#'（散列）的 flag 及其文本 content 部分。这是在以下正则表达式的帮助下完成的... /^(?<flag>#+)\s*(?<content>.*?)\s*$/ ... which features named capturing groups。哈希值 (flag.length) 表示当前的嵌套级别。

function traceAndAggregateChapterHierarchy({ chaptersMap = {}, result }, heading) {
  const {
    flag = '',
    content = '',
  } = (/^(?<flag>#+)\s*(?<content>.*?)\s*$/)
    .exec(heading)
    ?.groups ?? {};

  const nestingLevel = flag.length;

  // ensure a valid `heading` format.
  if (nestingLevel >= 1) {

    let chapters;
    if (nestingLevel === 1) {

      // reset map.
      chaptersMap = {};
      // level-1 chapter items need to be pushed into `result`.
      chapters = result;
    } else {
      // create/access the deep nesting level specific `chapters` array.
     chapters = (chaptersMap[nestingLevel] ??= []);
    }
    // create a new chapter item.
    const chapterItem = {
      text: content || '$$ missing header content $$',
      chapters: [] ,
    };
    // create/reassign the next level's `chapters` array.
    chaptersMap[nestingLevel + 1] = chapterItem.chapters;

    // push new item into the correct `chapters` array.
    chapters.push(chapterItem);
  }
  return { chaptersMap, result };
}

const content = [
  "#  The quick brown (1) ",
  "## fox jumps (1a)",
  "###over (1a1)",
  "####  ",
  "###the (1a2)",
  "## lazy dog (1b)",
  "# Foo bar (2)",
  "# Baz biz (3)",
  "##buzz (3a)  ",
  "##booz (3b)  ",
  "# Lorem ipsum (4) ",
  "##",
];
const { result: headings } = content
  .reduce(traceAndAggregateChapterHierarchy, { result: [] });

console.log({ content, headings });

.as-console-wrapper { min-height: 100%!important; top: 0; }

Answer 2

没有可变状态的简短解决方案:) 通过递归删除第一个 # 并对标题进行分组。

const content = [
  "#1",
  "##1a",
  "###1a1",
  "###1a2",
  "##1b",
  "#2",
  "#3",
  "##3a",
  "##3b",
  "#4",
];

const getNesting = (arr) =>
  arr
    .map((str) => str.slice(1)) // remove first #
    .reduce(
      (acc, cur) =>
      // group heading level
        cur.match(/^#/)
          ? [...acc.slice(0, -1), acc.at(-1).concat(cur)]
          : [...acc, [cur]],
      []
    )
    .map(([text, ...subh]) => ({
      // recursive call
      text,
      chapters: !!subh ? getNesting(subh) : [],
    }));

console.log(JSON.stringify(getNesting(content)));

根据多个特定字符从平面字符串数组创建嵌套 objects

Creating nested objects from a flat array of strings based on a number of specific characters

javascript

arrays

mapping

algorithm

reduce