子路径“/docs”上的 Docsearch 不抓取侧边导航

Docsearch on subpath '/docs' not scraping side navigation

Docusaurus 文档网站:https://slovakia-atmo-plan.marvintest.vito.be/docs/ is rendered in Docs only mode

Algolia Docsearch 抓取器不抓取根级页面,而是记录 Ignored: from start url。这个问题似乎只有在 Docusaurus 构建嵌套在 {baseUrl}/docs.

下时才会出现

为什么这会被忽略?这是我的文档搜索配置:

{
  "index_name": "atmoplan-documentation",
  "start_urls": ["https://slovakia-atmo-plan.marvintest.vito.be/docs"],
  "sitemap_urls": ["https://slovakia-atmo-plan.marvintest.vito.be/docs/sitemap.xml"],
  "sitemap_alternate_links": true,
  "stop_urls": ["/tests"],
  "selectors": {
    "lvl0": {
      "selector": "(//ul[contains(@class,'menu__list')]//a[contains(@class, 'menu__link menu__link--sublist menu__link--active')]/text() | //nav[contains(@class, 'navbar')]//a[contains(@class, 'navbar__link--active')]/text())[last()]",
      "type": "xpath",
      "global": true,
      "default_value": "Documentation"
    },
    "lvl1": "header h1",
    "lvl2": "article h2",
    "lvl3": "article h3",
    "lvl4": "article h4",
    "lvl5": "article h5, article td:first-child",
    "lvl6": "article h6",
    "text": "article p, article li, article td:last-child"
  },
  "strip_chars": " .,;:#",
  "custom_settings": {
    "separatorsToIndex": "_",
    "attributesForFaceting": ["language", "version", "type", "docusaurus_tag"],
    "attributesToRetrieve": ["hierarchy", "content", "anchor", "url", "url_without_anchor", "type"]
  },
  "conversation_id": ["833762294"],
  "nb_hits": 46250
}

在您的 docusaurus.config.js 中,您应该使用您将托管文档的实际网站设置 url 参数。类似于:

module.exports = {
    url: 'https://slovakia-atmo-plan.marvintest.vito.be/docs',
[…]
}

您的 docusaurus 将使用它来生成 sitemap.xmlalgolia 将使用它来定位您的页面。


参考: https://docusaurus.io/docs/docusaurus.config.js/#url


免责声明

我注意到你的 sitemap.xml 里面有些奇怪的东西。例如第一个 link 是 https://www.vito.be/docs/markdown-page,但是为 Algolia 定义的 URL 是 https://slovakia-atmo-plan.marvintest.vito.be/docs.