如何通过 parent 页面标题检索 Confluence 中的所有 child 页面?

How can I retrieve all the child pages in Confluence by a parent page title?

我知道如何在 Confluence 中通过标题检索页面

res = requests.get(BASE_URL + "/confluence/rest/api/content", params={"title": "parent page title"} , auth=("username", "pass"))

https://developer.atlassian.com/confdev/confluence-rest-api/confluence-rest-api-examples

如何在给定 parent 页面标题的情况下检索所有 child 页面?

据我所知,您无法通过 parent 的 标题 搜索 child 页面。您需要使用 parent 的 Id.

进行搜索

尝试以下方法获取 parent ID:

/rest/api/content/search?cql=title=<parentTitle>

如果您只有 parent 的标题,您需要先发送第二个调用以从标题中获取 ID

/rest/api/content/search?cql=parent=<parentId>

id 和 children 无法通过 /confluence/rest/api/content 找到,所以这行不通:

res = requests.get(BASE_URL + "/confluence/rest/api/content", params={"parent": "<parentId>"} , auth=("username", "pass"))

res = requests.get(BASE_URL + "/confluence/rest/api/content", params={"title": "parents title"} , auth=("username", "pass"))

我写了一个完整的解决方案,它使用递归并通过 space 键完成。尽管最初的问题是询问如何仅根据标题来完成它,但我想我会展示它是如何在一个紧凑的脚本中工作的。

此脚本将遍历整个 space,给定其 space 键,然后打印出每一页和 child 页的标题。

import json
import requests
import builtins


class list(list):
    def __init__(self, *args):
        super().__init__(args)

    def print(self):
        for i in self:
            print(f"{i}")

    def append_unique(self, item):
        if item not in self:
            self.append(item)


class Requests:
    def __init__(self, requests_username, requests_secret_file_name, requests_url_root):
        self.session = requests.Session()
        self.session.auth = (requests_username, self.load_password(requests_secret_file_name))
        self.url_root = requests_url_root

    @staticmethod
    def load_password(file_name):
        with open(f"{file_name}") as f: contents = f.read()
        return contents

    def get_top_level_space_content(self, space_key):
        url = f"{self.url_root}/rest/api/content?spaceKey={space_key}"
        response = self.session.get(url)
        return str(response.text)


class Parser:
    def __init__(self, parser_requests):
        self.page_names = list()
        self.page_ids = list()
        self.parser_requests = parser_requests

    def extract_list_of_page_ids(self, content):
        as_json = json.loads(content)
        content_list = dict(as_json).get('results')
        if content_list is None:
            return
        for c_l in content_list:
            if c_l.get('type') == 'page':
                self.page_titles.append_unique(c_l.get("title"))
                self.page_ids.append_unique(c_l.get("id"))
                self.extract_list_of_page_ids(
                    self.parser_requests.session.get(f'{requests.url_root}/rest/api/content/search?cql=parent='
                                                     f'{c_l.get("id")}').text)
        return


if __name__ == "__main__":
    # I wrote this with a .gitignore that ignores *.secret files
    # I recommend using an access token and not a password
    username, secret_file_name, url_root = "username", \
                                           "file_containing_password.secret", \
                                           "https://confluence-wiki-root.com"

    requests = Requests(username, secret_file_name, url_root)

    parser = Parser(requests)

    space_content = parser.parser_requests.get_top_level_space_content("SPACE-KEY")

    parser.extract_list_of_page_ids(space_content)

    parser.page_names.print()
  • 注 1:适用于 on-prem Confluence 7.7.4
  • 注意 2:我将是第一个承认的,这不是我写过的最快的东西,但这是第一次通过。也许我可以稍后优化。