Google 驱动器 API - 获取文档大纲

Google Drive API - get document outline

在 google 文档中,您可以查看和浏览文档 大纲。我试图通过 Google Drive API 访问此大纲,但我找不到相关文档。这是我现在的代码:

    //authenticate
    $this->authenticate();

    $Service = new Google_Service_Drive($this->Client);
    $File = $Service->files->get($FileID);

    return $File;

我找回了文档对象,但找不到 returns 概述的任何函数。我需要 大纲链接 才能从我的应用程序访问文档的特定部分。有什么想法可以实现吗?

File.get returns a file resource 所有文件资源只是文件的元数据。它是有关存储在 google 驱动器上的文件的信息。

您将需要在某些文档应用程序中加载它以查找任何大纲链接。元数据不包含与文件中存储的数据有关的任何信息。

这部分我终于用DaImTo pointing me in the right direction. After getting a file resource I used it to get export link for the HTML code of my document and then I used that link to retrieve HTML content of that document with Google_Http_Request. (Google documentation解决了这个问题)

public function retrive_file_outline($FileID) {
    //authenticate
    $this->authenticate();

    $Service = new Google_Service_Drive($this->Client);
    $File = $Service->files->get($FileID);

    $DownloadUrl = $File->getExportLinks()["text/html"];

    if ($DownloadUrl) {
        $Request = new Google_Http_Request($DownloadUrl, 'GET', null, null);
        $HttpRequest = $Service->getClient()->getAuth()->authenticatedRequest($Request);
        if ($HttpRequest->getResponseHttpCode() == 200) {
            return array($File, $HttpRequest->getResponseBody());
        } else {
            // An error occurred.
            return null;
        }
    } else {
        // The file doesn't have any content stored on Drive.
        return null;
    }
}

之后,我使用 DOMDocument. All the headers have id attributes which are used as an anchor link 解析了 HTML 内容。我检索了所有 headers(h1 到 h6)的 ID,并将其与我的文档编辑 url 连接起来。这给了我所有的大纲 links。这是解析和连接部分:

public function test($FileID) {
    $File = $this->model_google->retrive_file_outline($FileID);

    $DOM = new DOMDocument;
    $DOM->loadHTML($File[1]);

    $TagNames = ["h1", "h2", "h3", "h4", "h5", "h6"];
    foreach($TagNames as $TagName) {
        $Items = $DOM->getElementsByTagName($TagName);
        foreach($Items as $Item) {
            $ID = $Item->attributes->getNamedItem("id");
            echo "<a target='_blank' href='" . $File[0]->alternateLink ."#heading=". $ID->nodeValue . "'>" . $Item->nodeValue . "</a><br />";
        }
    }
    //echo $File;
}

编辑: 我将函数 retrieve_file_outline 和测试合并到 retrieve_file_outline 中,我得到了 returns 文档标题数组与 links 和 ids 的函数:

public function retrive_file_outline($FileID) {
    //authenticate
    $this->authenticate();

    $Service = new Google_Service_Drive($this->Client);
    $File = $Service->files->get($FileID);

    $DownloadUrl = $File->getExportLinks()["text/html"];

    if ($DownloadUrl) {
        $Request = new Google_Http_Request($DownloadUrl, 'GET', null, null);
        $HttpRequest = $Service->getClient()->getAuth()->authenticatedRequest($Request);
        if ($HttpRequest->getResponseHttpCode() == 200) {
            $DOM = new DOMDocument;
            $DOM->loadHTML($HttpRequest->getResponseBody());

            $TagNames = ["h1", "h2", "h3", "h4", "h5", "h6"];
            $Headings = array();
            foreach($TagNames as $TagName) {
                $Items = $DOM->getElementsByTagName($TagName);
                foreach($Items as $Item) {
                    $ID = $Item->attributes->getNamedItem("id");
                    $Heading = array(
                        "link" => $File->alternateLink . "#heading=" . $ID->nodeValue,
                        "heading_id" => $ID->nodeValue,
                        "title" => $Item->nodeValue
                    );

                    array_push($Headings, $Heading);
                }
            }

            return $Headings;
        } else {
            // An error occurred.
            return null;
        }
    } else {
        // The file doesn't have any content stored on Drive.
        return null;
    }
}