如何使用 DOMDocument 将特定的顺序元素包装在单个容器中?

How can specific sequentially elements be wrapped with a single container using DOMDocument?

假设我有以下标记:

<h3 class="handorgel__header">
    <button class="handorgel__header__button">
        Books, Literature and Languages
    </button>
</h3>

<div class="handorgel__content">
    <div class="handorgel__content__inner">
        <p>Hello</p>
    </div>
</div>

<h3 class="handorgel__header">
    <button class="handorgel__header__button">
        Business and Consumer Information
    </button>
</h3>

<div class="handorgel__content">
    <div class="handorgel__content__inner">
        <p>World</p>
    </div>
</div>

<p>Some text in between</p>

<h3 class="handorgel__header">
    <button class="handorgel__header__button">
        Fine Arts and Music
    </button>
</h3>

<div class="handorgel__content">
    <div class="handorgel__content__inner">
        <p>Hello</p>
    </div>
</div>

<h3 class="handorgel__header">
    <button class="handorgel__header__button">
        Genealogy
    </button>
</h3>

<div class="handorgel__content">
    <div class="handorgel__content__inner">
        <p>World</p>
    </div>
</div>

我想包装每组 .handorgel__* 元素,以便它们包含在容器中 <div class="handorgel">

<div class="handorgel">

    <h3 class="handorgel__header">
        <button class="handorgel__header__button">
            Books, Literature and Languages
        </button>
    </h3>

    <div class="handorgel__content">
        <div class="handorgel__content__inner">
            <p>Hello</p>
        </div>
    </div>

    <h3 class="handorgel__header">
        <button class="handorgel__header__button">
            Business and Consumer Information
        </button>
    </h3>

    <div class="handorgel__content">
        <div class="handorgel__content__inner">
            <p>World</p>
        </div>
    </div>

</div>

<p>Some text in between</p>

<div class="handorgel">

    <h3 class="handorgel__header">
        <button class="handorgel__header__button">
            Fine Arts and Music
        </button>
    </h3>

    <div class="handorgel__content">
        <div class="handorgel__content__inner">
            <p>Hello</p>
        </div>
    </div>

    <h3 class="handorgel__header">
        <button class="handorgel__header__button">
            Genealogy
        </button>
    </h3>

    <div class="handorgel__content">
        <div class="handorgel__content__inner">
            <p>World</p>
        </div>
    </div>

</div>

每个组内可以有任意数量的元素,一个页面也可以有任意数量的组。我怎样才能检测到这些组并适当地包装它们?我目前在这个项目中使用 DOMDocument 来做很多事情,所以如果可能的话,我也想将它用于此目的,除非有明显更好的方法。

经过一系列反复试验,我自己成功完成了这项工作。并不像我想象的那么困难,DOMDocument 实际上会自己处理一些删除逻辑。


/**
 * Wrap handorgel groups in appropriate containers
 *
 * @param string $content
 *
 * @return string
 */
function gpl_wrap_handorgel_shortcodes(string $content): string {
    if (! is_admin() && $content) {
        $DOM = new DOMDocument();

        // disable errors to get around HTML5 warnings...
        libxml_use_internal_errors(true);

        // load in content
        $DOM->loadHTML(mb_convert_encoding("<html><body>{$content}</body></html>", "HTML-ENTITIES", "UTF-8"), LIBXML_HTML_NODEFDTD);

        // reset errors to get around HTML5 warnings...
        libxml_clear_errors();

        $body = $DOM->getElementsByTagName("body");

        $handorgels = [];

        $prev_class = "";

        foreach ($body[0]->childNodes as $element) {
            /**
             * Ensure that only HTML nodes get checked/modified
             */
            if ($element->nodeType == 1) {
                $current_class = $element->getAttribute("class");

                /**
                 * Find any handorgel elements
                 */
                if (preg_match("/handorgel__/", $current_class)) {
                    $group = array_key_last($handorgels);

                    /**
                     * If the previous class didn't include `handorgel__`, create a new handorgel object
                     */
                    if (! preg_match("/handorgel__/", $prev_class)) {
                        $handorgels[] = [
                            "container" => $DOM->createElement("div"),
                            "elements"  => [],
                        ];

                        /**
                         * Update `$group` to match the new container
                         */
                        $group = array_key_last($handorgels);
                    }

                    /**
                     * Append the current element to the group to be moved after all sequential handorgel
                     * elements are located for its group
                     */
                    $handorgels[$group]["elements"][] = $element;
                }

                /**
                 * Update `$prev_class` to track where handorgel groups should begin and end
                 */
                $prev_class = $current_class;
            }
        }

        /**
         * Construct the grouped handorgels
         */
        if ($handorgels) {
            foreach ($handorgels as $group => $handorgel) {
                $handorgel["container"]->setAttribute("class", "handorgel");

                foreach ($handorgel["elements"] as $key => $element) {
                    /**
                     * Insert the container in the starting position for the group
                     */
                    if ($key === 0) {
                        $element->parentNode->insertBefore($handorgels[$group]["container"], $element);
                    }

                    $handorgel["container"]->appendChild($element);
                }
            }
        }

        /**
         * Remove unneeded tags (inserted for parsing reasons)
         */
        $content = remove_extra_tags($DOM); // custom function that removes html/body and outputs the DOMDocument as a string
    }

    return $content;
}
add_filter("the_content", "gpl_wrap_handorgel_shortcodes", 30, 1);