PHP 将 div 转换为自定义标签

PHP convert divs to custom tags

我正在尝试使用 PHP 将一些 html 标签转换为自定义标签。我一直在尝试使用 DOMDocument,但发现它非常麻烦。在 PHP / DOMDocument 中是否有一种简单的方法可以做到这一点?

输入:

<div class="element_wrapper">
    <div class="element_header">My header</div>
    <div class="element">
        <div class="name">Element Name</div>
    </div>
</div>

期望输出:

<element_wrapper>
    <element_header>My Header</element_header>
    <element>
        <name>Element Name</name>
    </element>
</element_wrapper>

我的第一种方法(不完整,根据 AndrewL64 的要求添加):

<?php

$templates = Repository::fetchTemplates();

$classes = [
    'element_wrapper',
    'element',
    'name',
    'element_header',
];

foreach ($templates as $template) {
    $html = '<div>' . $template['html_body'] . '</div>';
    $dom = new DOMDocument();
    $dom->loadHTML($html);
    $finder = new DOMXPath($dom);
    foreach ($classes as $class) {
        $div_nodes = $finder->query("//div[@class='$class']");
        /** @var DOMNode $div_node */
        foreach ($div_nodes as $div_node) {

            /** @var DOMElement $custom_tag */
            $custom_tag = $dom->createElement($class, $div_node->nodeValue);
            if ($div_node->hasAttributes()) {
                foreach ($div_node->attributes as $attribute) {
                    if ($attribute->nodeValue === $class) {
                        continue;
                    }
                    $custom_tag->setAttributeNode($attribute);
                }
            }
            $div_node->parentNode->replaceChild($custom_tag, $div_node);
        }
    }
}

非常感谢!

使用 DOMDocument in php you may find useful php-xml 模块。

我无法想象您为什么要使用自定义标签,但您可以将 div-s 作为纯文本使用。要正确关闭标签,请使用数据结构 STACK(有一个 module 可以使用数据结构,但您可能需要自己的 'smaller' 一个)。

< div> - adds element to stack

< /div> - deletes one.

最后以这种方式解析整个输入文本。

最后我使用 preg_replace 和多个 DOMDocument 实例对 html 进行了更改。使用纯粹的 DOMDocument 有一堆递归和重建,你需要做这很难跟踪并且感觉非常容易出错。我的解决方案如下:

<?php

$templates = TemplateRepository::fetchAll();

$classes = [
    'element_wrapper',
    'element',
    'name',
    'element_header',
];


foreach ($templates as $template) {
    // We need to guarantee a root element for DOMDocument to be happy. (strip later)
    $html = '<div>' . $template['html_body'] . '</div>';

    $dom = new DOMDocument();
    $dom->loadHTML($html);

    $finder = new DOMXPath($dom);

    $class_found = false; // track if we found a class / will have changes.
    foreach ($classes as $class) {
        $div_nodes = $finder->query("//div[contains(@class,'$class')]");
        /** @var DOMNode $div_node */
        foreach ($div_nodes as $div_node) {
            $class_found = true;

            $content = $dom->saveHTML($div_node);

            // I know that the class I want to turn into a custom tag will come after the div opener, so replace that with the class.
            $content = preg_replace('@^<div class="' . $class . '([^>]+)>@', '<' . $class . ' class=">', $content);

            // Clean up empty class attribute...just cuz.
            $content = preg_replace("@<$class class=\"\s*\"@", "<$class", $content);

            // Replace closing div with closing custom tag.  We can assume the end </div> is our target because DOMDocument did the heavy lifting.
            $content = preg_replace('@</div>$@', "</$class>", $content);

            // Create a new dom document from our new html string.  We need this to create a DOMNode that we can import into our original.
            $dom_element = new DOMDocument();
            $dom_element->loadHTML($content);

            // We only want the original html, so just grab the first child of the body.
            $node = $dom_element->getElementsByTagName('body')[0]->firstChild;

            // Import the new node into our original document so we can use it to replace our <div> version.
            $node = $dom->importNode($node, true);

            // Replace our original.
            $div_node->parentNode->replaceChild($node, $div_node);
        }
    }

    // Get the final updated html.
    $new_body = $dom->saveHTML($dom->getElementsByTagName('body')[0]->firstChild);

    // And finish by stripping off our wrapper div we added at the start.
    $new_body = preg_replace('@^<div>(.*)</div>@', '', $new_body);
}