HTML 净化器:如何防止移除锚标签的 href 属性

HTML Purifier: How to prevent from removing href attribute of anchor tags

我卡在 HTML 净化器配置中,无法删除锚标签的任何 href 属性。

当前输出:

预期输出:(带有 href 属性)

下面是我的HTML净化器功能:

    function html_purify($content)
{
    if (hooks()->apply_filters('html_purify_content', true) === false) {
        return $content;
    }

    $CI = &get_instance();
    $CI->load->config('migration');

    $config = HTMLPurifier_HTML5Config::create(
        HTMLPurifier_HTML5Config::createDefault()
    );

    $config->set('HTML.DefinitionID', 'CustomHTML5');
    $config->set('HTML.DefinitionRev', $CI->config->item('migration_version'));

    // Disables cache
   // $config->set('Cache.DefinitionImpl', null);

    $config->set('HTML.SafeIframe', true);
    $config->set('Attr.AllowedFrameTargets', ['_blank']);
    $config->set('Core.EscapeNonASCIICharacters', true);
    $config->set('CSS.AllowTricky', true);

    // These config option disables the pixel checks and allows
    // specifiy e.q. widht="auto" or height="auto" for example on images
    $config->set('HTML.MaxImgLength', null);
    $config->set('CSS.MaxImgLength', null);

    //Customize - Allow image data
    $config->set('URI.AllowedSchemes', array('data' => true));

    //allow YouTube and Vimeo
    $regex = hooks()->apply_filters('html_purify_safe_iframe_regexp', '%^(https?:)?//(www\.youtube(?:-nocookie)?\.com/embed/|player\.vimeo\.com/video/)%');

    $config->set('URI.SafeIframeRegexp', $regex);
    hooks()->apply_filters('html_purifier_config', $config);

    $def = $config->maybeGetRawHTMLDefinition();

    if ($def) {
        $def->addAttribute('p', 'pagebreak', 'Text');
        $def->addAttribute('div', 'align', 'Enum#left,right,center');
        $def->addElement(
            'iframe',
            'Inline',
            'Flow',
            'Common',
            [
                'src'                   => 'URI#embedded',
                'width'                 => 'Length',
                'height'                => 'Length',
                'name'                  => 'ID',
                'scrolling'             => 'Enum#yes,no,auto',
                'frameborder'           => 'Enum#0,1',
                'allow'                 => 'Text',
                'allowfullscreen'       => 'Bool',
                'webkitallowfullscreen' => 'Bool',
                'mozallowfullscreen'    => 'Bool',
                'longdesc'              => 'URI',
                'marginheight'          => 'Pixels',
                'marginwidth'           => 'Pixels',
            ]
        );
    }

    $purifier = new HTMLPurifier($config);

    return $purifier->purify($content);
}

为了在任何锚标签中允许 href attr,要添加的正确配置是什么?

URI.AllowedSchemes 是一个白名单,因此您插入其中的设置只允许 data URL 排除其他人。由于这将 URL https://google.com 标记为 href 的不允许值,因此 href 为空,空 href 被剥离。

如果要扩展默认白名单,这里供参考:

array (
  'http' => true,
  'https' => true,
  'mailto' => true,
  'ftp' => true,
  'nntp' => true,
  'news' => true,
  'tel' => true,
)