PHP - 修改字符串文本上图像的绝对 path/URL

PHP - Modify absolute path/URL of images on a string text

我正在尝试将旧博文(基于 WP)迁移到新平台。其中一个步骤定义为:

  1. 获取 full_text 个帖子
  2. 搜索完整 path/url 旧图像的存在(让我们设置 https://whosebug.com/uploads/logo.png 或只是 uploads/logo。 png)
  3. Extract/save 并获取新图像的 guid()
  4. 将旧路径https://whosebug.com/uploads/logo.png切换到新路径(让我们看看https://quora.[=52= .png

我尝试使用正则表达式来搜索旧网址: /(http:\/\/Whosebug\.com\/uploads\/)+(.*?)[a-zA-Z0-9]+(\.jpg|\.png|\.gif)/

然后尝试:

$old = array();
$pattern = "/(https:|http:\/\/Whosebug\.com\/uploads\/)+(.*?)[a-zA-Z0-9]+(\.jpg|\.png|\.gif)/";
$text = "orem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor <img src='https://whosebug.com/uploads/image1.png'/> rem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor <img src='https://whosebug.com/uploads/image2.png'/>";

// seatch and get old urls
preg_match_all($pattern, $text, $old);

但我是这样的:

array(4) {
  [0]=>
  array(2) {
    [0]=>
    string(44) "https://whosebug.com/uploads/image1.png"
    [1]=>
    string(44) "https://whosebug.com/uploads/image2.png"
  }
  [1]=>
  array(2) {
    [0]=>
    string(6) "https:"
    [1]=>
    string(6) "https:"
  }
  [2]=>
  array(2) {
    [0]=>
    string(28) "//whosebug.com/uploads/"
    [1]=>
    string(28) "//whosebug.com/uploads/"
  }
  [3]=>
  array(2) {
    [0]=>
    string(4) ".png"
    [1]=>
    string(4) ".png"
  }
}

我认为这个正则表达式会做得更好一点:

#\b((?:https?://Whosebug\.com/)?uploads/(.*?\.(?:jpg|png|gif)))\b#

我简化了你的一些(例如,将 https:|http: 替换为 https?:),还删除了看起来不必要的 [a-zA-Z0-9]+。我还改进了分组,使一些非捕获:

新代码(注意我添加了一个额外的图像参考用于测试):

$old = array();
$pattern = "#\b((?:https?://Whosebug\.com/)?uploads/(.*?\.(?:jpg|png|gif)))\b#";
$text = "orem uploads/xyx.gif ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor <img src='https://whosebug.com/uploads/image1.png'/> rem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor <img src='https://whosebug.com/uploads/image2.png'/>";

// seatch and get old urls
preg_match_all($pattern, $text, $old);
print_r($old);

输出:

Array
(
    [0] => Array
        (
            [0] => uploads/xyx.gif
            [1] => https://whosebug.com/uploads/image1.png
            [2] => https://whosebug.com/uploads/image2.png
        )

    [1] => Array
        (
            [0] => uploads/xyx.gif
            [1] => https://whosebug.com/uploads/image1.png
            [2] => https://whosebug.com/uploads/image2.png
        )

    [2] => Array
        (
            [0] => xyx.gif
            [1] => image1.png
            [2] => image2.png
        )

)

如果你想坚持图像名称只包含 [a-zA-Z0-9] 然后将 .*? 更改为 [a-zA-Z0-9]+

$pattern = "#\b((?:https?://Whosebug\.com/)?uploads/([a-zA-Z0-9]+\.(?:jpg|png|gif)))\b#";