从 php 中的网站获取所有图像源

Question

我想做的是让用户输入一个url，里面有https://www.flickr.com/search/?text=arushad%20ahmed这样的图片，得到'src'属性中的所有图片源，并显示出来。

以下方法无效：

$file = fopen("https://www.flickr.com/search/?text=arushad%20ahmed", "r");
$doc = new DOMDocument();
$doc->loadHTML($file);
$image = $doc->getElementsByTagName('img');

foreach ($image as $img) {
    echo $img;
}

那么我怎样才能让这个工作如我所愿呢？

Answer 1

src 不是 tag，而是 attribute。
你说你是 php 的新手所以这很正常，现在担心，使用这个代码：

$doc = new DOMDocument();
$doc->loadHTMLFile("https://www.flickr.com/search/?text=arushad%20ahmed");
$xpath = new DOMXpath($doc);
$imgs = $xpath->query("//img");
for ($i=0; $i < $imgs->length; $i++) {
    $img = $imgs->item($i);
    $src = $img->getAttribute("src");
    // do something with $src
}

详细了解 PHP DOMDocument

更新

根据您的评论，您似乎没有 PHP DOMDocument 支持，您可以使用以下命令安装它。

sudo yum --enablerepo=webtatic install php-xml
sudo /sbin/service httpd stop
sudo /sbin/service httpd start

此外，您尝试解析的页面不包含有效的 HTML，请使用 HTML Tidy 修复它，即：

$html = file_get_contents('https://www.flickr.com/search/?text=arushad%20ahmed');
$config = array(
  'clean' => 'yes',
  'output-html' => 'yes',
);
$tidy = tidy_parse_string($html, $config, 'utf8');
$tidy->cleanRepair();
$doc = new DOMDocument();
$doc->loadHTML($tidy); 
//the rest of the code is the same
$xpath = new DOMXpath($doc);
$imgs = $xpath->query("//img");
for ($i=0; $i < $imgs->length; $i++) {
    $img = $imgs->item($i);
    $src = $img->getAttribute("src");
    // do something with $src
}

从 php 中的网站获取所有图像源

get all image source from a website in php

php

fopen

image

src

domdocument

更新