使用 DOMDocument 在 PHP 中抓取特定标签 属性
Scraping specific tag property in PHP using DOMDocument
我正在尝试根据 'property' 从 'meta' 标签中提取内容。喜欢
`
<meta name="keywords" content="9gag,fun,funny,lol,meme,GIF,wtf,omg,fail,video,cosplay,geeky,forever alone" />
<meta name="twitter:image" content="http://images-cdn.9gag.com/images/thumbnail-facebook/14198244_1420182794.8999_AmeJun_n.jpg" />
<meta property="og:title" content="I finished the manga last week, so I wanted to make my on "What Naruto taught me"" />
<meta property="og:site_name" content="9GAG" />
<meta property="og:url" content="http://9gag.com/gag/aGVqbvz" />
...
`
所以我只想获取那些具有 'og' 的内容。
通过 cURL 请求,我已经能够获取属性。
$ch = curl("http://9gag.com/gag/aGVqbvz?ref=fsidebar");
$dom = new DOMDocument();
@$dom->loadHTML($ch);
//echo $ch;
$links = $dom->getElementsByTagName('meta');
//get no of tags or elements
echo $links->length;
echo '<pre>';
foreach ($links as $link) {
echo $link->getAttribute("property");
echo '<br>';
}
如何获取特定 属性 或名称的特定内容。
XPath 是你的朋友。像 //meta[starts-with(@property, "og")]/@content
这样的表达式将获取所有具有 属性 属性值以 "og".
开头的元元素的内容属性
示例:
$xpath = new DOMXPath($dom);
$query = '//meta[starts-with(@property, "og")]/@content';
foreach ($xpath->query($query) as $node) {
echo $node->value, "\n";
}
输出:
I finished the manga last week, so I wanted to make my on "What Naruto taught me"
9GAG
http://9gag.com/gag/aGVqbvz
我正在尝试根据 'property' 从 'meta' 标签中提取内容。喜欢 `
<meta name="keywords" content="9gag,fun,funny,lol,meme,GIF,wtf,omg,fail,video,cosplay,geeky,forever alone" />
<meta name="twitter:image" content="http://images-cdn.9gag.com/images/thumbnail-facebook/14198244_1420182794.8999_AmeJun_n.jpg" />
<meta property="og:title" content="I finished the manga last week, so I wanted to make my on "What Naruto taught me"" />
<meta property="og:site_name" content="9GAG" />
<meta property="og:url" content="http://9gag.com/gag/aGVqbvz" />
... ` 所以我只想获取那些具有 'og' 的内容。 通过 cURL 请求,我已经能够获取属性。
$ch = curl("http://9gag.com/gag/aGVqbvz?ref=fsidebar");
$dom = new DOMDocument();
@$dom->loadHTML($ch);
//echo $ch;
$links = $dom->getElementsByTagName('meta');
//get no of tags or elements
echo $links->length;
echo '<pre>';
foreach ($links as $link) {
echo $link->getAttribute("property");
echo '<br>';
}
如何获取特定 属性 或名称的特定内容。
XPath 是你的朋友。像 //meta[starts-with(@property, "og")]/@content
这样的表达式将获取所有具有 属性 属性值以 "og".
示例:
$xpath = new DOMXPath($dom);
$query = '//meta[starts-with(@property, "og")]/@content';
foreach ($xpath->query($query) as $node) {
echo $node->value, "\n";
}
输出:
I finished the manga last week, so I wanted to make my on "What Naruto taught me"
9GAG
http://9gag.com/gag/aGVqbvz