使用 PHP 获取 <script type="application/ld+json"> 的内容
Get content of <script type="application/ld+json"> using PHP
我找不到 API Vine 来获取页面内容的标题、描述和图像。 JSON 位于页面本身的 body 脚本标记中: 。如何使用 PHP 获取此脚本标记的内容(JSON)以便对其进行解析?
藤页:
https://vine.co/v/igO3EbIXDlI
来自页面源
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "SocialMediaPosting",
"url": "https://vine.co/v/igO3EbIXDlI",
"datePublished": "2016-03-01T00:58:35",
"author": {
"@type": "Person",
"name": "MotorAddicts\u2122",
"image": "https://v.cdn.vine.co/r/avatars/39FEFED72B1242718633613316096_pic-r-1439261422661708f3e9755.jpg.jpg?versionId=LPjQUQ4KmTIPLu3iDbXw4FipgjEpC6fw",
"url": "https://vine.co/u/989736283540746240"
},
"articleBody": "Mmm... Black black blaaaaack!! \ud83d\ude0d ( Drift \u53d1 )",
"image": "https://v.cdn.vine.co/r/videos/98C3799A811316254965085667328_SW_WEBM_14567938452154dc600dbde.webm.jpg?versionId=wPuaQvDxnpwF7KjSGao21hoddooc3eCl",
"interactionCount": [{
"@type": "UserInteraction",
"userInteractionType": "http://schema.org/UserLikes",
"value": "1382"
}, {
"@type": "UserInteraction",
"userInteractionType": "http://schema.org/UserShares",
"value": "368"
}, {
"@type": "UserInteraction",
"userInteractionType": "http://schema.org/UserComments",
"value": "41"
}, {
"@type": "UserInteraction",
"userInteractionType": "http://schema.org/UserViews",
"value": "80575"
}],
"sharedContent": {
"@type": "VideoObject",
"name" : "Mmm... Black black blaaaaack!! \ud83d\ude0d ( Drift \u53d1 )",
"description" : "",
"thumbnailUrl" : "https://v.cdn.vine.co/r/videos/98C3799A811316254965085667328_SW_WEBM_14567938452154dc600dbde.webm.jpg?versionId=wPuaQvDxnpwF7KjSGao21hoddooc3eCl",
"uploadDate" : "2016-03-01T00:58:35",
"contentUrl" : "https://v.cdn.vine.co/r/videos_h264high/98C3799A811316254965085667328_SW_WEBM_14567938452154dc600dbde.mp4?versionId=w7ugLPYtj5LWeVUsXaH1bt2VuK8QE0qv",
"embedUrl" : "https://vine.co/v/igO3EbIXDlI/embed/simple",
"interactionCount" : "82366"
}
}
</script>
这之后要做什么?
$html = 'https://vine.co/v/igO3EbIXDlI';
$dom = new DOMDocument;
$dom->loadHTML($html);
更新:
我在此处找到了 Vine API 的说明:
https://dev.twitter.com/web/vine/oembed
要为 JSON 查询 Vine API,请从以下位置获取请求:
https://vine.co/oembed.json?url=https%3A%2F%2Fvine.co%2Fv%2F[videoid]
示例:
https://vine.co/oembed.json?url=https%3A%2F%2Fvine.co%2Fv%2FMl16lZVTTxe
$html_content = file_get_contents('https://vine.co/v/igO3EbIXDlI');
$target_class = 'script';
$dom_object = new DOMDocument;
$dom_object->loadHTML($html_content);
$xpath_object = new DOMXpath($dom_object);
$elements = $xpath_object->query("//*[contains(concat(' ', normalize-space(@class), ' '), ' {$target_class} ')]");
$output = []
foreach ($elements as $element)
{
$output[] = $dom_object->saveHTML($element);
}
# you now have a list of strings, each containing the contents of a
# non-overlapping script tag
您可以为此使用 DOMDocument
和 DOMXpath
:
$html = file_get_contents( $url );
$dom = new DOMDocument();
libxml_use_internal_errors( 1 );
$dom->loadHTML( $html );
$xpath = new DOMXpath( $dom );
$jsonScripts = $xpath->query( '//script[@type="application/ld+json"]' );
$json = trim( $jsonScripts->item(0)->nodeValue );
$data = json_decode( $json );
使用此 xPath 模式,您可以搜索属性 type 为“application/ld+json”的所有 <script>
节点:
// Following path no matter where they are in the document
script Elements <script>
[@type="application/ld+json"] with attribute “tipe” as “application/ld+json”
然后您检索您的 JSON 字符串,获取第一个返回的 <script>
节点的 ->nodeValue
。
如果你事先不知道节点存在and/or它的位置,使用这个:
$jsonScripts = $xpath->query( '//script[@type="application/ld+json"]' );
if( $jsonScripts->length < 1 )
{
die( "Error: No script node found" );
}
else
{
foreach( $jsonScripts as $node )
{
$json = json_decode( $node->nodeValue );
// your stuff with JSON ...
}
}
我找不到 API Vine 来获取页面内容的标题、描述和图像。 JSON 位于页面本身的 body 脚本标记中: 。如何使用 PHP 获取此脚本标记的内容(JSON)以便对其进行解析?
藤页:
https://vine.co/v/igO3EbIXDlI
来自页面源
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "SocialMediaPosting",
"url": "https://vine.co/v/igO3EbIXDlI",
"datePublished": "2016-03-01T00:58:35",
"author": {
"@type": "Person",
"name": "MotorAddicts\u2122",
"image": "https://v.cdn.vine.co/r/avatars/39FEFED72B1242718633613316096_pic-r-1439261422661708f3e9755.jpg.jpg?versionId=LPjQUQ4KmTIPLu3iDbXw4FipgjEpC6fw",
"url": "https://vine.co/u/989736283540746240"
},
"articleBody": "Mmm... Black black blaaaaack!! \ud83d\ude0d ( Drift \u53d1 )",
"image": "https://v.cdn.vine.co/r/videos/98C3799A811316254965085667328_SW_WEBM_14567938452154dc600dbde.webm.jpg?versionId=wPuaQvDxnpwF7KjSGao21hoddooc3eCl",
"interactionCount": [{
"@type": "UserInteraction",
"userInteractionType": "http://schema.org/UserLikes",
"value": "1382"
}, {
"@type": "UserInteraction",
"userInteractionType": "http://schema.org/UserShares",
"value": "368"
}, {
"@type": "UserInteraction",
"userInteractionType": "http://schema.org/UserComments",
"value": "41"
}, {
"@type": "UserInteraction",
"userInteractionType": "http://schema.org/UserViews",
"value": "80575"
}],
"sharedContent": {
"@type": "VideoObject",
"name" : "Mmm... Black black blaaaaack!! \ud83d\ude0d ( Drift \u53d1 )",
"description" : "",
"thumbnailUrl" : "https://v.cdn.vine.co/r/videos/98C3799A811316254965085667328_SW_WEBM_14567938452154dc600dbde.webm.jpg?versionId=wPuaQvDxnpwF7KjSGao21hoddooc3eCl",
"uploadDate" : "2016-03-01T00:58:35",
"contentUrl" : "https://v.cdn.vine.co/r/videos_h264high/98C3799A811316254965085667328_SW_WEBM_14567938452154dc600dbde.mp4?versionId=w7ugLPYtj5LWeVUsXaH1bt2VuK8QE0qv",
"embedUrl" : "https://vine.co/v/igO3EbIXDlI/embed/simple",
"interactionCount" : "82366"
}
}
</script>
这之后要做什么?
$html = 'https://vine.co/v/igO3EbIXDlI';
$dom = new DOMDocument;
$dom->loadHTML($html);
更新:
我在此处找到了 Vine API 的说明:
https://dev.twitter.com/web/vine/oembed
要为 JSON 查询 Vine API,请从以下位置获取请求:
https://vine.co/oembed.json?url=https%3A%2F%2Fvine.co%2Fv%2F[videoid]
示例:
https://vine.co/oembed.json?url=https%3A%2F%2Fvine.co%2Fv%2FMl16lZVTTxe
$html_content = file_get_contents('https://vine.co/v/igO3EbIXDlI');
$target_class = 'script';
$dom_object = new DOMDocument;
$dom_object->loadHTML($html_content);
$xpath_object = new DOMXpath($dom_object);
$elements = $xpath_object->query("//*[contains(concat(' ', normalize-space(@class), ' '), ' {$target_class} ')]");
$output = []
foreach ($elements as $element)
{
$output[] = $dom_object->saveHTML($element);
}
# you now have a list of strings, each containing the contents of a
# non-overlapping script tag
您可以为此使用 DOMDocument
和 DOMXpath
:
$html = file_get_contents( $url );
$dom = new DOMDocument();
libxml_use_internal_errors( 1 );
$dom->loadHTML( $html );
$xpath = new DOMXpath( $dom );
$jsonScripts = $xpath->query( '//script[@type="application/ld+json"]' );
$json = trim( $jsonScripts->item(0)->nodeValue );
$data = json_decode( $json );
使用此 xPath 模式,您可以搜索属性 type 为“application/ld+json”的所有 <script>
节点:
// Following path no matter where they are in the document
script Elements <script>
[@type="application/ld+json"] with attribute “tipe” as “application/ld+json”
然后您检索您的 JSON 字符串,获取第一个返回的 <script>
节点的 ->nodeValue
。
如果你事先不知道节点存在and/or它的位置,使用这个:
$jsonScripts = $xpath->query( '//script[@type="application/ld+json"]' );
if( $jsonScripts->length < 1 )
{
die( "Error: No script node found" );
}
else
{
foreach( $jsonScripts as $node )
{
$json = json_decode( $node->nodeValue );
// your stuff with JSON ...
}
}