如何从 PHP 脚本生成的字符串中 select 特定文本？

Question

我一直在尝试使用多个 PHP 脚本从 Twitch 抓取 HLS 文件。第一个运行 cURL 命令，通过 returns 表示 URL 的 Python 脚本获取 HLS URL，并将生成的字符串转换为纯文本，第二个（不起作用的那个）应该是提取 M3U8 文件并使其能够播放。

第一个脚本(extract.php)

<?php
header('Content-Type: text/plain; charset=utf-8');
$url = "https://pwn.sh/tools/streamapi.py?url=twitch.tv/cgtn_live_russian&quality=1080p60";

$curl = curl_init($url);
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

//for debug only!
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);

$resp = curl_exec($curl);
curl_close($curl);
var_dump($resp);
$undesirable = array("}");
$cleanurl = str_replace($undesirable,"");
echo substr($cleanurl, 39, 898);

?>

这个脚本（我们称它为 extract.php）有效，并且它 returns（纯文本）与 Python 脚本将 return，即：

string(904) "{"success": true, "urls": {"1080p60": "https://video-weaver.fra05.hls.ttvnw.net/v1/playlist/[token].m3u8"}}"

第二个脚本(play.php)

<?php
$opts = array(
'http'=>array(
'method'=>"GET",
'header'=>"Referer:https://myserver.com/" .
  "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:51.0) Gecko/20100101 Firefox/51.0"
));

$html = file_get_contents("extract.php");

preg_match_all(
    '/(http.*?\.m3u8[^&">]+)/',

    $html,
    $posts, // will contain the article data
    PREG_SET_ORDER // formats data into an array of posts
);

foreach ($posts as $post) {
    $link = $post[0];

header("Location: $link");
}
?>

第二个脚本（我们称之为 play.php）理论上应该 return M3U8 文件（没有 string(904) " {"success": true, "urls": {"1080p60":) 并使其能够在媒体播放器（例如 VLC）中播放，但它不会 return 任何东西。

谁能告诉我怎么了？我在制作这些 PHP 文件时是否犯了语法或正则表达式错误，或者第二个文件是否因为字符串的其他元素而无法正常工作？

提前致谢。

Answer 1

我认为您可以依靠正则表达式来获取 URL 而不是尝试手动清理字符串。另一种方法是使用 json_decode().

无论如何，想法是在 extract.php 中定义一个变量，在本例中是 $resp。像现在这样通过 echo 执行此操作不会使其在父脚本中可用。

一旦包含 extract.php，您就可以在 play.php 中引用该变量。

<?php
//extract.php
$resp = '';
$url = "https://pwn.sh/tools/streamapi.py?url=twitch.tv/cgtn_live_russian&quality=1080p60";

$curl = curl_init($url);
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

//for debug only!
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);

$resp = curl_exec($curl);
curl_close($curl);


//play.php
include('./extract.php');

//$resp is set in extraact.php
preg_match_all(
    '/(http.*?\.m3u8)/',
    $resp,
    $posts, // will contain the article data
    PREG_SET_ORDER // formats data into an array of posts
);

foreach ($posts as $post) {
    $link = $post[0];
}

header("Location: $link");
die();

如何从 PHP 脚本生成的字符串中 select 特定文本？

How to select specific text from a string generated by a PHP script?

php

curl

web-scraping

http-live-streaming

m3u8