正则表达式:从 URL 中提取推文用户名和 ID
Regex: Extract Tweet Username and ID From URL
我正在尝试获取推文 URL,如果找到,在消息中使用此正则表达式 #^https?://twitter\.com/(?:\#!/)?(\w+)/status(es)?/(\d+)$#is
但似乎我的正则表达式无法正确获取推文 URL。下面是我的完整代码
function gettweet($string)
{
$regex = '#^https?://twitter\.com/(?:\#!/)?(\w+)/status(es)?/(\d+)$#is';
$string = preg_replace_callback($regex, function($matches) {
$user = $matches[2];
$statusid = $matches[3];
$url = "https://twitter.com/$user/status/$statusid";
$urlen = urlencode($url);
$getcon = file_get_contents("https://publish.twitter.com/oembed?url=$urlen");
$con = json_decode($getcon, true);
$tweet_html = $con["html"];
return $tweet_html;
}, $string);
return $string;
}
$message="This is absolutely trending can you also see it here https://twitter.com/itslifeme/status/765268556133064704 i like it";
$mes=gettweet($message);
echo $mes;
这不会像您预期的那样工作的原因是因为您在正则表达式中包含 anchors,这表示模式必须从头到尾匹配。
通过删除锚点,它匹配...
$regex = '#https?://twitter\.com/(?:\#!/)?(\w+)/status(es)?/(\d+)#is';
$string = "This is absolutely trending can you also see it here https://twitter.com/itslifeme/status/765268556133064704 i like it";
if (preg_match($regex, $string, $match)) {
var_dump($match);
}
上面的代码给了我们...
array(4) {
[0]=>
string(55) "https://twitter.com/itslifeme/status/765268556133064704"
[1]=>
string(9) "itslifeme"
[2]=>
string(0) ""
[3]=>
string(18) "765268556133064704"
}
此外,确实没有理由在您的表达式中包含 dot all pattern modifier。
s (PCRE_DOTALL
)
If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.
我正在尝试获取推文 URL,如果找到,在消息中使用此正则表达式 #^https?://twitter\.com/(?:\#!/)?(\w+)/status(es)?/(\d+)$#is
但似乎我的正则表达式无法正确获取推文 URL。下面是我的完整代码
function gettweet($string)
{
$regex = '#^https?://twitter\.com/(?:\#!/)?(\w+)/status(es)?/(\d+)$#is';
$string = preg_replace_callback($regex, function($matches) {
$user = $matches[2];
$statusid = $matches[3];
$url = "https://twitter.com/$user/status/$statusid";
$urlen = urlencode($url);
$getcon = file_get_contents("https://publish.twitter.com/oembed?url=$urlen");
$con = json_decode($getcon, true);
$tweet_html = $con["html"];
return $tweet_html;
}, $string);
return $string;
}
$message="This is absolutely trending can you also see it here https://twitter.com/itslifeme/status/765268556133064704 i like it";
$mes=gettweet($message);
echo $mes;
这不会像您预期的那样工作的原因是因为您在正则表达式中包含 anchors,这表示模式必须从头到尾匹配。
通过删除锚点,它匹配...
$regex = '#https?://twitter\.com/(?:\#!/)?(\w+)/status(es)?/(\d+)#is';
$string = "This is absolutely trending can you also see it here https://twitter.com/itslifeme/status/765268556133064704 i like it";
if (preg_match($regex, $string, $match)) {
var_dump($match);
}
上面的代码给了我们...
array(4) { [0]=> string(55) "https://twitter.com/itslifeme/status/765268556133064704" [1]=> string(9) "itslifeme" [2]=> string(0) "" [3]=> string(18) "765268556133064704" }
此外,确实没有理由在您的表达式中包含 dot all pattern modifier。
s (
PCRE_DOTALL
)If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.