如何str_replace Google Facebook分享新闻RSS?
How to str_replace Google News RSS for Facebook Share?
您好,我正在使用 simpleXML 显示 news.google.com 提要。
这样显示的词条link到原文:
我需要 link 的条目来代替:
http://WEBSITEWITHNEWS.COM/ARTICLEURLHERE
原因是Facebook Sharer无法解释以下link:
Facebook 分享器 需要它看起来像这样:
https://www.facebook.com/sharer/sharer.php?u=http://WEBSITEWITHNEWS.COM/ARTICLEURLHERE
有没有一种方法可以使用 regex(str_replace 或 preg_match) 删除 Google 重定向 URL 以便社交分享网站可以识别 link?
Google 重定向 URL 是动态的,因此每次都会略有不同,因此我需要一些可以替换每个变体的东西。
我的工作,功能代码:
$feed = file_get_contents("https://news.google.com/news/feeds?q=KEYWORD&output=rss");
$xml = new SimpleXmlElement($feed);
foreach ($xml->channel->item as $entry){
$date = $entry->pubDate;
$date = strftime("%m/%d/%y %I:%M:%S%P", strtotime($date));
$desc = $entry->description;
$desc = str_replace("and more »", "","$desc");
$desc = str_replace("font-size:85%", "font-size:100%","$desc");
?>
<div class="item"></div>
<?php echo $desc; ?>
<div class="date">
<?php echo $date; ?></div>
<?php } ?>
$desc = $entry->description;
$date = $entry->pubDate;
$date = strftime("%A, %m/%d/%Y, %H:%M:%S", strtotime($date));
$desc = str_replace("and more »","x","and more »");
echo $date;
echo $desc;
}
我使用 $desc 来显示 link 而不是 $link,但是 URL 到带有 Google 重定向 URL 的文章仍然在 $link 如果你想 str_replace 或 preg_match $link 而不是 $desc
Link 开始工作 Google 以下新闻提要:
https://news.google.com/news/feeds?q=KEYWORD&output=rss
如果您知道如何解决这个问题,那么您就是英雄。谢谢 Overflowers
我第一条评论的答案是使用这个正则表达式。
<?php
date_default_timezone_set('America/New_York');
$feed = file_get_contents("https://news.google.com/news/feeds?q=KEYWORD&output=rss");
$xml = new SimpleXmlElement($feed);
foreach ($xml->channel->item as $entry) {
$date = $entry->pubDate;
$date = strftime("%m/%d/%y %I:%M:%S%P", strtotime($date));
$desc = $entry->description;
$desc = str_replace("and more »", "","$desc");
$desc = str_replace("font-size:85%", "font-size:100%","$desc"); /*
?>
<div class="item"></div>
<?php // echo $desc; ?>
<div class="date"><?php echo $date; ?></div>
<?php
*/
$desc = $entry->description;
$desc = preg_replace('~href=".*?&url=(.*?)"~', 'href="https://www.facebook.com/sharer/sharer.php?u="', $desc);
$date = $entry->pubDate;
$date = strftime("%A, %m/%d/%Y, %H:%M:%S", strtotime($date));
//$desc = str_replace("and more »","x","and more »");
echo $date . "\n" . $desc;
die('1 pass');
}
?>
输出(为显示而改变的格式):
<table border="0" cellpadding="2" cellspacing="7" style="vertical-align:top;">
<tr>
<td width="80" align="center" valign="top"><font style="font-size:85%;font-family:arial,sans-serif"></font></td>
<td valign="top" class="j"><font style="font-size:85%;font-family:arial,sans-serif"><br>
<div style="padding-top:0.8em;"><img alt="" height="1" width="1"></div>
<div class="lh"><a href="https://www.facebook.com/sharer/sharer.php?u=http://www.gamasutra.com/blogs/JonathanRaveh/20150506/242840/Death_of_the_app_keyword__whats_next.php"><b>Death of the app <b>keyword</b> – what's next?</b></a><br>
<font size="-1"><b><font color="#6f6f6f">Gamasutra (blog)</font></b></font><br>
<font size="-1">Yes, app <b>keywords</b> are dying. If you search the web you may find insightful stories about apps that gained massive recognition due to the clever use of <b>keywords</b>. Many companies and services (such as Sensor Tower) offer developers tools to help them ...</font><br>
<font size="-1" class="p"></font><br>
<font class="p" size="-1"><a class="p" href="http://news.google.com/news/more?ncl=d4b6j-gMxFN1VKM&authuser=0&ned=us"><nobr><b>and more »</b></nobr></a></font></div>
</font></td>
</tr>
</table>
1 pass
这个正则表达式 ".*?&url=(.*?)"
正在查找 href 的第一个双引号和最后一个双引号之间,并捕获 &url=
之后的所有内容。在示例中,我看到每个实例都将 URL 作为最后一个参数。如果 URL 不是最后一个参数,则此正则表达式将不起作用,因为它使用检查来查找最后一个双引号或实体符号;那将是 ("|&)
。不过,我可以看到从 URLs 中截断了参数;如果他们有额外的 GET
参数。我在这些 URL 中从未见过的另一件事是它们使用 GET
参数。取出 die('1 pass');
试试看,如果您一开始想要小样本,请保留 die
。
您可以为此使用内置的 PHP 函数 parse_url (split URL into components) and parse_str(从查询字符串中获取参数值):
$feed = file_get_contents(
"https://news.google.com/news/feeds?q=KEYWORD&output=rss"
);
$xml = new SimpleXmlElement($feed);
foreach ($xml->channel->item as $entry){
// Get query part of link
$query = parse_url($entry->link, PHP_URL_QUERY);
// Parse query parameters into $params array
parse_str($query, $params);
// Get URL from parameters
$url = $params['url'];
// Just output in this example
echo "URL: $url", PHP_EOL;
// ... Do some more stuff
}
输出:
URL: http://www.gamasutra.com/blogs/JonathanRaveh/20150506/242840/Death_of_the_app_keyword__whats_next.php
URL: http://www.business2community.com/online-marketing/8-keyword-optimization-tips-perfect-ppc-campaigns-01222200
URL: http://searchengineland.com/marry-keywords-compelling-content-218174
...
您好,我正在使用 simpleXML 显示 news.google.com 提要。
这样显示的词条link到原文:
我需要 link 的条目来代替: http://WEBSITEWITHNEWS.COM/ARTICLEURLHERE
原因是Facebook Sharer无法解释以下link:
Facebook 分享器 需要它看起来像这样:
https://www.facebook.com/sharer/sharer.php?u=http://WEBSITEWITHNEWS.COM/ARTICLEURLHERE
有没有一种方法可以使用 regex(str_replace 或 preg_match) 删除 Google 重定向 URL 以便社交分享网站可以识别 link?
Google 重定向 URL 是动态的,因此每次都会略有不同,因此我需要一些可以替换每个变体的东西。
我的工作,功能代码:
$feed = file_get_contents("https://news.google.com/news/feeds?q=KEYWORD&output=rss");
$xml = new SimpleXmlElement($feed);
foreach ($xml->channel->item as $entry){
$date = $entry->pubDate;
$date = strftime("%m/%d/%y %I:%M:%S%P", strtotime($date));
$desc = $entry->description;
$desc = str_replace("and more »", "","$desc");
$desc = str_replace("font-size:85%", "font-size:100%","$desc");
?>
<div class="item"></div>
<?php echo $desc; ?>
<div class="date">
<?php echo $date; ?></div>
<?php } ?>
$desc = $entry->description;
$date = $entry->pubDate;
$date = strftime("%A, %m/%d/%Y, %H:%M:%S", strtotime($date));
$desc = str_replace("and more »","x","and more »");
echo $date;
echo $desc;
}
我使用 $desc 来显示 link 而不是 $link,但是 URL 到带有 Google 重定向 URL 的文章仍然在 $link 如果你想 str_replace 或 preg_match $link 而不是 $desc
Link 开始工作 Google 以下新闻提要: https://news.google.com/news/feeds?q=KEYWORD&output=rss
如果您知道如何解决这个问题,那么您就是英雄。谢谢 Overflowers
我第一条评论的答案是使用这个正则表达式。
<?php
date_default_timezone_set('America/New_York');
$feed = file_get_contents("https://news.google.com/news/feeds?q=KEYWORD&output=rss");
$xml = new SimpleXmlElement($feed);
foreach ($xml->channel->item as $entry) {
$date = $entry->pubDate;
$date = strftime("%m/%d/%y %I:%M:%S%P", strtotime($date));
$desc = $entry->description;
$desc = str_replace("and more »", "","$desc");
$desc = str_replace("font-size:85%", "font-size:100%","$desc"); /*
?>
<div class="item"></div>
<?php // echo $desc; ?>
<div class="date"><?php echo $date; ?></div>
<?php
*/
$desc = $entry->description;
$desc = preg_replace('~href=".*?&url=(.*?)"~', 'href="https://www.facebook.com/sharer/sharer.php?u="', $desc);
$date = $entry->pubDate;
$date = strftime("%A, %m/%d/%Y, %H:%M:%S", strtotime($date));
//$desc = str_replace("and more »","x","and more »");
echo $date . "\n" . $desc;
die('1 pass');
}
?>
输出(为显示而改变的格式):
<table border="0" cellpadding="2" cellspacing="7" style="vertical-align:top;">
<tr>
<td width="80" align="center" valign="top"><font style="font-size:85%;font-family:arial,sans-serif"></font></td>
<td valign="top" class="j"><font style="font-size:85%;font-family:arial,sans-serif"><br>
<div style="padding-top:0.8em;"><img alt="" height="1" width="1"></div>
<div class="lh"><a href="https://www.facebook.com/sharer/sharer.php?u=http://www.gamasutra.com/blogs/JonathanRaveh/20150506/242840/Death_of_the_app_keyword__whats_next.php"><b>Death of the app <b>keyword</b> – what's next?</b></a><br>
<font size="-1"><b><font color="#6f6f6f">Gamasutra (blog)</font></b></font><br>
<font size="-1">Yes, app <b>keywords</b> are dying. If you search the web you may find insightful stories about apps that gained massive recognition due to the clever use of <b>keywords</b>. Many companies and services (such as Sensor Tower) offer developers tools to help them ...</font><br>
<font size="-1" class="p"></font><br>
<font class="p" size="-1"><a class="p" href="http://news.google.com/news/more?ncl=d4b6j-gMxFN1VKM&authuser=0&ned=us"><nobr><b>and more »</b></nobr></a></font></div>
</font></td>
</tr>
</table>
1 pass
这个正则表达式 ".*?&url=(.*?)"
正在查找 href 的第一个双引号和最后一个双引号之间,并捕获 &url=
之后的所有内容。在示例中,我看到每个实例都将 URL 作为最后一个参数。如果 URL 不是最后一个参数,则此正则表达式将不起作用,因为它使用检查来查找最后一个双引号或实体符号;那将是 ("|&)
。不过,我可以看到从 URLs 中截断了参数;如果他们有额外的 GET
参数。我在这些 URL 中从未见过的另一件事是它们使用 GET
参数。取出 die('1 pass');
试试看,如果您一开始想要小样本,请保留 die
。
您可以为此使用内置的 PHP 函数 parse_url (split URL into components) and parse_str(从查询字符串中获取参数值):
$feed = file_get_contents(
"https://news.google.com/news/feeds?q=KEYWORD&output=rss"
);
$xml = new SimpleXmlElement($feed);
foreach ($xml->channel->item as $entry){
// Get query part of link
$query = parse_url($entry->link, PHP_URL_QUERY);
// Parse query parameters into $params array
parse_str($query, $params);
// Get URL from parameters
$url = $params['url'];
// Just output in this example
echo "URL: $url", PHP_EOL;
// ... Do some more stuff
}
输出:
URL: http://www.gamasutra.com/blogs/JonathanRaveh/20150506/242840/Death_of_the_app_keyword__whats_next.php
URL: http://www.business2community.com/online-marketing/8-keyword-optimization-tips-perfect-ppc-campaigns-01222200
URL: http://searchengineland.com/marry-keywords-compelling-content-218174
...