阅读 html table 具有特定 class 的内容

Question

我有显示我的新闻的新闻页面。我使用 table 显示标题。

<table class="news">
<tr>
    <th>#</th>
    <th></th>
</tr>
<tr>...</tr>
<tr>...</tr>
</table>

此页面中还有其他 table。但是我想在另一页中得到这个table。我搜索了一下，发现是这样的：

$text = file_get_contents("http://www.example.com/news");
echo strip_tags($text, "<table><tr><th><td>");

输出包含新闻页面中的所有 table。我的目标只是 table 和 class "news"。
我该怎么做？

Answer 1

echo strip_tags($text, "<table class='news'>|<tr>|<th>|<td>");

这应该去除除那些

之外的所有标签

echo strip_tags($text, "<table><tr><th><td>");

这将删除除字符串之外的所有内容：

<table><tr><th><td>

Answer 2

我用两个 table 创建了示例代码。可以看到最后的输出

<?php
$html = <<<EOT
<table class="news" border='1'>
<tr>
<th>#</th>
<th></th>
</tr>
<tr><td>New 1 - first </td><td>New 1 - second </td></tr>
<tr><td>New 1 - fifth </td><td>New 1 - forth</td></tr>

</table>
<table class="another_news" border='1'>
<tr>
<th>#</th>
<th></th>
</tr>
<tr><td>Another New 1 - first </td><td>Another New 1 - first </td></tr>
<tr><td>AnotherNew 1 - first </td><td>Another New 1 - first </td></tr>

</table>
EOT;
echo $html;
echo "<hr>";
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($html); // loads your HTML
$xpath = new DOMXPath($doc);
// returns all tables with class news
$tables = $xpath->query('//table[@class="news"]');
$requiredTable = ''; // This will html of tables
foreach ($tables as $table) {
    $requiredTable .=  $doc->saveXML($table);
}
echo $requiredTable;
?>

这应该在 $requiredTable 变量中打印 table

阅读 html table 具有特定 class 的内容

read html table content with specific class

html

php

file-get-contents