PHP 简单 HTML DOM 解析
PHP simple HTML DOM parse
我想使用 dom 解析器从一些 html 代码中提取一些信息,但我卡在了某个点上。
<div id="posts">
<div class="post">
<div class="user">me:</div>
<div class="post">I am an apple</div>
</div>
<div class="post">
<div class="user">you:</div>
<div class="post">I am a banana</div>
</div>
<div class="post">
<div class="user">we:</div>
<div class="post">We are fruits</div>
</div>
</div>
这将打印用户。
$users= $html->find('div[class=user]');
foreach($users as $user)
echo $user->innertext;
这将打印帖子。
$posts = $html->find('div[class=post]');
foreach($posts as $post)
echo $post->innertext;
我想将它们一起打印,而不是分开打印,如下所示:
me:
I am an apple
you:
I am a banana
we:
We are fruits
如何使用解析器执行此操作?
使用下面的代码
$users= $html->find('div[class=user]');
$posts = $html->find('div[class=post]');
foreach($users as $i=>$user){
echo $user->innertext."<br>";
echo $posts[$i]->innertext;
}
希望对您有所帮助
假设您使用的是 Simple HTML DOM Parser,您可以使用带有逗号分隔符格式的 find()
。试试这个:
$posts = $html->find('div.post');
foreach($posts as $post){
$children = $post->find('div.user,div.post');
foreach($children as $child){
echo $child->class.' -- ';
echo $child->innerText(); echo '<br>';
}
}
输出
user -- me:
post -- I am an apple
user -- you:
post -- I am a banana
user -- we:
post -- We are fruits
使用您提供的标记,您可以只指出主要 div(div#posts)的 children,然后循环所有 children。然后对于每个 children 只得到第一个和第二个:
foreach($html->find('div#posts', 0)->children() as $post) {
$user = $post->children(0)->innertext;
$post = $post->children(1)->innertext;
echo $user . '<br/>' . $post . '<hr/>';
}
虽然我真的建议使用 DOMDocument
:
$dom = new DOMDocument;
$dom->loadHTML($html_markup);
$xpath = new DOMXpath($dom);
$elements = $xpath->query('//div[@id="posts"]/div[@class="post"]');
foreach($elements as $posts) {
$user = $xpath->evaluate('string(./div[@class="user"])', $posts);
$post = $xpath->evaluate('string(./div[@class="post"])', $posts);
echo $user . '<br/>' . $post . '<hr/>';
}
我想使用 dom 解析器从一些 html 代码中提取一些信息,但我卡在了某个点上。
<div id="posts">
<div class="post">
<div class="user">me:</div>
<div class="post">I am an apple</div>
</div>
<div class="post">
<div class="user">you:</div>
<div class="post">I am a banana</div>
</div>
<div class="post">
<div class="user">we:</div>
<div class="post">We are fruits</div>
</div>
</div>
这将打印用户。
$users= $html->find('div[class=user]');
foreach($users as $user)
echo $user->innertext;
这将打印帖子。
$posts = $html->find('div[class=post]');
foreach($posts as $post)
echo $post->innertext;
我想将它们一起打印,而不是分开打印,如下所示:
me:
I am an apple
you:
I am a banana
we:
We are fruits
如何使用解析器执行此操作?
使用下面的代码
$users= $html->find('div[class=user]');
$posts = $html->find('div[class=post]');
foreach($users as $i=>$user){
echo $user->innertext."<br>";
echo $posts[$i]->innertext;
}
希望对您有所帮助
假设您使用的是 Simple HTML DOM Parser,您可以使用带有逗号分隔符格式的 find()
。试试这个:
$posts = $html->find('div.post');
foreach($posts as $post){
$children = $post->find('div.user,div.post');
foreach($children as $child){
echo $child->class.' -- ';
echo $child->innerText(); echo '<br>';
}
}
输出
user -- me:
post -- I am an apple
user -- you:
post -- I am a banana
user -- we:
post -- We are fruits
使用您提供的标记,您可以只指出主要 div(div#posts)的 children,然后循环所有 children。然后对于每个 children 只得到第一个和第二个:
foreach($html->find('div#posts', 0)->children() as $post) {
$user = $post->children(0)->innertext;
$post = $post->children(1)->innertext;
echo $user . '<br/>' . $post . '<hr/>';
}
虽然我真的建议使用 DOMDocument
:
$dom = new DOMDocument;
$dom->loadHTML($html_markup);
$xpath = new DOMXpath($dom);
$elements = $xpath->query('//div[@id="posts"]/div[@class="post"]');
foreach($elements as $posts) {
$user = $xpath->evaluate('string(./div[@class="user"])', $posts);
$post = $xpath->evaluate('string(./div[@class="post"])', $posts);
echo $user . '<br/>' . $post . '<hr/>';
}