如何使用 HtmlAgilityPack C# 抓取 <div class="content" id ="getSch">
How to scrape <div class="content" id ="getSch"> using HtmlAgilityPack C#
我想从电影网站上抓取数据,我会在其中抓取电影时间表和电影Title.and我不知道如何编写抓取这个的查询html<div class="content" id="getSh">
。
<div class="container">
<div class="content" id="getSh"><ul class="ctr"><li class="ctrl">Cinema 1</li>
<li class="ctrr">09, Mar</li><li class="cl"></li></ul>
<ul class="col_row"><li class="col"><a href="#">3:15 pm</a></li>
<li class="col cb"><a href="/movies/detail/299">The Second Best Exotic Marigold Hotel
<span class="blue">Digital 2D</span></a></li><li class="col cc"><a href="#">--</a>
</li><li class="cl"></li></ul> <ul class="col_row"><li class="col"><a href="#">6:15 pm</a
li><li class="col cb"><a href="/movies/detail/307">Focus <span class="blue">Digital 2D
</span><span class="red">Adults Only</span></a></li><li class="col cc"><a href="#">--
</a></li><li class="cl"></li></ul> <ul class="col_row"><li class="col">
<a href="#">8:45 pm</a></li><li class="col cb"><a href="/movies/detail/266">
Kingsman: The Secret Service <span class="blue">Digital 2D</span><span class="red">
Adults Only</span></a></li><li class="col cc"><a href="#">--</a></li><li class="cl">
</li></ul><ul class="col_row col_m"><li class="col"><a href="#">11:45 pm</a></li>
<li class="col cb"><a href="/movies/detail/267">Badlapur <span class="blue">Digital 2D
</span></a></li><li class="col cc"><a href="#">--</a></li><li class="cl">
</li></ul><ul class="ctr"><li class="ctrl">Cinema 2</li><li class="ctrr">09, Mar</li>
<li class="cl"></li></ul> <ul class="col_row"><li class="col"><a href="#">3:30 pm</a>
</li><li class="col cb"><a href="/movies/detail/307">Focus <span class="blue">Digital
</span><span class="red">Adults Only</span></a></li><li class="col cc"><a href="#">--<
/a></li><li class="cl"></li></ul> <ul class="col_row"><li class="col"><a href="#">6:00
pm</a></li><li class="col cb"><a href="/movies/detail/266">Kingsman: The Secret Service
<span class="blue">Digital 2D</span><span class="red">Adults Only</span></a></li>
<li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul> <ul class="col_row">
<li class="col"><a href="#">9:00 pm</a></li><li class="col cb"><a href="/movies/detail/307">
Focus <span class="blue">Digital 2D</span><span class="red">Adults Only</span></a></li>
<li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul><ul class="col_row col_m">
<li class="col"><a href="#">11:30 pm</a></li><li class="col cb"><a href="/movies/detail/266">
Kingsman: The Secret Service <span class="blue">Digital 2D</span><span class="red">Adults Only
</span></a></li><li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul><ul class="
ctr"><li class="ctrl">Cinema 3</li><li class="ctrr">09, Mar</li><li class="cl"></li></ul>
<ul class="col_row"><li class="col"><a href="#">3:45 pm</a></li><li class="col cb"><
a href="/movies/detail/321">Hey Bro <span class="blue">Digital 2D</span></a></li><
li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul> <ul class="col_row"><
li class="col"><a href="#">6:30 pm</a></li><li class="col cb"><a href="/movies/detail/328">D
irty Politics <span class="blue">Digital 2D</span><span class="red">Adults Only</span>
</a></li><li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul>
<ul class="col_row"><li class="col"><a href="#">9:30 pm</a></li><li class="col cb">
<a href="/movies/detail/321">Hey Bro <span class="blue">Digital 2D</span></a></li><
li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul><ul class="col_row col_m">
<li class="col"><a href="#">12:15 am</a></li><li class="col cb"><a href="/movies/detail/328"
>Dirty Politics <span class="blue">Digital 2D</span><span class="red">Adults Only</span></a>
</li><li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul><ul class="ctr">
<li class="ctrl">Cinema 4</li><li class="ctrr">09, Mar</li><li class="cl"></li></ul>
<ul class="col_row"><li class="col"><a href="#">3:00 pm</a></li><li class="col cb">
<a href="/movies/detail/295">The SpongeBob Movie: Sponge Out of Water <span class="blue">D
igital 3D</span></a></li><li class="col cc"><a href="#">--</a></li><li class="cl"></li>
</ul> <ul class="col_row"><li class="col"><a href="#">5:15 pm</a></li><li class="col cb">
<a href="/movies/detail/300">Paddington <span class="blue">Digital 2D</span></a></li>
<li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul> <ul class="col_row"><
li class="col"><a href="#">7:30 pm</a></li><li class="col cb"><a href="/movies/detail/297">
Unbroken <span class="blue">Digital 2D</span></a></li><li class="col cc"><a href="#">--</a>
</li><li class="cl"></li></ul><ul class="col_row col_m"><li class="col"><a href="#">10:30 pm
</a></li><li class="col cb">
<a href="/movies/detail/299">The Second Best Exotic Marigold Hotel <span class="blue">Digital 2D<
/span></a></li><li class="col cc"><
a href="#">--</a></li><li class="cl"></li></ul><ul class="ctr">
<li class="ctrl">Royal Cinema</li><li class="ctrr">09, Mar</li>
<li class="cl"></li></ul> <ul class="col_row"><li class="col"><
a href="#">3:05 pm</a></li><li class="col cb"><a href="/movies/detail/328">Dirty Politics <
span class="blue">Digital 2D</span><span class="red">Adults Only</span></a></li><li class="col cc">
<a href="#">--</a></li><li class="cl"></li></ul> <ul class="col_row"><li class="col"><a href="#">
6:05 pm</a></li><li class="col cb"><a href="/movies/detail/307">Focus <span class="blue">Digital 2D
</span><span class="red">Adults Only</span></a></li><li class="col cc"><a href="#">--</a></li>
<li class="cl"></li></ul><ul class="col_row col_m"><li class="col"><a href="#">8:30 pm</a></li>
<li class="col cb"><a href="/movies/detail/299">The Second Best Exotic Marigold Hotel
<span class="blue">Digital 2D</span></a></li><li class="col cc"><a href="#">--</a></li>
<li class="cl"></li></ul></div>
</div>
我使用此 C# 代码提取无法正常工作的数据
HtmlNode htmlNode = document.DocumentNode.SelectSingleNode("//div[@id='customScrollBox']");
List<string> movieList = new List<string>();
foreach (HtmlNode heading in htmlNode.SelectNodes("//ul[@class='col_row']"))
{
movieList.Add(heading.InnerText);
}
我想要这个输出
电影室 = 电影院 1
电影名称=第二好的异域万寿菊酒店
还有时间表
据我了解,您是想知道电影名称吗?如果是这样,下面的代码应该这样做:
foreach (HtmlNode heading in htmlNode.SelectNodes("//ul[@class='col_row']")
{
var heading = heading.SelectSingleNode(".//li[@class='col cb']/a").InnerText;
//I Presume you want other fields here?
}
我想从电影网站上抓取数据,我会在其中抓取电影时间表和电影Title.and我不知道如何编写抓取这个的查询html<div class="content" id="getSh">
。
<div class="container">
<div class="content" id="getSh"><ul class="ctr"><li class="ctrl">Cinema 1</li>
<li class="ctrr">09, Mar</li><li class="cl"></li></ul>
<ul class="col_row"><li class="col"><a href="#">3:15 pm</a></li>
<li class="col cb"><a href="/movies/detail/299">The Second Best Exotic Marigold Hotel
<span class="blue">Digital 2D</span></a></li><li class="col cc"><a href="#">--</a>
</li><li class="cl"></li></ul> <ul class="col_row"><li class="col"><a href="#">6:15 pm</a
li><li class="col cb"><a href="/movies/detail/307">Focus <span class="blue">Digital 2D
</span><span class="red">Adults Only</span></a></li><li class="col cc"><a href="#">--
</a></li><li class="cl"></li></ul> <ul class="col_row"><li class="col">
<a href="#">8:45 pm</a></li><li class="col cb"><a href="/movies/detail/266">
Kingsman: The Secret Service <span class="blue">Digital 2D</span><span class="red">
Adults Only</span></a></li><li class="col cc"><a href="#">--</a></li><li class="cl">
</li></ul><ul class="col_row col_m"><li class="col"><a href="#">11:45 pm</a></li>
<li class="col cb"><a href="/movies/detail/267">Badlapur <span class="blue">Digital 2D
</span></a></li><li class="col cc"><a href="#">--</a></li><li class="cl">
</li></ul><ul class="ctr"><li class="ctrl">Cinema 2</li><li class="ctrr">09, Mar</li>
<li class="cl"></li></ul> <ul class="col_row"><li class="col"><a href="#">3:30 pm</a>
</li><li class="col cb"><a href="/movies/detail/307">Focus <span class="blue">Digital
</span><span class="red">Adults Only</span></a></li><li class="col cc"><a href="#">--<
/a></li><li class="cl"></li></ul> <ul class="col_row"><li class="col"><a href="#">6:00
pm</a></li><li class="col cb"><a href="/movies/detail/266">Kingsman: The Secret Service
<span class="blue">Digital 2D</span><span class="red">Adults Only</span></a></li>
<li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul> <ul class="col_row">
<li class="col"><a href="#">9:00 pm</a></li><li class="col cb"><a href="/movies/detail/307">
Focus <span class="blue">Digital 2D</span><span class="red">Adults Only</span></a></li>
<li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul><ul class="col_row col_m">
<li class="col"><a href="#">11:30 pm</a></li><li class="col cb"><a href="/movies/detail/266">
Kingsman: The Secret Service <span class="blue">Digital 2D</span><span class="red">Adults Only
</span></a></li><li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul><ul class="
ctr"><li class="ctrl">Cinema 3</li><li class="ctrr">09, Mar</li><li class="cl"></li></ul>
<ul class="col_row"><li class="col"><a href="#">3:45 pm</a></li><li class="col cb"><
a href="/movies/detail/321">Hey Bro <span class="blue">Digital 2D</span></a></li><
li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul> <ul class="col_row"><
li class="col"><a href="#">6:30 pm</a></li><li class="col cb"><a href="/movies/detail/328">D
irty Politics <span class="blue">Digital 2D</span><span class="red">Adults Only</span>
</a></li><li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul>
<ul class="col_row"><li class="col"><a href="#">9:30 pm</a></li><li class="col cb">
<a href="/movies/detail/321">Hey Bro <span class="blue">Digital 2D</span></a></li><
li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul><ul class="col_row col_m">
<li class="col"><a href="#">12:15 am</a></li><li class="col cb"><a href="/movies/detail/328"
>Dirty Politics <span class="blue">Digital 2D</span><span class="red">Adults Only</span></a>
</li><li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul><ul class="ctr">
<li class="ctrl">Cinema 4</li><li class="ctrr">09, Mar</li><li class="cl"></li></ul>
<ul class="col_row"><li class="col"><a href="#">3:00 pm</a></li><li class="col cb">
<a href="/movies/detail/295">The SpongeBob Movie: Sponge Out of Water <span class="blue">D
igital 3D</span></a></li><li class="col cc"><a href="#">--</a></li><li class="cl"></li>
</ul> <ul class="col_row"><li class="col"><a href="#">5:15 pm</a></li><li class="col cb">
<a href="/movies/detail/300">Paddington <span class="blue">Digital 2D</span></a></li>
<li class="col cc"><a href="#">--</a></li><li class="cl"></li></ul> <ul class="col_row"><
li class="col"><a href="#">7:30 pm</a></li><li class="col cb"><a href="/movies/detail/297">
Unbroken <span class="blue">Digital 2D</span></a></li><li class="col cc"><a href="#">--</a>
</li><li class="cl"></li></ul><ul class="col_row col_m"><li class="col"><a href="#">10:30 pm
</a></li><li class="col cb">
<a href="/movies/detail/299">The Second Best Exotic Marigold Hotel <span class="blue">Digital 2D<
/span></a></li><li class="col cc"><
a href="#">--</a></li><li class="cl"></li></ul><ul class="ctr">
<li class="ctrl">Royal Cinema</li><li class="ctrr">09, Mar</li>
<li class="cl"></li></ul> <ul class="col_row"><li class="col"><
a href="#">3:05 pm</a></li><li class="col cb"><a href="/movies/detail/328">Dirty Politics <
span class="blue">Digital 2D</span><span class="red">Adults Only</span></a></li><li class="col cc">
<a href="#">--</a></li><li class="cl"></li></ul> <ul class="col_row"><li class="col"><a href="#">
6:05 pm</a></li><li class="col cb"><a href="/movies/detail/307">Focus <span class="blue">Digital 2D
</span><span class="red">Adults Only</span></a></li><li class="col cc"><a href="#">--</a></li>
<li class="cl"></li></ul><ul class="col_row col_m"><li class="col"><a href="#">8:30 pm</a></li>
<li class="col cb"><a href="/movies/detail/299">The Second Best Exotic Marigold Hotel
<span class="blue">Digital 2D</span></a></li><li class="col cc"><a href="#">--</a></li>
<li class="cl"></li></ul></div>
</div>
我使用此 C# 代码提取无法正常工作的数据
HtmlNode htmlNode = document.DocumentNode.SelectSingleNode("//div[@id='customScrollBox']");
List<string> movieList = new List<string>();
foreach (HtmlNode heading in htmlNode.SelectNodes("//ul[@class='col_row']"))
{
movieList.Add(heading.InnerText);
}
我想要这个输出 电影室 = 电影院 1 电影名称=第二好的异域万寿菊酒店 还有时间表
据我了解,您是想知道电影名称吗?如果是这样,下面的代码应该这样做:
foreach (HtmlNode heading in htmlNode.SelectNodes("//ul[@class='col_row']")
{
var heading = heading.SelectSingleNode(".//li[@class='col cb']/a").InnerText;
//I Presume you want other fields here?
}