Foreach explode trim take last date simple dom 解析器
Foreach explode trim take last date simple dom parser
我只想从第三个 div 开始计算文本 div 的最后日期,下面使用 foreach 是我的代码
它只显示日期 01.01.1970 我无法获取日期并将其与今天的日期进行比较
--------------
我要抓取的部分页面
<div class="courses_list">
<article class="course" data-id="4376" data-datepairs="20150718-20150809" data-terms="8">
<div class="img">
<a href="http://ickosovo.com/?course=network-security-pentesting-2"><img src="http://ickosovo.com/wp-content/uploads/2015/07/network-a4-90x60.jpg"></a>
</div>
<div class="text">
<h3><a href="http://ickosovo.com/?course=network-security-pentesting-2">NETWORK SECURITY & PENTESTING</a></h3>
<div class="excerpt"><p>In the last decade, wireless networks gained a substantial momentum. One of the most beneficial features of wireless networks is […]</p>
<div class="applied date_applied">
18 July 2015
-
09 August 2015 </div>
<div class="applied date_applied">
ICT Courses </div>
</div>
</div>
</article>
<article class="course" data-id="4378" data-datepairs="20150727-20150826" data-terms="38">
<div class="img">
<a href="http://ickosovo.com/?course=autocad-autocad-lt-2015-fundamentals-7"><img src="http://ickosovo.com/wp-content/uploads/2015/07/AutoCAD-2015-poster-90x60.png"></a>
</div>
<div class="text">
<h3><a href="http://ickosovo.com/?course=autocad-autocad-lt-2015-fundamentals-7">AUTOCAD / AUTOCAD LT 2015 FUNDAMENTALS</a></h3>
<div class="excerpt"><p>The AutoCAD / AutoCAD LT 2015 Fundamentals Training course is designed for new users of AutoCAD and for delegates who would […]</p>
<div class="applied date_applied">
27 July 2015
-
26 August 2015 </div>
<div class="applied date_applied">
Special Focus </div>
</div>
</div>
</article>
<article class="course" data-id="4439" data-datepairs="20150727-20150918" data-terms="8">
<div class="img">
<a href="http://ickosovo.com/?course=web-design-6"><img src="http://ickosovo.com/wp-content/uploads/2015/07/web-design_poster_july-2015-90x60.png"></a>
</div>
<div class="text">
<h3><a href="http://ickosovo.com/?course=web-design-6">WEB DESIGN</a></h3>
<div class="excerpt"><p>Many other training companies claim that creating a website is easy and can be done by anyone. While this may […]</p>
<div class="applied date_applied">
27 July 2015
-
18 September 2015 </div>
<div class="applied date_applied">
ICT Courses </div>
</div>
</div>
</article>
<article class="course" data-id="4441" data-datepairs="20150728-20150919" data-terms="8">
<div class="img">
<a href="http://ickosovo.com/?course=php5-web-application-3"><img src="http://ickosovo.com/wp-content/uploads/2015/07/php-poster_july-2015-90x60.png"></a>
</div>
<div class="text">
<h3><a href="http://ickosovo.com/?course=php5-web-application-3">PHP5 Web Application</a></h3>
<div class="excerpt"><p>Many other training companies claim that creating a Web application is easy and can be done by anyone. While this […]</p>
<div class="applied date_applied">
28 July 2015
-
19 September 2015 </div>
<div class="applied date_applied">
ICT Courses </div>
</div>
</div>
</article>
</div>
include('simple_html_dom.php');
$html1 = file_get_html($page1);
$today = strtotime("today");
$events_old = array();
$events_today = array();
$events_future = array();
foreach($html1->find('div.text h3') as $e) {
$link = $e->getElementsByTagName('a',0)->href;
$date_array = explode("-", trim($e->next_sibling()->getElementsByTagName('applied.date_applied', 0)->plaintext));
$originalDate = trim($date_array[1]);
$dt = strtotime($originalDate);
$c = array('title' => $e->plaintext, 'date' => date('d.m.Y', $dt), 'timestamp' => $dt, 'from' => 'ict', 'link' => $link);
if($today == $dt) {
array_push($events_today, $c);
} elseif($today > $dt) {
array_push($events_old, $c);
} else {
array_push($events_future, $c);
}
}
经过一些编辑,我们找到了需要解析的 HTML 标记的真实相关结构:
<h3><a >NETWORK SECURITY & PENTESTING</a></h3>
<div class="excerpt"><p>In the last decade, wireless networks gained a substantial momentum. One of the most beneficial features of wireless networks is […]</p>
<div class="applied date_applied">
18 July 2015
-
09 August 2015
</div>
<div class="applied date_applied">
ICT Courses
</div>
</div>
那么包含两个日期的 applied date_applied
元素是 <h3>
第一个兄弟元素的子元素就变得更清楚了。然后,您可以使用 next_sibling()
和 children()
访问它,并使用数组索引 [1]
来引用正确的子节点。
foreach ($html1->find('div.text h3') as $e) {
// get the two dates as an array
// The second child node of the first sibling to the <h3>
$date_array = explode('-' , trim($e->next_sibling()->children()[1]->plaintext)); $originalDate = trim($date_array[1]);
// Trim and convert them to dates:
foreach ($date_array as &$d) {
$d = strtotime(trim($d));
}
}
检查您的结果(现在全部转换为时间戳):
print_r($date_array);
Array
(
[0] => 1437195600
[1] => 1439096400
)
Array
(
[0] => 1437973200
[1] => 1440565200
)
Array
(
[0] => 1437973200
[1] => 1442552400
)
Array
(
[0] => 1438059600
[1] => 1442638800
)
我只想从第三个 div 开始计算文本 div 的最后日期,下面使用 foreach 是我的代码
它只显示日期 01.01.1970 我无法获取日期并将其与今天的日期进行比较 -------------- 我要抓取的部分页面
<div class="courses_list">
<article class="course" data-id="4376" data-datepairs="20150718-20150809" data-terms="8">
<div class="img">
<a href="http://ickosovo.com/?course=network-security-pentesting-2"><img src="http://ickosovo.com/wp-content/uploads/2015/07/network-a4-90x60.jpg"></a>
</div>
<div class="text">
<h3><a href="http://ickosovo.com/?course=network-security-pentesting-2">NETWORK SECURITY & PENTESTING</a></h3>
<div class="excerpt"><p>In the last decade, wireless networks gained a substantial momentum. One of the most beneficial features of wireless networks is […]</p>
<div class="applied date_applied">
18 July 2015
-
09 August 2015 </div>
<div class="applied date_applied">
ICT Courses </div>
</div>
</div>
</article>
<article class="course" data-id="4378" data-datepairs="20150727-20150826" data-terms="38">
<div class="img">
<a href="http://ickosovo.com/?course=autocad-autocad-lt-2015-fundamentals-7"><img src="http://ickosovo.com/wp-content/uploads/2015/07/AutoCAD-2015-poster-90x60.png"></a>
</div>
<div class="text">
<h3><a href="http://ickosovo.com/?course=autocad-autocad-lt-2015-fundamentals-7">AUTOCAD / AUTOCAD LT 2015 FUNDAMENTALS</a></h3>
<div class="excerpt"><p>The AutoCAD / AutoCAD LT 2015 Fundamentals Training course is designed for new users of AutoCAD and for delegates who would […]</p>
<div class="applied date_applied">
27 July 2015
-
26 August 2015 </div>
<div class="applied date_applied">
Special Focus </div>
</div>
</div>
</article>
<article class="course" data-id="4439" data-datepairs="20150727-20150918" data-terms="8">
<div class="img">
<a href="http://ickosovo.com/?course=web-design-6"><img src="http://ickosovo.com/wp-content/uploads/2015/07/web-design_poster_july-2015-90x60.png"></a>
</div>
<div class="text">
<h3><a href="http://ickosovo.com/?course=web-design-6">WEB DESIGN</a></h3>
<div class="excerpt"><p>Many other training companies claim that creating a website is easy and can be done by anyone. While this may […]</p>
<div class="applied date_applied">
27 July 2015
-
18 September 2015 </div>
<div class="applied date_applied">
ICT Courses </div>
</div>
</div>
</article>
<article class="course" data-id="4441" data-datepairs="20150728-20150919" data-terms="8">
<div class="img">
<a href="http://ickosovo.com/?course=php5-web-application-3"><img src="http://ickosovo.com/wp-content/uploads/2015/07/php-poster_july-2015-90x60.png"></a>
</div>
<div class="text">
<h3><a href="http://ickosovo.com/?course=php5-web-application-3">PHP5 Web Application</a></h3>
<div class="excerpt"><p>Many other training companies claim that creating a Web application is easy and can be done by anyone. While this […]</p>
<div class="applied date_applied">
28 July 2015
-
19 September 2015 </div>
<div class="applied date_applied">
ICT Courses </div>
</div>
</div>
</article>
</div>
include('simple_html_dom.php');
$html1 = file_get_html($page1);
$today = strtotime("today");
$events_old = array();
$events_today = array();
$events_future = array();
foreach($html1->find('div.text h3') as $e) {
$link = $e->getElementsByTagName('a',0)->href;
$date_array = explode("-", trim($e->next_sibling()->getElementsByTagName('applied.date_applied', 0)->plaintext));
$originalDate = trim($date_array[1]);
$dt = strtotime($originalDate);
$c = array('title' => $e->plaintext, 'date' => date('d.m.Y', $dt), 'timestamp' => $dt, 'from' => 'ict', 'link' => $link);
if($today == $dt) {
array_push($events_today, $c);
} elseif($today > $dt) {
array_push($events_old, $c);
} else {
array_push($events_future, $c);
}
}
经过一些编辑,我们找到了需要解析的 HTML 标记的真实相关结构:
<h3><a >NETWORK SECURITY & PENTESTING</a></h3>
<div class="excerpt"><p>In the last decade, wireless networks gained a substantial momentum. One of the most beneficial features of wireless networks is […]</p>
<div class="applied date_applied">
18 July 2015
-
09 August 2015
</div>
<div class="applied date_applied">
ICT Courses
</div>
</div>
那么包含两个日期的 applied date_applied
元素是 <h3>
第一个兄弟元素的子元素就变得更清楚了。然后,您可以使用 next_sibling()
和 children()
访问它,并使用数组索引 [1]
来引用正确的子节点。
foreach ($html1->find('div.text h3') as $e) {
// get the two dates as an array
// The second child node of the first sibling to the <h3>
$date_array = explode('-' , trim($e->next_sibling()->children()[1]->plaintext)); $originalDate = trim($date_array[1]);
// Trim and convert them to dates:
foreach ($date_array as &$d) {
$d = strtotime(trim($d));
}
}
检查您的结果(现在全部转换为时间戳):
print_r($date_array);
Array
(
[0] => 1437195600
[1] => 1439096400
)
Array
(
[0] => 1437973200
[1] => 1440565200
)
Array
(
[0] => 1437973200
[1] => 1442552400
)
Array
(
[0] => 1438059600
[1] => 1442638800
)