使用 perl 从更新网站获取数据
Getting data from an updating website using perl
我一直在尝试制作一个 perl 程序,通过更新网站 (https://www.vizugy.hu/?mapModule=OpGrafikon&AllomasVOA=73F7E310-985C-11D4-BB62-00508BA24287&mapData=Idosor) 告诉我一条河流的水位,但我的程序无法访问该网站,我完全卡住了,我是初学者。
#!/usr/bin/perl -w
$url = "https://www.vizugy.hu/?mapModule=OpGrafikon&AllomasVOA=73F7E310-985C-11D4-BB62-00508BA24287&mapData=Idosor";
use LWP::Simple;
$site = get($url) or die "The webpage won't load";
if($site =~ /<strong>(\d+)<\/strong>/ig){
$waterLevel= ;
}else{
die "Can't find the water level (Vízállás (cm))";
}
if($site =~ /<strong>(\d+.\d+.\d+. \d+.:\d+)<\/strong>/){
$date = ;
}else{
die "Can't find date (Időpont)";
}
print("The water level in Komarom is $waterLevel cm (Date: $date)\n");
我这样做是为了 class,我必须使用 LWP。该网站是匈牙利语,变量也是匈牙利语,但我尽力翻译了。
您的代码在我的 Linux 命令行中按预期工作。但是我在网上使用这个 IDE.
时看到的行为和你完全一样
LWP::Simple 的问题是很难调试出错的地方。所以我已经替换了你代码的顶部,所以它使用 LWP::UserAgent 代替。
#!/usr/bin/perl
# Always use these
use strict;
use warnings;
my $url = "https://www.vizugy.hu/?mapModule=OpGrafikon&AllomasVOA=73F7E310-985C-11D4-BB62-00508BA24287&mapData=Idosor";
use LWP::UserAgent;
print "Make a UA\n";
my $ua = LWP::UserAgent->new;
print "Request\n";
my $resp = $ua->get($url) or die "The webpage won't load";
print "Response\n";
print $resp->code, ': ', $resp->message, "\n";
my $site = $resp->content;
my ($waterLevel, $date);
if ($site =~ /<strong>(\d+)<\/strong>/ig) {
$waterLevel= ;
}else{
die "Can't find the water level (Vízállás (cm))";
}
if ($site =~ /<strong>(\d+.\d+.\d+. \d+.:\d+)<\/strong>/) {
$date = ;
}else{
die "Can't find date (Időpont)";
}
print("The water level in Komarom is $waterLevel cm (Date: $date)\n");
我看到的回复是:
Make a UA
Request
Response
500: Can't connect to www.vizugy.com:433 (Temporary failure in name resolution)
Can't find the water level (Vízállás (cm)) at main.pl line 21
看来您的在线 IDE 设置不正确,无法发出 HTTP 请求。您可以联系所有者(电子邮件地址在他们网站的首页上)或者您可以将问题报告给您的讲师。
请研究下面的演示代码
- 下载网页
- 从
Javascript
块中提取数据
- 处理获得的输出数据
- 表单哈希
%data
- 利用perlform
将数据输出为table
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
my $url = 'https://www.vizugy.hu/?mapModule=OpGrafikon&AllomasVOA=73F7E310-985C-11D4-BB62-00508BA24287&mapData=Idosor';
my $req = $ua->get($url);
if ($req->is_success) {
say 'INFO: Success loading web page';
} else {
die "Could not head($url): " . $req->status_line;
}
my %data = $req->decoded_content =~ /(\w+) = new Array(.*?);/g;
$data{$_} =~ s/[()']//g for keys %data;
$data{Vizhozam} =~ s/[<sup>|<\/sup>]//g;
$data{Vizhozam} =~ s/(\d+) (\d{3}),(\d{2}) m3/./g;
$data{Vizho} =~ s/ \x{b0}//g;
$data{Vizho} =~ s/(\d+),(\d+)C/./g;
for (keys %data) {
my @array = split(',', $data{$_});
@array = map { s/^ // && $_ } @array;
$data{$_} = \@array;
}
#say Dumper(\%data);
my $count = @{$data{Idopont}}-1;
my($date,$level,$flow,$temp);
$^ = "STDOUT_TOP";
$~ = "STDOUT";
for ( 0..$count ) {
($date,$level,$flow,$temp) = ($data{Idopont}[$_],$data{Vizallas}[$_],$data{Vizhozam}[$_],$data{Vizho}[$_]);
write;
}
$~ = "STDOUT_BOTTOM";
write;
format STDOUT_TOP =
+-------------------+------------------+-------------------+----------------+
| Date | Water level (cm) | Water flow (m3/s) | Water temp (C) |
+-------------------+------------------+-------------------+----------------+
.
format STDOUT =
| @<<<<<<<<<<<<<<<< | @>>> | @>>>>>>>>>>> | @>>>> |
$date, $level, $flow, $temp
.
format STDOUT_BOTTOM =
+-------------------+------------------+-------------------+----------------+
.
生成的输出
INFO: Success loading web page
+-------------------+------------------+-------------------+----------------+
| Date | Water level (cm) | Water flow (m3/s) | Water temp (C) |
+-------------------+------------------+-------------------+----------------+
| 2022.01.06. 07:00 | 331 | 2620.00 | 5.8 |
| 2022.01.07. 07:00 | 334 | 2650.00 | 5.3 |
| 2022.01.08. 07:00 | 331 | 2620.00 | 4.9 |
| 2022.01.09. 07:00 | 289 | 2240.00 | 4.5 |
| 2022.01.10. 07:00 | 272 | 2100.00 | 4.4 |
| 2022.01.11. 07:00 | 260 | 2010.00 | 4.1 |
| 2022.01.12. 07:00 | 243 | 1880.00 | 3.8 |
| 2022.01.13. 07:00 | 228 | 1770.00 | 3.4 |
| 2022.01.14. 07:00 | 213 | 1660.00 | 3.3 |
| 2022.01.15. 07:00 | 196 | 1550.00 | 3.4 |
| 2022.01.16. 07:00 | 195 | 1540.00 | 3.4 |
| 2022.01.17. 07:00 | 183 | 1470.00 | 3.5 |
| 2022.01.18. 07:00 | 185 | 1480.00 | 3.1 |
| 2022.01.19. 07:00 | 173 | 1410.00 | 2.8 |
| 2022.01.23. 07:00 | 164 | 1360.00 | 2.2 |
| 2022.01.24. 07:00 | 154 | 1300.00 | 2.0 |
| 2022.01.25. 07:00 | 161 | 1340.00 | 2.1 |
| 2022.01.26. 07:00 | 173 | 1410.00 | 2.4 |
+-------------------+------------------+-------------------+----------------+
我一直在尝试制作一个 perl 程序,通过更新网站 (https://www.vizugy.hu/?mapModule=OpGrafikon&AllomasVOA=73F7E310-985C-11D4-BB62-00508BA24287&mapData=Idosor) 告诉我一条河流的水位,但我的程序无法访问该网站,我完全卡住了,我是初学者。
#!/usr/bin/perl -w
$url = "https://www.vizugy.hu/?mapModule=OpGrafikon&AllomasVOA=73F7E310-985C-11D4-BB62-00508BA24287&mapData=Idosor";
use LWP::Simple;
$site = get($url) or die "The webpage won't load";
if($site =~ /<strong>(\d+)<\/strong>/ig){
$waterLevel= ;
}else{
die "Can't find the water level (Vízállás (cm))";
}
if($site =~ /<strong>(\d+.\d+.\d+. \d+.:\d+)<\/strong>/){
$date = ;
}else{
die "Can't find date (Időpont)";
}
print("The water level in Komarom is $waterLevel cm (Date: $date)\n");
我这样做是为了 class,我必须使用 LWP。该网站是匈牙利语,变量也是匈牙利语,但我尽力翻译了。
您的代码在我的 Linux 命令行中按预期工作。但是我在网上使用这个 IDE.
时看到的行为和你完全一样LWP::Simple 的问题是很难调试出错的地方。所以我已经替换了你代码的顶部,所以它使用 LWP::UserAgent 代替。
#!/usr/bin/perl
# Always use these
use strict;
use warnings;
my $url = "https://www.vizugy.hu/?mapModule=OpGrafikon&AllomasVOA=73F7E310-985C-11D4-BB62-00508BA24287&mapData=Idosor";
use LWP::UserAgent;
print "Make a UA\n";
my $ua = LWP::UserAgent->new;
print "Request\n";
my $resp = $ua->get($url) or die "The webpage won't load";
print "Response\n";
print $resp->code, ': ', $resp->message, "\n";
my $site = $resp->content;
my ($waterLevel, $date);
if ($site =~ /<strong>(\d+)<\/strong>/ig) {
$waterLevel= ;
}else{
die "Can't find the water level (Vízállás (cm))";
}
if ($site =~ /<strong>(\d+.\d+.\d+. \d+.:\d+)<\/strong>/) {
$date = ;
}else{
die "Can't find date (Időpont)";
}
print("The water level in Komarom is $waterLevel cm (Date: $date)\n");
我看到的回复是:
Make a UA
Request
Response
500: Can't connect to www.vizugy.com:433 (Temporary failure in name resolution)
Can't find the water level (Vízállás (cm)) at main.pl line 21
看来您的在线 IDE 设置不正确,无法发出 HTTP 请求。您可以联系所有者(电子邮件地址在他们网站的首页上)或者您可以将问题报告给您的讲师。
请研究下面的演示代码
- 下载网页
- 从
Javascript
块中提取数据 - 处理获得的输出数据
- 表单哈希
%data
- 利用perlform 将数据输出为table
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
my $url = 'https://www.vizugy.hu/?mapModule=OpGrafikon&AllomasVOA=73F7E310-985C-11D4-BB62-00508BA24287&mapData=Idosor';
my $req = $ua->get($url);
if ($req->is_success) {
say 'INFO: Success loading web page';
} else {
die "Could not head($url): " . $req->status_line;
}
my %data = $req->decoded_content =~ /(\w+) = new Array(.*?);/g;
$data{$_} =~ s/[()']//g for keys %data;
$data{Vizhozam} =~ s/[<sup>|<\/sup>]//g;
$data{Vizhozam} =~ s/(\d+) (\d{3}),(\d{2}) m3/./g;
$data{Vizho} =~ s/ \x{b0}//g;
$data{Vizho} =~ s/(\d+),(\d+)C/./g;
for (keys %data) {
my @array = split(',', $data{$_});
@array = map { s/^ // && $_ } @array;
$data{$_} = \@array;
}
#say Dumper(\%data);
my $count = @{$data{Idopont}}-1;
my($date,$level,$flow,$temp);
$^ = "STDOUT_TOP";
$~ = "STDOUT";
for ( 0..$count ) {
($date,$level,$flow,$temp) = ($data{Idopont}[$_],$data{Vizallas}[$_],$data{Vizhozam}[$_],$data{Vizho}[$_]);
write;
}
$~ = "STDOUT_BOTTOM";
write;
format STDOUT_TOP =
+-------------------+------------------+-------------------+----------------+
| Date | Water level (cm) | Water flow (m3/s) | Water temp (C) |
+-------------------+------------------+-------------------+----------------+
.
format STDOUT =
| @<<<<<<<<<<<<<<<< | @>>> | @>>>>>>>>>>> | @>>>> |
$date, $level, $flow, $temp
.
format STDOUT_BOTTOM =
+-------------------+------------------+-------------------+----------------+
.
生成的输出
INFO: Success loading web page
+-------------------+------------------+-------------------+----------------+
| Date | Water level (cm) | Water flow (m3/s) | Water temp (C) |
+-------------------+------------------+-------------------+----------------+
| 2022.01.06. 07:00 | 331 | 2620.00 | 5.8 |
| 2022.01.07. 07:00 | 334 | 2650.00 | 5.3 |
| 2022.01.08. 07:00 | 331 | 2620.00 | 4.9 |
| 2022.01.09. 07:00 | 289 | 2240.00 | 4.5 |
| 2022.01.10. 07:00 | 272 | 2100.00 | 4.4 |
| 2022.01.11. 07:00 | 260 | 2010.00 | 4.1 |
| 2022.01.12. 07:00 | 243 | 1880.00 | 3.8 |
| 2022.01.13. 07:00 | 228 | 1770.00 | 3.4 |
| 2022.01.14. 07:00 | 213 | 1660.00 | 3.3 |
| 2022.01.15. 07:00 | 196 | 1550.00 | 3.4 |
| 2022.01.16. 07:00 | 195 | 1540.00 | 3.4 |
| 2022.01.17. 07:00 | 183 | 1470.00 | 3.5 |
| 2022.01.18. 07:00 | 185 | 1480.00 | 3.1 |
| 2022.01.19. 07:00 | 173 | 1410.00 | 2.8 |
| 2022.01.23. 07:00 | 164 | 1360.00 | 2.2 |
| 2022.01.24. 07:00 | 154 | 1300.00 | 2.0 |
| 2022.01.25. 07:00 | 161 | 1340.00 | 2.1 |
| 2022.01.26. 07:00 | 173 | 1410.00 | 2.4 |
+-------------------+------------------+-------------------+----------------+