使用 perl 从更新网站获取数据

Getting data from an updating website using perl

我一直在尝试制作一个 perl 程序,通过更新网站 (https://www.vizugy.hu/?mapModule=OpGrafikon&AllomasVOA=73F7E310-985C-11D4-BB62-00508BA24287&mapData=Idosor) 告诉我一条河流的水位,但我的程序无法访问该网站,我完全卡住了,我是初学者。

#!/usr/bin/perl -w

$url = "https://www.vizugy.hu/?mapModule=OpGrafikon&AllomasVOA=73F7E310-985C-11D4-BB62-00508BA24287&mapData=Idosor";

use LWP::Simple;

$site = get($url) or die "The webpage won't load";

if($site =~ /<strong>(\d+)<\/strong>/ig){
    $waterLevel= ;
    }else{
    die "Can't find the water level (Vízállás (cm))";
}

if($site =~ /<strong>(\d+.\d+.\d+. \d+.:\d+)<\/strong>/){
    $date = ;
    }else{
    die "Can't find date (Időpont)";
}


print("The water level in Komarom is $waterLevel cm (Date: $date)\n");

我这样做是为了 class,我必须使用 LWP。该网站是匈牙利语,变量也是匈牙利语,但我尽力翻译了。

您的代码在我的 Linux 命令行中按预期工作。但是我在网上使用这个 IDE.

时看到的行为和你完全一样

LWP::Simple 的问题是很难调试出错的地方。所以我已经替换了你代码的顶部,所以它使用 LWP::UserAgent 代替。

#!/usr/bin/perl

# Always use these
use strict;
use warnings;

my $url = "https://www.vizugy.hu/?mapModule=OpGrafikon&AllomasVOA=73F7E310-985C-11D4-BB62-00508BA24287&mapData=Idosor";

use LWP::UserAgent;

print "Make a UA\n";
my $ua = LWP::UserAgent->new;

print "Request\n";
my $resp = $ua->get($url) or die "The webpage won't load";

print "Response\n";
print $resp->code, ': ', $resp->message, "\n";

my $site = $resp->content;

my ($waterLevel, $date);

if ($site =~ /<strong>(\d+)<\/strong>/ig) {
    $waterLevel= ;
}else{
    die "Can't find the water level (Vízállás (cm))";
}

if ($site =~ /<strong>(\d+.\d+.\d+. \d+.:\d+)<\/strong>/) {
    $date = ;
}else{
    die "Can't find date (Időpont)";
}

print("The water level in Komarom is $waterLevel cm (Date: $date)\n");

我看到的回复是:

Make a UA
Request
Response
500: Can't connect to www.vizugy.com:433 (Temporary failure in name resolution)
Can't find the water level (Vízállás (cm)) at main.pl line 21

看来您的在线 IDE 设置不正确,无法发出 HTTP 请求。您可以联系所有者(电子邮件地址在他们网站的首页上)或者您可以将问题报告给您的讲师。

请研究下面的演示代码

  • 下载网页
  • Javascript 块中提取数据
  • 处理获得的输出数据
  • 表单哈希 %data
  • 利用perlform
  • 将数据输出为table
use strict;
use warnings;
use feature 'say';

use Data::Dumper;
use LWP::UserAgent;

my $ua  = LWP::UserAgent->new;
my $url = 'https://www.vizugy.hu/?mapModule=OpGrafikon&AllomasVOA=73F7E310-985C-11D4-BB62-00508BA24287&mapData=Idosor';
my $req = $ua->get($url);

if ($req->is_success) {
    say 'INFO: Success loading web page';
} else {
    die "Could not head($url): " . $req->status_line;
}

my %data = $req->decoded_content =~ /(\w+) = new Array(.*?);/g;

$data{$_} =~ s/[()']//g for keys %data;

$data{Vizhozam} =~ s/[<sup>|<\/sup>]//g;
$data{Vizhozam} =~ s/(\d+) (\d{3}),(\d{2}) m3/./g;
$data{Vizho}    =~ s/ \x{b0}//g;
$data{Vizho}    =~ s/(\d+),(\d+)C/./g;

for (keys %data) {
    my @array = split(',', $data{$_});
    @array = map { s/^ // && $_ } @array;
    $data{$_} = \@array;
}

#say Dumper(\%data);

my $count = @{$data{Idopont}}-1;
my($date,$level,$flow,$temp);

$^ = "STDOUT_TOP";
$~ = "STDOUT";

for ( 0..$count ) {
    ($date,$level,$flow,$temp) = ($data{Idopont}[$_],$data{Vizallas}[$_],$data{Vizhozam}[$_],$data{Vizho}[$_]);
    write;
}

$~ = "STDOUT_BOTTOM";
write;

format STDOUT_TOP =
+-------------------+------------------+-------------------+----------------+
| Date              | Water level (cm) | Water flow (m3/s) | Water temp (C) |
+-------------------+------------------+-------------------+----------------+
.

format STDOUT =
| @<<<<<<<<<<<<<<<< |             @>>> |      @>>>>>>>>>>> |          @>>>> |
$date, $level, $flow, $temp
.

format STDOUT_BOTTOM =
+-------------------+------------------+-------------------+----------------+
.

生成的输出

INFO: Success loading web page
+-------------------+------------------+-------------------+----------------+
| Date              | Water level (cm) | Water flow (m3/s) | Water temp (C) |
+-------------------+------------------+-------------------+----------------+
| 2022.01.06. 07:00 |              331 |           2620.00 |            5.8 |
| 2022.01.07. 07:00 |              334 |           2650.00 |            5.3 |
| 2022.01.08. 07:00 |              331 |           2620.00 |            4.9 |
| 2022.01.09. 07:00 |              289 |           2240.00 |            4.5 |
| 2022.01.10. 07:00 |              272 |           2100.00 |            4.4 |
| 2022.01.11. 07:00 |              260 |           2010.00 |            4.1 |
| 2022.01.12. 07:00 |              243 |           1880.00 |            3.8 |
| 2022.01.13. 07:00 |              228 |           1770.00 |            3.4 |
| 2022.01.14. 07:00 |              213 |           1660.00 |            3.3 |
| 2022.01.15. 07:00 |              196 |           1550.00 |            3.4 |
| 2022.01.16. 07:00 |              195 |           1540.00 |            3.4 |
| 2022.01.17. 07:00 |              183 |           1470.00 |            3.5 |
| 2022.01.18. 07:00 |              185 |           1480.00 |            3.1 |
| 2022.01.19. 07:00 |              173 |           1410.00 |            2.8 |
| 2022.01.23. 07:00 |              164 |           1360.00 |            2.2 |
| 2022.01.24. 07:00 |              154 |           1300.00 |            2.0 |
| 2022.01.25. 07:00 |              161 |           1340.00 |            2.1 |
| 2022.01.26. 07:00 |              173 |           1410.00 |            2.4 |
+-------------------+------------------+-------------------+----------------+