从维基百科打印信息
Print information from Wikipedia
我正在尝试从维基百科打印出信息,结果确实如此:http://i.imgur.com/1vMj3df.jpg
这是我正在使用的代码:
use strict;
use warnings;
use WWW::Wikipedia;
use HTML::Restrict;
my $wiki = WWW::Wikipedia->new(clean_html => 1);
my $hr = HTML::Restrict->new;
my $entry = $wiki->search('Perl');
my $processed = $hr->process($entry->text_basic);
print $processed;
1;
我希望删除上面的内容 "perl is a family" 并只打印出段落。
打印出来的例子:
{{Infobox programming language
| name = Perl
| logo =
| paradigm = multi-paradigm: functional, imperative, object-oriented (class-based), reflective, procedural, event-driven, generic
| year =
| designer = Larry Wall
| developer = Larry Wall
| latest_release_version = 5.22.1
| latest_release_date =
| latest_preview_version = 5.23.7
| latest_preview_date =
| turing-complete = Yes
| typing = Dynamic
| influenced_by = AWK, Smalltalk 80, Lisp, C, C++, sed, Unix shell, Pascal
| influenced = Chapel, Coffeescript, ECMAScript/JavaScript, Falcon, Julia, LPC, Perl 6, PHP, Python, Qore, Ruby, Windows PowerShell
| programming_language = C
| operating_system = Cross-platform
| license = GNU General Public License or Artistic License
| website =
| file_ext = .pl .pm .t .pod
| wikibooks = Perl Programming
}}
'Perl' is a family of high-level, general-purpose, interpreted, dynamic programming languages. The languages in this family include Perl 5 and Perl 6.
Though Perl is not officially an acronym, there are various backronyms in use, the most well-known being "Practical Extraction and Reporting Language". Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions. Perl 6, which began as a redesign of Perl 5 in 2000, eventually evolved into a separate language. Both languages continue to be developed independently by different development teams and liberally borrow ideas from one another.
The Perl languages borrow features from other programming languages including C, shell script (sh), AWK, and sed. They provide powerful text processing facilities without the arbitrary data-length limits of many contemporary Unix commandline tools, facilitating easy manipulation of text files. Perl 5 gained widespread popularity in the late 1990s as a CGI scripting language, in part due to its unsurpassed regular expression and string parsing abilities.
In addition to CGI, Perl 5 is used for graphics programming, system administration, network programming, finance, bioinformatics, and other applications. It has been nicknamed "the Swiss Army chainsaw of scripting languages" because of its flexibility and power, and possibly also because of its "ugliness". In 1998, it was also referred to as the "duct tape that holds the Internet together", in reference to both its ubiquitous use as a glue language and its perceived inelegance.
我要删除:
{{Infobox programming language
| name = Perl
| logo =
| paradigm = multi-paradigm: functional, imperative, object-oriented (class-based), reflective, procedural, event-driven, generic
| year =
| designer = Larry Wall
| developer = Larry Wall
| latest_release_version = 5.22.1
| latest_release_date =
| latest_preview_version = 5.23.7
| latest_preview_date =
| turing-complete = Yes
| typing = Dynamic
| influenced_by = AWK, Smalltalk 80, Lisp, C, C++, sed, Unix shell, Pascal
| influenced = Chapel, Coffeescript, ECMAScript/JavaScript, Falcon, Julia, LPC, Perl 6, PHP, Python, Qore, Ruby, Windows PowerShell
| programming_language = C
| operating_system = Cross-platform
| license = GNU General Public License or Artistic License
| website =
| file_ext = .pl .pm .t .pod
| wikibooks = Perl Programming
}}
我只想显示每个搜索结果中的段落。
将 wiki 返回的字符串与正则表达式进行匹配,以用空字符串替换 {{}} 和紧随其后的空新行之间的第一部分:
$processed =~ s/\{\{.*\}\}\R\R//s;
如您所见,我使用了 /s
选项来匹配多行,但我没有使用 g
选项,因此我们只替换了第一个贪婪匹配。
我正在尝试从维基百科打印出信息,结果确实如此:http://i.imgur.com/1vMj3df.jpg
这是我正在使用的代码:
use strict;
use warnings;
use WWW::Wikipedia;
use HTML::Restrict;
my $wiki = WWW::Wikipedia->new(clean_html => 1);
my $hr = HTML::Restrict->new;
my $entry = $wiki->search('Perl');
my $processed = $hr->process($entry->text_basic);
print $processed;
1;
我希望删除上面的内容 "perl is a family" 并只打印出段落。
打印出来的例子:
{{Infobox programming language
| name = Perl
| logo =
| paradigm = multi-paradigm: functional, imperative, object-oriented (class-based), reflective, procedural, event-driven, generic
| year =
| designer = Larry Wall
| developer = Larry Wall
| latest_release_version = 5.22.1
| latest_release_date =
| latest_preview_version = 5.23.7
| latest_preview_date =
| turing-complete = Yes
| typing = Dynamic
| influenced_by = AWK, Smalltalk 80, Lisp, C, C++, sed, Unix shell, Pascal
| influenced = Chapel, Coffeescript, ECMAScript/JavaScript, Falcon, Julia, LPC, Perl 6, PHP, Python, Qore, Ruby, Windows PowerShell
| programming_language = C
| operating_system = Cross-platform
| license = GNU General Public License or Artistic License
| website =
| file_ext = .pl .pm .t .pod
| wikibooks = Perl Programming
}}
'Perl' is a family of high-level, general-purpose, interpreted, dynamic programming languages. The languages in this family include Perl 5 and Perl 6.
Though Perl is not officially an acronym, there are various backronyms in use, the most well-known being "Practical Extraction and Reporting Language". Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions. Perl 6, which began as a redesign of Perl 5 in 2000, eventually evolved into a separate language. Both languages continue to be developed independently by different development teams and liberally borrow ideas from one another.
The Perl languages borrow features from other programming languages including C, shell script (sh), AWK, and sed. They provide powerful text processing facilities without the arbitrary data-length limits of many contemporary Unix commandline tools, facilitating easy manipulation of text files. Perl 5 gained widespread popularity in the late 1990s as a CGI scripting language, in part due to its unsurpassed regular expression and string parsing abilities.
In addition to CGI, Perl 5 is used for graphics programming, system administration, network programming, finance, bioinformatics, and other applications. It has been nicknamed "the Swiss Army chainsaw of scripting languages" because of its flexibility and power, and possibly also because of its "ugliness". In 1998, it was also referred to as the "duct tape that holds the Internet together", in reference to both its ubiquitous use as a glue language and its perceived inelegance.
我要删除:
{{Infobox programming language
| name = Perl
| logo =
| paradigm = multi-paradigm: functional, imperative, object-oriented (class-based), reflective, procedural, event-driven, generic
| year =
| designer = Larry Wall
| developer = Larry Wall
| latest_release_version = 5.22.1
| latest_release_date =
| latest_preview_version = 5.23.7
| latest_preview_date =
| turing-complete = Yes
| typing = Dynamic
| influenced_by = AWK, Smalltalk 80, Lisp, C, C++, sed, Unix shell, Pascal
| influenced = Chapel, Coffeescript, ECMAScript/JavaScript, Falcon, Julia, LPC, Perl 6, PHP, Python, Qore, Ruby, Windows PowerShell
| programming_language = C
| operating_system = Cross-platform
| license = GNU General Public License or Artistic License
| website =
| file_ext = .pl .pm .t .pod
| wikibooks = Perl Programming
}}
我只想显示每个搜索结果中的段落。
将 wiki 返回的字符串与正则表达式进行匹配,以用空字符串替换 {{}} 和紧随其后的空新行之间的第一部分:
$processed =~ s/\{\{.*\}\}\R\R//s;
如您所见,我使用了 /s
选项来匹配多行,但我没有使用 g
选项,因此我们只替换了第一个贪婪匹配。