使用 Perl 将具有 UTF-8 字符的 xml 内容添加到 eXist-db 集合时出现问题
Issues when adding a xml content with UTF-8 characters to an eXist-db collection using Perl
我正在尝试使用 Perl 将动态生成的 XML 内容添加到 eXist-db 集合(参见下面的代码 addFile.pl
),问题是每当内容包含 UTF-8 字符时我收到错误 Failed to parse XML-RPC request: Byte "195" is not a member of the (7-bit) ASCII character set.
.
#!/usr/bin/perl
use RPC::XML;
use RPC::XML::Client;
my ($sec, $min, $hour, $mday, $mon, $year) = localtime();
my $timestamp = sprintf("%04d%02d%02d%02d%02d%02d",$year+1900,$mon+1,$mday,$hour,$min,$sec);
print("Timestamp: $timestamp\n");
my $FILENAME = "$timestamp.xml";
my $COLLECTION = 'output';
my $record = <<END;
<document id="doc_20150419014112">
<text>ñáéíóú</text>
</document>
END
$query = <<END;
xquery version "3.0";
import module namespace xmldb="http://exist-db.org/xquery/xmldb";
declare variable $filename := '$FILENAME';
declare variable $record := '';
let $log-in := xmldb:login("/db", "admin", "admin")
(: let $create-collection := xmldb:create-collection("/db", "$COLLECTION") :)
let $record :=
$record
for $target in ('/db/$COLLECTION')
return xmldb:store($target, $filename, $record)
END
print $query;
$URL = "http://admin:admin\@localhost:8080/exist/xmlrpc";
# connecting to $URL...
$client = new RPC::XML::Client $URL;
# Output options
$options = RPC::XML::struct->new(
'indent' => 'yes',
'encoding' => 'UTF-8',
'highlight-matches' => 'none');
$req = RPC::XML::request->new("query", $query, 20, 1, $options);
$response = $client->send_request($req);
if($response->is_fault) {
die "An error occurred: " . $response->string . "\n";
}
my $result = $response->value;
print $result;
当我 运行 xquery 脚本(见下文)直接使用 eXide 它时 运行 通常但是当我 运行 它通过 perl 脚本时,我收到以下信息:
$ perl addFile.pl
Timestamp: 20150428162016
xquery version "3.0";
import module namespace xmldb="http://exist-db.org/xquery/xmldb";
declare variable $filename := '20150428162016.xml';
declare variable $record := '';
let $log-in := xmldb:login("/db", "admin", "admin")
(: let $create-collection := xmldb:create-collection("/db", "output") :)
let $record :=
<document id="doc_20150419014112">
<text>ñáéíóú</text>
</document>
for $target in ('/db/output')
return xmldb:store($target, $filename, $record)
An error occurred: Failed to parse XML-RPC request: Byte "195" is not a member of the (7-bit) ASCII character set.
我找到了解决方案here,我会引用答案以防万一:
RPC::XML Perl 模块默认使用 us-ascii 作为 XML 编码。如果您从数据库或其他来源传送 UTF-8 内容,RPC::XML 使用默认设置会生成无效的 XML。
RPC::XML使用的XML编码只能全局更改:
#!/usr/bin/perl
use RPC::XML;
use RPC::XML::Client;
$RPC::XML::ENCODING = 'utf-8';
我正在尝试使用 Perl 将动态生成的 XML 内容添加到 eXist-db 集合(参见下面的代码 addFile.pl
),问题是每当内容包含 UTF-8 字符时我收到错误 Failed to parse XML-RPC request: Byte "195" is not a member of the (7-bit) ASCII character set.
.
#!/usr/bin/perl
use RPC::XML;
use RPC::XML::Client;
my ($sec, $min, $hour, $mday, $mon, $year) = localtime();
my $timestamp = sprintf("%04d%02d%02d%02d%02d%02d",$year+1900,$mon+1,$mday,$hour,$min,$sec);
print("Timestamp: $timestamp\n");
my $FILENAME = "$timestamp.xml";
my $COLLECTION = 'output';
my $record = <<END;
<document id="doc_20150419014112">
<text>ñáéíóú</text>
</document>
END
$query = <<END;
xquery version "3.0";
import module namespace xmldb="http://exist-db.org/xquery/xmldb";
declare variable $filename := '$FILENAME';
declare variable $record := '';
let $log-in := xmldb:login("/db", "admin", "admin")
(: let $create-collection := xmldb:create-collection("/db", "$COLLECTION") :)
let $record :=
$record
for $target in ('/db/$COLLECTION')
return xmldb:store($target, $filename, $record)
END
print $query;
$URL = "http://admin:admin\@localhost:8080/exist/xmlrpc";
# connecting to $URL...
$client = new RPC::XML::Client $URL;
# Output options
$options = RPC::XML::struct->new(
'indent' => 'yes',
'encoding' => 'UTF-8',
'highlight-matches' => 'none');
$req = RPC::XML::request->new("query", $query, 20, 1, $options);
$response = $client->send_request($req);
if($response->is_fault) {
die "An error occurred: " . $response->string . "\n";
}
my $result = $response->value;
print $result;
当我 运行 xquery 脚本(见下文)直接使用 eXide 它时 运行 通常但是当我 运行 它通过 perl 脚本时,我收到以下信息:
$ perl addFile.pl
Timestamp: 20150428162016
xquery version "3.0";
import module namespace xmldb="http://exist-db.org/xquery/xmldb";
declare variable $filename := '20150428162016.xml';
declare variable $record := '';
let $log-in := xmldb:login("/db", "admin", "admin")
(: let $create-collection := xmldb:create-collection("/db", "output") :)
let $record :=
<document id="doc_20150419014112">
<text>ñáéíóú</text>
</document>
for $target in ('/db/output')
return xmldb:store($target, $filename, $record)
An error occurred: Failed to parse XML-RPC request: Byte "195" is not a member of the (7-bit) ASCII character set.
我找到了解决方案here,我会引用答案以防万一:
RPC::XML Perl 模块默认使用 us-ascii 作为 XML 编码。如果您从数据库或其他来源传送 UTF-8 内容,RPC::XML 使用默认设置会生成无效的 XML。
RPC::XML使用的XML编码只能全局更改:
#!/usr/bin/perl
use RPC::XML;
use RPC::XML::Client;
$RPC::XML::ENCODING = 'utf-8';