在 PHP 中使用 SAX XML 解析器读取第一个元素和子元素的属性
Read attributes of first and children elements with SAX XML Parser in PHP
在互联网上进行一些研究后,我意识到 SAX XML 解析器对我来说是最好的选择,因为我一直在寻找最快的 XML 大型(非常大)解析器 xml 文件。
所以我正在处理我在教程中找到的这段代码,它确实工作得很好,我只是不知道如何读取第一个和第二个元素的属性,以及每个元素中的内容。
代码如下:
XML
<?xml version="1.0" encoding="iso-8859-1"?>
<items>
<item id="100" name="First Element 1" />
<item id="101" name="First Element 2" />
<item id="102" name="First Element 3" />
<item id="103" name="First Element 4">
<attribute name="Second Element 4" value="508" />
</item>
<item id="104" name="First Element 5" />
<item id="105" name="First Element 6">
<attribute name="Second Element 6" value="215" />
</item>
</items>
PHP
$items = array();
$elements = null;
$item_attributes = null; //I added that myself, not sure if it's correct
// Called to this function when tags are opened
function startElements($parser, $name, $attrs) {
global $items, $elements, $item_attributes; // <-- added it here aswell
if(!empty($name)) {
if ($name == 'ITEM') {
if (!empty($attrs['ID'])) {
$item_attributes []= array(); // <-- here aswell
}
// creating an array to store information
$items []= array();
}
$elements = $name;
}
}
// Called to this function when tags are closed
function endElements($parser, $name) {
global $elements;
if(!empty($name)) {
$elements = null;
}
}
// Called on the text between the start and end of the tags
function characterData($parser, $data) {
global $items, $elements;
if(!empty($data)) {
if ($elements == 'ATTRIBUTE') {
$items[count($items)-1][$elements] = trim($data);
}
}
}
// Creates a new XML parser and returns a resource handle referencing it to be used by the other XML functions.
$parser = xml_parser_create();
xml_set_element_handler($parser, "startElements", "endElements");
xml_set_character_data_handler($parser, "characterData");
// open xml file
if (!($handle = fopen('./pages/scripts/sax.xml', "r"))) {
die("could not open XML input");
}
while($data = fread($handle, 4096)) {
xml_parse($parser, $data); // start parsing an xml document
}
xml_parser_free($parser); // deletes the parser
$i = 1;
foreach($items as $course) {
echo $i.' -';
echo ' ITEM ID: '.$course['ID'].'(?),';
echo ' NAME: '.$course['NAME'].'(?)<br/>';
echo 'ATTRIBUTE NAME: ???,';
echo ' ATTRIBUTE VALUE: ???<hr/>'; // not sure how to pull those results
$i++;
}
所以我试图从标签 item
中获取 id
和 name
以及从标签 [=19 中获取 name
和 value
=] 在第一个元素内 item
...
有什么想法吗?
更新: 注意:course['ID']
和 course['NAME']
没有回应任何东西,但是当我使用 course['ITEM']
或 course['ATTRIBUTE']
它回显标签 item
或 attribute
中的任何内容,例如:<item> this </item>
,我想要得到的是:<item THIS="this" />
我会以与您的方式略有不同,但它仍然适用于相同的海豚...
我知道这不是一个非常实用的方法,但我认为它会适合你...如果元素属性存在,你也可以多次获取它:
Note that at start_element
function you can edit the attributes of both item
and attribute
elements in the arrays of the variables $item_attr
and $field_attr
.
我还建议您完整地 运行 这段代码,这样您就可以看到它打印的内容,然后随意编辑它。
XML:
<?xml version="1.0" encoding="iso-8859-1"?>
<items>
<item id="100" name="First Element 1" />
<item id="101" name="First Element 2" />
<item id="102" name="First Element 3" />
<item id="103" name="First Element 4">
<attribute name="Second Element 4" value="508" />">
<attribute name="Third Element 4" value="509" />
</item>
<item id="104" name="First Element 5" />
<item id="105" name="First Element 6">
<attribute name="Second Element 6" value="215" />
</item>
</items>
PHP
<?php
$GLOBALS['currentIndex'] = 0; // identify where you are at in each element item
$GLOBALS['currentAttrIndex'] = 0; // the same but for the element attribute
$GLOBALS['currentField'] = ''; // specifies the element inside item
$GLOBALS['items'] = array(); // creates the array for the elements in items
$GLOBALS['attrs'] = array(); // creates the array for the elements attributes (in case of more than one)
$GLOBALS['items_attr'] = '';
$GLOBALS['fields_attr'] = '';
$parser = xml_parser_create();
xml_set_element_handler($parser, 'start_element', 'end_element');
xml_set_character_data_handler($parser, 'cdata');
xml_parse($parser, file_get_contents('./pages/test/sax.xml'), true);
xml_parser_free($parser);
//display results in a familiar way as a debugger...
$items = $GLOBALS['items'];
$attrs = $GLOBALS['attrs'];
$items_attr = $GLOBALS['items_attr'];
$fields_attr = $GLOBALS['fields_attr'];
$i = 1;
if (count($items) > 0) {
foreach($items as $item){
echo 'START ITEM</br>';
echo ($items_attr[$i-1]['id'] ? 'ID: '.$items_attr[$i-1]['id'].'</br>' : '');
echo ($items_attr[$i-1]['name'] ? 'NAME: '.$items_attr[$i-1]['name'].'</br>' : '');
$a = 0;
foreach ($attrs as $attr_id => $attr_name) {
if($attr_id == $i-1){
$g_i_attr_bits = explode(",", substr($attr_name, 0, -1));
foreach($g_i_attr_bits as $g_i_at_b){
$a++;
echo '  START ATTRIBUTE</br>';
echo ($fields_attr[$g_i_at_b]['name'] ? '  ; NAME: '.$fields_attr[$g_i_at_b]['name'].'</br>' : '');
echo ($fields_attr[$g_i_at_b]['value'] ? '  ; VALUE: '.$fields_attr[$g_i_at_b]['value'].'</br>' : '');
echo '  END ATTRIBUTE</br>';
}
}
}
if($a > 0){
echo 'END ITEM</br></br>';
}
$i++;
}
}
function start_element($parser, $name, $attributes){
switch($name){
case 'ITEM':
$item = array('attribute'=>'');
$GLOBALS['items'][] = $item;
$item_attr = array('id'=>''.$attributes['ID'].'','name'=>''.$attributes['NAME'].'');
$GLOBALS['items_attr'][] = $item_attr;
break;
case 'ATTRIBUTE':
$field_attr = array('name'=>''.$attributes['NAME'].'','value'=>''.$attributes['VALUE'].'');
$GLOBALS['fields_attr'][] = $field_attr;
$GLOBALS['currentField'] = 'attribute';
$attr .= $GLOBALS['currentAttrIndex'].',';
$GLOBALS['attrs'][$GLOBALS['currentIndex']] .= $attr;
break;
}
}
function end_element($parser, $name){
switch($name){
case 'ITEM':
$GLOBALS['currentIndex']++;
break;
case 'ATTRIBUTE':
$GLOBALS['currentAttrIndex']++;
break;
}
}
function cdata($parser, $data){
$currentIndex = $GLOBALS['currentIndex'];
$currentField = $GLOBALS['currentField'];
if($data != ''){
$GLOBALS['items'][$currentIndex][$currentField] = $data;
}
}
?>
在互联网上进行一些研究后,我意识到 SAX XML 解析器对我来说是最好的选择,因为我一直在寻找最快的 XML 大型(非常大)解析器 xml 文件。
所以我正在处理我在教程中找到的这段代码,它确实工作得很好,我只是不知道如何读取第一个和第二个元素的属性,以及每个元素中的内容。
代码如下:
XML
<?xml version="1.0" encoding="iso-8859-1"?>
<items>
<item id="100" name="First Element 1" />
<item id="101" name="First Element 2" />
<item id="102" name="First Element 3" />
<item id="103" name="First Element 4">
<attribute name="Second Element 4" value="508" />
</item>
<item id="104" name="First Element 5" />
<item id="105" name="First Element 6">
<attribute name="Second Element 6" value="215" />
</item>
</items>
PHP
$items = array();
$elements = null;
$item_attributes = null; //I added that myself, not sure if it's correct
// Called to this function when tags are opened
function startElements($parser, $name, $attrs) {
global $items, $elements, $item_attributes; // <-- added it here aswell
if(!empty($name)) {
if ($name == 'ITEM') {
if (!empty($attrs['ID'])) {
$item_attributes []= array(); // <-- here aswell
}
// creating an array to store information
$items []= array();
}
$elements = $name;
}
}
// Called to this function when tags are closed
function endElements($parser, $name) {
global $elements;
if(!empty($name)) {
$elements = null;
}
}
// Called on the text between the start and end of the tags
function characterData($parser, $data) {
global $items, $elements;
if(!empty($data)) {
if ($elements == 'ATTRIBUTE') {
$items[count($items)-1][$elements] = trim($data);
}
}
}
// Creates a new XML parser and returns a resource handle referencing it to be used by the other XML functions.
$parser = xml_parser_create();
xml_set_element_handler($parser, "startElements", "endElements");
xml_set_character_data_handler($parser, "characterData");
// open xml file
if (!($handle = fopen('./pages/scripts/sax.xml', "r"))) {
die("could not open XML input");
}
while($data = fread($handle, 4096)) {
xml_parse($parser, $data); // start parsing an xml document
}
xml_parser_free($parser); // deletes the parser
$i = 1;
foreach($items as $course) {
echo $i.' -';
echo ' ITEM ID: '.$course['ID'].'(?),';
echo ' NAME: '.$course['NAME'].'(?)<br/>';
echo 'ATTRIBUTE NAME: ???,';
echo ' ATTRIBUTE VALUE: ???<hr/>'; // not sure how to pull those results
$i++;
}
所以我试图从标签 item
中获取 id
和 name
以及从标签 [=19 中获取 name
和 value
=] 在第一个元素内 item
...
有什么想法吗?
更新: 注意:course['ID']
和 course['NAME']
没有回应任何东西,但是当我使用 course['ITEM']
或 course['ATTRIBUTE']
它回显标签 item
或 attribute
中的任何内容,例如:<item> this </item>
,我想要得到的是:<item THIS="this" />
我会以与您的方式略有不同,但它仍然适用于相同的海豚...
我知道这不是一个非常实用的方法,但我认为它会适合你...如果元素属性存在,你也可以多次获取它:
Note that at
start_element
function you can edit the attributes of bothitem
andattribute
elements in the arrays of the variables$item_attr
and$field_attr
.
我还建议您完整地 运行 这段代码,这样您就可以看到它打印的内容,然后随意编辑它。
XML:
<?xml version="1.0" encoding="iso-8859-1"?>
<items>
<item id="100" name="First Element 1" />
<item id="101" name="First Element 2" />
<item id="102" name="First Element 3" />
<item id="103" name="First Element 4">
<attribute name="Second Element 4" value="508" />">
<attribute name="Third Element 4" value="509" />
</item>
<item id="104" name="First Element 5" />
<item id="105" name="First Element 6">
<attribute name="Second Element 6" value="215" />
</item>
</items>
PHP
<?php
$GLOBALS['currentIndex'] = 0; // identify where you are at in each element item
$GLOBALS['currentAttrIndex'] = 0; // the same but for the element attribute
$GLOBALS['currentField'] = ''; // specifies the element inside item
$GLOBALS['items'] = array(); // creates the array for the elements in items
$GLOBALS['attrs'] = array(); // creates the array for the elements attributes (in case of more than one)
$GLOBALS['items_attr'] = '';
$GLOBALS['fields_attr'] = '';
$parser = xml_parser_create();
xml_set_element_handler($parser, 'start_element', 'end_element');
xml_set_character_data_handler($parser, 'cdata');
xml_parse($parser, file_get_contents('./pages/test/sax.xml'), true);
xml_parser_free($parser);
//display results in a familiar way as a debugger...
$items = $GLOBALS['items'];
$attrs = $GLOBALS['attrs'];
$items_attr = $GLOBALS['items_attr'];
$fields_attr = $GLOBALS['fields_attr'];
$i = 1;
if (count($items) > 0) {
foreach($items as $item){
echo 'START ITEM</br>';
echo ($items_attr[$i-1]['id'] ? 'ID: '.$items_attr[$i-1]['id'].'</br>' : '');
echo ($items_attr[$i-1]['name'] ? 'NAME: '.$items_attr[$i-1]['name'].'</br>' : '');
$a = 0;
foreach ($attrs as $attr_id => $attr_name) {
if($attr_id == $i-1){
$g_i_attr_bits = explode(",", substr($attr_name, 0, -1));
foreach($g_i_attr_bits as $g_i_at_b){
$a++;
echo '  START ATTRIBUTE</br>';
echo ($fields_attr[$g_i_at_b]['name'] ? '  ; NAME: '.$fields_attr[$g_i_at_b]['name'].'</br>' : '');
echo ($fields_attr[$g_i_at_b]['value'] ? '  ; VALUE: '.$fields_attr[$g_i_at_b]['value'].'</br>' : '');
echo '  END ATTRIBUTE</br>';
}
}
}
if($a > 0){
echo 'END ITEM</br></br>';
}
$i++;
}
}
function start_element($parser, $name, $attributes){
switch($name){
case 'ITEM':
$item = array('attribute'=>'');
$GLOBALS['items'][] = $item;
$item_attr = array('id'=>''.$attributes['ID'].'','name'=>''.$attributes['NAME'].'');
$GLOBALS['items_attr'][] = $item_attr;
break;
case 'ATTRIBUTE':
$field_attr = array('name'=>''.$attributes['NAME'].'','value'=>''.$attributes['VALUE'].'');
$GLOBALS['fields_attr'][] = $field_attr;
$GLOBALS['currentField'] = 'attribute';
$attr .= $GLOBALS['currentAttrIndex'].',';
$GLOBALS['attrs'][$GLOBALS['currentIndex']] .= $attr;
break;
}
}
function end_element($parser, $name){
switch($name){
case 'ITEM':
$GLOBALS['currentIndex']++;
break;
case 'ATTRIBUTE':
$GLOBALS['currentAttrIndex']++;
break;
}
}
function cdata($parser, $data){
$currentIndex = $GLOBALS['currentIndex'];
$currentField = $GLOBALS['currentField'];
if($data != ''){
$GLOBALS['items'][$currentIndex][$currentField] = $data;
}
}
?>