Delicious Bookmark this on Delicious Share on Facebook SlashdotSlashdot It! Digg! Digg



PHP : Function Reference : XML Parser Functions : xml_set_element_handler

xml_set_element_handler

Set up start and end element handlers (PHP 4, PHP 5)
bool xml_set_element_handler ( resource parser, callback start_element_handler, callback end_element_handler )

Examples ( Source code ) » xml_set_element_handler

<?php
$file 
"contact.xml";
   
function 
startElement($parser$name$attrs) {
    print 
"<B>$name =></B>  ";
}

function 
endElement($parser$name) {
    print 
"n";
}
   
function 
characterData($parser$value) {
    print 
"$value<BR>";
}
   
$simpleparser xml_parser_create();
xml_set_element_handler($simpleparser"startElement""endElement");
xml_set_character_data_handler($simpleparser"characterData");
   
if (!(
$fp fopen($file"r"))) {
  die(
"could not open XML input");
}
   
while(
$data fread($fpfilesize($file))) {
  if (!
xml_parse($simpleparser$datafeof($fp))) {
     die(
xml_error_string(xml_get_error_code($simpleparser)));
  }
}

xml_parser_free($simpleparser);
?>
<!--
<contact id="43956">
     <personal>
          <name>
               <first>J</first>
               <middle>J</middle>
               <last>J</last>
          </name>
          <title>Manager</title>
          <employer>National Company</employer>
          <dob>1951-02-02</dob>
     </personal>
</contact>

-->

Related Examples ( Source code ) » xml_set_element_handler







Code Examples / Notes » xml_set_element_handler

11-oct-2001 12:09

You CAN use classes to parse XML. Just take a look at the following function:
xml_set_object


anonymous koward

This has been mentioned before, but I just spent several days trying to figure out what was going on. Folks, if your XML parser is completely in a class, look at the documentation for xml_set_object(). The documentation above does say you can use functions inside classes as callbacks for xml_set_element_handler, but it doesn't tell you that if your entire XML parser is inside a class, then this is the _WRONG_ way to do things.
You should instead call xml_set_object with your parser variable and $this, which will then fix strange errors that can otherwise crop up and stop you having to pass an array of ( $this, 'tagStart' ) to this function.
e.g.
<?php
class BadParser
{
function BadParser ()
{
$parser = xml_parser_create();
//This is the WRONG WAY to set the functions inside your class for parsing the XML.
xml_set_element_handler ( $parser, array ( $this, 'tagStart' ), array ( $this, 'tagEnd' ) );
xml_set_character_data_handler ( $parser, array ( $this, 'tagContent' ) );
xml_parse ( $parser, $this->XMLData );
}
function tagStart ( $parser, $tagName, $attributes = NULL )
{
$this->tag = $tagName;
}
function tagEnd ( $parser, $tagName )
{
$this->tag = NULL;
}
function tagContent ( $parser, $content )
{
//This WILL NOT work as you intended. $this->tag will do strange, mysterious things, but it won't be the tag name like you expected.
echo ( "{$this->tag}: $content" );
}
}
?>
Instead, you should change your constructor to do this as XML initalization instead:
<?php
class GoodParser
{
function GoodParser ()
{
$parser = xml_parser_create();
//This is the RIGHT WAY to set everything inside the object.
xml_set_object ( $parser, $this );
xml_set_element_handler ( $parser, 'tagStart', 'tagEnd' );
xml_set_character_data_handler ( $parser, 'tagContent' );
xml_parse ( $parser, $this->XMLData );
}
/* ... */
}
?>
I don't know if this problem exists in other versions of PHP. My version is 4.4.1. Hope I made sense, if this note had been around, it would've saved a lot of headaches for me (maybe I'm not observant enough).


darien

This documentation is somewhat awry. I know it's been said many times before, but it bears repeating...
If using PHP4, you may be required to use xml_set_object() instead of calling any of the xml_set_*_handler() functions with a two-item array. It will work fine on PHP5, but move the same code to PHP4 and it will create one copie of $this (even if you use &$this) for each handler you set!
<?php
// This code will fail mysteriously on PHP4.
$this->parser = xml_parser_create();
xml_set_element_handler(
           $this->parser,
           array(&$this,"start_tag"),
           array(&$this,"end_tag")
       );
       xml_set_character_data_handler(
           $this->parser,
           array(&$this,"tag_data")
       );
?>
<?php
// This code will work on PHP4.
$this->parser = xml_parser_create();
xml_set_object($this->parser,&$this);
xml_set_element_handler(
           $this->parser,
           "start_tag",
           "end_tag"
       );
       xml_set_character_data_handler(
           $this->parser,
           "tag_data"
       );
?>


turan dot yuksel

The method that 'ibjoel at hotmail dot com' have described requires libxml2 as the xml parser, it does not work with expat. For a brief explanation, see xml_get_current_byte_index.

rubentrancoso

My 25 cents. This example show how to parse a XML in a associative array tree.
<?php
$file = "flow/flow.xml";
$depth = 0;
$tree = array();
$tree['name'] = "root";
$stack[count($stack)] = &$tree;
function startElement($parser, $name, $attrs) {
  global $depth;
  global $stack;
  global $tree;
 
  $element = array();
  $element['name'] = $name;
  foreach ($attrs as $key => $value) {
//echo $key."=".$value;
$element[$key]=$value;
}
  $last = &$stack[count($stack)-1];
  $last[count($last)-1] = &$element;
  $stack[count($stack)] = &$element;
  $depth++;
}
function endElement($parser, $name) {
  global $depth;
  global $stack;
  array_pop($stack);
  $depth--;
}
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startElement", "endElement");
if (!($fp = fopen($file, "r"))) {
  die("could not open XML input");
}
while ($data = fread($fp, 4096)) {
  if (!xml_parse($xml_parser, $data, feof($fp))) {
      die(sprintf("XML error: %s at line %d",
                  xml_error_string(xml_get_error_code($xml_parser)),
                  xml_get_current_line_number($xml_parser)));
  }
}
xml_parser_free($xml_parser);
$tree = $stack[0][0];
echo "<pre>";
print_r($tree);
echo "</pre>";


tj

It seems that the tag handlers don't block on one another (the end handler is called whether or not the begin handler has finished). this can put you in a tight spot if you don't realize it while planning your app.

vladimir-leontiev

It seems that characterData() gets characters in chuncks of 1024; therefore if you have string of characters between you tags that is longer than 1024 then characterData() will be called more that once for single pair of tags. I don't know if this feature(bug?) is documented anywhere, I just wanted to warn everyone about this; it had tripped me. I use php 4.3.10 on Linux.

aw

In response to landb at mail dot net...
As the notes mention, you can pass an array that contains the reference to an object and a method name when you need... so you can call methods in your own class as handlers like this:
xml_set_element_handler($parser, array($this,"_startElement"), array($this,"_endElement"));
Hope it helps...


13-mar-2005 09:34

In response to aw at avatartechnology dot com...
In response to landb at mail dot net...
When your functions are in an object:
Careful ! Don't forget to add: & (reference) to your parameters.
xml_set_element_handler($parser, array(&$this,"_startElement"), array(&$this,"_endElement"));
--> xmlparse will work on your object (good).
instead of:
xml_set_element_handler($parser, array($this,"_startElement"), array($this,"_endElement"));
---> xmlparse will work on a COPY of your object (often bad)
Vin-s
(sorry for my english)


jg

If you are using a class for xml parsing, and want to check the return value of xml_set_element_handler in case it  fails, you must do this outside of the class's constructor. Inside the constructor, PHP-4.0.5 will die.
Basically, put all your xml initialisation code in another function of the class, and keep it out of the constructor.


hendra_g

I ran into the same problem with 'ibjoel at hotmail dot com' in regards to self-closing tags, and found that the script that he/she wrote did not work as I expected.
I played around with some of php's functions and examples and compiled something, which may not be the neatest solution, but it works for the data that 'ibjoel at hotmail dot com' provided.
The data needs to be read from a file though, so the fp can be utilised. It still uses the xml_get_current_byte_index(resource parser) trick, but this time, I check for the last 2 character before the index and test if it's "/>".
<?php
/* myxmltest.xml:
<normal_tag>
 <self_close_tag />
    data
 <normal_tag>data
    <self_close_tag attr="value" />
 </normal_tag>
    data
 <normal_tag></normal_tag>
</normal_tag>
*/
//## Global Variables ##//
$file = "myxmltest.xml";
$character_data_on = false;
$tag_complete = true;
function startElement($parser, $name, $attrs)
{
   global $character_data_on;
   global $tag_complete;
   
   echo "&lt;<font color=\"#0000cc\">$name</font>";
   //## Print the attributes ##//
   if (sizeof($attrs)) {
       while (list($k, $v) = each($attrs)) {
           echo " <font color=\"#009900\">$k</font>=\"<font
                  color=\"#990000\">$v</font>\"";
       }
   }
   //## Tag is still still incomplete,
   //## will be completed at either endElement or characterData ##//
   $tag_complete = false;
   $character_data_on = false;
}
function endElement($parser, $name)
{
   global $fp;
   global $character_data_on;
   global $tag_complete;
   
   //#### Test for self-closing tag ####//
   //## xml_get_current_byte_index(resource parser) when run in this
   //## function, gives the index at (indicated by *):
   //##   for self closing tag: <br />*
   //##   for individual closing tag: <div>character data*</div>
   //## So to test for self-closing tag, we can just test for the last 2
   //## characters from the index
   //###################################//
   
   if (!$character_data_on) {
       //## Record current fp position ##//
       $temp_fp = ftell($fp);
       
       //## Point fp to 2 bytes before the end element byte index ##//
       $end_element_byte_index = xml_get_current_byte_index($parser);
       fseek($fp,$end_element_byte_index-2);
       
       //## Gets the last 2 characters before the end element byte index ##//
       $validator = fgets($fp, 3);
       
       //## Restore fp position ##//
       fseek($fp,$temp_fp);
       
       //## If the last 2 character is "/>" ##//
       if ($validator=="/>") {
           //// Complete the self-closing tag ////
           echo " /&gt";
           //// Otherwise it is an individual closing tag ////
       } else echo "&gt&lt/<font color=\"#0000cc\">$name</font>&gt";
       $tag_complete = true;
   } else echo "&lt/<font color=\"#0000cc\">$name</font>&gt";
   
   $character_data_on = false;
}
function characterData($parser, $data)
{
   global $character_data_on;
   global $tag_complete;
   
   if ((!$character_data_on)&&(!$tag_complete)) {
       echo "&gt";
       $tag_complete = true;
   }
   echo "<b>$data</b>";
   $character_data_on = true;
}
$xml_parser = xml_parser_create();
xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, false);
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "characterData");
if (!($fp = fopen($file, "r"))) {
   die("could not open XML input");
}
echo "<pre>";
while ($file_content = fread($fp, 4096)) {
   if (!xml_parse($xml_parser, $file_content, feof($fp))) {
       die(sprintf("XML error: %s at line %d",
                   xml_error_string(xml_get_error_code($xml_parser)),
                   xml_get_current_line_number($xml_parser)));
   }
}
echo "</pre>";
xml_parser_free($xml_parser);
?>


ibjoel

I noticed that in the example below, and all the examples I've seen on this site for viewing xml in html, the look of self closing tags such as <br /> are not preserved. The parser cannot distinguish between <tag /> and <tag></tag>, and if your start and end element functions are like these examples, both instances will be output with both an indvidual start and end tag.  I needed to preserve self-closing tags and it took me a while to figure out this work around. Hope this helps someone...
 
The start tag is left open, and then completed by it's first child, the next start tag or its end tag.  The end tag will complete with " />", or </tag> depending on the number of bytes between the start and end tags in the parsed data.
<?php
//$data=filepath or string
$data=<<<DATA
<normal_tag>
 <self_close_tag />
     data
 <normal_tag>data
    <self_close_tag attr="value" />
 </normal_tag>
     data
 <normal_tag></normal_tag>
</normal_tag>
DATA;
function startElement($parser, $name, $attrs)
{
       xml_set_character_data_handler($parser, "characterData");
       global $first_child, $start_byte;
       if($first_child)          //close start tag if neccessary
               echo "><br />";
       $first_child=true;
       $start_byte=xml_get_current_byte_index ($parser);
       if(count($attrs)>=1){
               foreach($attrs as $x=>$y){
                       $attr_string .= " $x=\"$y\"";
               }
       }
       echo htmlentities("<{$name}{$attr_string}"); //unclosed starttag
}
function endElement($parser, $name)
{
       global $first_child, $start_byte;
       $byte=xml_get_current_byte_index ($parser);
       if($byte-$start_byte>2){           //if end tag is more than 2 bytes from start tag
               if($first_child)          //close start tag if neccessary
                       echo "><br />";
               echo htmlentities("</{$name}>")."<br />";  //individual end tag
       }else
               echo " /><br />";  // self closing tag
       $first_child=false;
}
function characterData($parser, $data)
{
       global $first_child;
       if($first_child)  //if $data is first child, close start tag
               echo "><br />";
       if($data=trim($data))
               echo "<font color='blue'>$data</font><br />";
       $first_child=false;
}
function ParseData($data)
{
       $xml_parser = xml_parser_create();
       xml_set_element_handler($xml_parser, "startElement", "endElement");
       xml_parser_set_option($xml_parser,XML_OPTION_CASE_FOLDING,0);
       if(is_file($data))
       {
               if (!($fp = fopen($file, "r"))) {
                       die("could not open XML input");
               }
               while ($data = fread($fp, 4096)) {
                       if (!xml_parse($xml_parser, $data, feof($fp))) {
                               $error=xml_error_string(xml_get_error_code($xml_parser));
                              $line=xml_get_current_line_number($xml_parser);
                               die(sprintf("XML error: %s at line %d",$error,$line));
                       }
               }
       }else{
               if (!xml_parse($xml_parser, $data, 1)) {
                               $error=xml_error_string(xml_get_error_code($xml_parser));
                               $line=xml_get_current_line_number($xml_parser);
                               die(sprintf("XML error: %s at line %d",$error,$line));
               }
       }
       
       xml_parser_free($xml_parser);
}
ParseData($data);
?>


lloeki

I modified the previous script, so that it is associative. I find it more useful that way. BTW I prefer strtolower() things, but that's not mandatory at all.
<?php
$file = "data.xml";
$depth = 0;
$tree = array();
$tree['name'] = "root";
$stack[] = &$tree;
function startElement($parser, $name, $attrs) {
  global $depth;
  global $stack;
  global $tree;
 
  $element = array();
  foreach ($attrs as $key => $value) {
      $element[strtolower($key)]=$value;
  }
  end($stack);
  $stack[key($stack)][strtolower($name)] = &$element;
  $stack[strtolower($name)] = &$element;
 
  $depth++;
}
function endElement($parser, $name) {
  global $depth;
  global $stack;
  array_pop($stack);
  $depth--;
}
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startElement", "endElement");
if (!($fp = fopen($file, "r"))) {
  die("could not open XML input");
}
while ($data = fread($fp, 4096)) {
  if (!xml_parse($xml_parser, $data, feof($fp))) {
      die(sprintf("XML error: %s at line %d",
                  xml_error_string(xml_get_error_code($xml_parser)),
                  xml_get_current_line_number($xml_parser)));
  }
}
xml_parser_free($xml_parser);
$tree = end(end($stack));
echo "<pre>";
print_r($tree);
echo "</pre>";
?>


redb

Example below (BadParser) works fine with some changes.
xml_set_element_handler ( $parser, array ( &$this, 'tagStart' ), array ( &$this, 'tagEnd' ) );
xml_set_character_data_handler ( $parser, array ( &$this, 'tagContent' ) );


youniforever

<html>
 <head>
   <title>SAX Demonstration</title>
  <META HTTP-EQUIV='Content-type' CONTENT='text/html; charset=euc-kr'>
 </head>
 <body>
   <h1>RSS ¸®´õ±â</h1>
   
     <?php
    $file = "data.xml";
     
     $currentTag = "";
     $currentAttribs = "";
     function startElement($parser, $name, $attribs)
     {
         global $currentTag, $currentAttribs;
         $currentTag = $name;
   
         $currentAttribs = $attribs;
         switch ($name) {
         
         default:
             echo("<b>&lt$name&gt</b>
");
             break;
         }
     }
     function endElement($parser, $name)
     {
         global $currentTag;
         switch ($name) {
         default:
             echo("
<b>&lt/$name&gt</b>
");
             break;
         }
         $currentTag = "";
         $currentAttribs = "";
     }
     function characterData($parser, $data)
     {
         global $currentTag;
         switch ($currentTag) {
         case "link":
             echo("<a href=\"$data\">$data</a>\n");
             break;
         case "title":
             echo("title : $data");
             break;
         default:
             echo($data);
             break;
         }
     }
    $xmlParser = xml_parser_create();
   
     $caseFold = xml_parser_get_option($xmlParser,
                                       XML_OPTION_CASE_FOLDING);
   
     $targetEncoding = xml_parser_get_option($xmlParser,
                                             XML_OPTION_TARGET_ENCODING);
     if ($caseFold == 1) {
         xml_parser_set_option($xmlParser, XML_OPTION_CASE_FOLDING, false);
     }
     xml_set_element_handler($xmlParser, "startElement", "endElement");
     xml_set_character_data_handler($xmlParser, "characterData");
     if (!($fp = fopen($file, "r"))) {
         die("Cannot open XML data file: $file");
     }
    while ($data = fread($fp, 4096)) {
         if (!xml_parse($xmlParser, $data, feof($fp))) {
             die(sprintf("XML error: %s at line %d",
                         xml_error_string(xml_get_error_code($xmlParser)),
                         xml_get_current_line_number($xmlParser)));
             xml_parser_free($xmlParser);
         }
     }
     xml_parser_free($xmlParser);
     ?>
   </table>
 </body>
</html>


Change Language


Follow Navioo On Twitter
utf8_decode
utf8_encode
xml_error_string
xml_get_current_byte_index
xml_get_current_column_number
xml_get_current_line_number
xml_get_error_code
xml_parse_into_struct
xml_parse
xml_parser_create_ns
xml_parser_create
xml_parser_free
xml_parser_get_option
xml_parser_set_option
xml_set_character_data_handler
xml_set_default_handler
xml_set_element_handler
xml_set_end_namespace_decl_handler
xml_set_external_entity_ref_handler
xml_set_notation_decl_handler
xml_set_object
xml_set_processing_instruction_handler
xml_set_start_namespace_decl_handler
xml_set_unparsed_entity_decl_handler
eXTReMe Tracker