Delicious Bookmark this on Delicious Share on Facebook SlashdotSlashdot It! Digg! Digg



PHP : Function Reference : DOM Functions

DOM Functions

Introduction

The DOM extension allows you to operate on XML documents through the DOM API with PHP 5.

For PHP 4, use DOM XML.

Note:

DOM extension uses UTF-8 encoding. Use utf8_encode() and utf8_decode() to work with texts in ISO-8859-1 encoding or Iconv for other encodings.

Installation

There is no installation needed to use these functions; they are part of the PHP core.

Predefined Classes

The API of the module follows the » DOM Level 3 standard as closely as possible. Consequently, the API is fully object-oriented. It is a good idea to have the DOM standard available when using this module.

This module defines a number of classes, which are explained in the following tables. Classes with an equivalent in the DOM standard are named DOMxxx.

DOMAttr

Extends DOMNode. The DOMAttr interface represents an attribute in an DOMElement object.

Constructor

Methods

Properties

Table 52. 

Name Type Read-only Description
name string yes The name of the attribute
ownerElement DOMElement yes The element which contains the attribute
schemaTypeInfo bool yes Not implemented yet, always return NULL
specified bool yes Not implemented yet, always return NULL
value string no The value of the attribute


DOMCharacterData

Extends DOMNode.

Methods

Properties

Table 53. 

Name Type Read-only Description
data string no The contents of the node
length int yes The length of the contents


DOMComment

Extends DOMCharacterData.

Constructor

DOMDocument

Extends DOMNode.

Constructor

Methods

Properties

Table 54. 

Name Type Read-only Description
actualEncoding string yes  
config DOMConfiguration yes  
doctype DOMDocumentType yes The Document Type Declaration associated with this document.
documentElement DOMElement yes This is a convenience attribute that allows direct access to the child node that is the document element of the document.
documentURI string no The location of the document or NULL if undefined.
encoding string no  
formatOutput bool no  
implementation DOMImplementation yes The DOMImplementation object that handles this document.
preserveWhiteSpace bool no Do not remove redundant white space. Default to TRUE.
recover bool no  
resolveExternals bool no Set it to TRUE to load external entities from a doctype declaration. This is useful for including character entities in your XML document.
standalone bool no  
strictErrorChecking bool no Throws DOMException on errors. Default to TRUE.
substituteEntities bool no  
validateOnParse bool no Loads and validates against the DTD. Default to FALSE.
version string no  
xmlEncoding string yes An attribute specifying, as part of the XML declaration, the encoding of this document. This is NULL when unspecified or when it is not known, such as when the Document was created in memory.
xmlStandalone bool no An attribute specifying, as part of the XML declaration, whether this document is standalone. This is FALSE when unspecified.
xmlVersion string no An attribute specifying, as part of the XML declaration, the version number of this document. If there is no declaration and if this document supports the "XML" feature, the value is "1.0".


DOMDocumentFragment

Extends DOMNode.

Methods

DOMDocumentType

Extends DOMNode

Each DOMDocument has a doctype attribute whose value is either NULL or a DOMDocumentType object.

Properties

Table 55. 

Name Type Read-only Description
publicId string yes The public identifier of the external subset.
systemId string yes The system identifier of the external subset. This may be an absolute URI or not.
name string yes The name of DTD; i.e., the name immediately following the DOCTYPE keyword.
entities DOMNamedNodeMap yes A DOMNamedNodeMap containing the general entities, both external and internal, declared in the DTD.
notations DOMNamedNodeMap yes A DOMNamedNodeMap containing the notations declared in the DTD.
internalSubset string yes The internal subset as a string, or null if there is none. This is does not contain the delimiting square brackets.


DOMElement

Extends DOMNode.

Constructor

Methods

Properties

Table 56. 

Name Type Read-only Description
schemaTypeInfo bool yes Not implemented yet, always return NULL
tagName string yes The element name


DOMEntity

Extends DOMNode

This interface represents a known entity, either parsed or unparsed, in an XML document.

Properties

Table 57. 

Name Type Read-only Description
publicId string yes The public identifier associated with the entity if specified, and NULL otherwise.
systemId string yes The system identifier associated with the entity if specified, and NULL otherwise. This may be an absolute URI or not.
notationName string yes For unparsed entities, the name of the notation for the entity. For parsed entities, this is NULL.
actualEncoding string no An attribute specifying the encoding used for this entity at the time of parsing, when it is an external parsed entity. This is NULL if it an entity from the internal subset or if it is not known.
encoding string yes An attribute specifying, as part of the text declaration, the encoding of this entity, when it is an external parsed entity. This is NULL otherwise.
version string yes An attribute specifying, as part of the text declaration, the version number of this entity, when it is an external parsed entity. This is NULL otherwise.


DOMEntityReference

Extends DOMNode.

Constructor

DOMException

DOM operations raise exceptions under particular circumstances, i.e., when an operation is impossible to perform for logical reasons.

See also Chapter 11, Exceptions.

Properties

Table 58. 

Name Type Read-only Description
code int yes An integer indicating the type of error generated


DOMImplementation

The DOMImplementation interface provides a number of methods for performing operations that are independent of any particular instance of the document object model.

Constructor

Methods

DOMNamedNodeMap

Methods

DOMNode

Methods

Properties

Table 59. 

Name Type Read-only Description
nodeName string yes Returns the most accurate name for the current node type
nodeValue string no The value of this node, depending on its type.
nodeType int yes Gets the type of the node. One of the predefined XML_xxx_NODE constants
parentNode DOMNode yes The parent of this node.
childNodes DOMNodeList yes A DOMNodeList that contains all children of this node. If there are no children, this is an empty DOMNodeList.
firstChild DOMNode yes The first child of this node. If there is no such node, this returns NULL.
lastChild DOMNode yes The last child of this node. If there is no such node, this returns NULL.
previousSibling DOMNode yes The node immediately preceding this node. If there is no such node, this returns NULL.
nextSibling DOMNode yes The node immediately following this node. If there is no such node, this returns NULL.
attributes DOMNamedNodeMap yes A DOMNamedNodeMap containing the attributes of this node (if it is a DOMElement) or NULL otherwise.
ownerDocument DOMDocument yes The DOMDocument object associated with this node.
namespaceURI string yes The namespace URI of this node, or NULL if it is unspecified.
prefix string no The namespace prefix of this node, or NULL if it is unspecified.
localName string yes Returns the local part of the qualified name of this node.
baseURI string yes The absolute base URI of this node or NULL if the implementation wasn't able to obtain an absolute URI.
textContent string no This attribute returns the text content of this node and its descendants.


DOMNodeList

Methods

Properties

Table 60. 

Name Type Read-only Description
length int yes The number of nodes in the list. The range of valid child node indices is 0 to length - 1 inclusive.


DOMNotation

Extends DOMNode

Properties

Table 61. 

Name Type Read-only Description
publicId string yes  
systemId string yes  


DOMProcessingInstruction

Extends DOMNode.

Constructor

Properties

Table 62. 

Name Type Read-only Description
target string yes  
data string no  


DOMText

Extends DOMCharacterData.

Constructor

Methods

Properties

Table 63. 

Name Type Read-only Description
wholeText string yes  


DOMXPath

Constructor

Methods

Properties

Table 64. 

Name Type Read-only Description
document DOMDocument    


Examples

Many examples in this reference require an XML file. We will use book.xml that contains the following:

Example 512. book.xml

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" [
]>
<book id="listing">
<title>My lists</title>
<chapter id="books">
 <title>My books</title>
 <para>
  <informaltable>
   <tgroup cols="4">
    <thead>
     <row>
      <entry>Title</entry>
      <entry>Author</entry>
      <entry>Language</entry>
      <entry>ISBN</entry>
     </row>
    </thead>
    <tbody>
     <row>
      <entry>The Grapes of Wrath</entry>
      <entry>John Steinbeck</entry>
      <entry>en</entry>
      <entry>0140186409</entry>
     </row>
     <row>
      <entry>The Pearl</entry>
      <entry>John Steinbeck</entry>
      <entry>en</entry>
      <entry>014017737X</entry>
     </row>
     <row>
      <entry>Samarcande</entry>
      <entry>Amine Maalouf</entry>
      <entry>fr</entry>
      <entry>2253051209</entry>
     </row>
     <!-- TODO: I have a lot of remaining books to add.. -->
    </tbody>
   </tgroup>
  </informaltable>
 </para>
</chapter>
</book>


Predefined Constants

The constants below are defined by this extension, and will only be available when the extension has either been compiled into PHP or dynamically loaded at runtime.

Table 65. XML constants

Constant Value Description
XML_ELEMENT_NODE (integer) 1 Node is a DOMElement
XML_ATTRIBUTE_NODE (integer) 2 Node is a DOMAttr
XML_TEXT_NODE (integer) 3 Node is a DOMText
XML_CDATA_SECTION_NODE (integer) 4 Node is a DOMCharacterData
XML_ENTITY_REF_NODE (integer) 5 Node is a DOMEntityReference
XML_ENTITY_NODE (integer) 6 Node is a DOMEntity
XML_PI_NODE (integer) 7 Node is a DOMProcessingInstruction
XML_COMMENT_NODE (integer) 8 Node is a DOMComment
XML_DOCUMENT_NODE (integer) 9 Node is a DOMDocument
XML_DOCUMENT_TYPE_NODE (integer) 10 Node is a DOMDocumentType
XML_DOCUMENT_FRAG_NODE (integer) 11 Node is a DOMDocumentFragment
XML_NOTATION_NODE (integer) 12 Node is a DOMNotation
XML_HTML_DOCUMENT_NODE (integer) 13  
XML_DTD_NODE (integer) 14  
XML_ELEMENT_DECL_NODE (integer) 15  
XML_ATTRIBUTE_DECL_NODE (integer) 16  
XML_ENTITY_DECL_NODE (integer) 17  
XML_NAMESPACE_DECL_NODE (integer) 18  
XML_ATTRIBUTE_CDATA (integer) 1  
XML_ATTRIBUTE_ID (integer) 2  
XML_ATTRIBUTE_IDREF (integer) 3  
XML_ATTRIBUTE_IDREFS (integer) 4  
XML_ATTRIBUTE_ENTITY (integer) 5  
XML_ATTRIBUTE_NMTOKEN (integer) 7  
XML_ATTRIBUTE_NMTOKENS (integer) 8  
XML_ATTRIBUTE_ENUMERATION (integer) 9  
XML_ATTRIBUTE_NOTATION (integer) 10  


Table 66. DOMException constants

Constant Value Description
DOM_INDEX_SIZE_ERR (integer) 1 If index or size is negative, or greater than the allowed value.
DOMSTRING_SIZE_ERR (integer) 2 If the specified range of text does not fit into a DOMString.
DOM_HIERARCHY_REQUEST_ERR (integer) 3 If any node is inserted somewhere it doesn't belong
DOM_WRONG_DOCUMENT_ERR (integer) 4 If a node is used in a different document than the one that created it.
DOM_INVALID_CHARACTER_ERR (integer) 5 If an invalid or illegal character is specified, such as in a name.
DOM_NO_DATA_ALLOWED_ERR (integer) 6 If data is specified for a node which does not support data.
DOM_NO_MODIFICATION_ALLOWED_ERR (integer) 7 If an attempt is made to modify an object where modifications are not allowed.
DOM_NOT_FOUND_ERR (integer) 8 If an attempt is made to reference a node in a context where it does not exist.
DOM_NOT_SUPPORTED_ERR (integer) 9 If the implementation does not support the requested type of object or operation.
DOM_INUSE_ATTRIBUTE_ERR (integer) 10 If an attempt is made to add an attribute that is already in use elsewhere.
DOM_INVALID_STATE_ERR (integer) 11 If an attempt is made to use an object that is not, or is no longer, usable.
DOM_SYNTAX_ERR (integer) 12 If an invalid or illegal string is specified.
DOM_INVALID_MODIFICATION_ERR (integer) 13 If an attempt is made to modify the type of the underlying object.
DOM_NAMESPACE_ERR (integer) 14 If an attempt is made to create or change an object in a way which is incorrect with regard to namespaces.
DOM_INVALID_ACCESS_ERR (integer) 15 If a parameter or an operation is not supported by the underlying object.
DOM_VALIDATION_ERR (integer) 16 If a call to a method such as insertBefore or removeChild would make the Node invalid with respect to "partial validity", this exception would be raised and the operation would not be done.


Table of Contents

DOMAttr->__construct() — Creates a new DOMAttr object
DOMAttr->isId() — Checks if attribute is a defined ID
DOMCharacterData->appendData() — Append the string to the end of the character data of the node
DOMCharacterData->deleteData() — Remove a range of characters from the node
DOMCharacterData->insertData() — Insert a string at the specified 16-bit unit offset
DOMCharacterData->replaceData() — Replace a substring within the DOMCharacterData node
DOMCharacterData->substringData() — Extracts a range of data from the node
DOMComment->__construct() — Creates a new DOMComment object
DOMDocument->__construct() — Creates a new DOMDocument object
DOMDocument->createAttribute() — Create new attribute
DOMDocument->createAttributeNS() — Create new attribute node with an associated namespace
DOMDocument->createCDATASection() — Create new cdata node
DOMDocument->createComment() — Create new comment node
DOMDocument->createDocumentFragment() — Create new document fragment
DOMDocument->createElement() — Create new element node
DOMDocument->createElementNS() — Create new element node with an associated namespace
DOMDocument->createEntityReference() — Create new entity reference node
DOMDocument->createProcessingInstruction() — Creates new PI node
DOMDocument->createTextNode() — Create new text node
DOMDocument->getElementById() — Searches for an element with a certain id
DOMDocument->getElementsByTagName() — Searches for all elements with given tag name
DOMDocument->getElementsByTagNameNS() — Searches for all elements with given tag name in specified namespace
DOMDocument->importNode() — Import node into current document
DOMDocument->load() — Load XML from a file
DOMDocument->loadHTML() — Load HTML from a string
DOMDocument->loadHTMLFile() — Load HTML from a file
DOMDocument->loadXML() — Load XML from a string
DOMDocument->normalizeDocument() — Normalizes the document
DOMDocument->registerNodeClass() — Register extended class used to create base node type
DOMDocument->relaxNGValidate() — Performs relaxNG validation on the document
DOMDocument->relaxNGValidateSource() — Performs relaxNG validation on the document
DOMDocument->save() — Dumps the internal XML tree back into a file
DOMDocument->saveHTML() — Dumps the internal document into a string using HTML formatting
DOMDocument->saveHTMLFile() — Dumps the internal document into a file using HTML formatting
DOMDocument->saveXML() — Dumps the internal XML tree back into a string
DOMDocument->schemaValidate() — Validates a document based on a schema
DOMDocument->schemaValidateSource() — Validates a document based on a schema
DOMDocument->validate() — Validates the document based on its DTD
DOMDocument->xinclude() — Substitutes XIncludes in a DOMDocument Object
DOMDocumentFragment->appendXML() — Append raw XML data
DOMElement->__construct() — Creates a new DOMElement object
DOMElement->getAttribute() — Returns value of attribute
DOMElement->getAttributeNode() — Returns attribute node
DOMElement->getAttributeNodeNS() — Returns attribute node
DOMElement->getAttributeNS() — Returns value of attribute
DOMElement->getElementsByTagName() — Gets elements by tagname
DOMElement->getElementsByTagNameNS() — Get elements by namespaceURI and localName
DOMElement->hasAttribute() — Checks to see if attribute exists
DOMElement->hasAttributeNS() — Checks to see if attribute exists
DOMElement->removeAttribute() — Removes attribute
DOMElement->removeAttributeNode() — Removes attribute
DOMElement->removeAttributeNS() — Removes attribute
DOMElement->setAttribute() — Adds new attribute
DOMElement->setAttributeNode() — Adds new attribute node to element
DOMElement->setAttributeNodeNS() — Adds new attribute node to element
DOMElement->setAttributeNS() — Adds new attribute
DOMElement->setIdAttribute() — Declares the attribute specified by name to be of type ID
DOMElement->setIdAttributeNode() — Declares the attribute specified by node to be of type ID
DOMElement->setIdAttributeNS() — Declares the attribute specified by local name and namespace URI to be of type ID
DOMEntityReference->__construct() — Creates a new DOMEntityReference object
DOMImplementation->__construct() — Creates a new DOMImplementation object
DOMImplementation->createDocument() — Creates a DOMDocument object of the specified type with its document element
DOMImplementation->createDocumentType() — Creates an empty DOMDocumentType object
DOMImplementation->hasFeature() — Test if the DOM implementation implements a specific feature
DOMNamedNodeMap->getNamedItem() — Retrieves a node specified by name
DOMNamedNodeMap->getNamedItemNS() — Retrieves a node specified by local name and namespace URI
DOMNamedNodeMap->item() — Retrieves a node specified by index
DOMNode->appendChild() — Adds new child at the end of the children
DOMNode->cloneNode() — Clones a node
DOMNode->hasAttributes() — Checks if node has attributes
DOMNode->hasChildNodes() — Checks if node has children
DOMNode->insertBefore() — Adds a new child before a reference node
DOMNode->isDefaultNamespace() — Checks if the specified namespaceURI is the default namespace or not
DOMNode->isSameNode() — Indicates if two nodes are the same node
DOMNode->isSupported() — Checks if feature is supported for specified version
DOMNode->lookupNamespaceURI() — Gets the namespace URI of the node based on the prefix
DOMNode->lookupPrefix() — Gets the namespace prefix of the node based on the namespace URI
DOMNode->normalize() — Normalizes the node
DOMNode->removeChild() — Removes child from list of children
DOMNode->replaceChild() — Replaces a child
DOMNodelist->item() — Retrieves a node specified by index
DOMProcessingInstruction->__construct() — Creates a new DOMProcessingInstruction object
DOMText->__construct() — Creates a new DOMText object
DOMText->isWhitespaceInElementContent() — Indicates whether this text node contains whitespace
DOMText->splitText() — Breaks this node into two nodes at the specified offset
DOMXPath->__construct() — Creates a new DOMXPath object
DOMXPath->evaluate() — Evaluates the given XPath expression and returns a typed result if possible.
DOMXPath->query() — Evaluates the given XPath expression
DOMXPath->registerNamespace() — Registers the namespace with the DOMXPath object
dom_import_simplexml — Gets a DOMElement object from a SimpleXMLElement object

Code Examples / Notes » ref.dom

php

[Editor's Note: If you're using entities, then you have no choice. XML Catalogs can speed DTD resolution.]
Never use
$dom->resolveExternals=true;
when parsing XHTML document that has the DOCTYPE declaration with DTD URL specified in it.
Otherwise parsing the XHTML with DOCTYPE like this one:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
will result in PHP/DOM downloading the DTD file from W3C site when parsing your document. It will add extra delay to your script - I experienced that $dom->load()'s total time was from 1 to 16 seconds.
elixon


pes_cz

When I tried to parse my XHTML Strict files with DOM extension, it couldn't understand xhtml entities (like &copy;). I found post about it here (14-Jul-2005 09:05) which adviced to add resolveExternals = true, but it was very slow. There was some small note about xml catalogs but without any glue. Here it is:
XML catalogs is something like cache. Download all needed dtd's to /etc/xml, edit file /etc/xml/catalog and add this line: <public publicId="-//W3C//DTD XHTML 1.0 Strict//EN" uri="file:///etc/xml/xhtml1-strict.dtd" />
Thats all. Thanks to http://www.whump.com/moreLikeThis/link/03815


aidan

When dealing with validation or loading, the output errors can be quite annoying.
PHP 5.1 introduces libxml_get_errors().
http://php.net/libxml_get_errors


amir.laheratcomplinet.com

This particular W3C page provides invaluable documentation for the DOM classes implemented in php5 (via libxml2). It fills in plenty of php.net's gaps:
http://www.w3.org/TR/DOM-Level-2-Core/core.html
Some key examples:
* concise summary of the class heirachy (1.1.1)
* clarification that DOM level 2 doesn't allow for population of internal DTDs
* explanation of DOMNode->normalize()
* explanation of the DOMImplementation class
The interfaces are described in OMG's Interface Definition Language


toby

This module is not included by default either in the CentOS 4 "centosplus" repository. For those using PHP5 on CentOS 4, a simple "yum --enablerepo=centosplus install php-xml" will do the trick (this will install both the XML and DOM modules).

phpdeveloper

The Yanik's dom2array() function (added on 14-Mar-2007 08:40) does not handle multiple nodes with the same name, i.e.:
<foo>
 <name>aa</name>
 <name>bb</name>
</foo>
It will overwrite former and your array will contain just the last one ("bb")


simlee

The project I'm currently working on uses XPaths to dynamically navigate through chunks of an XML file.  I couldn't find any PHP code on the net that would build the XPath to a node for me, so I wrote my own function.  Turns out it wasn't as hard as I thought it might be (yay recursion), though it does entail using some PHP shenanigans...  
Hopefully it'll save someone else the trouble of reinventing this wheel.
<?php
   function getNodeXPath( $node ) {
       // REMEMBER THAT XPATHS USE BASE-1 INSTEAD OF BASE-0!!!
       
       // Get the index for the current node by looping through the siblings.
       $parentNode = $node->parentNode;
       if( $parentNode != null ) {
           $nodeIndex = 0;
           do {
               $testNode = $parentNode->childNodes->item( $nodeIndex );
               $nodeName = $testNode->nodeName;
               $nodeIndex++;
               
               // PHP trickery!  Here we create a counter based on the node
               //  name of the test node to use in the XPath.
               if( !isset( $$nodeName ) ) $$nodeName = 1;
               else $$nodeName++;
               
               // Failsafe return value.
               if( $nodeIndex > $parentNode->childNodes->length ) return( "/" );
           } while( !$node->isSameNode( $testNode ) );
           
           // Recursively get the XPath for the parent.
           return( getNodeXPath( $parentNode ) . "/{$node->nodeName}[{$$nodeName}]" );
       } else {
           // Hit the root node!  Note that the slash is added when
           //  building the XPath, so we return just an empty string.
           return( "" );
       }
   }
?>


mark

Note that these DOM functions expect (and presumably return) all their data in UTF-8 character encoding, regardless of what PHP's current encoding is. This means that text nodes, attribute values etc, should be in utf8.
This applies even if you're generating an XML document which is not ultimately in utf8.
Mark


cormac

Most email clients ignore stylesheets in HTML formatted emails.  The best way to ensure your HTML is formatted correctly by a broad spectrum of email clients, including webmail implementations as Gmail, is to use inline style attributes.  The following function uses DOM to parse an inline stylesheet, and will replace element class and id attributes with inline style attributes, and add inline style attributes for generic tag stylesheet rules.  It will remove the stylesheet and any used class and id attributes as these are defunct for most email clients.  It is a fairly lightweight function and does not support CSS inheritance, but will work for simple stylesheets e.g.:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>HTML EMAIL</title>
<style type="text/css">
body {
 margin: 10px 10px;
 font: 8pt arial;
 background: #fff;
 color: #000;
}
p {
 margin: 0 0 10px;
 line-height: 1.2em;
 text-align: justify;
}
p.centered {
 text-align: centre;
}
p#right {
 text-align: right;
}
</style>
</head>
<body>


Sample text justified
<p class="centered">Centered text here
<p id="right">Right-aligned text
</body>
</html>
Here's the function:
<?php
function parseStyleSheetfor Email($html)
{
 $doc = new DOMDocument;
 $doc->loadHTML($html);
 // grab inline stylesheet as DOM object
 $oStyle = $doc->getElementsByTagName('style')->item(0);
 // grab rule identifiers and rules
 preg_match_all('/^([-#._a-z0-9]+) ?\{(.*?)\}/ims', $oStyle->nodeValue, $aMatches, PREG_SET_ORDER);
 foreach ($aMatches as $aRule) {
   $rule_id = $aRule[1];
   // clean up rules
   $rule = str_replace(array("\r", "\n", '  ', '; '), array('', '', ' ', ';'), $aRule[2]);
   $rule = preg_replace(array('/^ /', '/;$/'), '', $rule);
   // generic rules
   if (!strstr($rule_id, '.') && !strstr($rule_id, '#')) {
     $items = $doc->getElementsByTagName($rule_id);
     // set style attribute equal to rule from stylesheet
     foreach ($items as $item) {
       // if there is already inline style append it to end of stylesheet rule
       $current_style = $item->getAttribute('style');
       if (!empty($current_style)) {
         $item->setAttribute('style', $rule . ';' . $current_style);
       } else {
         $item->setAttribute('style', $rule);
       }
     }
   // classes
   } elseif (strstr($rule_id, '.')) {
     list($rule_tag, $rule_class) = explode('.', $rule_id);
     $items = $doc->getElementsByTagName($rule_tag);
     foreach ($items as $item) {
       $class = $item->getAttribute('class');
       if ($class == $rule_class) {
         // if there is already inline style append it to end of stylesheet rule
         $current_style = $item->getAttribute('style');
         if (!empty($current_style)) {
           $item->setAttribute('style', $current_style . ';' . $rule);
         } else {
           $item->setAttribute('style', $rule);
         }
         // remove class as it won't be used now
         $item->removeAttribute('class');
       }
     }
   // ids
   } elseif (strstr($rule_id, '#')) {
     list($rule_tag, $id) = explode('#', $rule_id);
     $item = $doc->getElementById($id);
     $current_style = $item->getAttribute('style');
     if (!empty($current_style)) {
       $item->setAttribute('style', $current_style . ';' . $rule);
     } else {
       $item->setAttribute('style', $rule);
     }
     // remove class as it won't be used now
     $item->removeAttribute('id');
   }
 }
 // remove inline stylesheet
 $oStyle->parentNode->removeChild($oStyle);
 return $doc->saveHTML();
}
?>


francois hill

In response to myself...
I have just realized that it is possible to pass a context node in a xpath query :
DOMNodeList query ( string $expression [, DOMNode $contextnode] )
Thus rendering my class 'XPathableNode' useless.
Makes sense really.
I expect you learn every day =0)


francois hill

In response to lutfi at smartconsultant dot us :
(see my post on
http://fr2.php.net/manual/en/
function.dom-domdocument-getelementsbytagname.php
)
Use this class I wrote:
class XPathableNode extends DOMNode
{
protected $Node;
protected $DOMDocument_from_node;
protected $DOMXpath_for_node;
public function __construct(/* DOMNode */ $node)
{
$this->Node=$node;
$this->DOMDocument_from_node=new
                                                         DomDocument();
$domNode=$this->DOMDocument_from_node
                                 ->importNode($this->Node, true);
$this->DOMDocument_from_node
                                ->appendChild($domNode);
$this->DomXpath_for_node =
                   new Domxpath($this->
                                        DOMDocument_from_node);
}
public function convertHTML()
{ return $this->DOMDocument_from_node
                                                            ->saveHTML();
}
public /*DomNodeList*/ function applyXpath($xpath)
{ return $this->DomXpath_for_node
                                                      ->query($xpath);
}
}
(sorry for the display... What a terrible hinderance on the
part of php.net !)
Then :
Make a new XPathableNode out of your parent node.
You may then retrieve a DOMNodeList from it by applying a
xpath (thus being able to specify the depth  and name of
elements you want).
Has got me around some (of the many) DOM awkwardnesses a few times.
;o)


freyjkell

In order to REALLY well handle XHTML entities with DOM, you can do following things:
1. Add this DOCTYPE to your documents
<!DOCTYPE xhtmlentities PUBLIC "-//W3C//ENTITIES XHTML Character Entities 1.0//EN" "/xhtml11.ent">
2. Copy http://freyjkell.ovh.org/xhtml11.ent into your document root.
3. In your PHP:
<?php
$dom=new DOMDocument();
$dom->load('file.xhtml',LIBXML_DTDLOAD); // NOT resolveExternals - it needs true doctype, and includes crap code
// some DOM operations
$doctype=DOMImplementation::createDocumentType("html","-//W3C//DTD XHTML 1.1//EN","http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"); // creating real doctype
$output=DOMImplementation::createDocument('','',$doctype);
$output->appendChild($output->importNode($dom->documentElement,true));
$output->encoding='utf-8';
$output=$output->saveXML();
$xhtml=preg_match(
'/application\/xhtml\+xml(?![+a-z])'.
'(;q=(0\.\d{1,3}|[01]))?/i',
$_SERVER['HTTP_ACCEPT'],$xhtml) &&
(isset($xhtml[2])?$xhtml[2]:1) > 0 ||
strpos($_SERVER["HTTP_USER_AGENT"],
"W3C_Validator")!==false ||
strpos($_SERVER["HTTP_USER_AGENT"],
"WebKit")!==false; // XHTML Content-Negotiation
header('Content-Type: '.($xhtml?'application/xhtml+xml':'text/html').'; charset=utf-8');
print $output;
?>


massimo dot scamarcia

If you're moving from PHP4 to PHP5, you can keep your scripts untouched using this:
http://alexandre.alapetite.net/doc-alex/domxml-php4-php5/index.en.html


cooper

If you are using not object-oriented functions and it takes too much time to change them all (or you'll be replacing them later) then as a temporary decision can be used this modules:
For DOM XML:
http://alexandre.alapetite.net/doc-alex/domxml-php4-php5/
For XSLT:
http://alexandre.alapetite.net/doc-alex/xslt-php4-php5/


spammable69

I wrote a framework to implement the StyleSheet interfaces as specified on the W3C website.  The code is written in PHP, and is NOT a complete implementation.  Use it how ya like.  I was planning on adding the CSSStyleSheet interfaces as well.  Feel free to ask.
<?
class StyleSheetList {
public length;
private self;

function __construct ( ) {
$this->self = array();
}

function __get($property, $&ret) {
if($property == 'length')
$ret = count($this->self);
return true;
}

function __set($property, $val) {
if($property == 'length')
return true;
}

function item( $index ) {
return $this->self[$index];
}
}

class MediaList extends StyleSheetList {

function appendMedium ( $newMedium ) {
array_push($this->self, $newMedium);
}

function deleteMedium ( $oldMedium ) {
foreach($this->self as $item) {
if( $item == $oldMedium ) {
$item = $this->self[ $this->length-1 ];
array_pop($this->self);
break;
}
}
}
}

class DocumentStyle {
public styleSheets;

function __construct ( ) {
$this->styleSheets = new StyleSheetList();
}

function __set($property, $val) {
if($property == 'styleSheets')
return true;
}
}

class LinkStyle {
public sheet;

function __construct () {
$this->sheet = new StyleSheet();
}

function __set($property, $val) {
if($property == 'sheet')
return true;
}
}

class StyleSheet {
public type;
public disabled;
public ownerNode;
public parentStyleSheet;
public href;
public title;
public media;

function __construct( $type, $disabled, $ownerNode, $parentStyleSheet, $href, $title){
$this->type = $type;
$this->disabled = $disabled;
$this->media = new MediaList();
$this->ownerNode = $ownerNode;
$this->parentStyleSheet = $parentStyleSheet;
$this->href = $href;
$this->title = $title;
}
}
?>
Only contactable via http://murpsoft.com/contact.html


nevyn

I wrote a couple of functions to:
- create a DOMDocument from a file
- parse the namespaces in it
- create a XPath object with all the namespaces registered
- load the schemalocations
- validate the file on the main schema (the one without prefix)
It is useful for me, see if it is also for someone else!!
Giulio
function decodeNode($node)
{
   $out = $node->ownerDocument->saveXML($node);
   $re = "{^<((?:\\w*:)?\\w*)". //the tag name
   "[\\s\n\r]*((?:[\\s\n\r]*".
   "(?:\\w*:)?\\w+[\\s\n\r]*=[\\s\n\r]*". //possible attribute name
   "(?:\"[^\"]*\"|\'[^\']*\'))*)". //attribute value
   "[\\s\n\r]*>[\r\n]*".
   "((?:.*[\r\n]*)*)". //content
   "[\r\n]*</\\1>$}"; //closing tag
   preg_match($re, $out, $mat);
   return $mat;
}
function innerXml($node)
{
   $mat = decodeNode($node);
   return $mat[3];
}
function getnodeAttributes($node)
{
   $mat = decodeNode($node);
   $txt = $mat[2];
   $re = "{((?:\\w*:)?\\w+)[\\s\n\r]*=[\\s\n\r]*(\"[^\"]*\"|\'[^\']*\')}";
   preg_match_all($re, $txt, $mat);
   $att = array();
   for ($i=0; $i<count($mat[0]); $i++)
   {
       $value = $mat[2][$i];
       if ($value[0] == "\'" || $value[0] == "\"")
       {
           $len = strlen($value);
           $value = substr($value, 1, strlen($value)-2);
       }
       $att[ $mat[1][$i] ] = $value;
   }
   return $att;
}
function loadXml($file)
{
   $doc = new DOMDocument();
   $doc->load($file);
   //cerca l'attributo xmlns
   $xsi = false;
   $doc->namespaces = array();
   $doc->xpath = new DOMXPath($doc);
   
   $attr = getnodeAttributes($doc->documentElement);
   foreach ($attr as $name => $value)
   {
       if (substr($name,0,5) == "xmlns")
       {
           $uri = $value;
           $pre = $doc->documentElement->lookupPrefix($uri);
           if ($uri == "http://www.w3.org/2001/XMLSchema-instance")
               $xsi = $pre;
           $doc->namespaces[$pre] = $uri;
           if ($pre == "")
               $pre = "noname";
           $doc->xpath->registerNamespace($pre, $uri);
       }
   }
   
   if ($xsi)
   {
       $doc->schemaLocations = array();
       $lst = $doc->xpath->query("//@$xsi:schemaLocation");
       foreach($lst as $el)
       {
           $re = "{[\\s\n\r]*([^\\s\n\r]+)[\\s\n\r]*([^\\s\n\r]+)}";
           preg_match_all($re, $el->nodeValue, $mat);
           for ($i=0; $i<count($mat[0]); $i++)
           {
               $value = $mat[2][$i];
               $doc->schemaLocations[ $mat[1][$i] ] = $value;
           }
       }
       $olddir = getcwd();
       chdir(dirname($file));
       $schema = $doc->schemaLocations[$doc->namespaces[""]];
       if (substr($schema,0,7) == "file://")
       {
           $schema = substr($value,7);
       }
       if (!$doc->schemaValidate($schema))
           dbg()->err("Invalid file");
       chdir($olddir);
   }
   
   return $doc;
}


lutfi

i have some problem parsing recurred xml tree here:
<menu name='parent1' >
 <submenu name='file' >
    <submenu name='open' >Open file</submenu>
    <submenu name='close' >Close file</submenu>
 </submenu>
 <submenu name='edit' >
    <submenu name='cut' >Cut Clipboards</submenu>
    <submenu name='copy' >Copy Clipboards</submenu>
    <submenu name='paste' >Paste Clipboards</submenu>
 </submenu>
</menu>
with getElementsByTagName all submenu is on the same level.
i want it to be structured like tree list, but not change the 'submenu' tag


yanik
I hate DOM model !
so I wrote dom2array simple function (simple for use):
function dom2array($node) {
 $res = array();
 print $node->nodeType.'<br/>';
 if($node->nodeType == XML_TEXT_NODE){
     $res = $node->nodeValue;
 }
 else{
     if($node->hasAttributes()){
         $attributes = $node->attributes;
         if(!is_null($attributes)){
             $res['@attributes'] = array();
             foreach ($attributes as $index=>$attr) {
                 $res['@attributes'][$attr->name] = $attr->value;
             }
         }
     }
     if($node->hasChildNodes()){
         $children = $node->childNodes;
         for($i=0;$i<$children->length;$i++){
             $child = $children->item($i);
             $res[$child->nodeName] = dom2array($child);
         }
     }
 }
 return $res;
}


brian dot reynolds

I found the xml2array function below very useful, but there seems to be a bug in it. The $item variable was never getting set. I've expanded this out to be a bit more readable, and the corrected code is :
function xmlToArray($n)
{
   $return=array();
   foreach($n->childNodes as $nc){
       if( $nc->hasChildNodes() ){
           if( $n->firstChild->nodeName== $n->lastChild->nodeName&&$n->childNodes->length>1){
               $item = $n->firstChild;
               $return[$nc->nodeName][]=$this->xmlToArray($item);
           }
           else{
                $return[$nc->nodeName]=$this->xmlToArray($nc);
           }
      }
      else{
          $return=$nc->nodeValue;
      }
   }
   return $return;
}


johanwthijs-at-hotmail-dot-com

Being an experienced ASP developer I was wondering how to replace textual content of a node (with msxml this is simply acheived by setting the 'text' property of a node). Out of frustration I started to play around with SimpleXml but I could not get it to work in combination with xPath.
I took me a lot of time to find out so I hope this helps others:
function replaceNodeText($objXml, $objNode, $strNewContent){
/*
This function replaces a node's string content with strNewContent
*/
$objNodeListNested = &$objNode->childNodes;
foreach ( $objNodeListNested as $objNodeNested ){
if ($objNodeNested->nodeType == XML_TEXT_NODE)$objNode->removeChild ($objNodeNested);
}

$objNode->appendChild($objXml->createTextNode($strNewContent));
}
$objXml= new DOMDocument();
$objXml->loadXML('<root><node id="1">bla</note></root>');
$objXpath = new domxpath($objXml);
$strXpath="/root/node[@id='1']";
$objNodeList = $objXpath ->query($strXpath);
foreach ($objNodeList as $objNode){
//pass the node by reference
replaceNodeText($objXml, &$objNode, $strImportedValue);
}


lpetrov

Basicly there are alot of problems on dynamic namespaces registering and other maybe 'not well' documented parts of DOM.
Here is an article covering some of the problems that our company web developers found while we were developing a template engine for our new framework.
The link:
http://blog.axisvista.com/?p=35


aidan

As of PHP 5.1, libxml options may be set using constants rather than the use of proprietary DomDocument properties.
DomDocument->resolveExternals is equivilant to setting
LIBXML_DTDLOAD
LIBXML_DTDATTR
DomDocument->validateOnParse is equivilant to setting
LIBXML_DTDLOAD
LIBXML_DTDVALID
PHP 5.1 users are encouraged to use the new constants.
Example:
DomDocument->load($file, LIBXML_DTDLOAD|LIBXML_DTDATTR);
DomDocument->load($file, LIBXML_DTDLOAD|LIBXML_DTDVALID);


jim.filter

Array to DOM
Here is a recursive function to turn a multidimensional array into an XML document.  It will handle multiple elements of the same tag, but only one per parent element. IE:
Can't generate:         Can generate:
<root>                  <root>
 <sub1>data1</sub1>      <subs1>
 <sub1>data2</sub1>         <value>data1</value>
 <sub2>data1</sub2>         <value>data2</value>
 <sub2>data2</sub2>      </subs1>
</root>                   <subs2>
                           <value>data1</value>
                           <value>data2</value>
                         <subs2>
                       </root>
Also, the function performs no type of error checking on your array and will throw a DOMException if a key value you used in your array contains invalid characters for a proper DOM tag. This function is untested for "deep" multidimensional arrays.
Complete code ready to run with example:
<?PHP
 function AtoX($array, $DOM=null, $root=null){
   
   if($DOM  == null){$DOM  = new DOMDocument('1.0', 'iso-8859-1');}
   if($root == null){$root = $DOM->appendChild($DOM->createElement('root'));}
   
   $name = $array['#MULTIPLE_ELEMENT_NAME'];
     foreach($array as $key => $value){    
       if(is_int($key) && $name != null){
         if(is_array($value)){
           $subroot = $root->appendChild($DOM->createElement($name));
           AtoX($value, $DOM, $subroot);
         }
         else if(is_scalar($value)){
           $root->appendChild($DOM->createElement($name, $value));
         }
       }
       else if(is_string($key) && $key != '#MULTIPLE_ELEMENT_NAME'){
         if(is_array($value)){
           $subroot = $root->appendChild($DOM->createElement($key));
           AtoX($value, $DOM, $subroot);
         }
         else if(is_scalar($value)){
           $root->appendChild($DOM->createElement($key, $value));
         }
       }
     }
   return $DOM;  
 }
 
 $array = array(
                   '#MULTIPLE_ELEMENT_NAME' => 'GenericDatas',
                   'Date'      => 'November 03, 2007',
                   'Company'   => 'Facility One',
                   'Field'     => 'Facility Management Software',
                   'Employees' => array(
                                     '#MULTIPLE_ELEMENT_NAME' => 'Employee',
                                     'Cindy',
                                     'Sean',
                                     'Joe',
                                     'Owen',
                                     'Jim',
                                     'Dale',
                                     'Kelly',
                                     'Ryan',
                                     'Johnathan',
                                     'Robin',
                                     'William Marcus',
                                     'NewCoops' => array(
                                                         '#MULTIPLE_ELEMENT_NAME' => 'Coop',
                                                         'John',
                                                         'Tyler',
                                                         'Ray',
                                                         'Dawn'
                                                        )    
                                   ),
                   'Datas',
                   'DATAS',
                   'OtherDatas'
               );
 
 $DOM  = new DOMDocument('1.0', 'iso-8859-1');
 $root = $DOM->appendChild($DOM->createElement('CompanyData'));
 $DOM  = AtoX($array, $DOM, $root);
 $DOM->save('C:\test.xml');
?>


sanados

appended to
brian dot reynolds at risaris dot com
20-Feb-2007 10:09
when you got variable nodes at start you array fails and looses nodes  beneath.
solution that counts occurance though eats up performance:
function xmlToArray($n)
{
   $xml_array = array();
   $occurance = array();
foreach($n->childNodes as $nc)
{
$occurance[$nc->nodeName]++;
}

   foreach($n->childNodes as $nc){
       if( $nc->hasChildNodes() )
       {
        if($occurance[$nc->nodeName] > 1)
        {
        $xml_array[$nc->nodeName][] = xmlToArray($nc);
        }
        else
        {
        $xml_array[$nc->nodeName] = xmlToArray($nc);
        }
       }
       else
       {
        return utf8_decode($nc->nodeValue);
       }
   }
   return $xml_array;
}


sean

$xmlDoc=<<<XML
<?xml version="1.0"?>
<methodCall>
  <methodName>examples.getStateName</methodName>
  <params>
     <param>
        <value><i4>41</i4></value>
        </param>
     </params>
  </methodCall>
XML;
$xml= new DOMDocument();
$xml->preserveWhiteSpace=false;
$xml->loadXML($xmlDoc);
print_r(xml2array($xml));
function xml2array($n)
{
$return=array();
foreach($n->childNodes as $nc)
($nc->hasChildNodes())
?($n->firstChild->nodeName== $n->lastChild->nodeName&&$n->childNodes->length>1)
?$return[$nc->nodeName][]=xml2array($item)
:$return[$nc->nodeName]=xml2array($nc)
:$return=$nc->nodeValue;
return $return;
}


Change Language


Follow Navioo On Twitter
.NET Functions
Apache-specific Functions
Alternative PHP Cache
Advanced PHP debugger
Array Functions
Aspell functions [deprecated]
BBCode Functions
BCMath Arbitrary Precision Mathematics Functions
PHP bytecode Compiler
Bzip2 Compression Functions
Calendar Functions
CCVS API Functions [deprecated]
Class/Object Functions
Classkit Functions
ClibPDF Functions [deprecated]
COM and .Net (Windows)
Crack Functions
Character Type Functions
CURL
Cybercash Payment Functions
Credit Mutuel CyberMUT functions
Cyrus IMAP administration Functions
Date and Time Functions
DB++ Functions
Database (dbm-style) Abstraction Layer Functions
dBase Functions
DBM Functions [deprecated]
dbx Functions
Direct IO Functions
Directory Functions
DOM Functions
DOM XML Functions
enchant Functions
Error Handling and Logging Functions
Exif Functions
Expect Functions
File Alteration Monitor Functions
Forms Data Format Functions
Fileinfo Functions
filePro Functions
Filesystem Functions
Filter Functions
Firebird/InterBase Functions
Firebird/Interbase Functions (PDO_FIREBIRD)
FriBiDi Functions
FrontBase Functions
FTP Functions
Function Handling Functions
GeoIP Functions
Gettext Functions
GMP Functions
gnupg Functions
Net_Gopher
Haru PDF Functions
hash Functions
HTTP
Hyperwave Functions
Hyperwave API Functions
i18n Functions
IBM Functions (PDO_IBM)
IBM DB2
iconv Functions
ID3 Functions
IIS Administration Functions
Image Functions
Imagick Image Library
IMAP
Informix Functions
Informix Functions (PDO_INFORMIX)
Ingres II Functions
IRC Gateway Functions
PHP / Java Integration
JSON Functions
KADM5
LDAP Functions
libxml Functions
Lotus Notes Functions
LZF Functions
Mail Functions
Mailparse Functions
Mathematical Functions
MaxDB PHP Extension
MCAL Functions
Mcrypt Encryption Functions
MCVE (Monetra) Payment Functions
Memcache Functions
Mhash Functions
Mimetype Functions
Ming functions for Flash
Miscellaneous Functions
mnoGoSearch Functions
Microsoft SQL Server Functions
Microsoft SQL Server and Sybase Functions (PDO_DBLIB)
Mohawk Software Session Handler Functions
mSQL Functions
Multibyte String Functions
muscat Functions
MySQL Functions
MySQL Functions (PDO_MYSQL)
MySQL Improved Extension
Ncurses Terminal Screen Control Functions
Network Functions
Newt Functions
NSAPI-specific Functions
Object Aggregation/Composition Functions
Object property and method call overloading
Oracle Functions
ODBC Functions (Unified)
ODBC and DB2 Functions (PDO_ODBC)
oggvorbis
OpenAL Audio Bindings
OpenSSL Functions
Oracle Functions [deprecated]
Oracle Functions (PDO_OCI)
Output Control Functions
Ovrimos SQL Functions
Paradox File Access
Parsekit Functions
Process Control Functions
Regular Expression Functions (Perl-Compatible)
PDF Functions
PDO Functions
Phar archive stream and classes
PHP Options&Information
POSIX Functions
Regular Expression Functions (POSIX Extended)
PostgreSQL Functions
PostgreSQL Functions (PDO_PGSQL)
Printer Functions
Program Execution Functions
PostScript document creation
Pspell Functions
qtdom Functions
Radius
Rar Functions
GNU Readline
GNU Recode Functions
RPM Header Reading Functions
runkit Functions
SAM - Simple Asynchronous Messaging
Satellite CORBA client extension [deprecated]
SCA Functions
SDO Functions
SDO XML Data Access Service Functions
SDO Relational Data Access Service Functions
Semaphore
SESAM Database Functions
PostgreSQL Session Save Handler
Session Handling Functions
Shared Memory Functions
SimpleXML functions
SNMP Functions
SOAP Functions
Socket Functions
Standard PHP Library (SPL) Functions
SQLite Functions
SQLite Functions (PDO_SQLITE)
Secure Shell2 Functions
Statistics Functions
Stream Functions
String Functions
Subversion Functions
Shockwave Flash Functions
Swish Functions
Sybase Functions
TCP Wrappers Functions
Tidy Functions
Tokenizer Functions
Unicode Functions
URL Functions
Variable Handling Functions
Verisign Payflow Pro Functions
vpopmail Functions
W32api Functions
WDDX Functions
win32ps Functions
win32service Functions
xattr Functions
xdiff Functions
XML Parser Functions
XML-RPC Functions
XMLReader functions
XMLWriter Functions
XSL functions
XSLT Functions
YAZ Functions
YP/NIS Functions
Zip File Functions
Zlib Compression Functions
eXTReMe Tracker