PHP : Function Reference : DOM Functions : DOMDocument->saveXML()
Example 530. Saving a DOM tree into a string
<?php
$doc = new DOMDocument('1.0'); // we want a nice output $doc->formatOutput = true;
$root = $doc->createElement('book'); $root = $doc->appendChild($root);
$title = $doc->createElement('title'); $title = $root->appendChild($title);
$text = $doc->createTextNode('This is the title'); $text = $title->appendChild($text);
echo "Saving all the document:\n";
echo $doc->saveXML() . "\n";
echo "Saving only the title part:\n";
echo $doc->saveXML($title);
?>
The above example will output:
Saving all the document:
<?xml version="1.0"?>
<book>
<title>This is the title</title>
</book>
Saving only the title part:
<title>This is the title</title>
padys
When you save whole document:
DOMDocument->saveXML() produces string in encoding defined in property DOMDocument->encoding.
When you save only one node:
DOMDocument->saveXML(DOMNode) produces always string in UTF-8.
mswiercz
Quick tip to minimize memory when generating documents with DOM.
Rather than using
$xmlStr = DOMDocument->saveXML();
echo $xmlStr;
to dump a large DOM to the output buffer, use a PHP output stream, as in
DOMDocument->save('php://output');
A lot of memory will be saved when generating large DOMs.
sander
Note that for large DOM trees (tens of thousands of elements nested at least a few levels deep), setting formatOutput to true drives up memory usage to rather insane levels when you call saveXML(). (Tested with PHP 5.2.1) Pretty output is not worth that cost.
devin
It took some searching to figure this one out. I didn't see much in the way of explaining this glitch in the manual thus far. (For PHP5 I believe)
formatOutput = true; appears to fail when the origin of the DOM came from a file via load(). EX:
<?php
$dom = new DOMDocument();
$dom->load ("test.xml");
$dom->formatOutput = true;
$new_tag = $dom->createElement ('testNode');
$new_tag->appendChild (
$dom->createElement ('test', 'this is a test'));
$dom->documentElement->appendChild ($new_tag);
printf ("<pre>%s</pre>", htmlentities ($dom->saveXML()));
?>
Will not indent the output and will display the modified nodes all in one long line. Makes for editing a config.xml a bit difficult when saving to a file.
By adding the preserveWhiteSpace = false; BEFORE the load() the formatOutput works as expected. EX:
<?php
$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->load ("test.xml");
$dom->formatOutput = true;
$new_tag = $dom->createElement ('testNode');
$new_tag->appendChild (
$dom->createElement ('test', 'this is a test'));
$dom->documentElement->appendChild ($new_tag);
printf ("<pre>%s</pre>", htmlentities ($dom->saveXML()));
?>
CAUTION: If your loaded xml file (test.xml) has an empty root node that is not shortened or has no children this will NOT work.
Example:
DOES NOT WORK:
<?xml version="1.0"?>
<root>
</root>
WORKS:
<?xml version="1.0"?>
<root/>
WORKS:
<?xml version="1.0"?>
<root>
<!-- comment -->
</root>
WORKS:
<?xml version="1.0"?>
<root>
<child/>
</root>
27-mar-2006 01:20
I used the function posted by "joe" but the following works to me for get the innerXML
<?php
$itemLeido = $XMLRespuesta->getElementsByTagName("articulos");
foreach($itemLeido as $node) {
echo($node->ownerDocument->saveXML($node));
}
?>
jitr
Comment to `devin at SPAMISBAD dot tritarget dot com''s post:
Thanks for pointing out the pitfalls of `formatOutput' vs. `load*()'. This has certainly saved me from some possible surprises.
I think the seemingly strange behaviour can be explained. Warning: The following stuff is mostly based on deductions and experiments. Much less on studying the sources and specs (I'm not sure some of these would provide answer anyway, at least not easily).
As you point out, `preserveWhiteSpace' must be set before loading the DOM from the source string (I'm working with `loadXML()' but I believe the situation should be the same with `load()' you used). This looks logical, as this property seems to control the parsing and DOM creation process during which text nodes containing the whitespace are either included or dropped. This can be proven by dumping the DOM structure and comparing the results based on the value of `preserveWhiteSpace'. With `preserveWhiteSpace' set to `FALSE', no text nodes containing whitespace will be present in the returned DOM. When this property is `TRUE', these nodes will be present.
Note: When speaking about the whitespace in the previous paragraph, we're most certainly speaking about so called `whitespace in element content' or `element content whitespace', if I'm not mistaken. See also my comment in the notes of `DOMText->isWhitespaceInElementContent()' method.
As for the mysterious effect on the output of `saveXLM()', I think the explanation lies in the presence or absence of the above mentioned whitespace text nodes. This was also proven by experiments: After adding such a node into a DOM which contained none (the DOM was created using `loadXML()' with `preserveWhiteSpace' set to `FALSE'), the output formatting got affected in a such a way, the formatting got lost for the rest of the document after the added node. I think the presence of whitespace text nodes forces such rendering, that the content of these nodes is used to separate adjoining nodes thus disabling default formatting. Only when there are no such text nodes present, the ouput formatting takes effect (provided the `formatOutput' is set to `TRUE', of course).
Well, the thing I don't really understand is how you did get an output of a signle line with `formatOutput' set to `TRUE'. This has happened to me when no whitespace text nodes were present (ie. when loading the XML with `preserveWhiteSpace' set to `FALSE') *and* with `formatOutput' set to *`FALSE'* (with the opposite value of `formatOutput', the formatting should do it's work and you should not end up with just one line). But I haven't seen your source. Perhaps you had whitespace nodes containing no new-lines in your DOM?
As for the CAUTION about root element, I didn't see any problems with empty root element neither in shortened nor full form. What did you have in mind, when you said it `WORKS' or `DOES NOT WORK'?
nevyn
A little function to get the full xml contents of a Xml node.
function innerXml($node)
{
$out = $node->ownerDocument->saveXML($node);
$re = "{^<(\\w*)(?:\\s*\\w+=(?:\"[^\"]*\"|\'[^\']*\'))*\\s*>(.*)</\\1>$}";
preg_match($re, $out, $mat);
return $mat[2];
}
|