$some_link = 'http://www.example.com';
$abc = 'killer';
$bcd = 'awsome';
$cde = 'qwerty';
$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
#$dom->loadHTMLFile($some_link);
$html = getTags( $dom, $abc, $bcd, $cde );
echo $html;
function getTags( $dom, $abc, $bcd, $cde ){
$html = '';
$domxpath = new DOMXPath($dom);
$newDom = new DOMDocument;
$newDom->formatOutput = true;
$defffff = $domxpath->query("//$abc" . '[#' . $bcd . "='$cde']");
// since above returns DomNodeList Object
// converting to string(html)
$i = 0;
while( $myItem = $defffff->item($i++) ){
$node = $newDom->importNode( $myItem, true ); // import node
$newDom->appendChild($node); // append node
}
$html = $newDom->saveHTML();
return $html;
}
?>
this is the whole code. it is returning multiple results in a row, now what I want is to have ONLY the result no.1 and no.5. How can I do it?
I am new to DOM, tried several things but no success. Thanks in Advance
Change this
$i = 0;
while( $myItem = $defffff->item($i++) ){
$node = $newDom->importNode( $myItem, true ); // import node
$newDom->appendChild($node); // append node
}
into this, in order to append only selected nodes
$i = 0;
while( $myItem = $defffff->item($i++) ){
if ($i==0 or $i==4){
$node = $newDom->importNode( $myItem, true ); // import node
$newDom->appendChild($node); // append node
}
}
or you if you know the indexes you want already, you can do this
$myIndexes = array (0,4);
foreach ($myIndexes as $i){
$myItem = $defffff->item($i++);
$node = $newDom->importNode( $myItem, true ); // import node
$newDom->appendChild($node); // append node
}
Related
Why does not display the attribute html via xpath php
<?php
$content = '<div class="keep-me">Keep this div</div><div class="remove-me" id="test">Remove this div</div>';
$badClasses = array('');
$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML($content);
libxml_clear_errors();
$xPath = new DOMXpath($dom);
foreach($badClasses as $badClass){
$domNodeList = $xPath->query('//div[#class="remove-me"]/#id');
$domElemsToRemove = ''; // container of deleted elements
foreach ( $domNodeList as $domElement ) {
$domElemsToRemove .= $dom->saveHTML($domElement); // concat them
$domElement->parentNode->removeChild($domElement); // then remove
}
}
$content = $dom->saveHTML();
echo htmlentities($domElemsToRemove);
?>
Works - //div[#class="remove-me"] or //div[#class="remove-me"]/text()
Not working - //div[#class="remove-me"]/#id
Maybe there is a way easier
The XPath //div[#class="remove-me"]/#id is correct, but you need to just loop over the returned elements and add the nodeValue to a list of matching ID's...
$xPath = new DOMXpath($dom);
$domNodeList = $xPath->query('//div[#class="remove-me"]/#id');
$ids = []; // container of deleted elements
foreach ( $domNodeList as $domElement ) {
$ids[] = $domElement->nodeValue;
}
print_r($ids);
If the aim is to fetch the ID of any element with class "remove-me" as is how I interpret the question then perhaps you can try like this - untested btw...
.... other code before
$xp=new DOMXpath( $dom );
$col= $xp->query( '*[#class="remove-me"]' );
if( $col->length > 0 ){
foreach($col as $node){
$id=$node->hasAttribute('id') ? $node->getAttribute('id') : 'banana';
echo $id;
}
}
however looking at the code in the question suggests that you wish to delete nodes - in which case build an array of nodes ( nodelist ) and iterate through it from the end to the front - ie: backwards...
I would like to be able to create an XML file from some of the content of a html page. I have tried intensively but seem to miss something.
I have created two arrays, I have setup a DOMdocument and I have prepared to save an XML file on the server... I have tried to make tons of different foreach loops all over the place - but it won't work.
Here is my code:
<?php
$page = file_get_contents('http://www.halfmen.dk/!hmhb8/score.php');
$doc = new DOMDocument();
$doc->loadHTML($page);
$score = $doc->getElementsByTagName('div');
$keyarray = array();
$teamarray = array();
foreach ($score as $value) {
if ($value->getAttribute('class') == 'xml') {
$keyarray[] = $value->firstChild->nodeValue;
$teamarray[] = $value->firstChild->nextSibling->nodeValue;
}
}
print_r($keyarray);
print_r($teamarray);
$doc = new DOMDocument('1.0','utf-8');
$doc->formatOutput = true;
$droot = $doc->createElement('ROOT');
$droot = $doc->appendChild($droot);
$dsection = $doc->createElement('SECTION');
$dsection = $droot->appendChild($dsection);
$dkey = $doc->createElement('KEY');
$dkey = $dsection->appendChild($dkey);
$dteam = $doc->createElement('TEAM');
$dteam = $dsection->appendChild($dteam);
$dkeytext = $doc->createTextNode($keyarray);
$dkeytext = $dkey->appendChild($dkeytext);
$dteamtext = $doc->createTextNode($teamarray);
$dteamtext = $dteam->appendChild($dteamtext);
echo $doc->save('xml/test.xml');
?>
I really like simplicity, thank you.
You need to add each item in one at a time rather than as an array, which is why I build the XML for each div tag rather than as a second pass. I've had to assume that your XML is structured the way I've done it, but this may help you.
$page = file_get_contents('http://www.halfmen.dk/!hmhb8/score.php');
$doc = new DOMDocument();
$doc->loadHTML($page);
$score = $doc->getElementsByTagName('div');
$doc = new DOMDocument('1.0','utf-8');
$doc->formatOutput = true;
$droot = $doc->createElement('ROOT');
$droot = $doc->appendChild($droot);
foreach ($score as $value) {
if ($value->getAttribute('class') == 'xml') {
$dsection = $doc->createElement('SECTION');
$dsection = $droot->appendChild($dsection);
$dkey = $doc->createElement('KEY', $value->firstChild->nodeValue);
$dkey = $dsection->appendChild($dkey);
$dteam = $doc->createElement('TEAM', $value->firstChild->nextSibling->nodeValue);
$dteam = $dsection->appendChild($dteam);
}
}
I am trying to use the dom object to simplify the implementation of a glossary tooltip. What I need to do is to replace a text element in a paragraph, but NOT in an anchor tag that may be embedded in the paragraph.
$html = '<p>Replace this tag not this tag</p>';
$document = new DOMDocument();
$document->loadHTML($html);
$document->preserveWhiteSpace = false;
$document->validateOnParse = true;
$nodes = $document->getElementByTagName("p");
foreach ($nodes as $node) {
$node->nodeValue = str_replace("tag","element",$node->nodeValue);
}
echo $document->saveHTML();
I get:
'...<p>Replace this element not this element</p>...'
I want:
'...<p>Replace this element not this tag</p>...'
How do I implement this such that only the parent node text is changed and the child node (a tag) is not changed?
Try this:
$html = '<p>Replace this tag not this tag</p>';
$document = new DOMDocument();
$document->loadHTML($html);
$document->preserveWhiteSpace = false;
$document->validateOnParse = true;
$nodes = $document->getElementsByTagName("p");
foreach ($nodes as $node) {
while( $node->hasChildNodes() ) {
$node = $node->childNodes->item(0);
}
$node->nodeValue = str_replace("tag","element",$node->nodeValue);
}
echo $document->saveHTML();
Hope this helps.
UPDATE
To answer #paul's question in the comments below, you can create
$html = '<p>Replace this tag not this tag</p>';
$document = new DOMDocument();
$document->loadHTML($html);
$document->preserveWhiteSpace = false;
$document->validateOnParse = true;
$nodes = $document->getElementsByTagName("p");
//create the element which should replace the text in the original string
$elem = $document->createElement( 'dfn', 'tag' );
$attr = $document->createAttribute('title');
$attr->value = 'element';
$elem->appendChild( $attr );
foreach ($nodes as $node) {
while( $node->hasChildNodes() ) {
$node = $node->childNodes->item(0);
}
//dump the new string here, which replaces the source string
$node->nodeValue = str_replace("tag",$document->saveHTML($elem),$node->nodeValue);
}
echo $document->saveHTML();
Question
How can I remove empty xml tags in PHP?
Example:
$value1 = "2";
$value2 = "4";
$value3 = "";
xml = '<parentnode>
<tag1> ' .$value1. '</tag1>
<tag2> ' .$value2. '</tag2>
<tag3> ' .$value3. '</tag3>
</parentnode>';
XML Result:
<parentnode>
<tag1>2</tag1>
<tag2>4</tag2>
<tag3></tag3> // <- Empty tag
</parentnode>
What I want!
<parentnode>
<tag1>2</tag1>
<tag2>4</tag2>
</parentnode>
The XML without the empty tags like "tag3"
Thanks!
You can use XPath with the predicate not(node()) to select all elements that do not have child nodes.
<?php
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->loadxml('<parentnode>
<tag1>2</tag1>
<tag2>4</tag2>
<tag3></tag3>
<tag2>4</tag2>
<tag3></tag3>
<tag2>4</tag2>
<tag3></tag3>
</parentnode>');
$xpath = new DOMXPath($doc);
foreach( $xpath->query('//*[not(node())]') as $node ) {
$node->parentNode->removeChild($node);
}
$doc->formatOutput = true;
echo $doc->savexml();
prints
<?xml version="1.0"?>
<parentnode>
<tag1>2</tag1>
<tag2>4</tag2>
<tag2>4</tag2>
<tag2>4</tag2>
</parentnode>
This works recursively and removes nodes that:
contain only spaces
do not have attributes
do not have child notes
// not(*) does not have children elements
// not(#*) does not have attributes
// text()[normalize-space()] nodes that include whitespace text
while (($node_list = $xpath->query('//*[not(*) and not(#*) and not(text()[normalize-space()])]')) && $node_list->length) {
foreach ($node_list as $node) {
$node->parentNode->removeChild($node);
}
}
$dom = new DOMDocument;
$dom->loadXML($xml);
$elements = $dom->getElementsByTagName('*');
foreach($elements as $element) {
if ( ! $element->hasChildNodes() OR $element->nodeValue == '') {
$element->parentNode->removeChild($element);
}
}
echo $dom->saveXML();
CodePad.
The solution that worked with my production PHP SimpleXMLElement object code, by using Xpath, was:
/*
* Remove empty (no children) and blank (no text) XML element nodes, but not an empty root element (/child::*).
* This does not work recursively; meaning after empty child elements are removed, parents are not reexamined.
*/
foreach( $this->xml->xpath('/child::*//*[not(*) and not(text()[normalize-space()])]') as $emptyElement ) {
unset( $emptyElement[0] );
}
Note that it is not required to use PHP DOM, DOMDocument, DOMXPath, or dom_import_simplexml().
//this is a recursively option
do {
$removed = false;
foreach( $this->xml->xpath('/child::*//*[not(*) and not(text()[normalize-space()])]') as $emptyElement ) {
unset( $emptyElement[0] );
$removed = true;
}
} while ($removed) ;
If you're going to be a lot of this, just do something like:
$value[] = "2";
$value[] = "4";
$value[] = "";
$xml = '<parentnode>';
for($i=1,$m=count($value); $i<$m+1; $i++)
$xml .= !empty($value[$i-1]) ? "<tag{$i}>{$value[$i-1]}</tag{$i}>" : null;
$xml .= '</parentnode>';
echo $xml;
Ideally though, you should probably use domdocument.
I have a question.
I have the xml
http://maps.googleapis.com/maps/api/geocode/xml?address=new+york&sensor=true
I want to read from example
GeocodeResponse/result/geometry/location/lat
and
GeocodeResponse/result/geometry/location/lng
maybe using XPATH and this is what i have so far ...
<?php
$Address = "new+york";
$Query = "http://maps.googleapis.com/maps/api/geocode/xml?address=".$Address."&sensor=true";
$XmlResponse = file_get_contents($Query);
$doc = new DOMDocument();
$doc->loadXML($XmlResponse);
$root = $doc->getElementsByTagName( "GeocodeResponse" );
foreach( $root as $val )
{
$hrefs = $val->getElementsByTagName( "status" );
$status = $hrefs->item(0)->nodeValue;
foreach( $hrefs as $val2 )
{
$hrefs2 = $val2->getElementsByTagName( "type" );
$type = $hrefs2->item(0)->nodeValue;
echo "Type is: $type <br>";
}
echo "Status is: $status <br>";
}
?>
Can i have some advices?
maybe i can use
$xpath = new DOMXPath($xml);
$hrefs = $xpath->evaluate("/page");
UPDATE!!
I have managed to get the result by this ...
$xpath = new DOMXPath($doc);
$res = $xpath->evaluate('//GeocodeResponse/result/geometry');
$root = $doc->getElementsByTagName( "location" );
foreach( $root as $val )
{
$hrefs = $val->getElementsByTagName( "lat" );
$status = $hrefs->item(0)->nodeValue;
echo "Status is: $status <br>";
}
but i want something without foreach like
$xpath = new DOMXPath($doc);
$res = $xpath->evaluate('//GeocodeResponse/result/geometry');
$hrefs = $val->getElementsByTagName( "lat" );
$status = $hrefs->item(0)->nodeValue;
echo "Status is: $status <br>";
Is this possible?
You may want to switch to JSON that is much easier to parse:
http://maps.googleapis.com/maps/api/geocode/json?address=new+york&sensor=true
$json = file_get_contents('http://maps.googleapis.com/maps/api/geocode/json?address=new+york&sensor=true');
$geocodeResponse = json_decode($json, true);
foreach($geocodeResponse['results'] as $result){
echo $result['geometry']['location']['lat'].', '.$result['geometry']['location']['lng'] .'<br>';
}