This is the first time i am using PHP DOMDocument and i don't know its methods.
I grab the html that has the following format
<div class=row abc>...</div>
<div class=row xyz>...</div>
<div class=row qrs>...</div>
...
...
<div class="row>This is what i want to grab</div>
<div class="row show-more-result">Show More</div>
What i am trying to achieve is that first i select the div with class show-more-results and then target the one level upper div thats where my data is present.
I have started exploring the PHP DOMDocument class but there is not any getElementByClass method i found
public function scrapping()
{
// Create a DOMDocument Object to fetch the search results
$dom = new \DOMDocument;
#$dom->loadHTML($this->_response);
$dom->preserveWhiteSpace = false;
$xpath = new \DomXpath($dom);
$show_more_div = $xpath->query('//*[#class="show-more-result"]')->item(0);
$stuff = $show_more_div->textContent;
echo($stuff);
}
I tried to target the show more div but it says Trying to get property of non-object as if the $xpath-query() returns nothing.
Please help me in targeting the desired div.
Updated
var_dump($xpath->query('//*[#class="show-more-result"]')->item(0));
// NULL
You're doing a straight string equality:
$show_more_div = $xpath->query('//*[#class="show-more-result"]')->item(0);
^^^^^^^
But your target div's class is actually row show-more-result. You need to do a substring match instead:
//*[contains(#class, 'show-more-result')]
Related
I am getting a code of a page using ob_buffer and i try to replace all divs inside the page that contain the class locked-content.
While it's easy with jquery the problem is that with php is a bit harder. Let say for example i have this html code
<div class='class'> cool content </div>
<div class='class more-class life-is-hard locked-content'>
<div class='cool-div'></div>
<div class='anoter-cool-div'></div>
some more code here
</div>
<div class='class'> cool content </div>
Now it seems like a complex task, I think i need to detect somehow how many divs are open after the div with the class 'locked-content' and then count how many closed div there are and when the wanted div was closed and then replace the code with new code while looping the code in case the div exists more than once.
Anyone has an idea on how to do something like this?
Thanks
You can do it via DOMXPath:
$dom = new DOMDocument();
$dom->loadHtml($html);
$xpath = new DOMXPath($dom);
$node = $xpath->query('//div[contains(#class, "locked-content")]');
foreach ($nodes as $node) {
foreach ($node->childNodes as &$cNode) {
if ($cNode instanceOf DOMElement && $cNode->tagName === 'div') {
$cNode->replaceWith(/* Whatever */);
}
}
}
I want to target a tags with class genre within parent div with id test:
<div id="test">
<a class="genre">hello</a>
<a class="genre">hello2</a>
</div>
So far, I can get all the genre a tags:
$xpath = new DOMXPath($doc);
$elements = $xpath->query('//a[#class="genre"]');
... but I want to adjust //a[#class="genre"] so I only target the ones within the test div.
I don't understand why you did not write it yourself because you use all needed elements of xpath in your expression. Or, maybe, i've misunderstand you question
$elements = $xpath->query('//div[#id="test"]/a[#class="genre"]');
I know this topic was posted everywhere, but their question is not I want. I want to insert some HTML codes before the page is loaded without touching the original code in the page.
Suppose my header was rendered by a function called render_header():
function render_body() {
return "<body>
<div class='container'>
<div class='a'>A</div>
<div class='b'>B</div>
</div>
</body>";
}
From now, I want to insert HTML codes using PHP without editing the render_body(). I want a function that insert some divs to container'div.
render_body();
<?php *//Insert '<div class="c" inside div container* ?>
Just as an alternative using XPath - this should load in the output from render_body() to an XML (DOMDocument) object and create an XPath object to query your HTML so you can easily work out where you want to insert the new HTML.
This will probably only work if you're using XML well formed HTML though.
//read in the document
$xml = new DOMDocument();
$xml->loadHTML(render_body());
//create an XPath query object
$xpath = new DOMXpath($xml);
//create the HTML nodes you want to insert
// using $xml->createElement() ...
//find the node to which you want to attach the new content
$xmlDivClassA = $xpath->query('//body/div[#class="a"]')->item(0);
$xmlDivClassA->appendChild( /* the HTML nodes you've previously created */ );
//output
echo $xml->saveHTML();
Took a little while as I had to refer to the documentation ... too much JQuery lately it's ruining my ability to manipulate the DOM without looking things up :\
The only thing I can think of is to turn on output buffering and then use the DOMDocument class to read in the entire buffer and then make changes to it. It is worth doing some reading of the documentation (http://www.php.net/manual/en/book.dom.php) provided in the script...
ie.:
<?php
function render_body() {
return "<body>
<div class='container'>
<div class='a'>A</div>
<div class='b'>B</div>
</div>
</body>";
}
$dom = new DOMDocument();
$dom->loadHTML(render_body());
// get body tag
$body = $dom->getElementsByTagName('body')->item(0);
// add a new element at the end of the body
$element = $dom->createElement('div', 'My new element at the end!');
$body->appendChild($element);
echo $dom->saveHTML(); // echo what is in the dom
?>
EDIT:
As per CD001's suggestions, I have tested this code and it works.
I need to figure the closing tag for below code
<div class="emph"><div class="level"> Some testing </div></div>
In this i need to find the correct tag for parent DIV. my goal is to add the class name before the closing DIV like below
<div class="emph"><div class="level"> Some testing <!--level--></div><!--emph--></div>
For that i need to find the exact closing Parent DIV.
is that possible to achieve in PHP?
You can use simpleXML (or any other XML class) - for each div element, read it's class and append at the end of node content. It's not exactly finding the closing tag, but achieves your specified goal.
Sample code:
$dom = new DOMDocument;
$dom->loadXML($xml);
$divs = $dom->getElementsByTagName('div');
foreach ($divs as $div) {
if ($div->getAttribute('class')!='') {
$div->nodeValue = $div->nodeValue.'<!--'.$div->getAttribute('class').'-->';
}
}
echo $dom->saveXML();
While printing the divs in PHP keep an array $div_array = array()
As soon as you open a div do:
array_push($div_array, 'emph'); // or 'level' depending on the classname
As soon as you're ready to print the closing tag, ask for the value of the last div by:
array_pop($div_array);
// for example
echo '<!-- '.array_pop($div_array).' -->';
Popping the array also deletes the last entry of the array. Which is what you want I presume.
I have an issue where I have a div that doesnt have a class or id. Is it possible to select an div element when I know its innerText ie
<div class="thishere"></div>
<div>Search on a this text</div>
If not, the div before it has a class, how do i find its next sibling?
$selector = new Zend_Dom_Query($response->getBody());
$nodes = $selector->query('????');
Using JavaScript you can loop through every element on the page like this says and find that div with the special class. Then, you'll know that the next element in the loop will be that second div and you can get its contents using element.innerHTML.
$text = <<<text
<div class="thishere"></div>
<div>Search on a this text</div>
text;
$selector = new Zend_Dom_Query ($text);
$nodes = $selector->queryXpath('//div[contains(text(),"Search on a this text")]');
foreach ($nodes as $node)
{
...
}