i got a page source from a file using php and its output is similar to
<div class="basic">
<div class="math">
<div class="winner">
<div class="under">
<div class="checker">
<strong>check</strong>
</div>
</div>
</div>
</div>
</div>
from this i need to got only a particular 'div' with whole div and contents inside like below when i give input as 'under'(class name) . anybody suggest me how to do this one using php
<div class="under">
<div class="checker">
<strong>check</strong>
</div>
</div>
Try this:
$html = <<<HTML
<div class="basic">
<div class="math">
<div class="winner">
<div class="under">
<div class="checker">
<strong>check</strong>
</div>
</div>
</div>
</div>
</div>;
HTML;
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$div = $xpath->query('//div[#class="under"]');
$div = $div->item(0);
echo $dom->saveXML($div);
This will output:
<div class="under">
<div class="checker">
<strong>check</strong>
</div>
</div>
Function to extract the contents from a specific div id from any webpage
The below function extracts the contents from the specified div and returns it. If no divs with the ID are found, it returns false.
function getHTMLByID($id, $html) {
$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML($html);
$node = $dom->getElementById($id);
if ($node) {
return $dom->saveXML($node);
}
return FALSE;
}
$id is the ID of the <div> whose content you're trying to extract, $html is your HTML markup.
Usage example:
$html = file_get_contents('http://www.mysql.com/');
echo getHTMLByID('tagline', $html);
Output:
The world's most popular open source database
I'm not sure what you asking but this might be it
preg_match_all("<div class='under'>(.*?)</div>", $htmlsource, $output);
$output should now contain the inner content of that div
Related
I have a known Div Class name, and I can retrieve the inner html code all good, but how would I retrieve the next Div Class name (not inner from the known Div Class) using php, dom document and xpath?
For example with the code below, if I know the Div class "mobile-container mobile-filter-container", how would I return "mobile-container mobile-cart-content-container"?
<div class="mobile-container mobile-filter-container">
<div class="mobile-wrapper-header"></div>
<div class="mobile-filter-wrapper"></div>
</div>
<div class="mobile-container mobile-cart-content-container">
<div class="mobile-wrapper-header">
Thanks,
I believe this should get you close enough to what you need:
$data = <<<DATA
<html>
<div class="mobile-container mobile-filter-container">
<div class="mobile-wrapper-header"></div>
<div class="mobile-filter-wrapper"></div>
</div>
<div class="mobile-container mobile-cart-content-container">
<div class="mobile-wrapper-header"></div>
<div class="mobile-filter-wrapper"></div>
</div>
<div class="unwanted">
<div class="mobile-wrapper-header"></div>
<div class="mobile-filter-wrapper"></div>
</div>
</html>
DATA;
$doc = new DOMDocument();
$doc->loadHTML($data);
$xpath = new DOMXpath($doc);
$elements = $xpath->query('.//div[#class="mobile-container mobile-filter-container"]/following-sibling::div[1]/#class');
echo $elements[0]->nodeValue;
Output:
mobile-container mobile-cart-content-container
I have an external file with lots of informations e.g
http://domain.com/thefile.html
Each Data in the file is wrapped into a <div> element:
....
<div class="lineData">
<div class="lineLData">Playstation</div>
<div class="lineRData">awesome</div>
</div>
<div class="lineData">
<div class="lineLData">xbox one</div>
<div class="lineRData">not awesome</div>
</div>
<div class="lineData">
<div class="lineLData">wii u</div>
<div class="lineRData">mhhhh</div>
</div>
....
Now I want to search the whole file for the Keyword "Playstation" and echo the whole <div>:
<div class="lineData">
<div class="lineLData">Playstation</div>
<div class="lineRData">awesome</div>
</div>
Is this possible with PHP ?
If we assume the resource / URL is $url :
$result = array();
$dom = new DOMDocument;
$dom->loadHTML(file_get_contents($url));
find all <div>'s with the class lineData using DomXPath :
$xpath = new DomXPath($dom);
$lineDatas = $xpath->query('//div[contains(#class,"lineData")]');
add all lineData <div>'s containing "playstation" to the $result array :
foreach($lineDatas as $lineData) {
if (strpos(strtolower($lineData->nodeValue), 'playstation') !== false) {
$result[] = $lineData;
}
}
example of outputting the result
foreach($result as $lineData) {
echo $dom->saveHTML($lineData);
}
outputs
<div class="lineData">
<div class="lineLData">Playstation</div>
<div class="lineRData">awesome</div>
</div>
when tested on the example HTML in OP.
Use DOMDocument for this purpose.
$dom = new DOMDocument;
$dom->loadHTMLFile("file.html");
Now you can search for the div:
$xpath = new DOMXPath($dom);
$res = $xpath->query("//*[contains(#class, 'lineData')]");
Now you have the div as DOMElement. Saving should be possible with these few lines:
$html = $res->ownerDocument->saveHTML($res);
How to extract the full html content inside a div ? I tried this code,
$html= '<html>
<body>
<div id="test">
<div id="mydiv1">Hello</div>
<div id="mydiv2">How are you</div>
</div>
</body>
</html>';
$attr = "id";
$value = "test";
$tag_regex = '/<div[^>]*'.$attr.'="'.$value.'">(.*?)<\\/div>/si';
preg_match($tag_regex,$html,$matches);
echo $matches[0];
By running this code I get the result,
<div id="test">
<div id="mydiv1">Hello</div>
Expected result,
<div id="test">
<div id="mydiv1">Hello</div>
<div id="mydiv2">How are you</div>
</div>
In my code the regular expression execute till the first occurrence of </div> . How can I get the full code inside <div id="test"> ?
With DOMDocument:
$dom = new DOMDocument;
$dom->loadHTML($html);
$div = $dom->getElementById('test');
$result = $dom->saveHTML($div);
Lets say I have this comment block containing HTML:
<html>
<body>
<code class="hidden">
<!--
<div class="a">
<div class="b">
<div class="c">
Link Test 1
</div>
<div class="c">
Link Test 2
</div>
<div class="c">
Link Test 3
</div>
</div>
</div>
-->
</code>
<code>
<!-- test -->
</code>
</body>
</html>
Using DOMXPath for PHP, how do I get the links and text within the tag?
This is what I have so far:
$dom = new DOMDocument();
$dom->loadHTML("HTML STRING"); # not actually in code
$xpath = new DOMXPath($dom);
$query = '/html/body/code/comment()';
$divs = $dom->getElementsByTagName('div')->item(0);
$entries = $xpath->query($query, $divs);
foreach($entries as $entry) {
# shows entire text block
echo $entry->textContent;
}
How do I navigate so that I can get the "c" classes and then put the links into an array?
EDIT Please note that there are multiple <code> tags within the page, so I can't just get an element with the code attribute.
You already can target the comment containing the links, just follow thru that and make another query inside it. Example:
$sample_markup = '<html>
<body>
<code class="hidden">
<!--
<div class="a">
<div class="b">
<div class="c">
Link Test 1
</div>
<div class="c">
Link Test 2
</div>
<div class="c">
Link Test 3
</div>
</div>
</div>
-->
</code>
</body>
</html>';
$dom = new DOMDocument();
$dom->loadHTML($sample_markup); # not actually in code
$xpath = new DOMXPath($dom);
$query = '/html/body/code/comment()';
$entries = $xpath->query($query);
foreach ($entries as $key => $comment) {
$value = $comment->nodeValue;
$html_comment = new DOMDocument();
$html_comment->loadHTML($value);
$xpath_sub = new DOMXpath($html_comment);
$links = $xpath_sub->query('//div[#class="c"]/a'); // target the links!
// loop each link, do what you have to do
foreach($links as $link) {
echo $link->getAttribute('href') . '<br/>';
}
}
I want to get a specific tag from url, from example:
If I have this content:
<div id="hey">
<div id="bla"></div>
</div>
<div id="hey">
<div id="bla"></div>
</div>
And I want to get all divs with the id "hey", ( i think its with preg_match_all ), How can I do that?
The content inside the tag can be changed.
I recommend use DOMDocument class instead of regular expressions (is less resource consumer and more clear IMHO).
$content = '<div id="hey">
<div id="bla"></div>
</div>
<div id="hey">
<div id="bla"></div>
</div>';
$doc = new DOMDocument();
#$doc->loadHTML($content); // # for possible not standard HTML
$xpath = new DOMXPath($doc);
$elements = $xpath->query("//div[#id='hey']");
/*#var $elements DOMNodeList */
for ($i=0;$i<$elements->length;$i++) {
/*#var $curr_element DOMElement */
$curr_element = $elements->item($i);
// Here do what you want with the element
var_dump($curr_element);
}
If you want to get the content from an URL you can use this line instead to fill the variable $content:
$content = file_get_contents('http://yourserver/urls/page.php');