I want to replace string from specific classes from HTML.
In HTML there is other content which I don't want to change.
In below code want to change data on class one and three only, class two content should be as it is.
I need to this in dynamic way.
<div class="one"> I want to change this </div>
<div class="two"> I don't want to change this </div>
<div class="three"> I want to change this </div>
Dom functions are helpful
php manual
//your html file content
$str = '...<div class="one"> I want to change this </div>
<div class="two"> I don\'t want to change this </div>
<div class="three"> I want to change this </div>... ';
$dom = new DOMDocument();
$dom->loadHtml($str);
$domXpath = new DOMXPath($dom);
//query the nodes matched
$list = $domXpath->query('//div[#class!="two"]');
if ($list->length > 0) {
foreach ($list as $node) {
//change node value
$node->nodeValue = 'Content changed!';
}
}
//get the result
$new_str = $dom->saveHTML();
var_dump($new_str);
Related
This question already has answers here:
Get DOMElement with specific text PHP / XPath
(2 answers)
Closed 1 year ago.
I am trying to find all child elements of a div that contains a specific string. For example, in the following HTML content, I need to find all child elements of the "Trees" div, including the <div>Trees pair. There are no classes or IDs associated with each div, so I can't search for IDs or classes.
I tried the following code, using an answer from https://stackoverflow.com/a/55989111/1466973 , but the expected content was not returned by the function.
<?php
$html_text = "
<html>
<div>Grass
<div>Good grass
<div>Grass 1</div>
<div>Grass 2</div>
<div>Grass 3</div>
</div>
<div>Weeds
<div>Weeds 2</div>
<div>Weeds 3</div>
<div>Weeds 4</div>
</div>
</div>
<div>Trees
<div>Good Trees
<div>Tree 1</div>
<div>Tree 2</div>
<div>Tree 3</div>
</div>
<div>Tall Trees
<div>Tree 11</div>
<div>Tree 12</div>
<div>Tree 13</div>
</div>
</div>
<div>Fruit
<div>Red
<div>Fruit 1</div>
<div>Fruit 2</div>
<div>Fruit 31</div>
</div>
</div>
</html> ";
echo find_content($html_text); // this should be only the content of the div containing "Trees"
// tried this solution from https://stackoverflow.com/a/55989111/1466973 , didn't work
function find_trees($html_text = "") {
$dom = new DOMDocument();
$dom->loadHTML($html_text);
$xpath = new DOMXpath($dom);
$res = $xpath->document->documentElement->textContent;
$textNodes = explode(PHP_EOL, $res);
$trees_html = "";
foreach ($textNodes as $key => $text) {
if ($text == 'Trees') {
$trees_html .= $textNodes[$key + 1];
break;
}
}
"end of this function<br>";
return $trees_html;
}
Try it this way and see if it works:
Edited:
Since you are using DOMDocument to parse XML, you might as well use its xpath support to specify, succinctly, what your are looking for:
$target = $xpath->query("//div[contains(.,'Trees')]");
That's it. The rest is just a method to output to screen the string representation, in XML format, of what you have located:
$trees = $target[0]->ownerDocument->saveXML($target[0]);
echo $trees;
I have an external file with lots of informations e.g
http://domain.com/thefile.html
Each Data in the file is wrapped into a <div> element:
....
<div class="lineData">
<div class="lineLData">Playstation</div>
<div class="lineRData">awesome</div>
</div>
<div class="lineData">
<div class="lineLData">xbox one</div>
<div class="lineRData">not awesome</div>
</div>
<div class="lineData">
<div class="lineLData">wii u</div>
<div class="lineRData">mhhhh</div>
</div>
....
Now I want to search the whole file for the Keyword "Playstation" and echo the whole <div>:
<div class="lineData">
<div class="lineLData">Playstation</div>
<div class="lineRData">awesome</div>
</div>
Is this possible with PHP ?
If we assume the resource / URL is $url :
$result = array();
$dom = new DOMDocument;
$dom->loadHTML(file_get_contents($url));
find all <div>'s with the class lineData using DomXPath :
$xpath = new DomXPath($dom);
$lineDatas = $xpath->query('//div[contains(#class,"lineData")]');
add all lineData <div>'s containing "playstation" to the $result array :
foreach($lineDatas as $lineData) {
if (strpos(strtolower($lineData->nodeValue), 'playstation') !== false) {
$result[] = $lineData;
}
}
example of outputting the result
foreach($result as $lineData) {
echo $dom->saveHTML($lineData);
}
outputs
<div class="lineData">
<div class="lineLData">Playstation</div>
<div class="lineRData">awesome</div>
</div>
when tested on the example HTML in OP.
Use DOMDocument for this purpose.
$dom = new DOMDocument;
$dom->loadHTMLFile("file.html");
Now you can search for the div:
$xpath = new DOMXPath($dom);
$res = $xpath->query("//*[contains(#class, 'lineData')]");
Now you have the div as DOMElement. Saving should be possible with these few lines:
$html = $res->ownerDocument->saveHTML($res);
I have website, where i have posted few images inside particular div :-
<div class="posts">
<div class="separator">
<img src="http://www.example.com/image.jpg" />
<p>Be, where I am today, and i will be one where you will search me tomorrow</p>
</div>
<div class="separator">
<img src="http://www.example.com/imagesda.jpg" />
<p>Be, where I am today, and i will be one where you will search me tomorrow</p>
</div>
.... few more images
</div>
And from my 2nd website, i want to fetch all images on that particular div.. I have below code.
<?php
$htmlget = new DOMDocument();
#$htmlget->loadHtmlFile('http://www.example.com');
$xpath = new DOMXPath( $htmlget);
$nodelist = $xpath->query( "//img/#src" );
foreach ($nodelist as $images){
$value = $images->nodeValue;
echo "<img src='".$value."' /><br />";
}
?>
But this is fetching all images from my website and not just particular div. It also prints out my RSS image, Social icon image, etc.,
Can i specify particular div in my php code, so that it only fetch image from div.posts class.
first give a "id" for the outer div container. Then get it by its id. Then get its child image nodes.
an example:
$tables = $dom->getElementsById('node_id');
$table = $tables->item(1);
//get the number of rows in the 2nd table
echo $table->childNodes->length;
//content of each child
foreach($table->childNodes as $child)
{
echo $child->ownerDocument->saveHTML($child);
}
may be this like will help you. It has a good tutorial.
http://www.binarytides.com/php-tutorial-parsing-html-with-domdocument/
With PHP Simple HTML Parser, this will be:
include('simple_html_dom.php');
$html=file_get_html("http://your_web_site.com");
foreach($html->find('div.posts img') as $img_posts){
echo $img_posts->src.<br>; // to show the source attribute
}
Still reading about PHP Simple HTML Dom parser. And so far, it's faster(in implementation) than regex.
Here is another code that may help. You are looking for
doc->getElementsByTagName
which can help target a tag directly.
<?php
$myhtml = <<<EOF
<html>
<body>
<div class="posts">
<div class="separator">
<img src="http://www.example.com/image.jpg" />
<p>Be, where I am today, and i will be one where you will search me tomorrow</p>
</div>
<div class="separator">
<img src="http://www.example.com/imagesda.jpg" />
<p>Be, where I am today, and i will be one where you will search me tomorrow</p>
</div>
.... few more images
</div>
</body>
EOF;
$doc = new DOMDocument();
$doc->loadHTML($myhtml);
$divs = $doc->getElementsByTagName('img');
foreach ($divs as $div) {
foreach ($div->attributes as $attr) {
$name = $attr->nodeName;
$value = $attr->nodeValue;
echo "Attribute '$name' :: '$value'<br />";
}
}
?>
Demo here http://codepad.org/keZkC377
Also the answer here can provide further insights
Not finding elements using getElementsByTagName() using DomDocument
i got a page source from a file using php and its output is similar to
<div class="basic">
<div class="math">
<div class="winner">
<div class="under">
<div class="checker">
<strong>check</strong>
</div>
</div>
</div>
</div>
</div>
from this i need to got only a particular 'div' with whole div and contents inside like below when i give input as 'under'(class name) . anybody suggest me how to do this one using php
<div class="under">
<div class="checker">
<strong>check</strong>
</div>
</div>
Try this:
$html = <<<HTML
<div class="basic">
<div class="math">
<div class="winner">
<div class="under">
<div class="checker">
<strong>check</strong>
</div>
</div>
</div>
</div>
</div>;
HTML;
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$div = $xpath->query('//div[#class="under"]');
$div = $div->item(0);
echo $dom->saveXML($div);
This will output:
<div class="under">
<div class="checker">
<strong>check</strong>
</div>
</div>
Function to extract the contents from a specific div id from any webpage
The below function extracts the contents from the specified div and returns it. If no divs with the ID are found, it returns false.
function getHTMLByID($id, $html) {
$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML($html);
$node = $dom->getElementById($id);
if ($node) {
return $dom->saveXML($node);
}
return FALSE;
}
$id is the ID of the <div> whose content you're trying to extract, $html is your HTML markup.
Usage example:
$html = file_get_contents('http://www.mysql.com/');
echo getHTMLByID('tagline', $html);
Output:
The world's most popular open source database
I'm not sure what you asking but this might be it
preg_match_all("<div class='under'>(.*?)</div>", $htmlsource, $output);
$output should now contain the inner content of that div
I want to get a specific tag from url, from example:
If I have this content:
<div id="hey">
<div id="bla"></div>
</div>
<div id="hey">
<div id="bla"></div>
</div>
And I want to get all divs with the id "hey", ( i think its with preg_match_all ), How can I do that?
The content inside the tag can be changed.
I recommend use DOMDocument class instead of regular expressions (is less resource consumer and more clear IMHO).
$content = '<div id="hey">
<div id="bla"></div>
</div>
<div id="hey">
<div id="bla"></div>
</div>';
$doc = new DOMDocument();
#$doc->loadHTML($content); // # for possible not standard HTML
$xpath = new DOMXPath($doc);
$elements = $xpath->query("//div[#id='hey']");
/*#var $elements DOMNodeList */
for ($i=0;$i<$elements->length;$i++) {
/*#var $curr_element DOMElement */
$curr_element = $elements->item($i);
// Here do what you want with the element
var_dump($curr_element);
}
If you want to get the content from an URL you can use this line instead to fill the variable $content:
$content = file_get_contents('http://yourserver/urls/page.php');