Count Similar Div : Simple html dom - php

I have a html layout like :
<div id="pageno">1</div>
<div id="pageno">2</div>
<div id="pageno">3</div>
<div id="pageno">4</div>
<div id="pageno">5</div>
I need to know using html dom parser how can i know the last div inner text?
THanks in advance

// Create a new DomDocument.
$dom = new DomDocument();
// Load your HTML into it.
$dom->loadHTML('
<div id="pageno">1</div>
<div id="pageno">2</div>
<div id="pageno">3</div>
<div id="pageno">4</div>
<div id="pageno">5</div>
');
// Obtain a list of the DIVs.
$divList = $dom->getElementsByTagName("div");
// Obtain the last element of the list.
$lastDiv = $divList->item($divList->length - 1);
// Output the inner text.
echo $lastDiv->nodeValue;
However, the HTML you have provided is not valid, as element IDs should be unique. This may cause an error in the loadHTML function.

Related

PHP + Simple HTML DOM: Select a tag inside a div with class inside another div with class

In PHP: Simple HTML DOM, How do I select all <strong> tag that are inside div with class abc, which are inside div with class 123:
<div class="123">
<div class="abc">
<strong>Text</strong>
</div>
</div>
You need to use a selector like div.123 div.abc strong and get the first element of the result. Here is a working example:
<?php
require 'simple_html_dom.php';
$html =<<<html
<div class="123">
<div class="abc">
<strong>Text</strong>
</div>
</div>
html;
$dom = str_get_html($html);
$el = $dom->find('div.123 div.abc strong', 0);
print $el;
print "\n";
print $el->innertext;
Result:
<strong>Text</strong>
Text
You can refer to the manual for a better understanding of how selectors work.

How to use Simple HTML DOM PHP to get span data-reactid value?

Neither of these work:
$html = file_get_html("https://www.example.com/page/");
print($html->find('[data-reactid=10]', 0)->plaintext);
print($html->find('[data-reactid=11]', 0)->plaintext);
where the html looks like this:
<div class="stuff" data-reactid="10">
<span data-reactid="11">Value I want</span>
</div>
what am I doing wrong?
FYI. this does work:
print($html->find('[data-reactid=5]', 0)->plaintext);`
where:
<div class"stuff" data-reactid="5">
<!-- react-text: 6 -->
Value I want
<!-- /react-text: -->
</div>
So how do I get the value with the span?
I can get the value with the div.
This works.
$html_str = '
<div class="stuff" data-reactid="10">
<span data-reactid="11">Value I want</span>
</div>
';
// Create a DOM object
$html = new simple_html_dom();
// Load HTML from a string
$html->load($html_str);
// Get the value
echo $html->find('div[data-reactid=10]', 0)->find('span', 0)->{'data-reactid'};

DOMDocument Remove div and it content by identifier with PHP

Hi I wanna remove a line from a HTML file with PHP
like this:
<div id="buttons">
<div id="buttonid_4">Button 4</div>
<div id="buttonid_3">Button 3</div>
<div id="buttonid_2">Button 2</div>
<div id="buttonid_1">Button 1</div>
</div>
So, I wanna remove the buttonid_4, and it content.
That it will be like this:
<div id="buttons">
<div id="buttonid_3">Button 3</div>
<div id="buttonid_2">Button 2</div>
<div id="buttonid_1">Button 1</div>
</div>
First I think it is easy, but I can't found the answer :|
I tried:
"as simple"
$dom = new DOMDocument('1.0', 'utf-8');
$dom->loadHTMLFile($The_Path_For_File);
$element = $dom->getElementById('buttonid_'. $Button_Id);
$element->parentNode->removeChild($element);
$dom->saveHTMLFile($The_Path_For_File);
I got
Call to a member function removeChild() on a non-object
and everytime when I tried with GetElementById, so I continue with XPATH:
$xpath = new DOMXpath($dom);
$nodeList = $xpath->query('//div[#id="buttonid'.$Button_Id.'"]');
foreach($nodeList as $element){
$dom->$element->removeChild($element);
}
$dom->saveHTMLFile($The_Path_For_File);
I didn't get error, the notepad requested the refresh for file, but no change
Anyone know how to produce this?
The use of getElementById requires a Document Type Declaration (DTD).
PHP Documentation
Notice your HTML fails validation $dom->validate()
Just add <!DOCTYPE html> to your HTML and it will work.
For this function to work, you will need either to set some ID
attributes with DOMElement::setIdAttribute or a DTD which defines an
attribute to be of type ID. In the later case, you will need to
validate your document with DOMDocument::validate or
DOMDocument::$validateOnParse before using this function.

php - regex to get contents in DIV tags

Hello and thank for looking at my question.
I'm in need to grab some data from an HTML snippet.
This source is a trusted/structured one so I think it's OK to use regex in this HTML. Dom and other advanced features in php are an overkill I guess.
Here is the format of the HTML snippet.
<div id="d-container">
<div id="row-custom_1">
<div class="label">Type</div>
<div class="content">John Smith</div>
<div class="clear"></div>
</div>
</div>
In above, please note the first 2 DIV tags have IDs set. There could be several row-custom_1 like div tags so I will need to escape them.
I'm actually very poor in regex so I'm expecting a help from you to rab the John Smith from above html snippet.
It could be something like
<div * id="row-custom_1" * > * <div * class="content" * >GRAB THIS </div>
but I don't know how to do it in regex.
John Smith part won't contain any html for sure. it's from a trusted source that it strips all html and gives the data in above format.
I can understand that regex is never a good idea to process HTML anyway.
Thank you very much for any assistance.
Edit just after 30 minutes:
Many of the awesome people suggested to use an HTML parser so I did ; worked like a charm. So if anyone comes here with a similar question, as the stupid question author, I'd recommend using DOM for the job.
Here is a simple DOM based code to get your value from the given HTML:
$html = <<< EOF
<div id="d-container">
<div id="row-custom_1">
<div class="label">Type</div>
<div class="content">John Smith</div>
<div class="clear"></div>
</div>
</div>
EOF;
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($html); // loads your html
$xpath = new DOMXPath($doc);
$value = $xpath->evaluate("string(//div[#id='d-container']
/div[#id='row-custom_1']/div[#class='content']/text())");
echo "User Name: [$value]\n"; // prints your user name
OUTPUT:
User Name: [John Smith]

Parse HTML with PHP's HTML DOMDocument

I was trying to do it with "getElementsByTagName", but it wasn't working, I'm new to using DOMDocument to parse HTML, as I used to use regex until yesterday some kind fokes here told me that DOMEDocument would be better for the job, so I'm giving it a try :)
I google around for a while looking for some explains but didn't find anything that helped (not with the class anyway)
So I want to capture "Capture this text 1" and "Capture this text 2" and so on.
Doesn't look to hard, but I can't figure it out :(
<div class="main">
<div class="text">
Capture this text 1
</div>
</div>
<div class="main">
<div class="text">
Capture this text 2
</div>
</div>
If you want to get :
The text
that's inside a <div> tag with class="text"
that's, itself, inside a <div> with class="main"
I would say the easiest way is not to use DOMDocument::getElementsByTagName -- which will return all tags that have a specific name (while you only want some of them).
Instead, I would use an XPath query on your document, using the DOMXpath class.
For example, something like this should do, to load the HTML string into a DOM object, and instance the DOMXpath class :
$html = <<<HTML
<div class="main">
<div class="text">
Capture this text 1
</div>
</div>
<div class="main">
<div class="text">
Capture this text 2
</div>
</div>
HTML;
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
And, then, you can use XPath queries, with the DOMXPath::query method, that returns the list of elements you were searching for :
$tags = $xpath->query('//div[#class="main"]/div[#class="text"]');
foreach ($tags as $tag) {
var_dump(trim($tag->nodeValue));
}
And executing this gives me the following output :
string 'Capture this text 1' (length=19)
string 'Capture this text 2' (length=19)
You can use http://simplehtmldom.sourceforge.net/
It is very simple easy to use DOM parser written in php, by which you can easily fetch the content of div tag.
Something like this:
// Find all <div> which have attribute id=text
$ret = $html->find('div[id=text]');
See the documentation of it for more help.

Categories