Php get a value from url using a class - php

here is the div code on different domains, i want to display total on my homepage. I try to use the file_get_html but it displays all the div content, but i want to save the number within the <dd></dd> in a variables and add them and display them on my page.
here is the div code
<div class="stats">
<dl class="statscount">
<dt>total:</dt>
<dd>5,299</dd>
</dl>
20000
</div>
and here is my current code.
<?php
include 'simple_html_dom.php';
$html = file_get_html('http://www.targetdomain.com');
$result = $html->find('dl[class=statscount]', 0); //Output: THESE
$result = str_replace(",", "", $result);
echo $result;
?>
but there is small problem i don't need to fetch all the data in the class, i just need data for <dd></dd> tag within the class, Can you please tell me how to achieve this. basically i want to fetch the number within the <dd>5,299</dd> and add all the numbers from different pages and display the total on my website. Thanks

I would use XPath for this, this way you won't need simple_html_dom because DOM and XPath is part of the PHP5 core:
$html = <<<EOF
<div class="stats">
<dl class="statscount">
<dt>total posts:</dt>
<dd>5,299</dd>
</dl>
20000
</div>
EOF;
$doc = new DOMDocument();
$doc->loadHTML($html);
$selector = new DOMXPath($doc);
$value = $selector
->query('//dl[#class="statscount"]/dd/text()')
->item(0)
->nodeValue;
var_dump($value); // Output: string(5) "5,299"
You can test the code here

Maybe a regex
preg_match('/<dd>[^>]*(.*)<\/dd>/', $htmlcode, $matches);
$result = $matches;

Related

How to get data or value from any div in php

i Have create php page where use many div with different id name.
so i want to get data or value from one div.
Here am showing one div with id name
i want to get data or value from this div.
<div id="tablename">tablename</div>
i have use this but its not working.
$doc = new DomDocument();
$thediv = $doc->getElementById('tablename');
echo $thediv->textContent;
So please tell me how can i get this value from my div?
You need to pass the whole content of your page to the class, otherwise, it can't select nothing since it thinks the document is empty:
$content = '<div id="tablename"></div>';
$doc = new DomDocument();
$doc->loadHTML($content); // That's the addition
$thediv = $doc->getElementById('tablename');
echo $thediv->textContent;
More info:
loadHTML(): Load the HTML from a string.
loadHTMLFile(): Load the HTML from a file.
Downloaded and include PHP Simple HTML DOM Parser from https://sourceforge.net/projects/simplehtmldom/files/ and
Try this
include 'simple_html_dom.php';
$html = file_get_html("http://www.facebook.com");
$displaybody = $html->find('div[id=blueBarDOMInspector]', 0)->plaintext;
echo $displaybody ;exit;

How to scrape html contents of one div by id using php

The page on another of my domains which I'd like to scrape one div from contains:
<div id="thisone">
<p>Stuff</p>
</div>
<div id="notthisone">
<p>More stuff</p>
</div>
Using this php...
<?php
$page = file_get_contents('http://thisite.org/source.html');
$doc = new DOMDocument();
$doc->loadHTML($page);
foreach ($doc->getElementsByTagName('div') as $node) {
echo $doc->saveHtml($node), PHP_EOL;
}
?>
...gives me all divs on http://thisite.org/source.html, with html. However, I only want to pull through the div with an id of "thisone" but using:
foreach ($doc->getElementById('thisone') as $node) {
doesn't bring up anything.
$doc->getElementById('thisone');// returns a single element with id this one
Try $node=$doc->getElementById('thisone'); and then print $node
On a side note, you can use phpQuery for a jquery like syntext: pq("#thisone")
$doc->getElementById('thisone') returns a single DOMElement, not an array, so you can't iterate through it
just do:
$node = $doc->getElementById('thisone');
echo $doc->saveHtml($node), PHP_EOL;
Look at PHP manual http://php.net/manual/en/domdocument.getelementbyid.php
getElementByID returns an element or NULL. Not an array and therefore you can't iterate over it.
Instead do this
<?php
$page = file_get_contents('example.html');
$doc = new DOMDocument();
$doc->loadHTML($page);
$node = $doc->getElementById('thisone');
echo $doc->saveHtml($node), PHP_EOL;
?>
On running
php edit.php you get something like this
<div id="thisone">
<p>Stuff</p>
</div>

For each div tag, take its contents

I'm trying to loop through the code of a HTML page and reformat it's contents. It has a few div's within div's, which I want to extract. I've tried various forms of explode, regex and DOM, but can't find exactly how to do this.
Example:
<div class="section1">
<div class="section2">number 1</div>
</div>
<div class="section1">
<div class="section2">number 2</div>
</div>
The result I'm looking for is basically, for each section 1, get contents from section 2, so the output would be:
number 1, number 2
Does anyone know how to do something like this?
Should be pretty easy with DOMXPath:
$doc = new DOMDocument;
$doc->loadHTML(/*...*/); // load the HTML here
$xpath = new DOMXPath($doc);
$result = $xpath->query("//div[#class='section1']/div[#class='section2']/text()");
foreach ($result as $item) {
echo "$item->wholeText\n";
}
See it in action.
This is a jQuery solution, not PHP:
$('.section1).each(function() {
return $(this).html();
});

Add a span tag to Title without javascript

UPDATE:
Yes I am Using PHP in my pages.
Hello Friends I was thinking..... Is there a way to add a <span> tag to the title without using javascript?
May be using Regex or php or some other method. I dont really know.
Let me explain....
My HTML is like this:
<h3 class="title">The Title Goes Here</h3>
What I want is to automatically add a span tag, so the the final HTML looks like this.
<h3 class="title"><span>The </span>Title Goes Here</h3>
I want to wrap only the first word of the title in a <span> tag.
I know this can easily be dont using Javascript but I am looking for a non-javascript solution.
Please Help!
You can do this with DOMDocument in PHP if you don't want to do it with the javascript DOM:
$html = '<h3 class="title">The Title Goes Here</h3>';
$doc = new DOMDocument();
$doc->loadHTML($html);
$xp = new DOMXPath($doc);
foreach($xp->query('//h3[#class="title"]') as $parent) {
$title = $parent->nodeValue;
list($first, $rest) = explode(' ', $title, 2);
$span = new DOMElement('span', $first. ' ');
$parent->nodeValue = $rest;
$parent->insertBefore($span, $parent->firstChild);
}
foreach($doc->getElementsByTagName('body')->item(0)->childNodes as $node)
{
echo $doc->saveHTML($node);
}
My answer is that the cannot be done. You can't manipulate a page in the browser without JavaScript. This can only be achieved by editing the page on the server manually, or by dynamically generating it using PHP logic, or an equivalent solution, of which there are many.
If you are doing this for a corporate solution that is only used on a single corporate standard browser, you could look into building a plugin for the browser.

How to write a preg_match_all just for grabbing one specific element?

Until the website give me an access to his API, i need to display only 2 things from this website :
What i want to grab
// Example on a live page
Those 2 things are contained in a div :
<div style="float: right; margin: 10px;">
here what i want to display on my website
</div>
The problem is that i found an example on stackoverflow, but i never wrote preg_match before. How to do this with the data i want to grabb ? Thank you
<?php $html = file_get_contents($st_player_cv->getUrlEsl());
preg_match_all(
'What do i need to write here ?',
$html,
$posts, // will contain the data
PREG_SET_ORDER // formats data into an array of posts
);
foreach ($posts as $post) {
$premium = $post[1];
$level = $post[2];
// do something with data
}
The DOM way to do it would be
libxml_use_internal_errors(TRUE);
$dom = new DOMDocument;
$dom->loadHTMLFile('http://www.esl.eu/fr/player/5178309/');
libxml_clear_errors();
$xPath = new DOMXPath($dom);
$nodes = $xPath->query('//div[#style="float: right; margin: 10px;"]');
foreach($nodes as $node) {
echo $node->nodeValue, PHP_EOL;
}
but there is a whole slew of JavaScript in the page that modifies the DOM heavily after the page was loaded. Since any PHP script based fetching will not execute any JavaScript, the style we search for in the XPath does not exist yet and we won't get any results (the Regex suggesed by Hannes doesn't work for the same reason). Neither do the level numbers on the badge exist yet.
As Wrikken pointed out in the comments, there also seems to be some mechanism to block certain requests. I had the message once, but I am not sure what triggers it, because I could also fetch page on several occasions.
To cut a long story short: you cannot achieve what you are trying to do with this page.
If you want something more generic
preg_match('/<div[^>]+?>(.*?)<\/div>/', $myhtml, $result);
echo $result[1] . "\n";
$myhtml contains the code html you have to analyze. $result is the array that contains the regexp and () content after the regular expression was applied. $result[1] will give you what is between the <div ... > and </div>.
This way, even if the <div differs (class name change or different attributes), it'll still work.
this regex '#<div style="float: right; margin: 10px;">(.*)</div>#' should do the trick (yeah) but i would advice you to use DOM & XPath.
edit:
Here is an Xpath / DOM Example:
$html = <<<HTML
<html>
<body>
<em>nonsense</em>
<div style="float: right; margin: 10px;"> here what i want to display on my website </div>
<div> even more nonsense </div>
</body>
</html>
HTML;
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$elements = $xpath->query('//div[#style="float: right; margin: 10px;"]');
echo $elements->item(0)->nodeValue;

Categories