Get Div Class data-files as php value - php

So, i have been trying to get the data-files as a php variable, but have not been able to get it.
This is from the source code.
<div class="videoplayer" id="video1" data-files="files.mp4">
This is the code im having most succes with, but i dont get the data-files value.
<?php
$doc = new DOMDocument();
#$doc->loadHTML($url);
$doc->validateOnParse = true;
libxml_use_internal_errors(true);
$doc->loadHtml(file_get_contents($url));
libxml_use_internal_errors(false);
$classname="videoplayer";
$finder = new DomXPath($doc);
$result = $finder->query("//*[contains(#class, '$classname')]");
// There's actually something in the list
if($result->length > 0) {
$node = $result->item(0);
echo "{$node->nodeName} - {$node->nodeValue}";
}
else {
echo "Empty";
}
?>
Any ideaas how to achieve this?

You get the value of attributes using DOMElement::getAttribute. So to get the data-files attribute, use:
$file = $node->getAttribute("data-files");
echo "$node->nodeName - $file";

Related

Extracting information from <i> tag from HTML using PHP

I am having some code and getting HTTP 500 Error. A bit getting confused. I need to extract from the web of weather cast weather digit information and add in the website.
Here is a code:
orai_class.php
<?php
Class orai{
var $url;
function generate_orai($url){
$html = file_get_contents($url);
$classname = 'wi wi-1';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$results = $xpath->query("//*[#class='" . $classname . "']");
$i=0;
foreach($results as $node)
{
if ($results->length > 0) {
$array[] = $results->item($i)->nodeValue;
}
$i++;
}
return $array;
}
}
?>
index.php
<?php
include("orai.class.php");
$orai = new orai();
print_r($orai->generate_orai('https://orai.15min.lt/prognoze/vilnius'));
?>
Thank You.

Echo HTML code, which is retrieved from a external page in php

We have this code
$page = file_get_contents('http://example.aspx?a=14&c=14213&med=0');
$doc = new DOMDocument();
$doc->loadHTML($page);
$divs = $doc->getElementsByTagName('table');
foreach($divs as $div) {
// Loop through the tableĀ“s looking for one withan id of "Table2"
// Then echo out its contents
if ($div->getAttribute('id') === 'Table2') {
echo $div->childNodes;
}
}
As you see the code works, but outputs plain text, because the function of childnodes, but we need to output the code of "Table2" instead of plain text.
How can I do this?
Solved, with this code
$dom = new DOMDocument();
$data = file_get_contents('http://example.aspx?a=14&c=14213&med=0');
$dom->loadHTML($data); // $data is your html code, grab it using file_get_contents or cURL.
$xpath = new DOMXPath($dom);
$div = $xpath->query('//table[#id="Table2"]');
$div = $div->item(0);
echo $dom->saveXML($div);

get value of href inside of div from external site using PHP

good day Sir/Maam.
I have a certain html attribute that I want to search from the external website
I want to get the a href value but the problem is the id or class or name is random.
<div class="static">
Dynamic
</div>
This code should display all the hrefs in http://example.com
In this case I use DOMDocument and XPath to select the elements you want to access because it's very flexible and easy to use.
<?php
$html = file_get_contents("http://example.com");
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DomXPath($doc);
$nodeList = $xpath->query("//a/#href");
print_r($nodeList);
// To access the values inside nodes
foreach($nodeList as $node){
echo "<p>" . $node->nodeValue . "</p>";
}
use jquery to get the value as follow:
var link = $(".static>a").attr("href");
You can use PHP DOMDocument:
<?php
$exampleurl = "http://YourDomain.com"; //set your url
$filterClass = "dynamicclass";
$dom = new DOMDocument('1.0');
#$dom->loadHTMLFile($exampleurl);
$anchors = $dom->getElementsByTagName('a');
foreach ($anchors as $element) {
$href = $element->getAttribute('href'); // all href
$class = $element->getAttribute('class');
if($class==$filterClass){
echo $href;
}
}
?>

DOM Parser grabbing href of <a> tag by class="Decision"

I'm working with a DOM parser and I'm having issues. I'm basically trying to grab the href within the tag that only contain the class ID of 'thumbnail '. I've been trying to print the links on the screen and still get no results. Any help is appreciated. I also turned on error_reporting(E_ALL); and still nothing.
$html = file_get_contents('http://www.reddit.com/r/funny');
$dom = new DOMDocument();
#$dom->loadHTML($html);
$classId = "thumbnail ";
$div = $html->find('a#'.$classId);
echo $div;
I also tried this but still had the same result of NOTHING:
include('simple_html_dom.php');
$html = file_get_contents('http://www.reddit.com/r/funny');
$dom = new DOMDocument();
#$dom->loadHTML($html);
// grab all the on the page
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");
$ret = $html->find('a[class=thumbnail]');
echo $ret;
You were almost there:
<?php
$dom = new DOMDocument();
#$dom->loadHTMLFile('http://www.reddit.com/r/funny');
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a[contains(concat(' ',normalize-space(#class),' '),' thumbnail ')]");
var_dump($hrefs);
Gives:
class DOMNodeList#28 (1) {
public $length =>
int(25)
}
25 matches, I'd call it success.
This code would probably work:
$html = file_get_contents('http://www.reddit.com/r/funny');
$dom = new DOMDocument();
#$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$hyperlinks = $xpath->query('//a[#class="thumbnail"]');
foreach($hyperlinks as $hyperlink) {
echo $hyperlink->getAttribute('href'), '<br>;'
}
if you're using simple_html_dom, why are you doing all these superfluous things? It already wraps the resource in everything you need -- http://simplehtmldom.sourceforge.net/manual.htm
include('simple_html_dom.php');
// set up:
$html = new simple_html_dom();
// load from URL:
$html->load_file('http://www.reddit.com/r/funny');
// find those <a> elements:
$links = $html->find('a[class=thumbnail]');
// done.
echo $links;
Tested it and made some changes - this works perfect too.
<?php
// load the url and set up an array for the links
$dom = new DOMDocument();
#$dom->loadHTMLFile('http://www.reddit.com/r/funny');
$links = array();
// loop thru all the A elements found
foreach($dom->getElementsByTagName('a') as $link) {
$url = $link->getAttribute('href');
$class = $link->getAttribute('class');
// Check if the URL is not empty and if the class contains thumbnail
if(!empty($url) && strpos($class,'thumbnail') !== false) {
array_push($links, $url);
}
}
// Print results
print_r($links);
?>

How can I echo a scraped div in PHP?

How do I echo and scrape a div class? I tried this but it doesn't work. I am using cURL to establish the connection. How do I echo it? I want it just how it is on the actual page.
$document = new DOMDocument();
$document->loadHTML($html);
$selector = new DOMXPath($document);
$anchors = $selector->query("/html/body//div[#class='resultitem']");
//a URL you want to retrieve
foreach($anchors as $a) {
echo $a;
}
Neighbor,
I just made this snippet below, that uses your logic, and some tweaks to display the specified class from the webpage in the get_contents function.
Maybe you can plug in your values and try it?
(Note: I put the error checking in there to see a few bugs. It can be helpful to use that as you tweak. )
<?php
error_reporting(E_ALL);
ini_set('display_errors', '1');
$url = "http://www.tizag.com/cssT/cssid.php";
$class_to_scrape="display";
$html = file_get_contents($url);
$document = new DOMDocument();
$document->loadHTML($html);
$selector = new DOMXPath($document);
$anchors = $selector->query("/html/body//div[#class='". $class_to_scrape ."']");
echo "ok, no php syntax errors. <br>Lets see what we scraped.<br>";
foreach ($anchors as $node) {
$full_content = innerHTML($node);
echo "<br>".$full_content."<br>" ;
}
/* this function preserves the inner content of the scraped element.
** http://stackoverflow.com/questions/5349310/how-to-scrape-web-page-data-without-losing-tags
** So be sure to go and give that post an uptick too:)
**/
function innerHTML(DOMNode $node)
{
$doc = new DOMDocument();
foreach ($node->childNodes as $child) {
$doc->appendChild($doc->importNode($child, true));
}
return $doc->saveHTML();
}
?>

Categories