All divs are not coming while parsing html by dom file - php

I am trying to parse all divs by using DOM file but all the divs are not coming.
My code is:
<?php
include('simplehtmldom/simple_html_dom.php');
// Create DOM from URL or file
$html = file_get_html('http://www.ebay.in');
foreach($html->find('div') as $element)
echo $element->class . '<br>';
?>

Related

Get td contain table from library simplehtmldom

simple_html_dom does not work in page "https://eldni.com/buscar-por-dni?dni=44626399"
<?php
include_once './simple_html_dom./HtmlWeb.php';
use simplehtmldom\HtmlWeb;
// get DOM from URL or file
$doc = new HtmlWeb();
$html = $doc->load('https://eldni.com/buscar-por-dni?dni=44626399');
foreach($html->find('td') as $e)
echo $e->plaintext . '<br>' . PHP_EOL;
?>
I want td plain text of the "td" table.

Question about using simple html dom parser to store HTML tags as objects

I am building a web scraper using the simple HTML DOM parser. However, I ran into some issues figuring out how to store HTML elements on a web page as objects. I would like to take an input URL, and turn all the HTML elements like tags, divs, fields, etc. and turn them into an object that gets spit out onto a page. I have written some code that currently works when I type in a URL, but the output is not what I am trying to achieve. Below, I have attached the code that I have worked out already, and I am seeking to find a way in which I could achieve what I am trying to do.
I have tried finding all images and links as well as creating a DOM object. I can't seem to figure out how to convert these elements into objects that I can use to learn more about a website, and possibly store that data into a database.
<?php
require('simple_html_dom.php');
// Create DOM from URL or file
$url = $_POST["url"];
$html = file_get_html($url);
echo $html;
// Find all images
$element = new simple_html_dom();
foreach($html->find('img') as $element)
echo $element->src . '<br>';
// Find all links
$element = new simple_html_dom();
foreach($html->find('a') as $element)
echo $element->href . '<br>';
// Create a DOM object
$html = new simple_html_dom();
// Load HTML from a URL
$html->load_file($url);
echo $html;
?>
I am expecting an output of objects, but I am instead getting an actual output of images and links on a web page.
<?php
require('simple_html_dom.php');
// Create DOM from URL or file
// $url = $_POST["url"];
$url = 'Your-Url'; // Your url: 'www.example.com'
$html = file_get_html($url);
// Find all images
$images = []; //create empty images array
foreach($html->find('img') as $element){
$images[] = $element->src . '<br>'; //Store the found elements in the images array
}
echo '<pre>Output $images: '; var_dump($images); echo '</pre>'; //An output from the images array
// Find all links
$links = []; //create empty images array
foreach($html->find('a') as $element){
$links[] = $element->href . '<br>'; //Store the found elements in the links array
}
echo '<pre>Output $links: '; var_dump($links); echo '</pre>'; //An output from the links array
The echo's display the arrays filled with 'image' and 'a' tags value's from your page

How to access an HTML attribute and retrieve data from it in PHP?

I'm new to PHP and I would like to know how to retrieve data from an HTML element such as an src?
It's very easy to do that in jQuery:
$('img').attr('src');
But I have no idea how to do it in PHP (if it is possible).
Here's an example I'm working on:
I loaded $result into SimpleXMLElement and stored it into $xml:
$xml = simplexml_load_string($result) or die("Error: Cannot create object");
Then used foreach to loop over all elements:
foreach($xml->links->link as $link){
echo 'Image: ' . $link->{'link-code-html'}[0] . '</br>';
// returns sometihing similar to: <a href='....'><img src='....'></a>
}
Inside of the foreach I'm trying to access links (src) in img.
Is there a way to access src of the img nested inside of the a — clear when outputted to the screen:
echo 'Image: ' . $link->{'link-code-html'}[0] . '</br>';
I would do this with the built-in DOMDocument and DOMXPath APIs, and then you can use the getAttribute method on any matching img node:
$doc = new DOMDocument();
// Load some example HTML. If you need to load from file, use ->loadHTMLFile
$doc->loadHTML("<a href='abc.com'><img src='ping1.png'></a>
<a href='def.com'><img src='ping2.png'></a>
<a href='ghi.com'>something else</a>");
$xpath = new DOMXpath($doc);
// Collect the images that are children of anchor elements
$imgs = $xpath->query("//a/img");
foreach($imgs as $img) {
echo "Image: " . $img->getAttribute("src") . "\n";
}

how i can get img src from a html page by using php

AA Dear bro, i want to get the img src from a html page butt i have faced with error,Help please , my server show this messaage
Notice: Undefined offset: 0 in F:\xamppppp\htdocs\Arslan_Sir\img
download from google.php on line 13 Notice: Array to string
conversion in F:\xamppppp\htdocs\Arslan_Sir\img download from
google.php on line 15 Array
my code is
<?php //this code can be pic
image from a html page $ctual_link="https://www.google.com/search?q=9780333993385&ie=utf-8&oe=utf-8&client=firefox-b-ab"; define('DIRECTORY', '/imgg/m/'); $text = file_get_contents($ctual_link); preg_match_all('/<div class=\"image\">(.*?)<\/div>/s', $text, $out); //preg_match('/~src="(.*)"itemprop="image" \/>/',$text,$out); preg_match('~src="(.*)"\s*itemprop="image"[^>]*>~',$text,$out); //$out
= explode(' ',$out[1]); $z=trim($out[0],'"'); echo $out; //} ?>
Not quite sure but thinking about PHP Simple HTML DOM Parser
the example from the landing page of the library
$html = file_get_html('http://www.google.com/');
// Find all images
foreach($html->find('img') as $element)
echo $element->src . '<br>';
solved your issue you can apply this code it will help you better
<?php
$ctual_link="https://www.google.com/search?q=9780333993385&ie=utf-8&oe=utf-8&client=firefox-b-ab";
$html = file_get_contents($ctual_link);
//Create a new DOM document
$dom = new DOMDocument;
#$dom->loadHTML($html);
$links = $dom->getElementsByTagName('img');
foreach ($links as $link){
//Extract and show the "src" attribute of image.
echo $link->nodeValue;
echo $link->getAttribute('src'), '<br>';
}
?>

How to get alt attribute from a specific div image by php dom parser?

How to get alt attribute from a specific div image by php scraper ? Just look my following code sample. I can print all img "alt" attribute. But I would like to get the "alt" attribute of a specific div class. How can? Here is my code sample:
<?php
error_reporting(E_ALL);
error_reporting(1);
set_time_limit(0);
require 'simple_html_dom.php';
$url = "mylink.com";
$html = file_get_html($url);
foreach($html->find('img') as $element)
echo $element->alt.'<br>';
?>
Let's say the div's class is foo:
foreach($html->find('div.foo') as $div) {
echo $div->alt . '<br>';
}

Categories