I am trying to get all the description from this eBay url: https://www.ebay.com/itm/Front-strut-spacers-30mm-for-Ford-Focus2-C-Max-Focus3-Kuga-Escape-Lift-Kit/112460641185?epid=19025000547&hash=item1a2f2d33a1:g:0IYAAOSw1m9atFcz. Here is a screenshot:
The highlighted text is what I am trying to get using the div id: ds_div. However When I debug it it has no value. Here is my code:
$description = $html->find("div[id=ds_div]", 0);
var_dump($description);
if($description != null){
$item['description'] = $description->plaintext;
}else{
$item['description'] = '';
}
Try this May be this will helps to you,
foreach ( $html->find('td div#ds_div') as $element ) {
echo $element->plaintext . '<br>';
}
There actually is no element with id ds_div on that page, that's why your query returns nothing. There is however an iframe on that page that contains the element you're looking for. Get the URL of that iframe, parse/scrape the source of that and you should get your description.
Related
I want to scrape few web pages. I am using php and simple html dom parser.
For instance trying to scrape this site: https://www.autotrader.co.uk/motorhomes/motorhome-dealers/bc-motorhomes-ayr-dpp-10004733?channel=motorhomes&page=5
I use this load the url.
$html = new simple_html_dom();
$html->load_file($url);
This loads the correct page. Then I find the next page link, here it will be:
https://www.autotrader.co.uk/motorhomes/motorhome-dealers/bc-motorhomes-ayr-dpp-10004733?channel=motorhomes&page=6
Just the page value is changed from 5 to 6. The code snippet to get the next link is:
function getNextLink($_htmlTemp)
{
//Getting the next page links
$aNext = $_htmlTemp->find('a.next', 0);
$nextLink = $aNext->href;
return $nextLink;
}
The above method returns the correct link with page value being 6.
Now when I try to load this next link, it fetches the first default page with page query absent from the url.
//After loop we will have details of all the listing in this page -- so get next page link
$nxtLink = getNextLink($originalHtml); //Returns string url
if(!empty($nxtLink))
{
//Yay, we have the next link -- load the next link
print 'Next Url: '.$nxtLink.'<br>'; //$nxtLink has correct value
$originalHtml->load_file($nxtLink); //This line fetches default page
}
The whole flow is something like this:
$html->load_file($url);
//Whole thing in a do-while loop
$originalHtml = $html;
$shouldLoop = true;
//Main Array
$value = array();
do{
$listings = $originalHtml->find('div.searchResult');
foreach($listings as $item)
{
//Some logic here
}
//After loop we will have details of all the listing in this page -- so get next page link
$nxtLink = getNextLink($originalHtml); //Returns string url
if(!empty($nxtLink))
{
//Yay, we have the next link -- load the next link
print 'Next Url: '.$nxtLink.'<br>';
$originalHtml->load_file($nxtLink);
}
else
{
//No next link -- stop the loop as we have covered all the pages
$shouldLoop = false;
}
} while($shouldLoop);
I have tried encoding the whole url, only the query parameters but the same result. I also tried creating new instances of simple_html_dom and then loading the file, no luck. Please help.
You need to html_entity_decode those links, I can see that they are getting mangled by simple-html-dom.
$url = 'https://www.autotrader.co.uk/motorhomes/motorhome-dealers/bc-motorhomes-ayr-dpp-10004733?channel=motorhomes';
$html = str_get_html(file_get_contents($url));
while($a = $html->find('a.next', 0)){
$url = html_entity_decode($a->href);
echo $url . "\n";
$html = str_get_html(file_get_contents($url));
}
I'm totally new to php, and I'm having a hard time changing the src attribute of img tags.
I have a website that pulls a part of a page using Simple Html Dom php, here is the code:
<?php
include_once('simple_html_dom.php');
$html = file_get_html('http://www.tabuademares.com/br/bahia/morro-de-sao-paulo');
foreach($html ->find('img') as $item) {
$item->outertext = '';
}
$html->save();
$elem = $html->find('table[id=tabla_mareas]', 0);
echo $elem;
?>
This code correctly returns the part of the page I want. But when I do this the img tags comes with the src of the original page: /assets/svg/icon_name.svg
What I want to do is change the original src so that it looks like this: http://www.mywebsite.com/wp-content/themes/mytheme/assets/svg/icon_name.svg
I want to put the url of my site in front of assets / svg / icon_name.svg
I already tried some tutorials, but I could not make any work.
Could someone please kind of help a noob in php?
i could make it work. So if someone have the same question, here is how i managed to get the code working.
<?php
// Note you must download the php files simple_html_dom.php from
// this link https://sourceforge.net/projects/simplehtmldom/files/
//than include them
include_once('simple_html_dom.php');
//target the website
$html = file_get_html('http://the_target_website.com');
//loop thru all images of the html dom
foreach($html ->find('img') as $item) {
// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected...), it will returns true or false)
$value = $item->src;
// Set a attribute
$item->src = 'http://yourwebsite.com/'.$value;
}
//save the variable
$html->save();
//findo on html the div you want to get the content
$elem = $html->find('div[id=container]', 0);
//output it using echo
echo $elem;
?>
That's it!
did you read the documentation for read and modify attributes
As per that
// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected...), it will returns true or false)
$value = $e->href;
// Set a attribute
$e->href = 'ursitename'.$value;
say i have html code like this
$html = "This is some stuff right here. OH MY GOSH";
i am trying to get values of href and also on which anchor work i mean check this out text i am able to get href value by following this code
$displaybody->find('a ') as $element;
echo $element;
well it works for me but how do i get value of check this out could you guys help me out. i did search but i am not able to find it out . thanks in advance
my actual html look like this
» Download MP4 « - <b>144p (Video Only)</b> - <span> 19.1</span> MB<br />
my href look like this above code return download mp4 and i want it like downloadmp4 114p (video only) 19.1 mb how do i do that
If what you are using now is the SimpleHTMLDOM, then ->innertext works fine on that anchor elements that you have found:
include 'simple_html_dom.php';
$html = "This is some stuff right here. OH MY GOSH";
$displaybody = str_get_html($html);
foreach($displaybody->find('a ') as $element) {
echo $element->innertext . '<br/>';
}
If you were referring to PHP's DOMDocument, then its not find() function you need to use, to target each anchor element, you need to use ->getElementsByTagName(), then each selected elements you need to use ->nodeValue:
$html = "This is some stuff right here. OH MY GOSH";
$dom = new DOMDocument();
$dom->loadHTML($html);
foreach($dom->getElementsByTagName('a') as $element) {
echo $element->nodeValue . '<br/>';
}
I'm trying to do a PHP page to shows an album cover via Last.FM API. However, the Artist Name and the Title of the music are provided by a XML file that a software updates via FTP.
Here is the code of Last.FM api:
<?php
$img = simplexml_load_file('http://ws.audioscrobbler.com/2.0/?method=track.getInfo&api_key=<APIKEY>&artist=cher&track=believe');
echo '<img src="';
echo $img->track[0]->album[0]->image[3];
echo '">';
?>
Now the link of my XML file is: http://summerblast.pt/avaplayer/rds.xml
The info I need to the Last.FM API is in 'OnAir/CurMusic'.
Well, what I am trying to do is change "Cher" and "Believe" (the artist's name and the Title's name) in the link of the "simplexml_load_file" (in the php code) with the info that my XML file provides.
Can you please help me doing this?
Thank you all in advance.
Sorry was confused to what you were asking from before. I have gone ahead and corrected to do what you wanted.
You can get the Artist and the Title from rds.xml and just pass it to the URL like the following listed below (note it is important to run urlencode on them since they can have spaces and other things to break the URL):
Updated to do error checking
$xml = simplexml_load_file('http://summerblast.pt/avaplayer/rds.xml');
$artist = urlencode($xml->OnAir->CurMusic->Artist);
$track = urlencode($xml->OnAir->CurMusic->Title);
$url = 'http://ws.audioscrobbler.com/2.0/?method=track.getInfo&api_key=XXXXXXXXXXXXXXXXXXXXXXXXXXX&artist='.$artist.'&track='.$track;
$xml2 = #simplexml_load_file($url);
if ($xml2 === false)
{
echo("Url failed"); // do whatever you want to do
}
else
{
if($xml2->track->album->image[3])
{
echo '<img src="';
echo((string) $xml2->track->album->image[3]);
echo '">';
}
else
{
echo("artist does not have a image"); // do whatever you want to do
}
}
I have a select menu in jquery mobile that I want to show a table from another website when the user selects an option, but the table needs to change when the user selects a different option.
I am using simple HTML dom parser, but I was wondering how to add the value of the selected option on to the url so if the user selects an option with the value of 32, it adds 32 onto the url so that the url used in the PHP code would be 'http://www.generalconvention.org/gc/deputations?diocese_id=32'. How do I do this using PHP?
<?php
include('simple_html_dom.php');
// get DOM from URL or file
$html = file_get_html('http://www.generalconvention.org/gc/deputations?diocese_id=');
// Find all tables
foreach($html->find('table') as $element)
echo $element;
?>
After you capture the value, concatenate it like this:
<?php
$value = "Your form value";
include('simple_html_dom.php');
// get DOM from URL or file
$html = file_get_html('http://www.generalconvention.org/gc/deputations?diocese_id=' . $value);
// Find all tables
foreach($html->find('table') as $element)
echo $element;
?>
You can do ajax call:
$(".comboboxClass").change(function(){
$("#divHtml").load('http://www.generalconvention.org/gc/deputations?diocese_id='+
$(this).find('option:selected').val()+' table.deputies'
);
});
In the #divhtml loads the contents of the table .deputies with load .