get the href value of a specific element and load it - php

I'm using jquery to add rel=brochure using $('.imageOuter a').attr('rel', 'brochure') this works as expected.
However, I want to grab the link that has rel as brochure. I'm trying to do this with loadHTML, as below:
function getBrochureLink() {
$doc = new DOMDocument();
$doc->loadHTML($file);
$area = $doc->getElementsByTagName('body')->item(0);
$links = $area->getElementsByTagName("link");
foreach($links as $l) {
if($l->getAttribute("rel") == "brochure") {
$brochureLink = $l->getAttribute("href");
}
}
}
Sadly $brochureLink is empty and not grabbing it.

Your issue is that the attr is set via Javascript. When you retrieved the page's contents via loadHTML, the JS was not executed, so you can't find the matching link.
You'll have to either run the JS on the server side, put the attr into the DOM directly without JS, or find another architecture for whatever you're attempting to accomplish.

Related

Change src atribute from img, using Simple HTML Dom php library

I'm totally new to php, and I'm having a hard time changing the src attribute of img tags.
I have a website that pulls a part of a page using Simple Html Dom php, here is the code:
<?php
include_once('simple_html_dom.php');
$html = file_get_html('http://www.tabuademares.com/br/bahia/morro-de-sao-paulo');
foreach($html ->find('img') as $item) {
$item->outertext = '';
}
$html->save();
$elem = $html->find('table[id=tabla_mareas]', 0);
echo $elem;
?>
This code correctly returns the part of the page I want. But when I do this the img tags comes with the src of the original page: /assets/svg/icon_name.svg
What I want to do is change the original src so that it looks like this: http://www.mywebsite.com/wp-content/themes/mytheme/assets/svg/icon_name.svg
I want to put the url of my site in front of assets / svg / icon_name.svg
I already tried some tutorials, but I could not make any work.
Could someone please kind of help a noob in php?
i could make it work. So if someone have the same question, here is how i managed to get the code working.
<?php
// Note you must download the php files simple_html_dom.php from
// this link https://sourceforge.net/projects/simplehtmldom/files/
//than include them
include_once('simple_html_dom.php');
//target the website
$html = file_get_html('http://the_target_website.com');
//loop thru all images of the html dom
foreach($html ->find('img') as $item) {
// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected...), it will returns true or false)
$value = $item->src;
// Set a attribute
$item->src = 'http://yourwebsite.com/'.$value;
}
//save the variable
$html->save();
//findo on html the div you want to get the content
$elem = $html->find('div[id=container]', 0);
//output it using echo
echo $elem;
?>
That's it!
did you read the documentation for read and modify attributes
As per that
// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected...), it will returns true or false)
$value = $e->href;
// Set a attribute
$e->href = 'ursitename'.$value;

scraping images from url using php

i am trying to make a page that allows me to grab and save images from another link , so here's what i want to add on my page:
text box (to enter url that i want to get images from).
save dialog box to specify the path to save images.
but what i am trying to do here i want to save images only from that url and from inside specific element.
for example on my code i say go to example.com and from inside of element class="images" grab all images.
notes: not all images from the page, just from inside the element
whether element has 3 images in it or 50 or 100 i don't care.
here's what i tried and worked using php
<?php
$html = file_get_contents('http://www.tgo-tv.net');
preg_match_all( '|<img.*?src=[\'"](.*?)[\'"].*?>|i',$html, $matches );
echo $matches[ 1 ][ 0 ];
?>
this gets image name and path but what i am trying to make is a save dialog box and the code must save image directly into that path instead of echo it out
hope you understand
Edit 2
it's ok of Not having save dialog box. i must specify save path from the code
If you want something generic, you can use:
<?php
$the_site = "http://somesite.com";
$the_tag = "div"; #
$the_class = "images";
$html = file_get_contents($the_site);
libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('//'.$the_tag.'[contains(#class,"'.$the_class.'")]/img') as $item) {
$img_src = $item->getAttribute('src');
print $img_src."\n";
}
Usage:
Change the site, tag, which can be a div, span, a, etc. also change the class name.
For example, change the values to:
$the_site = "https://stackoverflow.com/questions/23674744/what-is-the-equivalent-of-python-any-and-all-functions-in-javascript";
$the_tag = "div"; #
$the_class = "gravatar-wrapper-32";
Output:
https://www.gravatar.com/avatar/67d8ca039ee1ffd5c6db0d29aeb4b168?s=32&d=identicon&r=PG
https://www.gravatar.com/avatar/24da669dda96b6f17a802bdb7f6d429f?s=32&d=identicon&r=PG
https://www.gravatar.com/avatar/24780fb6df85a943c7aea0402c843737?s=32&d=identicon&r=PG
Maybe you should try HTML DOM Parser for PHP. I've found this tool recently and to be honest it works pretty well. It was JQuery-like selectors as you can see on the site. I suggest you to take a look and try something like:
<?php
require_once("./simple_html_dom.php");
foreach ($html->find("<tag>") as $<tag>) //Start from the root (<html></html>) find the the parent tag you want to search in instead of <tag> (e.g "div" if you want to search in all divs)
{
foreach ($<tag>->find("img") as $img) //Start searching for img tag in all (divs) you found
{
echo $img->src . "<br>"; //Output the information from the img's src attribute (if the found tag is <img src="www.example.com/cat.png"> you will get www.example.com/cat.png as result)
}
}
?>
I hope i helped you less or more.

How to get data or value from any div in php

i Have create php page where use many div with different id name.
so i want to get data or value from one div.
Here am showing one div with id name
i want to get data or value from this div.
<div id="tablename">tablename</div>
i have use this but its not working.
$doc = new DomDocument();
$thediv = $doc->getElementById('tablename');
echo $thediv->textContent;
So please tell me how can i get this value from my div?
You need to pass the whole content of your page to the class, otherwise, it can't select nothing since it thinks the document is empty:
$content = '<div id="tablename"></div>';
$doc = new DomDocument();
$doc->loadHTML($content); // That's the addition
$thediv = $doc->getElementById('tablename');
echo $thediv->textContent;
More info:
loadHTML(): Load the HTML from a string.
loadHTMLFile(): Load the HTML from a file.
Downloaded and include PHP Simple HTML DOM Parser from https://sourceforge.net/projects/simplehtmldom/files/ and
Try this
include 'simple_html_dom.php';
$html = file_get_html("http://www.facebook.com");
$displaybody = $html->find('div[id=blueBarDOMInspector]', 0)->plaintext;
echo $displaybody ;exit;

use selector search on html code(string) on PHP variable or ways alike

what im currently doing is i have a text area for user to copy and paste the html code.
i want to get a certain element of that html file.
in pure html, this can be done via jquery selector
but i think its a whole different thing when html code is on a variable and considered as a string.
how can i get a certain element location in that way?
code is:
function searchHtml() {
$html = $_POST; // text area input contains html code
$selector = "#rso > div > div > div:nth-child(1) > div > h3 > a"; //example - the a element with hello world
$getValue = getValueBySelector($selector); //will return hello world
}
function getValueBySelector($selector) {
//what will i do here?
}
searchHtml();
You can look at SimpleHTMLDom Parser (manual at http://simplehtmldom.sourceforge.net/manual.htm). This is a powerful tool to parse the HTML code to find and extract various elements and their attribute.
For your particular case, you can use
// Create a DOM object from the input string
$htmlDom = str_get_html($html);
// Find the required element
$e = $htmlDom->find($selector);
Oh, and you've to pass the provided input value to the getValueBySelector() function :-)

PHP DOM: doesnt load css stylies

I have this code which getting html code of page and than replace all the HREF attributes of the A' tag to redirect it to my site , than my site load the page and again redirect the links and so on...
<?php
libxml_use_internal_errors(true); // hide the parsing errors
$dom = new DOMDocument; // init new DOMDocument
if($_GET){
$dom->loadHtmlFile($_GET['open']); // getting link to redirect to
}else{
$dom->loadHtmlFile('http://www.stackoverflow.com'); // getting default site
}
$dom->loadHtmlFile('http://www.stackoverflow.com'); // load HTML into it
$xpath = new DOMXPath($dom); // create a new XPath
$nodes = $xpath->query('//a[#href]'); // Find all A elements with a href attribute
foreach($nodes as $node) { // Iterate over found elements
$node->setAttribute('href', 'index.php?open=http://www.stackoverflow.com'.$node->getAttribute('href')); // Change href attribute
}
echo $dom->saveXml(); // output cleaned HTML
?>
the code is perfectly running , the only problem is that it won't load CSS files somehow..
you're more than welcome to test this code and see what's the problems!
here is online version: http://browser.breet.co.il
thank you in advance!
Use saveHTML() instead of saveXml()
Using the last one, there's an xml definition at the start of the printed code so it doesn't parse correctly.

Categories