I want to fetch all phone from this website (olx.com.pk).
I have found that function but they will fetch date single phone number from single link of this site (olx.com.pk)
<?php
error_reporting(0);
$ch = curl_init("http://olx.com.pk/item/samsung-galaxy-tab3-16gb-white-IDSUu7h.html#7aae8d1c9a");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$cl = curl_exec($ch);
$dom = new DOMDocument();
#$dom->loadHTML($cl);
//$links = $dom->getElementsByTagName('a');
$xpath = new DOMXpath($dom);
$number = $xpath->query("//strong[#class='xx-large']//text()");
echo "<h1>". $number->item(0)->nodeValue ."</h1>";
?>
I want to fetch all phone number at one...
is it possible to get all number?
Here is slightly simplified version of same code:
// suppress DOM warnings
libxml_use_internal_errors(true);
$url = "http://olx.com.pk/item/samsung-galaxy-tab3-16gb-white-IDSUu7h.html#7aae8d1c9a";
$dom = new DOMDocument();
$dom->loadHTMLfile($url);
$xpath = new DOMXpath($dom);
$items = $xpath->query("//strong[#class='xx-large']");
// loop through items to retrieve node values
foreach ($items as $item) {
echo "<h1>". $item->nodeValue ."</h1>";
}
This code will fetch URL and select all strong[#class='xx-large'] nodes. Values for individual nodes are retrieved inside foreach loop.
P.S.
There is only one phone number on indicated URL and as a final result you can only see one phone number.
Related
Can you echo the results of a document parser or do you have to first create an array to display the results? Anyway, when running the code, nothing appears (no output or errors), and I have tried both methods. Could possibly be a site issue but I have tried a few others and get the same result.
<?php
$ebayquery ='halo';
$ebayhtml = 'https://www.ebay.com/sch/i.html_from=R40&_trksid=p2380057.m570.l1311.R6.TR12.TRC2.A0.H0.X.TRS0&_nkw=' . $ebayquery . '&_sacat=0';
$ebayresults = array();
$document = new \DOMDocument('1.0', 'UTF-8');
$internalErrors = libxml_use_internal_errors(true);
$document->loadHTML($ebayhtml);
libxml_use_internal_errors($internalErrors);
$xpath = new DOMXpath($document);
$links = $xpath->query('//h3[#id="lvtitle"]/a');
foreach($links as $a) {
echo $a->nodeValue;
}
?>
There are a couple of problems with the code. Firstly is that loadHTML() takes a string for the HTML and not a filename or URI. So first you have to read the web page and pass it in ( I've used file_get_contents() here).
Secondly, the XPath was looking for any <h3> tag with an id attribute of lvtitle, there are only instances where the class attribute is lvtitle. I've updated the XPath expression to use this instead.
$ebayquery ='halo';
$ebayhtml = 'https://www.ebay.com/sch/i.html_from=R40&_trksid=p2380057.m570.l1311.R6.TR12.TRC2.A0.H0.X.TRS0&_nkw=' . $ebayquery . '&_sacat=0';
$ebayresults = array();
$document = new \DOMDocument('1.0', 'UTF-8');
$internalErrors = libxml_use_internal_errors(true);
$ebayhtml = file_get_contents($ebayhtml);
$document->loadHTML($ebayhtml);
libxml_use_internal_errors($internalErrors);
$xpath = new DOMXpath($document);
$links = $xpath->query('//h3[#class="lvtitle"]/a');
print_r($links);
foreach($links as $a) {
echo $a->nodeValue.PHP_EOL;
}
Hy friends I am using this method to get all href links from tag from a site
$DOM = new DOMDocument();
#$DOM->loadHTML($data);
#$links = $DOM->getElementsByTagName('a');
foreach($links as $link){
$url = $link->getAttribute('href');
echo $url;
Now I don't know how to get the value by name fb_dtsg ..... Here is the source code
<input type="hidden" name="fb_dtsg" value="AQF0dSiG6Lyr:AQEnJP0PhWzy" autocomplete="off" />
I want to get it's value with DOm how to do this...... Thanks in advance
$DOM = new DOMDocument();
#$DOM->loadHTML($data);
#$links = $DOM->getElementsByTagName('input');
foreach($inputs as $input) {
if ($input->getAttribute('name') == 'fb_dtsg') {
echo 'found, do whatever';
break;
}
}
You can use DOMXpath()'s query method to get elements by the name attribute.
$DOM = new DOMDocument();
#$DOM->loadHTML($data);
#$links = $DOM->getElementsByTagName('a');
$xpath = new DOMXpath($DOM);
$input = $xpath->query('//input[#name="fb_dtsg"]');
echo $input[0]->getAttribute('value');
This will print the value of the first input element with name 'fb_dtsg'.
Hope it helps :) Feel free to ask if you need to know anything more.
Use xpath for that.
$DOM = new DOMDocument();
#$DOM->loadHTML($data);
$xpath = new DOMXpath($DOM);
$elementByName = $xpath->query("//input[#name='fb_dtsg']");
...
http://php.net/manual/ro/class.domxpath.php
$DOM->getElementsByTagName('a'); // for tag name
$DOM->getElementsByName('fb_dtsg'); // for name
document.getElementById('fb_dtsg_id').value // for showing value of the field
We have this code
$page = file_get_contents('http://example.aspx?a=14&c=14213&med=0');
$doc = new DOMDocument();
$doc->loadHTML($page);
$divs = $doc->getElementsByTagName('table');
foreach($divs as $div) {
// Loop through the tableĀ“s looking for one withan id of "Table2"
// Then echo out its contents
if ($div->getAttribute('id') === 'Table2') {
echo $div->childNodes;
}
}
As you see the code works, but outputs plain text, because the function of childnodes, but we need to output the code of "Table2" instead of plain text.
How can I do this?
Solved, with this code
$dom = new DOMDocument();
$data = file_get_contents('http://example.aspx?a=14&c=14213&med=0');
$dom->loadHTML($data); // $data is your html code, grab it using file_get_contents or cURL.
$xpath = new DOMXPath($dom);
$div = $xpath->query('//table[#id="Table2"]');
$div = $div->item(0);
echo $dom->saveXML($div);
I'm working with a DOM parser and I'm having issues. I'm basically trying to grab the href within the tag that only contain the class ID of 'thumbnail '. I've been trying to print the links on the screen and still get no results. Any help is appreciated. I also turned on error_reporting(E_ALL); and still nothing.
$html = file_get_contents('http://www.reddit.com/r/funny');
$dom = new DOMDocument();
#$dom->loadHTML($html);
$classId = "thumbnail ";
$div = $html->find('a#'.$classId);
echo $div;
I also tried this but still had the same result of NOTHING:
include('simple_html_dom.php');
$html = file_get_contents('http://www.reddit.com/r/funny');
$dom = new DOMDocument();
#$dom->loadHTML($html);
// grab all the on the page
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");
$ret = $html->find('a[class=thumbnail]');
echo $ret;
You were almost there:
<?php
$dom = new DOMDocument();
#$dom->loadHTMLFile('http://www.reddit.com/r/funny');
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a[contains(concat(' ',normalize-space(#class),' '),' thumbnail ')]");
var_dump($hrefs);
Gives:
class DOMNodeList#28 (1) {
public $length =>
int(25)
}
25 matches, I'd call it success.
This code would probably work:
$html = file_get_contents('http://www.reddit.com/r/funny');
$dom = new DOMDocument();
#$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$hyperlinks = $xpath->query('//a[#class="thumbnail"]');
foreach($hyperlinks as $hyperlink) {
echo $hyperlink->getAttribute('href'), '<br>;'
}
if you're using simple_html_dom, why are you doing all these superfluous things? It already wraps the resource in everything you need -- http://simplehtmldom.sourceforge.net/manual.htm
include('simple_html_dom.php');
// set up:
$html = new simple_html_dom();
// load from URL:
$html->load_file('http://www.reddit.com/r/funny');
// find those <a> elements:
$links = $html->find('a[class=thumbnail]');
// done.
echo $links;
Tested it and made some changes - this works perfect too.
<?php
// load the url and set up an array for the links
$dom = new DOMDocument();
#$dom->loadHTMLFile('http://www.reddit.com/r/funny');
$links = array();
// loop thru all the A elements found
foreach($dom->getElementsByTagName('a') as $link) {
$url = $link->getAttribute('href');
$class = $link->getAttribute('class');
// Check if the URL is not empty and if the class contains thumbnail
if(!empty($url) && strpos($class,'thumbnail') !== false) {
array_push($links, $url);
}
}
// Print results
print_r($links);
?>
I'm following a simplified version of the scraping tutorial by NetTuts here, which basically finds all divs with class=preview
http://net.tutsplus.com/tutorials/php/html-parsing-and-screen-scraping-with-the-simple-html-dom-library/comment-page-1/#comments
This is my code. The problem is that when I count $items I get only 1, so it's getting only the first div with class=preview, not all of them.
$articles = array();
$html = new simple_html_dom();
$html->load_file('http://net.tutsplus.com/page/76/');
$items = $html->find('div[class=preview]');
echo "count: " . count($items);
Try using DOMDocument and DOMXPath:
$file = file_get_contents('http://net.tutsplus.com/page/76/');
$dom = new DOMDocument();
#$dom->loadHTML($file);
$domx = new DOMXPath($dom);
$nodelist = $domx->evaluate("//div[#class='preview']");
foreach ($nodelist as $node) { print $node->nodeValue; }