I'm using simple_html_dom.php
<?php
include('simple_html_dom.php');
$songName = '再见青春';
$dom = file_get_html('http://www.google.com/cse?q='. $songName .'&cx=partner-pub-4291153493758949%3A9692445719&cof=FORID%3A10&ie=UTF-8&ad=w9&num=1');
$firstRow = $dom->find('#gs-visibleUrl-long')->plaintext;
echo $dom;
var_dump($firstRow);
?>
$dom is ok, but I want to dive in the DOM, it doesn't work. The $firstRow returned NULL. Am I doing this scrapping wrong?
The Dom and error is here http://daysof.me/chrome_lyric/lyric.php
Related
The following code where I try to find divs by class is not working for google search results, I have also tried for id.
include('simple_html_dom.php');
$dom = file_get_html("https://www.google.com/search?q=best+mug");
$all_divs = $dom->find("div[class='g']");
foreach ($all_divs as $div) {
echo $div->plaintext;
}
I think it's better to use XPath to do that, here is a sample of what your code could look like with XPath:
$dom = file_get_contents("https://www.google.com/search?q=best+mug");
#$doc = new DOMDocument();
#$doc->loadHTML($dom);
$xpath = new DomXPath($doc);
$all_divs = $xpath->query("//div[#class='g']");
foreach ($all_divs as $div) {
echo $div->plaintext;
}
Try it out and let me know if it works.
I am trying to crawl a website . Will the below code is that efficient to get me the values which I listed
<?php
include 'simple_html_dom.php';
$target_url = "http://www.phunwa.com/phone/0191/2604233";
$html = new simple_html_dom();
$html->load_file($target_url);
foreach($html->find('Name') as $link){
echo $link."<br />";
}
?>
Actaully I am trying to ftech Name , Address and location . COuld anybody please give me any idea on this.
Thanks in advance
By looking at the source code, try getting the contents of the div with class address-tags then looping through the tags and echoing the contents.
Try this to start with;
$dom = new DomDocument();
$dom->loadHtml($html);
$xpath = new DomXpath($dom);
$div = $xpath->query('//*[#class="address-tags"]')->item(0);
I'm trying to apply 2 classes to an element like this:
$div->setAttribute('class', 'txt found');
unfortunately it won't work as i'm getting the following markup:
<div found="" class="txt">
I've also tried $div->class = "txt found"; which had same result.
Any ideas how to fix this?
Could you please try following;
$div->className = "txt found";
Updated:
<?php
$divHtml = "<div></div>";
$dom = new DOMDocument();
$dom->loadHTML($divHtml);
$allElements = $dom->getElementsByTagName('div');
$divElement = $allElements->item(0);
$divElement->setAttribute("class", "txt found");
echo $dom->saveHTML();
?>
I tried to reproduce your case and finally it worked.You can test it.If you send more code we can modify it inorder to work
I am trying to get the specific tag content, but seems I am not able to do so using following function
<?PHP
include_once('simple_html_dom.php');
function read_page($url = 'http://google.com')
{
$doc = new DOMDocument();
$data = file_get_html($url);
$content = $data->find('div#footer');
print_r( $content);
}
read_page();
?>
Try $data->find('div[id="footer"]')
I'm attempting to make a script that only echos the div that encolose the image on google.
$url = "http://www.google.com/";
$page = file($url);
foreach($page as $theArray) {
echo $theArray;
}
The problem is this echos the whole page.
I want to echo only the part between the <div id="lga"> and the next closest </div>
Note: I have tried using if's but it wasn't working so I deleted them
Thanks
Use the built-in DOM methods:
<?php
$page = file_get_contents("http://www.google.com");
$domd = new DOMDocument();
libxml_use_internal_errors(true);
$domd->loadHTML($page);
libxml_use_internal_errors(false);
$domx = new DOMXPath($domd);
$lga = $domx->query("//*[#id='lga']")->item(0);
$domd2 = new DOMDocument();
$domd2->appendChild($domd2->importNode($lga, true));
echo $domd2->saveHTML();
In order to do this you need to parse the DOM and then get the ID you are looking for. Check out a parsing library like this http://simplehtmldom.sourceforge.net/manual.htm
After feeding your html document into the parser you could call something like:
$html = str_get_html($page);
$element = $html->find('div[id=lga]');
echo $element->plaintext;
That, I think, would be your quickest and easiest solution.