Using new simple_html_dom
How I can get
The Link
The text(name)
div class="stackoverflow"
href="http://stackoverflow.com">Stackoverflow
div
I think you use inner and outertext but I'm new to all of this so I thought I'll ask the experts.
Thanks
EDIT: I removed the anchor's as they where been parsed and turned into an actually link.
From the simple_html_dom documentation: http://simplehtmldom.sourceforge.net/manual.htm
$html = str_get_html('<div class="stackoverflow" href="http://www.stackoverflow.com">Stackoverflow Div</div>');
$e = $html->find("div.stackoverflow");
$link = $e->href;
$name = $e->innertext;
Obviously you can change the html input etc.
Related
I'm trying to get Facebook's meta tags from my HTML.
I'm using simple html dom to get all html data from the site.
I've tried with preg_replace, but without luck.
I want for example to get the content of this fb meta tag:
<meta content="IMAGE URL" property="og:image" />
Hope someone can help! :-)
I Was going to suggest to use get_meta_tags() but it seems to not work (for me) :s
<?php
$tags = get_meta_tags('http://www.example.com/');
echo $tags['og:image'];
?>
But I would rather suggest using DOMDocument anyways:
<?php
$sites_html = file_get_contents('http://example.com');
$html = new DOMDocument();
#$html->loadHTML($sites_html);
$meta_og_img = null;
//Get all meta tags and loop through them.
foreach($html->getElementsByTagName('meta') as $meta) {
//If the property attribute of the meta tag is og:image
if($meta->getAttribute('property')=='og:image'){
//Assign the value from content attribute to $meta_og_img
$meta_og_img = $meta->getAttribute('content');
}
}
echo $meta_og_img;
?>
Hope it helps
As per this method you will get key pair array of fabcebook open graph tags.
$url="http://fbcpictures.in";
$site_html= file_get_contents($url);
$matches=null;
preg_match_all('~<\s*meta\s+property="(og:[^"]+)"\s+content="([^"]*)~i', $site_html,$matches);
$ogtags=array();
for($i=0;$i<count($matches[1]);$i++)
{
$ogtags[$matches[1][$i]]=$matches[2][$i];
}
i Have create php page where use many div with different id name.
so i want to get data or value from one div.
Here am showing one div with id name
i want to get data or value from this div.
<div id="tablename">tablename</div>
i have use this but its not working.
$doc = new DomDocument();
$thediv = $doc->getElementById('tablename');
echo $thediv->textContent;
So please tell me how can i get this value from my div?
You need to pass the whole content of your page to the class, otherwise, it can't select nothing since it thinks the document is empty:
$content = '<div id="tablename"></div>';
$doc = new DomDocument();
$doc->loadHTML($content); // That's the addition
$thediv = $doc->getElementById('tablename');
echo $thediv->textContent;
More info:
loadHTML(): Load the HTML from a string.
loadHTMLFile(): Load the HTML from a file.
Downloaded and include PHP Simple HTML DOM Parser from https://sourceforge.net/projects/simplehtmldom/files/ and
Try this
include 'simple_html_dom.php';
$html = file_get_html("http://www.facebook.com");
$displaybody = $html->find('div[id=blueBarDOMInspector]', 0)->plaintext;
echo $displaybody ;exit;
I'm trying to get Facebook's meta tags from my HTML.
I'm using simple html dom to get all html data from the site.
I've tried with preg_replace, but without luck.
I want for example to get the content of this fb meta tag:
<meta content="IMAGE URL" property="og:image" />
Hope someone can help! :-)
I Was going to suggest to use get_meta_tags() but it seems to not work (for me) :s
<?php
$tags = get_meta_tags('http://www.example.com/');
echo $tags['og:image'];
?>
But I would rather suggest using DOMDocument anyways:
<?php
$sites_html = file_get_contents('http://example.com');
$html = new DOMDocument();
#$html->loadHTML($sites_html);
$meta_og_img = null;
//Get all meta tags and loop through them.
foreach($html->getElementsByTagName('meta') as $meta) {
//If the property attribute of the meta tag is og:image
if($meta->getAttribute('property')=='og:image'){
//Assign the value from content attribute to $meta_og_img
$meta_og_img = $meta->getAttribute('content');
}
}
echo $meta_og_img;
?>
Hope it helps
As per this method you will get key pair array of fabcebook open graph tags.
$url="http://fbcpictures.in";
$site_html= file_get_contents($url);
$matches=null;
preg_match_all('~<\s*meta\s+property="(og:[^"]+)"\s+content="([^"]*)~i', $site_html,$matches);
$ogtags=array();
for($i=0;$i<count($matches[1]);$i++)
{
$ogtags[$matches[1][$i]]=$matches[2][$i];
}
UPDATE:
Yes I am Using PHP in my pages.
Hello Friends I was thinking..... Is there a way to add a <span> tag to the title without using javascript?
May be using Regex or php or some other method. I dont really know.
Let me explain....
My HTML is like this:
<h3 class="title">The Title Goes Here</h3>
What I want is to automatically add a span tag, so the the final HTML looks like this.
<h3 class="title"><span>The </span>Title Goes Here</h3>
I want to wrap only the first word of the title in a <span> tag.
I know this can easily be dont using Javascript but I am looking for a non-javascript solution.
Please Help!
You can do this with DOMDocument in PHP if you don't want to do it with the javascript DOM:
$html = '<h3 class="title">The Title Goes Here</h3>';
$doc = new DOMDocument();
$doc->loadHTML($html);
$xp = new DOMXPath($doc);
foreach($xp->query('//h3[#class="title"]') as $parent) {
$title = $parent->nodeValue;
list($first, $rest) = explode(' ', $title, 2);
$span = new DOMElement('span', $first. ' ');
$parent->nodeValue = $rest;
$parent->insertBefore($span, $parent->firstChild);
}
foreach($doc->getElementsByTagName('body')->item(0)->childNodes as $node)
{
echo $doc->saveHTML($node);
}
My answer is that the cannot be done. You can't manipulate a page in the browser without JavaScript. This can only be achieved by editing the page on the server manually, or by dynamically generating it using PHP logic, or an equivalent solution, of which there are many.
If you are doing this for a corporate solution that is only used on a single corporate standard browser, you could look into building a plugin for the browser.
Folks,
I am using SIMPLEHTMLPARSER.
I am not able to parse HTML, When i var_dump the html document, it just shows the DOM structure and no HTML content.
$produrl = 'http://wap.ebay.com/Pages/ViewItem.aspx?aid=160586179890&sv=160586179890/';
var_dump(file_get_html($produrl));
$html = file_get_html($produrl);
var_dump($html->find('div[id=Teaser_Item] img[src]', 0));
Actually, what i want to extract is the IMG SRC which is:
http://wap.ebay.com/Pages/RbHttpHandler.ashx?width=51&height=240&fsize=999000&format=jpg&url=http%3A%2F%2Fi.ebayimg.com%2F00%2F%24%28KGrHqN%2C!jEE2n%28iTLozBNwBPG0bUg~~0_1.JPG%3Fset_id%3D8800005007
can someone help me debugging this, please?
Cheers
Natasha Thomas
<?php
require_once('simple_html_dom.php');
$produrl = 'http://wap.ebay.com/Pages/ViewItem.aspx?aid=160586179890&sv=160586179890/';
// Grab the document
$html = file_get_html($produrl);
// Find the img tag in the Teaser_Item div
$a = $html->find('div[id=Teaser_Item] img', 0);
// Display the src
echo($a->attr['src']);
?>