Php file_get_contents() issue - php

With php file_get_contents() i want just only the post and image. But it's get whole page. (I know there is other way to do this)
Example:
$homepage = file_get_contents('http://www.bdnews24.com/details.php?cid=2&id=221107&hb=5',
true);
echo $homepage;
It's show full page. Is there any way to show only the post which cid=2&id=221107&hb=5.
Thanks a lot.

Use PHP's DomDocument to parse the page. You can filter it more if you wish, but this is the general idea.
$url = 'http://www.bdnews24.com/details.php?cid=2&id=221107&hb=5';
// Create new DomDocument
$doc = new DomDocument();
$doc->loadHTMLFile($url);
// Get the post
$post = $doc->getElementById('opage_mid_left');
var_dump($post);
Update:
Unless the image is a requirement, I'd use the printer-friendly version: http://www.bdnews24.com/pdetails.php?id=221107, it's much cleaner.

You will need to parse the resulting HTML using a DOM parser to get the HTML of only the part you want. I like PHP Simple HTML DOM Parser, but as Paul pointed out, PHP also has it's own.

you can extract the
<div id="page">
//POST AND IMAGE EXIST HERE
</div>
part from the fetched contents using regex and push it on your page...

Related

How to get desire innertext from html tag in simple html dom

I have some text in which there is codes. I want to get last text from the link. here is an example
Some textBeezfeed.cu.ma<br>
another textGoogle.com<br>
I want to get Google.com text from the above code. I have tried and use Simple html dom. Anyway Here is my code
<?PHP
require_once('simple_html_dom.php');
$html = new simple_html_dom();
function tags($ddd){
$bbb=$ddd->find('a',1);
foreach($bbb as $bs){
echo $bs->innertext;
}
}
$html = str_get_html('Some textBeezfeed.cu.ma<br>
another textGoogle.com<br>');
echo tags($html);
?>
I want to get Google.com how to get. Please help me
I strongly recommend you use some external library to parse HTML. Any HTML you need. As you need today or in future needs.
Some very good tools are named inside these stackoverflow post.
I personally use simplehtmldom.sourceforge.net since ages with very good results.

Get all HTML comments on website with html simple dom

I've tried to grab all the comments from a website (The text between <!-- and -->), but without luck.
Here is my current code:
include('simple_html_dom.php');
$html = file_get_html('THE URL');
foreach($html->find('comment') as $element)
echo $element->plaintext;
Anyone have any ideas how to grab the comments, at the moment it's only giving me a blank page
I know regex is not supposed to parse HTML, but <!--(.*?)--> you can use a similar regex to find and fetch the comments...

Get contents of a div from a URL [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How to implement a web scraper in PHP?
How to parse and process HTML with PHP?
I need to crawl through a page and get the contents of a particular div. I have php and javascript as my two main options. How can it be done?
There are many ways to get the contents of an url:
First Method:
http://simplehtmldom.sourceforge.net/
Simple HTML DOM Parser
Second Method :
<?php
$contents = file_get_contents("http://www.url.com");
$contents = strip_tags($contents, "<div>");
preg_match_all("/<div/>(?:[^<]*)<\/div>/is", $contents, $file_contents);
?>
Third Method:
`You can use jquery like Selectors :`
http://api.jquery.com/category/selectors/
This is quite a basic method to do it PHP and it returns the content in plain text. However you might consider revising the regex for your particular need.
<?php
$link = file_get_contents("http://www.domain.com");
$file = strip_tags($link, "<div>");
preg_match_all("/<div/>(?:[^<]*)<\/div>/is", $file, $content);
print_r($content);
?>
You can use SimpleDomParser as documented here http://simplehtmldom.sourceforge.net/manual.htm
it requires PHP5+ though, but the nice thing is you can find tags on an HTML page with selectors just like jQuery.
Specifically with jQuery, if you have a div like the following:
<div id="cool_div">Some content here</div>
You could use jQuery to get the contents of the div like this:
$('#cool_div').text(); // will return text version of contents...
$('#cool_div').html(); // will return HTML version of contents...
If you're using PHP to generate the content of the page, then you should be able to get a decent handle on the content and manipulate it even before it's returned to the screen and displayed. Hope this helps!
Using PHP, you can try the DOMDocument class and the getElements() function

get contents from another page using php?

I'm not sure this is possible or not.
I want a php script when executed , it will go to a page (on a different domain) and get the html contents of it and inside the html there's links , and that script is able to get each link's href.
html code:
<div id="somediv">
Yahoo
Google
Facebook
</div>
The output code(which php will echo out) will be
http://yahoo.com
http://google.com
http://facebook.com
I have heard of cURL in php can do something like this but not exactly like this , i'm a bit confused , i hope some can guide me on this.
Thanks.
have a look at something like http://simplehtmldom.sourceforge.net/
Using DOM and XPath:
<?php
$doc = new DOMDocument();
$doc->loadHTMLFile("http://www.example.com/"); // or you could load from a string using loadHTML();
$xpath = new DOMXpath($doc);
$elements = $xpath->query("//div[#id='somediv']//a");
foreach($elements as $elem){
echo $elem->getAttribute('href');
}
BTW: you should read up on DOM and XPath.

How to write this crawler in php?

I need to create a php script.
The idea is very simple:
When I send a link of a blogpost to this php script, then the webpage is crawled and the first image with the title page are saved on my server.
What PHP function I have to use for this crawler ?
Use PHP Simple HTML DOM Parser
// Create DOM from URL
$html = file_get_html('http://www.example.com/');
// Find all images
$images = array();
foreach($html->find('img') as $element) {
$images[] = $element->src;
}
Now $images array have images links of given webpage. Now you can store your desired image in database.
HTML Parser: HTMLSQL
Features: you can get external html file, http or ftp link and parse content.
Well, you'll have to use quite a few functions :)
But I'm going to assume that you're asking specifically about finding the image, and say that you should use a DOM parser like Simple HTML DOM Parser, then curl to grab the src of the first img element.
I would user file_get_contents() and a regular expression to extract the first image tags src attribute.
CURL or a HTML Parser seem overkill in this case, but you are welcome to check it out.

Categories