Getting HTML data from php page

Getting HTML data from php page - php

I have a URL like this https://facebook.com/5 , I want to get HTML of that page, just like view source.
I tried using file_get_contents but that didn't returned me correct stuff.
Am I missing something ?
Is there any other function that I can utilize ?
If I can't get HTML of that page, what special thing did the developer do while coding the site to avoid this thing ?

Warning for being off topic
But does this task have be done using PHP?
Since this sounds like a task of web-scraping, I think you would gain more use in casperjs.
With this, you can target with precision what you would want to retrieved from the fb-page rather than grabbing the whole content, which I assume as of this writing is generated by multiple requests of content and rendered to you through a virtual DOM.
Please note that I haven't tried retrieving content from facebook, but I've done this with multiple services.
Good luck!

You may want to use curl instead: http://php.net/manual/en/curl.examples.php
Edit:
Here is an example of mine:
$url = 'https://facebook.com/5';
$ssl = true;
$ch = curl_init();
$timeout = 3;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, $ssl);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$data = curl_exec($ch);
curl_close($ch);
Note that depending on the websites vhost configuration a slash at the end of the url can make a difference.
Edit: Sorry for the undefined variable.. I copied it out of a helper method i used. Now it should be alright.
Yet another Edit:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
By adding this option you will follow the redirects that are apperently happening in your example. Since you said it was an example I actually didnt run it before. Now I did and it works.

Related

What am I doing wrong here (CURL), no matter what I try it returns empty/null

$url = "http://www.reddit.com/r/{mysubreddit}/new.json";
$fields = "sort=new";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $fields);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec($ch);
curl_close($ch);
var_dump($data);
{mysubreddit} is whatever subreddit I wanna check. It works fine to just grab that url via postman, or even in the browser. But when I use PHP/CURL, it returns empty. I've tried replacing the URL, with another URL to another site, and it works fine, so the curl part is working fine.
Is there something with reddit that I have to set? headers? or explicitly tell it for JSON? Or what?
I thought it might have to do with POST, but I tried GET to, still empty/null.
$url = "http://www.reddit.com/r/{mysubreddit}/new.json?sort=new";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec($ch);
curl_close($ch);
That doesnt work either

You just need to add:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

As others have mentioned, reddit is sending you a 302 redirect to https. You would be able to see that by examining the headers returned by curl_getinfo().
Enabling redirect following, as sorak describes, will work. However, it's not a good solution - you will make two HTTP requests on every single API call. This is a completely unnecessary waste of network and increases the execution time of your script. Instead, just change the url that you're requesting to be from https://www.reddit.com/ in the first place.

Execute a php function tas background process hrough url using curl command

I have function for converting files and I want to run it as background process cuing curl.
$url = sendMessages.php;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$curled=curl_exec($ch);
curl_close($ch);
I used above code but didn't work.please advice?
Do I have to use full path url?

yes the url is expected to be a full url, that may look something like
$url = 'http://localhost/file/convertfiles.php';
the actual url really depends of your setup,
also you would probably be confronted to timeout issues, these can be overcome but I don't know enough about what you are trying to do or your setup to tell you how

Set up a php proxy to access censored websites and bypass firewall

I'm currently using this plugin http://wordpress.org/extend/plugins/repress/ which basically makes my website a proxy so that users can access censored websites like this
www.mywebsite.com/proxy/www.cnn.com
The plugin works well enough but when it comes to absolute links the plugin doesn't parse it properly and the link is still blocked. That plugin development has stopped. So I need to write my own script. I've been searching everywhere and reading up on the tutorials I can find but none specifically helps me in regards to this.
I know how to use php curl to fetch a website and echo it on a blank page. What I don't know is how to set a proxy script to work like the above example where users can type
www.mywebsite.com
followed by
/proxy.php
then their target website
/www.cnn.com
Currently I have this set up:
<?php
$url = 'http://www.cnn.com';
$proxy_port = 80;
$proxy = '92.105.140.115';
$timeout = 0;
$referer = 'http://www.mydomain.com'
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);
curl_setopt($ch, CURLOPT_PROXYPORT, $proxy_port);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_PROXYTYPE, 'HTTP');
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 0);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);*/
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_REFERER, $referer);
$data = curl_exec($ch);
curl_close($ch);
echo $data;
?>
This pulls the home page but no css or images are retrieved. Likewise all relative links are broken. I have no idea how to apply the proxy_port and proxy variables. I tried
92.105.140.115:80/www.cnn.com
but that doesn't work. I don't quite fully understand this code either since I found it on an example site.
Any answer or links for tutorials is greatly welcome.
Thank You!

To have a completely functioning proxy isn't that simple. There are many such projects already available. Give any a shot:
http://www.surrogafier.info/
https://github.com/Alexxz/Simple-php-proxy-script
http://www.glype.com/
Have fun!

you can not just echo in a page the result of a curl's fetch website because the browsers will interpret the Uris bad, you need that when the user click on a link he goes to your proxy site not to the original site, so you can't just print with echo, you need to edit manual every link in that fetched page before print it to the users, i have a full functional proxy made by php en p.listascuba.com you cant try it.
contact me for more info

Make a curl request to a url having no file extension?

I have following URL
http://www.davesinclairstpeters.com/auto2_inventorylist?i=37647&c=12452&npg=1&ns=50&echo=2
I want to retrieve content of this url using curl but everytime I make this request it is showing me error, as it is not passing required parameters
Below is my code
$ch = curl_init(); // start CURL
curl_setopt($ch, CURLOPT_URL, $json_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPGET, true);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
$response = curl_exec($ch);

That page doesn't give any information stating that the information isn't being passed properly. In fact, it tells you that the information has been recieved - by viewing the source, you can see:
<!--
javax.servlet.forward.request_uri = /auto2_inventorylist
...
javax.servlet.forward.servlet_path = /auto2_inventorylist
...
javax.servlet.forward.query_string = i=37647&c=12452&npg=1&ns=50&echo=2
-->
Which tells you the information has infact been recieved.
Therefore, it's no problem with your code, but with the website itself. You should make sure the URL you are using is valid, or contact that website to get more information.
With regards to your code itself - the curl_setopt($ch, CURLOPT_HTTPGET, true); isn't necessary, as this is already set by default, and you can also pass the URL as an argument of the curl_init function. Doesn't impact performance, but makes for neater code.
$ch = curl_init($json_url); // start CURL
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
$response = curl_exec($ch);

You code is perfectly fine and if there's something wrong returned, simply paste this URL to your web browser and check the result. In this case website simply failed for some reasons. There's nothing you can do about that as problem is NOT on your side.

This URL yields a page of cars with links to more cars. Looks like the URL you're starting with is old, or has some sort of expiration factor that's not obvious.
Not knowing which sort of filtering parameters you're shooting for.. hard to say what else my be wrong, other than your starting URL be bad.
working url:
http://www.davesinclairlincolnstpeters.com/all-inventory/index.htm?listingConfigId=auto-new%2Cauto-used&compositeType=&year=&make=&start=0&sort=&facetbrowse=true&quick=true&preserveSelectsOnBack=true&searchLinkText=SEARCH&showInvTotals=false&showRadius=false&showReset=true&showSubmit=true&facetbrowseGridUnit=BLANK&showSelections=true&dependencies=model%3Amake%2Ccity%3Aprovince%2Ccity%3Astate&suppressAllConditions=false

Can't get this content spinning API to work with PHP

Can you help me get this content spinning API working? It was wrote to work with C# but is there a way I can get it to work using PHP? I've been trying to post to the URL stated on that page using cURL, but all I'm getting back is a blank page. Here's the code I'm using:
$url = "http://api.spinnerchief.com/apikey=YourAPIKey&username=YourUsername&password=YourPassword";
// Some content to POST
$post_fields = "SpinnerChief is totally free. Not a 'lite' or 'trial' version, but a 100% full version and totally free. SpinnerChief may be free, but it is also one of the most powerful content creation software tools available. It's huge user-defined thesaurus means that its sysnonyms are words that YOU would normally use in real life, not some stuffy dictionary definition. And there's more...SpinnerChief can only get better, because we listen to our users, We take onboard your ideas and suggestions, and we update SpinnerChief so that it becomes the software YOU want!";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PORT, 9001);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec($ch);
curl_close($ch);
echo $result;
Can anyone see something wrong I'm doing? Thanks a lot for the help.

The value for CURLOPT_POST should be 1, and the posted data should be set with CURLOPT_POSTFIELDS.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Getting HTML data from php page - php

Related

What am I doing wrong here (CURL), no matter what I try it returns empty/null

Execute a php function tas background process hrough url using curl command

Set up a php proxy to access censored websites and bypass firewall

Make a curl request to a url having no file extension?

Can't get this content spinning API to work with PHP

Categories

Resources