Using PHP and cURL, I'd like to check if I can login to a website using the provided user credentials. For that I'm currently retrieving the entire website and then use regex to filter for keywords that might indicate the login didn't work.
The url itself contains the string "errormessage" if a wrong username/password has been entered. Is it possible to only use curl to get the url address, without the contents to speed it up?
Here's my curl PHP code:
function curl_get_request($referer, $submit_url, $ch)
{
global $cookie_path;
// sends a request via curl to the string specifics listed
$agent = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)";
curl_setopt($ch, CURLOPT_URL, $submit_url);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_REFERER, $referer);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_path);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_path);
return $result = curl_exec ($ch);
}
Also, if somebody has a better idea on how to handle a problem like this, please let me know!
What you should do is check the URL each time there is a redirect. Most redirects are going to be done with the proper HTTP headers. If that is the case, see this answer:
PHP: cURL and keep track of all redirections
Basically, turn off automatic redirection following, and check the HTTP status code for 301 or 302. If you get one of those, you can continue to follow the redirection if needed, or exit from there.
If instead, the redirection is happening client side, you will have to parse the page with a DOM parser.
Related
I'm opening a page as logged user, and it kind of seems to work, except the website has some sort of a protection system. If I do this normally, I'll get the page I want, but if I do it with cURL, I'll get 'Welcome back user (userid)' and a link to the page I requested. Once I click the link, I'll get where I want to be. Now I tried faking the referer and checking the data that gets sent to the page, there's nothing special there. When I click the link, I simply get redirected to the page I wanted in the first place. My question is why doesn't this code get me there as well:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL , "http://www.site.com/sell/index");
curl_setopt($ch, CURLOPT_REFERER, 'http://www.site.com');
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13");
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_FAILONERROR, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, false);
curl_setopt($ch, CURLOPT_POST, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
$response = curl_exec($ch);
curl_close($ch);
echo $response;
Just before I do this, I perform login procedure and grab the cookie. And I do get to open the page as logged in user, I just can't seem to access it without clicking the ahref.
PS. The same thing would happen if I logged in, open the page I wanted, closed browser and opened it again. So I'm thinking it has to do with referer?
cookie-jar means it will save your cookie from curl's response. That's why it is not working for you. Instead use cookie-file so that your curl send stored cookie with request:
curl_setopt($ch, CURLOPT_COOKEFILE, "cookie.txt");
Also, use absolute path(/var/tmp/cookie.txt) instead of relative path.
Now, Be Happy!
There's a webpage that I need to log in to. I used CURL with post to login, but it's not enough. When you log in from the website the post also includes a string that is always changing. Is threre a way to get over that?
I use this:
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; he-IL; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6");
curl_setopt ($ch, CURLOPT_REFERER, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_POST, TRUE);
$post = "username=$username&password=$password";
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
$result = curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);
It's like I need the code to actually go to the webpage and fill the form regularly.
I looked everywhere but all I could find was using post data.
Thanks!
To pass that you need to visit the page with the form, grab the field and then use it in POST request when you submit the form.
I suggest you visit the form page not only for that, but also for the following reasons (some of which can be used to figure people using automatic requests):
You recieve cookies
You don't fake referrer, you actually visited the page
You might want to check form fields to see if there's any new ones added since you wrote the script. That could be the case if form setup changes and you might want to adapt to that, if you don't then your script might stop working one day
I am trying to retrieve a web page from the following url:
http://www.medicare.gov/find-a-doctor/provider-results.aspx?searchtype=OHP&specgrpids=922&loc=43615&pref=No&gender=Unknown&dist=25&lat=41.65603&lng=-83.66676
It works when I paste it into a browser, but when I run it through cURL, I receive a page with the following error: "One or more query string parameters of requested url are invalid or has unexpected value, please correct and retry."
It doesn't seem to make a difference if I provide a different userAgent or referrer. There is a redirect, so I use CURLOPT_FOLLOWLOCATION.
Here is my code:
$ch = curl_init($page);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 5.1; rv:12.0) Gecko/20100101 Firefox/12.0');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$html = curl_exec($ch);
curl_close($ch);
echo $html;
Any thoughts on why a request like this will work in the browser and not with cURL?
Your browser is sending cookies that cURL is not. Check the cookies you are sending to the site using browser tools or Fidler - you'll need to pass the same.
The problem was with cookies. This particular site needed an ASP.NET_SessionId cookie set in order to respond. I added the following to my cURL request:
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookie.txt');
curl_setopt($ch, CURLOPT_COOKIE, 'ASP.NET_SessionId=ho1pqwa0nb3ys3441alenm45; path=/; domain=www.medicare.gov');
I don't know if any session id will work, but it tried a couple random ones and they all worked.
I am using PHP cURL to fetch XML output from a URL. Here is what my code looks like:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.mydomain.com?querystring');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC);
curl_setopt($ch, CURLOPT_USERPWD, "username:password");
$store = curl_exec($ch);
echo $store;
curl_close($ch);
But, instead of returning the XML it just shows my 404 error page. If I type the URL http://www.mydomain.com?querystring in the web browser I can see the XML in the browser.
What am I missing here? :(
Thanks.
Some website owners check for the existence of certain things to make sure the request comes from a web browser and not a bot (or cURL). You should try adding curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)'); and see if that fixes the problem. That will send a user-agent string. The site may also check for the existence of cookies or other things.
To output the XML in a web-page, you'll need to use htmlentities(). You might want to wrap it inside a HTML <pre> element as well.
I want to access https://graph.facebook.com/19165649929?fields=name (obviously it's also accessable with "http") with cURL to get the file's content, more specific: I need the "name" (it's json).
Since allow_url_fopen is disabled on my webserver, I can't use get_file_contents! So I tried it this way:
<?php
$page = 'http://graph.facebook.com/19165649929?fields=name';
$ch = curl_init();
//$useragent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1";
//curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
curl_setopt($ch, CURLOPT_URL, $page);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
?>
With that code I get a blank page! When I use another page, like http://www.google.com it works like a charm (I get the page's content). I guess facebook is checking something I don't know... What can it be? How can I make the code work? Thanks!
did you double post this here?
php: Get html source code with cURL
however in the thread above we found your problem beeing unable to resolve the host and this was the solution:
//$url = "https://graph.facebook.com/19165649929?fields=name";
$url = "https://66.220.146.224/19165649929?fields=name";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Host: graph.facebook.com'));
$output = curl_exec($ch);
curl_close($ch);
Note that the Facebook Graph API requires authentication before you can view any of these pages.
You basically got two options for this. Either you login as an application (you've registered before) or as a user. See the api documentation to find out how this works.
My recommendation for you is to use the official PHP-SDK. You'll find it here. It does all the session and cURL magic for you and is very easy to use. Take the examples which are included in the package and start to experiment.
Good luck.