I'm trying to make a script that would run through some sites that I visit every day and get the most interesting info/statistics from them. I wanted to use curl for this purpose, because some of these sites require authentification. Everything was ok until I bumped into the site: rossnet.pl which seems to be somehow secured 'cause I can't authenticate myself at all.
The form that I want to use can be found here:
https://www.rossnet.pl/rossnetlogin.aspx
On the left, under the text: "Mam konto w Rossnet.pl - Logowanie". It doesn't seem to have any hidden input fields, only two text fields for credentials, called:
- "dnn$ctr1203$ViewLogin$txtUserLogin"
- "dnn$ctr1203$ViewLogin$txtUserPass"
I'm using the code shown below but the page returned by the server seems as if exactly nothing happened (no error messages, it seems to look the same as when I don't send any POST data).
Does anyone have a clue about what may be wrong? In the code below I put in actual account credentials for you to be able to test the script if you wish to help me.
Here you can see how does the script below work on my server:
http://kremuwa.netii.net/rossman/skrypt.php
<?php
$url = "https://www.rossnet.pl/rossnetlogin.aspx";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookie.txt');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101206 Ubuntu/10.10 (maverick) Firefox/3.6.13');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_REFERER, $url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, 'dnn$ctr1203$ViewLogin$txtUserLogin=warzywko3000&dnn$ctr1203$ViewLogin$txtUserPass=password123');
$output = curl_exec($ch);
curl_close($ch);
echo $output;
?>
Login forms are sometimes protected with challenges that prevent you from directly submitting the form without loading the page first. I've listed a few options that could stand in your way.
One option is cookie challenges, it's also the easiest to deal with by just loading the page (fetch the cookie) and send it along with the form submission.
Another option is a hidden field challenge; a hidden form field is populated with a challenge code and the submission expects that value to be sent as well.
The last option I can think of is an even more difficult approach involving JavaScript; the page would use JavaScript to load the challenge string, maybe obfuscate it a bit and then send it along (via hidden form field or ajax request).
Related
I'm trying to login to an external webpage using a php script with cURL. I'm new to cURL, so I feel like I'm missing a lot of pieces. I found a few examples and modified them to allow access to https pages. Ultimately, my goal is to be able to login to the page and download a .csv by following a specified link once logged in. So far, what I have is a script that tests logging in to the page; the script is shown below:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.websiteurl.com/login');
curl_setopt($ch, CURLOPT_POSTFIELDS,'Email='.urlencode($login_email).'&Password='.urlencode($login_pass).'&submit=1');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3");
curl_setopt($ch, CURLOPT_REFERER, "https://www.websiteurl.com/login");
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$output = curl_exec($ch);
I have a few questions. First, is there a reason this does not redirect on its own? The only way for me to view the contents of the page is to
echo $output
even though CURLOPT_RETURNTRANSFER and CURLOPT_FOLLOWLOCATION are both set to True.
Second, the URL for the page stays at "localhost/folderName/test.php" instead of directing to the actual website. Can anyone explain why this happens? Because the script doesn't actually redirect to a logged in webpage, I can't seem to do anything that I need to do.
Does my issue have to do with cookies? My cookies.txt file is in the same folder that my .php script is. (I'm using wampServer btw). Should it be located elsewhere?
Once I'm able to fix these two issues, it seems that all I need to be able to do is to redirect to the link that start the download process for the .csv file.
Thanks for any help, much appreciated!
Answering part of your question:
From http://php.net/manual/en/function.curl-setopt.php :
CURLOPT_RETURNTRANSFER TRUE to return the transfer as a string of the
return value of curl_exec() instead of outputting it out directly.
In other words - doing exactly what you described. It's returning the response to a string and you echo it to see it. As requested...
----- EDIT-----
As for the second part of your question - when I change the last three lines of the script to
$output = curl_exec($ch);
header('Location:'.$website);
echo $output;
The address of the page as displayed changes to $website - which in my case is the variable I use to store my equivalent of your 'https://www.websiteurl.com/login'
I am not sure that is what you wanted to do - because I'm not sure I understand what your next steps are. If you were getting redirected by the login site, wouldn't the new address be part of the header that is returned? And wouldn't you need to extract that address in order to perform the next request (wget or whatever) in order to download the file you wanted to get?
To do so, you need to set CURLOPT_HEADER to TRUE,
You can get the URL where you ended up from
$last_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
(see cURL , get redirect url to a variable ).
The same link also has a useful script for completely parsing the header information (returned when CURLOPT_HEADER==true. It's in the answer by nico limpica.
Bottom line: CURL gets the information that your browser would have received if you had pointed it to a particular site; that doesn't mean your browser behaves as though you pointed it to that site...
There's a webpage that I need to log in to. I used CURL with post to login, but it's not enough. When you log in from the website the post also includes a string that is always changing. Is threre a way to get over that?
I use this:
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; he-IL; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6");
curl_setopt ($ch, CURLOPT_REFERER, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_POST, TRUE);
$post = "username=$username&password=$password";
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
$result = curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);
It's like I need the code to actually go to the webpage and fill the form regularly.
I looked everywhere but all I could find was using post data.
Thanks!
To pass that you need to visit the page with the form, grab the field and then use it in POST request when you submit the form.
I suggest you visit the form page not only for that, but also for the following reasons (some of which can be used to figure people using automatic requests):
You recieve cookies
You don't fake referrer, you actually visited the page
You might want to check form fields to see if there's any new ones added since you wrote the script. That could be the case if form setup changes and you might want to adapt to that, if you don't then your script might stop working one day
I am Trying to login to 2shared with curl-php but for some reason it just returns me login page and does not set proper cookies in cookie file. Below is my code. Thanks for any help.
$user = "";
$pass = "";
$cookie = "cookie.txt";
$jsonp = 'jsonp'.time();
if (file_exists($cookie)) {
unlink($cookie);
}
$post = array(
"login" => $user,
"password" => $pass,
"callback" => $jsonp
);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.2shared.com/login?callback=".$jsonp);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('X-Requested-With: XMLHttpRequest'));
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
curl_setopt($ch, CURLOPT_REFERER, 'http://www.2shared.com/');
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; rv:12.0) Gecko/20100101 Firefox/12.0");
curl_setopt($ch, CURLOPT_ENCODING, "UTF-8" );
$return = curl_exec($ch);
curl_close($ch);
echo $return;
EDIT:
When I login via browser and watch traffic via HTTP analyzer i noticed after hitting login button it returns this data and redirect to loginRedirect object and i notice it set some cookies which does not appears while I am doing php-curl request:
{
"ok":true,
"rejectReason":"",
"loginRedirect":"http://www.2shared.com/account/homeDoorway.jsp;jsessionid=3F253C7C641C7A8402D4AC9872C1CEAE.dc282?rand=0.8112776952920494",
"loggedIn":"myemail#email.com",
"needActivation":false
}
But when trying to login with curl-php above code it return me this data:
jsonp1339804887({
"ok":true,
"rejectReason":"",
"loginRedirect":"http://www.2shared.com/login.jsp?sessionUnavailable=1",
"loggedIn":"",
"needActivation":false
})
As always when doing web scraping, the key is to compare with a recorded session done manually with a browser (like with LiveHTTPHeaders or similar tools). Then make sure that your script is sending a request as similar as the recorded one as possible.
If you had done that, you would've seen that...
The login form on 2shared doesn't seem to use a multipart formpost, so your passing of $array to CURLOPT_POSTFIELDS is wrong. It should simply be a string in the form of "login=$name&password=$secret". This said, this may not be the only flaw in your approach.
This may be just a short in the dark, but it appears to me that you actually should look at the redirect and follow it. The error message does indicate that you're not actually within a functioning session on the server side – and the session identification is part of the address that you would have been redirected to but chose not to follow. ;jsessionid=3F253C7C641C7A8402D4AC9872C1CEAE.dc282 The latter part ?rand=0.8112776952920494 appears – to me! – to be a random number the system also wants to have sent back. I'll take this to be a trivial token mechanism to make sure that the request actually is fresh and not something like a script that tries to get in :-)
Also, are you certain that the callback mechanism you use (with time) does make so much sense?
Have you tried to get to the login page innocently, watching for the redirect to pop up and then start your other code from there?
I am trying to automate the login progress on a captcha protected page. I am using Death By Captcha to translate the image into text and it seems to be working well. I am using curl to load the login page, retrieve the captcha image url, send it to DBC, get the text back and submit a POST request to the login page with the captcha text.
The problem that I'm having is that the captcha image changes when I submit the post request. Since I do not get the same behavior when reloading/or wrongly submitting the form through a browser (I get the same image over and over again), I am assuming that the problem has to do with the cookies or something else that I'm missing that relates to the session.
This is the code that I use to retrieve the data and submit the form:
$ch = curl_init();
// Not sure that I need it, just make sure that the session doesn't change...
curl_setopt($ch, CURLOPT_COOKIESESSION, false);
curl_setopt($ch, CURLOPT_URL, $loginUrl);
// It seems that PHPSESSID cookie parameter might be the parameter that keep the image the same, but it didn't work. I even read it dynamically from the cookie file but it still didn't work
//curl_setopt($ch, CURLOPT_COOKIE, "PHPSESSID=2bp3nhkp3bgftfrr1rjekg03o2");
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieName);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieName);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, $loginUrl);
$result = curl_exec($ch);
// Resolve the captcha and append it to the post parameters
$captchaText = $this->resolveCaptcha($result);
$postData .= '&LoginForm%5BverifyCode%5D='.$captchaText;
// Resubmit the form with the updated form data
curl_setopt($ch, CURLOPT_REFERER, $loginUrl);
curl_setopt($ch, CURLOPT_URL, $loginUrl);
curl_setopt ($ch, CURLOPT_POST, 1); //FIXED
curl_setopt ($ch, CURLOPT_POSTFIELDS, $postData);
$result = curl_exec($ch);
When I print the end result, I can see that the captcha text was submitted successfully but that the image itself has changed...
I am also attaching a screenshot of the request params as captured with Tamper in a standard Firefox session (so someone might spot if I'm missing something).
The PHP/curl submit code is fully working for non-captcha based sites so the POST parameters submission seems to be working.
It could be that I'm missing something very basic here, any help will be much appreciated.
I also took a look at these posts but couldn't find the answer that I'm looking for.
How CURL Login with Captcha and Session
How to retrieve captcha and save session with PHP cURL?
https://stackoverflow.com/questions/8633282/curl-to-download-a-captcha-and-submit-it
you're using
curl_setopt ($ch, CURLOPT_POST, 0);
in second curl_exec. shoudn't it be
curl_setopt ($ch, CURLOPT_POST, 1);
?
THE BACKGROUND DETAILS:
I have a custom shopping cart that uses PayPal for payment processing. I have an intermediary page between the cart and PayPal that adds the order to a database and sends confirmation emails.
Until now, I had the intermediary page set up to include all the necessary data as hidden form fields and submit the form to PayPal onload.
Now I'm experimenting with using cURL in PHP to send the POST data to PayPal.
$ch=curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.paypal.com/cgi-bin/webscr');
//curl_setopt($ch, CURLOPT_URL, 'http://localhost/postecho.php');
// ^ this one is a simple page that echoes all POST data using print_r
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $poststring);
// Some options that didn't seem to help
//curl_setopt($ch, CURLOPT_HEADER, 1);
//curl_setopt($ch, CURLOPT_POST, 1);
//curl_setopt($ch, CURLOPT_PROTOCOLS, CURLPROTO_HTTPS);
// User agent spoofing which also didn't seem to help
//$agent = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)";
//curl_setopt($ch, CURLOPT_USERAGENT, $agent);
$result=curl_exec($ch);
curl_close($ch);
$poststring contains all the POST data that I had previously been passing in param1=value¶m2=value format. Running this through test page postecho.php reveals that POST data seems to be alright.
THE PROBLEM:
"Sorry — your last action could not be completed"
This is what PayPal tells me when I try to do things the cURL way. It doesn't really give me any helpful information concerning the resolution of this problem. I figure there's gotta be something in the headers or something that it doesn't like. How do I make PayPal and cURL work together?
most likely you are missing cookie/session data. if i were you i would capture the raw http message that goes from your browser to paypal.com. some of it's info isn't going to be needed for the request to work, but at least it's going to contain all info you need. then try to emulate it with curl.
long answer short: first capture raw http message, then emulate it with curl.
Have you checked the API docs for PHP?
https://cms.paypal.com/us/cgi-bin/?&cmd=_render-content&content_ID=developer/library_code