I am trying to login into to a remote site using curl. ( before doing some data scraping)
Using the following code I am producing a cookies.txt file that has the following:
# Netscape HTTP Cookie File
# https://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.
#HttpOnly_www.xxx.com FALSE / TRUE 0 xxxv5 h_r4hXtn-gNAilZwhvHjYdE3Vr4HewhxtGrxja57LbW03-M9MLNqZSeiW7lQ2wRT9lZypNsAiX0gS0Ev1PrvNkGLmwL3B8ZmyOUMLYbTYbSW0y_aPGrIFlEp4skDzh0GJGIGtFHisCmQjEMlu0CJr0UEw2rCT9jbjzg0IyOnFYxNffaMPo229NZWV7HDfCK5M1_y6MPNvW_Kt-h4qTy8YmqGbfBwKxB-bulV78MSXU9ZWz_DVvdu6jXfPiHwCBDMV8FFBLaXm5rqYgNzvbsq8JLe1xkTPn1PNJhyizUa-hlwB6ev8HNwIwBpzs7406l6mL3VgyrDJpay6bHNoMtjh4fLwI7KapFANhFHfn57mg4
#HttpOnly_www.xxx.com FALSE / TRUE 0 ASP.NET_SessionId txakhdi15oeqxyfq53f44dts
When I manually log into the web site the cookie names are correct. So I think I am creating the login ( otherwise the cookies would not be created) but when I output
echo 'HELLO html1 = '.$html1;
I see the page telling me I have entered the wrong username and password.
Code as follows:
ini_set('display_errors', 1);
ini_set('display_startup_errors', 1);
error_reporting(E_ALL);
$username = 'xxx';
$password = 'xxx';
// echo 'STARTING';
//login form action url
$url="https://www.xxxx.com/Login";
$postinfo = "username=".$username."&password=".$password;
$cookie_file_path = "cookie.txt";
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
//set the cookie the site has for certain features, this is optional
curl_setopt($ch, CURLOPT_COOKIE, "cookiename=0");
curl_setopt($ch, CURLOPT_USERAGENT,
"Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, $_SERVER['REQUEST_URI']);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_MAXREDIRS,5); // return into a variable
// curl_setopt($ch, CURLOPT_UPLOAD, true);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST" );
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postinfo);
// set content length
$headers[] = 'Content-length: 0';
$headers[] = 'Transfer-Encoding: chunked';
curl_setopt($ch, CURLOPT_HTTPHEADER , $headers);
$html1 = curl_exec($ch);
echo 'HELLO html1 = '.$html1;
I cannot show the site for security reasons. ( which may be a killer)
Can anyone point me in the right direction?
first off, this won't work: ini_set('display_startup_errors', 1);
- the startup phase is already finished before the userland php code starts to run,
so this setting is set too late. it must be set in the php.ini config file. (not strictly true, but close enough, like on windows you can do crazy registry hacks to enable it, and you can set it with .user.ini files, etc, more info here http://php.net/manual/en/configuration.php )
second, obvious error here is that you don't urlencode $username and $password in $postinfo = "username=".$username."&password=".$password; -
if the username OR password contains any characters with special meanings in urlencoded format, you'll send the wrong credentials and won't get logged in (this includes &,=,#, spaces, and many other characters). fixed version would look like $postinfo = "username=".urlencode($username)."&password=".urlencode($password);
third, don't use CURLOPT_CUSTOMREQUEST for POST requests,
just use CURLOPT_POST.
fourth, your Content-length header is outright lying. the
correct length is actually 'Content-length: '.strlen($postinfo) - which with your code, is definitely not 0 -
but you shouldn't set this header at all, curl will do it for you
if you don't, and unlike you, curl won't mess up the code calculating
the size, so get rid of the entire line.
fifth, this code is also wrong:
$headers[] = 'Transfer-Encoding: chunked';
your curl code here is NOT using chuncked transfers,
and if it were, curl would send that header automatically,
so get rid of it.
sixth, don't just call curl_setopt, if there's an
error setting any of your options, curl_setopt will return
bool(false), and you should watch out for such errors,
use curl_error to extract the error message, and throw an exception,
if such an error occur. - instead of what your code is doing right now,
silently ignoring any curl_setopt errors. use something like
function ecurl_setopt($ch,int $option, $value){if(!curl_setopt($ch,$option,$value)){throw new \RuntimeException('curl_setopt failed!: '.curl_error($ch));}}
if fixing all of these problems is not enough to log in, you're not giving us enough information to help you any further. what does the browsers http login request look like? or what is the login url?
ini_set('display_errors', 1);
ini_set('display_startup_errors', 1);
error_reporting(E_ALL);
$username = 'xxx';
$password = 'xxx';
//login form action url
$url="https://www.xxxx.com/Login";
$postinfo = array("username"=>$username,"password"=>$password);
$cookie_file_path = "cookie.txt";
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch,CURLOPT_SSL_VERIFYHOST,false);
curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false);
curl_setopt($ch,CURLOPT_COOKIEFILE,$cookie_file_path);
curl_setopt($ch,CURLOPT_COOKIEJAR,$cookie_file_path);
curl_setopt($ch, CURLOPT_USERAGENT,
"Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_REFERER, $url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postinfo);
$html = curl_exec($ch);
echo $html;
Above code must works fine.
If there is still an issue, you must check cookie.txt file permissions.
Also if there is an invisible data needs to be sent including post, you can check it using firefox Live Http Headers plugin.
It is not as simple as reading the HTML page using curl. You need to supply a POST value for the submit button. If there is any javascript that executes prior to the activation of ACTION script, then that has to be looked at as well.
Usually you get better results if you use Selenium. See http://www.seleniumhq.org/
EDIT1:
If the server is rejecting your post string try: curl_setopt($handle, CURLOPT_POSTFIELDS, http_build_query($data));
Related
i would like to open all the page ids of the website starting with http://website.com/page.php?id=1 and ending with id=1000
take the data via preg_match and record it somewhere or .txt or .sql
bellow is the curl function i'm using at the moment please kindly advise the full code that will get this job done.
function curl($url)
{
$POSTFIELDS = 'name=admin&password=guest&submit=save';
$reffer = "http://google.com/";
$agent = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)";
$cookie_file_path = "C:/Inetpub/wwwroot/spiders/cookie/cook"; // Please set your Cookie File path. This file must have CHMOD 777 (Full Read / Write Option).
$ch = curl_init(); // Initialize a CURL session.
curl_setopt($ch,CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_URL, $url); // The URL to fetch. You can also set this when initializing a session with curl_init().
curl_setopt($ch, CURLOPT_USERAGENT, $agent); // The contents of the "User-Agent: " header to be used in a HTTP request.
curl_setopt($ch, CURLOPT_POST, 1); //TRUE to do a regular HTTP POST. This POST is the normal application/x-www-form-urlencoded kind, most commonly used by HTML forms.
curl_setopt($ch, CURLOPT_POSTFIELDS,$POSTFIELDS); //The full data to post in a HTTP "POST" operation.
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // TRUE to return the transfer as a string of the return value of curl_exec() instead of outputting it out directly.
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); // TRUE to follow any "Location: " header that the server sends as part of the HTTP header (note this is recursive, PHP will follow as many "Location: " headers that it is sent, unless CURLOPT_MAXREDIRS is set).
curl_setopt($ch, CURLOPT_REFERER, $reffer); //The contents of the "Referer: " header to be used in a HTTP request.
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path); // The name of the file containing the cookie data. The cookie file can be in Netscape format, or just plain HTTP-style headers dumped into a file.
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path); // The name of a file to save all internal cookies to when the connection closes.
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
You can try it with the function file_put_contents and a loop calling your function.
$file = "data.txt";
$website_url = "http://website.com/page.php?id=";
for(i = 1; i <= 1000; i++){
file_put_contents($file, curl($website_url.i), FILE_APPEND);
}
Maybe I just need a pair of fresh eyes....
I need to POST to a page behind .htaccess Basic Authentication. I successfully log in and get past the .htBA, then POST to the target page. I know that the script is getting to that page as I'm logging the access. However $_POST is empty -- evident from both checking the var as well as the target script not working the way it should. (I control all pages).
I've tried many combos of the various curl opts below to no avail. I'm not getting any errors from the second hit.
Thanks.
$post_array = array(
'username'=>$u,
'password'=>$p
);
// Login here
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://example.com/admin/login.php');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:43.0) Gecko/20100101 Firefox/43.0');
curl_setopt($ch, CURLOPT_COOKIEJAR, realpath('temp/cookies.txt') );
curl_setopt($ch, CURLOPT_COOKIEFILE, realpath('temp/cookies.txt'));
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_REFERER, 'http://example.com/index.php');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($post_array));
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'method' => 'POST',
"Authorization: Basic ".base64_encode("$username:$password"),
));
$logInFirst = curl_exec ($ch);
/* Don't close handle as need the auth for next page
* load up a new page */
$post_array_2 = array(
'localfile'=>'my_data.csv',
'theater_mode'=>'normal'
);
//curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, realpath('temp/cookies.txt') );
curl_setopt($ch, CURLOPT_COOKIEFILE, realpath('temp/cookies.txt'));
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_REFERER, 'http://example.com/admin/post_here.php');
curl_setopt($ch, CURLOPT_URL, 'http://example.com/admin/post_here.php');
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($post_array_2));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
//curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Content-Type: multipart/form-data;',
"Authorization: Basic ".base64_encode("$username:$password"),
));
$runAi = curl_exec($ch);
$run_error = curl_error($ch); echo '<hr>'.$run_error.'<hr>';
curl_close($ch);
Here's the code on the target page (post_here.php), which results in a zero count. So I know that the target script is being hit, and based on the output, there are no POSTs.
$pa = ' There are this many keys in POST: '.count($_POST);
foreach ($_POST as $key => $value) {
$pa .= ' '.$key.':'.$value.' ---- ';
}
The error is on the second request:
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($post_array_2));
// ...
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Content-Type: multipart/form-data;',
// ...
You send the header Content-Type: multipart/form-data but the data is encoded as application/x-www-form-urlencoded (by http_build_query()).
The data you want to post on the second request contains 'localfile'=>'my_data.csv'. If you want to upload a file on the second request then the content type is correct (but you don't need to set it manually). Don't use http_build_query() but pass an array to CURLOPT_POSTFIELDS, as is explained in the documentation.
Also, for file uploads you have to put a # in front of the file name and make sure curl is able to find the file. The best way to do this is to use the complete file path:
$post_array_2 = array(
'localfile' => '#'.__DIR__'/my_data.csv',
'theater_mode' => 'normal'
);
The code above assumes my_data.csv is located in the same directory as the PHP script (which is not recommended). You should use dirname() to navigate from the script's directory to the directory where the CSV file is stored, to compose the correct path.
As the documentation also states, since PHP 5.5 the # prefix is deprecated and you should use the CURLFile class for file uploads:
$post_array_2 = array(
'localfile' => new CURLFile(__DIR__'/my_data.csv'),
'theater_mode' => 'normal'
);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_array_2);
As a side note, when you call curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY); it means curl is allowed to negotiate the authentication method with the server. But you also send the header "Authorization: Basic ".base64_encode("$username:$password") and this removes any negotiation because it forces Authorization: Basic.
Also, in order to negociate, curl needs to know the (user, password) combination. You should always use curl_setopt(CURLOPT_USERPWD, "$username:$password") to tell it the user and password. Manual crafting the Authorization header is not recommended.
If you are sure Authorization: Basic is the method you need then you can
use curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC).
You do not see anything inside post because you are using 'Content-Type: multipart/form-data;',. Just remove that and you should be fine.
If you want to upload a file (i.e. my_data.csv) that case you need to follow this way:
## change your file name as following in your param
'localfile'=> '#'.'./my_data.csv',
## after that remove http_build_query() from post
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_array_2);
This will automatically add the header multipart with your post.
You may look your uploaded file using $_FILES variable.
Finally, You can observe what curl is enabling verbose mode.
curl_setopt($ch, CURLOPT_VERBOSE, true);
Tips: While using cookie, always close curl after each and every curl_exec() you do. Otherwise it will not probably write things into cookie file after every requests you make!
I've been trying to write a script that retrieves Google trends results for a given keyword. Please note im not trying to do anything malicious I just want to be able to automate this process and run it a few times every day.
After investigating the Google trends page I discovered that the information is available using the following URL:
http://www.google.com/trends/trendsReport?hl=en-GB&q=keyword&cmpt=q&content=1
You can request that information mutliple times with no issues from a browser, but if you try with "privacy mode" after 4 or 5 requests the following is displayed:
An error has been detected You have reached your quota limit. Please
try again later.
This makes me think that cookies are required. So I have written my script as follows:
$cookiefile = $siteurl . '/wp-content/plugins/' . basename(dirname(__FILE__)) . '/cookies.txt';
$url = 'http://www.google.com/trends/trendsReport?hl=en-GB&q=keyword&cmpt=q&content=1';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookiefile);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookiefile);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.8) Gecko/2009032609 Firefox/3.0.8');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$x='error';
while (trim($x) != '' ){
$html=curl_exec($ch);
$x=curl_error($ch);
}
echo "test cookiefile contents = ".file_get_contents($cookiefile)."<br />";
echo $html;
However I just can't get anything written to my cookies file. So I keep on getting the error message. Can anyone see where I'm going wrong with this?
I'm pretty sure your cookie file should exist before you can use it with curl.
Try:
$h = fopen($cookiefile, "x+");
In order to learn PHP my boss asked me to do some sort of project. I've done so far a To Do List & Reminder (www.frontpagewebdesign.com/newfolder) but what I'm trying to do right now is sending SMS notifications.
Because I cannot afford to buy a SMS gateway for such a small project, I decided to use my account on this website: www.sms-gratuite.ro. My trouble is the automatic Login and SMS sending with CURL.
I followed a tutorial and this is what I've done so far:
<?php
$form_vars = array();
//array for SMS sending form values
$username = '****#****.com';
$password = '********';
$loginUrl = 'http://sms-gratuite.ro/page/autentificare';
$postUrl='http://sms-gratuite.ro/page/acasa';
$form_vars['to'] = "076xxxxxxx";
//my own phone number
$form_vars['mesaj'] = "test";
//SMS text
$encoded_form_vars = http_build_query($form_vars);
$user_agent="Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)";
//init curl
$ch = curl_init();
//Set the URL to work with
curl_setopt($ch, CURLOPT_URL, $loginUrl);
// ENABLE HTTP POST
curl_setopt($ch, CURLOPT_POST, 1);
//Set the post parameters (mail and parola are the IDs of the form input fields)
curl_setopt($ch, CURLOPT_POSTFIELDS, 'mail='.$username.'&parola='.$password);
//Handle cookies for the login
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent);
//execute the request (the login)
$store = curl_exec($ch);
//check if the Login was succcesful by finding a string on the resulting page
if(strpos($store, "Trimite mesaj")===FALSE)
echo "logged in";
else
echo "not logged";
//set the landing url
curl_setopt($ch, CURLOPT_URL, 'http://sms-gratuite.ro/page/autentificare');
$referer='';
curl_setopt($ch, CURLOPT_URL, $postUrl);
//curl_setopt($ch, CURLOPT_HTTPHEADER,array("Expect:"));
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_REFERER, $referer);
curl_setopt($ch, CURLOPT_POSTFIELDS, 'to='.$form_vars['to'].'&mesaj='.$form_vars['mesaj']);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
$result = curl_exec($ch);
if(strpos($result, "Mesajul a fost trimis")===FALSE)
echo "<br>sms sent";
else
echo "<br>sms not sent";
curl_close($ch);
?>
I don't have any errors but it surely doesn't work. First of all the login fails. Is this form a particular one and the curl cannot handle it?
You could to put every curl request result into a different file, say page1.html, page2.html, etc. This way you can open then in browser and see what's the exact page you got in return for your request
You need to make exactly same request, as browser would do. There are browser addons like HttpFox (if you are using FireFox) that can show you all fields that were sent, all cookies and everything else related. You can compare that lists to what your curl is forming to find lacking pieces
Try theese steps and comment with further errors that you got, preferably detailed.
Trying to send data to a server which accepts data in the following format
VERIFY_DATA=MER_ID=xxx|MER_TRNX_ID=xxx| MER_TRNX_AMT=xxx
Will the following lines do?
$datatopost="VERIFY_DATA=MER_ID=xxx|MER_TRNX_ID=xxx| MER_TRNX_AMT=xxx";
curl_setopt ($ch, CURLOPT_POSTFIELDS,$datatopost);<br />
Any help will be appreciated,I am new with curl.
you can use this article to see how to do it properly.
I personally used this code to do it on a project of mine
$data="from=$from&to=$to&body=".urlencode($body)."&url=$url";
//$urlx contains the url where you want to post. $data contains the data you are posting
//$resp contains the response.
$process = curl_init($urlx);
curl_setopt($process, CURLOPT_HEADER, 0);
curl_setopt($process, CURLOPT_POSTFIELDS, $data);
curl_setopt($process, CURLOPT_POST, 1);
curl_setopt($process, CURLOPT_RETURNTRANSFER,1);
curl_setopt($process,CURLOPT_CONNECTTIMEOUT,1);
$resp = curl_exec($process);
curl_close($process);
here is a working example in php. This asks for and returns an FX quote. My data request is in the URL yours is in the post-fields though so you need to adjust. It looks as though you have spaces in the data you are passing "x| ME" i suspect it will not like that.
$ch = curl_init(); // initialise CURL
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, $curl_opt_string ); // this contains the URL and the request
curl_setopt($ch, CURLOPT_HEADER, false); // no header
curl_setopt($ch, CURLOPT_INTERFACE, "93.129.141.79"); // where to send the data back to / outgoing network
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"); // tell it what the agent is
curl_setopt($ch, CURLOPT_POST, 1); // want the data back as a post
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // return as a string rather than to the screen
$output = curl_exec($ch); // varu=iable to return it to
curl_close($ch); // close cURL resource, and free up system resources
$subject = $output; // get the data
I constructed my output sting as follows
$curl_opt_string = "http://msxml.rexefore.com/index.php?username=MiJoee4r65&password=L8e44Y&instrument=245.20." . $lhsrhs . "LITE&fields=D4,D6";