i have been searching around for awhile without any luck.
im wondering if it is possible to set the date of a server with curl()?
i currently got this code to login and retrive data
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $loginURL);
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6");
curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt ($ch, CURLOPT_COOKIEJAR, $cookieFile);
curl_setopt ($ch, CURLOPT_COOKIEFILE, $cookieFile);
curl_setopt ($ch, CURLOPT_REFERER, $loginURL);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $postData);
curl_setopt ($ch, CURLOPT_POST, 1);
But sometimes i the server im getting data from, to think that it is another date.
i know i can mane another curl "partion" to get data from at specifik date by the url, but i figure it is faster to only call the remote server once, so if it is possible to set a header or something?
I specific wonna do this: trick the server, that i call via CURL to think that it is to days in the future
You should have a detailed look at the:
PHP.net setopt Manual
With curl, you can set a full header with curl_setopt($handle, CURLOPT_HTTPHEADER, $header).
I don't entirely understand what you are attempting to in your specific case but hopefully this example might be useful:
$agent = 'Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0; en-US)';
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: ";
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt($ch,CURLOPT_USERAGENT, $agent);//s[array_rand($agents)]);
curl_setopt ($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 0);
$fileContents = curl_exec($ch);
curl_close($ch);
Further, if you are working with multiple curl requests you might consider a great library written by Josh Frasier, called Rolling_Curl.
Related
I have not much experience with cUrl. I am creating a bot which login automatically to a site. I have used cUrl to do this. when i run my script it logs me in successfully ( I get the response "Ok" ) but when i redirects to some other page I get logged out.
Here is my code:
<?php
$username="username";
$userpass="password";
$url="https://www.invertironline.com/User/DoLogin";
$cookie="cookie.txt";
$postdata = "username=".$username."&password=".$userpass."";
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 10.0; Win64;
x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36");
curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_COOKIEJAR, realpath($cookie));
curl_setopt ($ch, CURLOPT_COOKIEFILE, realpath($cookie));
curl_setopt ($ch, CURLOPT_REFERER, $url);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $postdata);
curl_setopt ($ch, CURLOPT_POST, 1);
$headers = array();
$headers[] = 'application/xhtml+voice+xml;version=1.2, application/x-
xhtml+voice+xml;version=1.2, text/html, application/xml;q=0.9,
application/xhtml+xml, image/png, image/jpeg, image/gif, image/x-xbitmap,
*/*;q=0.1';
$headers[] = 'Connection: Keep-Alive';
$headers[] = 'Content-type: application/x-www-form-urlencoded;charset=UTF-
8';
$headers[] = 'X-Requested-With: XMLHttpRequest';
$headers[] = 'Referer: https://www.invertironline.com/mercado/cotizaciones';
$headers[] = 'Origin: https://www.invertironline.com';
$headers[] = 'Host: www.invertironline.com';
curl_setopt ($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt ($ch, CURLOPT_HEADER, 1);
$result = curl_exec ($ch);
curl_close($ch);
echo $result;
header("Location: https://www.invertironline.com/mercado/cotizaciones");
here are some additional information:
if I create a html form with the username and password with action and submit it, it logs me in and give back the following response :
{"result":"OK","redirect":"/MiCuenta/EstadoCuenta"}
and in case of wrong credentials i get this response:
{"result":"Error"}
so when I use curl I am getting same response but when I try to redirect I am not logged in ( probably session is not saved)
I have spend 3 days trying to find a solution for this but still can't solve it Any help will be highly appreciated
go see in the cookie.txt file . it should display in details what problem you have.
if the session is availble but no response then please tell me so i ciould know how i could help you better,and if the file is clear,then there is a problem in your code in the phase of cookie store.so please check cookie.txt file and tell me what you find in it.
I am trying to make a website scraper, but the website is acting diferrently, than normal request via browser.
How can i make perfect cURL reguest, that the website will not filter it and block it?
Any help would be appriciated.
$curl_handle = curl_init ("***");
$header = array();
$header[] = "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0";
$header[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$header[] = "Accept-Language: cs,en-US;q=0.7,en;q=0.3";
$header[] = "Accept-Encoding: utf-8";
$header[] = "Connection: keep-alive";
$header[] = "Host: ****";
curl_setopt($curl_handle, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0');
curl_setopt($curl_handle, CURLOPT_HTTPHEADER, $header);
curl_setopt ($curl_handle, CURLOPT_COOKIEFILE, dirname(__FILE__) . '/cookie.txt');
curl_setopt ($curl_handle, CURLOPT_COOKIEJAR, dirname(__FILE__) . '/cookie.txt');
curl_setopt ($curl_handle, CURLOPT_RETURNTRANSFER, true);
curl_setopt ($curl_handle, CURLOPT_AUTOREFERER, true);
$output = curl_exec ($curl_handle);
This is, what i got so far, but it is still getting blocked.
The following CURL options might help:
curl_setopt($ch, CURLOPT_REFERER, $_SERVER['REQUEST_URI']);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
I am using the OCR Service of i2ocr.com to convert an image to text..
In my project, I need to do this work automatically so I am using PHP to get the text of the image.
In the OCR website the postdata is contained in the form of multipart/form-data
Like this:
-----------------------------32642708628732\r\n
Content-Disposition: form-data; name="i2ocr_options"\r\n
\r\n
url\r\n
-----------------------------32642708628732\r\n
Content-Disposition: form-data; name="i2ocr_uploadedfile"\r\n
\r\n
\r\n
-----------------------------32642708629732\r\n
Content-Disposition: form-data; name="i2ocr_url"\r\n
\r\n
http://www.murraydata.co.uk/wp-content/uploads/2013/02/ocr-font-500x220.jpg\r\n
-----------------------------32642708628732\r\n
Content-Disposition: form-data; name="i2ocr_languages"\r\n
\r\n
gb,eng\r\n
-----------------------------32642708628732--\r\n
In PHP I am using
$ch = curl_init();
$dt = array();
$dt['i2ocr_options'] = 'url';
$dt['i2ocr_uploadedfile'] = '';
$dt['i2ocr_url'] = 'http://www.murraydata.co.uk/wp-content/uploads/2013/02/ocr-font-500x220.jpg';
$dt['i2ocr_languages'] = 'gb,eng';
curl_setopt($ch, CURLOPT_URL,"http://www.i2ocr.com/process_form");
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; rv:23.0) Gecko/20100101 Firefox/23.0");
curl_setopt($ch,CURLOPT_ENCODING,"gzip,deflate");
curl_setopt($ch, CURLOPT_HTTPHEADER, Array("Content-Type: multipart/form-data; boundary=---------------------------32642708628732"));
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_REFERER, "http://www.i2ocr.com/");
curl_setopt($ch, CURLOPT_AUTOREFERER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, "$dt");
$html=curl_exec($ch);
print_r($html);
This code does not generate any errors, but I do not get any output either.
I need help getting the output from this curl request.
Like this:
<?php
function get($url, $refer, $ch)
{
curl_setopt ($ch, CURLOPT_URL,$url);
curl_setopt ($ch, CURLOPT_POST, 0);
curl_setopt ($ch, CURLOPT_COOKIEJAR, realpath('cookie.txt')); // cookie.txt
curl_setopt ($ch, CURLOPT_COOKIEFILE, realpath('cookie.txt'));
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; U; Linux i586; de; rv:5.0) Gecko/20100101 Firefox/5.0');
curl_setopt ($ch, CURLOPT_REFERER, $refer);
$result= curl_exec($ch);
return $result;
}
function post($url, $refer, $parametros, $ch)
{
curl_setopt ($ch, CURLOPT_URL,$url);
curl_setopt ($ch, CURLOPT_POST, 1);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $parametros);
curl_setopt ($ch, CURLOPT_COOKIEJAR, realpath('cookie.txt')); // cookie.txt
curl_setopt ($ch, CURLOPT_COOKIEFILE, realpath('cookie.txt'));
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; U; Linux i586; de; rv:5.0) Gecko/20100101 Firefox/5.0');
curl_setopt ($ch, CURLOPT_REFERER, $refer);
$result= curl_exec($ch);
return $result;
}
function hazlo() {
$ch = curl_init();
/* STEP 1. visito la primera pagina para coger sus cookies */
get ("http://www.i2ocr.com/", "http://www.i2ocr.com/", $ch);
//STEP 2. Creo un array con los datos del post
$data = array(
'i2ocr_options' => 'url',
'i2ocr_uploadedfile' => '',
'i2ocr_url' => 'http://www.murraydata.co.uk/wp-content/uploads/2013/02/ocr-font- 500x220.jpg',
'i2ocr_languages' => 'gb,eng'
);
$data2 = http_build_query($data);
//STEP 3. Enviamos el el array en post
echo post ("http://www.i2ocr.com/process_form", "http://www.i2ocr.com/", $data2, $ch);
}
hazlo();
?>
use view source to see the response html, you can see the text of the image (sorry for my english). Works 100% :)
I use the following general code to log into other https sites and pull records using forms, but it doesn't seem to work for www.voip.ms. I've created a testing account so if anyone wants to take a crack at it and tell me what I did wrong. (Warning, the site only gives your IP address 4 tries until it bans it)
<?php
ini_set('max_execution_time', 300);
$username="meahmatt#aol.com";
$password="testaccount";
$url="https://www.voip.ms/m/login.php";
$cookie="cookie.txt";
$postdata = "col_email".$username."&col_password=".$password."&action=login&form1=";
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6");
curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt ($ch, CURLOPT_COOKIEFILE, $cookie);
curl_setopt ($ch, CURLOPT_REFERER, $url);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $postdata);
curl_setopt ($ch, CURLOPT_POST, 1);
$result = curl_exec ($ch);
curl_close($ch);
echo $result;
?>
I've also tried setting CURLOPT_SSL_VERIFYPEER, TRUE with no change
I had the same problem recently trying to use the twitter api using curl_exec from godaddy.
The magic was to disable both peer and host verification in the options :
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // required as godaddy fails
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false); // required as godaddy fails
The error was a certificate verification problem. I have no problem using this exact script on non-godaddy servers.
CURLE_SSL_CACERT (60)
Peer certificate cannot be authenticated with known CA certificates.
The full request looks like this :
$url = "https://api.twitter.com/1.1/statuses/user_timeline.json?..."
$headers = array(
"Authorization: Bearer ".$bearer."",
);
$ch = curl_init(); // setup a curl
curl_setopt($ch, CURLOPT_URL, $url); // set url to send to
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers); // set custom headers
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // return data reather than echo
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // required as godaddy fails
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false); // required as godaddy fails
// $info = curl_getinfo($ch); // debug info
// var_dump($info); // dump curl info
$result = curl_exec($ch); // run the curl
curl_close($ch); // stop curling
// Check for errors and display the error message
if($errno = curl_errno($ch)) { echo "curlerror::$errno::"; }
Also notice that curl_getinfo and curl_errno were invaluable at finding the problem.
tl;dr , friends don't let friends use godaddy.
You could try this function if you like. It's helped me out a few times.
If you still have trouble try fiddler2 (fiddler2.com) to check for all of the headers and attempt to replicate them in PHP
ini_set('max_execution_time', 300);
$fields['col_email'] = "meahmatt#aol.com";
$fields['col_password'] = "testaccount";
$fields['action'] = "login";
$fields['form1'] = "";
$url = "https://www.voip.ms/m/login.php";
$html = get_html($url,$url,$fields);
function get_html($url,$ref='',$fields=array(),$cookie='cookie.txt'){
// $proxyAddress = '127.0.0.1:8888';
$ch = curl_init();
touch($cookie);
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; //browsers keep this blank.
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows;U;Windows NT 5.0;en-US;rv:1.4) Gecko/20030624 Netscape/7.1 (ax)');
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
if($proxyAddress != ''){
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 0);
curl_setopt($ch, CURLOPT_PROXYTYPE, CURLPROXY_HTTP);
curl_setopt($ch, CURLOPT_PROXY, $proxyAddress);
}
if(count($fields)>0){
curl_setopt($ch, CURLOPT_POSTFIELDS, $fields);
curl_setopt ($ch, CURLOPT_POST, 1);
}
curl_setopt($ch, CURLOPT_REFERER, $ref);
curl_setopt($ch, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_COOKIESESSION, TRUE);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie);
$result = curl_exec ($ch);
if(!$result){
echo "cURL error number:" .curl_errno($ch);
echo "cURL error:" . curl_error($ch);
exit;
}
curl_close ($ch);
return($result);
}
I am trying to obtain some information from a webpage that requires login. I am using PHP/cURL to post username and password to the login page on the target website. The website uses relative links to redirect authenticated users to a members only area.
I am getting 200 OK and I can see that I am being successfully authenticated. My issue is that I don't know how to make it go to the actual member area in the target website (targetwebsite.com/memberarea) as opposed to mywebsite.com/memberarea. Is there a way to specify the base domain of the target website in cURL? Could you also tell me if I am doing something not recommended in the following code.
Here is what I am doing...
<?php
// INIT CURL
$ch = curl_init();
// SET URL FOR THE POST FORM LOGIN
curl_setopt($ch, CURLOPT_URL, 'https://targetwebsite.com/login.php');
curl_setopt($ch, CURLOPT_HEADER, 1);
// ENABLE HTTP POST
curl_setopt ($ch, CURLOPT_POST, 1);
// SET POST PARAMETERS : FORM VALUES FOR EACH FIELD
curl_setopt ($ch, CURLOPT_POSTFIELDS, 'username=someuser&password=mypassword');
curl_setopt ($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12');
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; // browsers keep this blank.
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_REFERER, 'http://www.google.com');
curl_setopt($ch, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
// IMITATE CLASSIC BROWSER'S BEHAVIOUR : HANDLE COOKIES
curl_setopt ($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
# Setting CURLOPT_RETURNTRANSFER variable to 1 will force cURL
# not to print out the results of its query.
# Instead, it will return the results as a string return value
# from curl_exec() instead of the usual true/false.
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 0);
// EXECUTE 1st REQUEST (FORM LOGIN)
$store = curl_exec ($ch);
// SET FILE TO DOWNLOAD
curl_setopt ($ch, CURLOPT_COOKIEFILE, 'cookie.txt');
curl_setopt($ch, CURLOPT_URL, 'https://targetwebsite.com?memberarea+welcome+UserFirstName');
// EXECUTE 2nd REQUEST (FILE DOWNLOAD)
$content = curl_exec ($ch);
//echo $store;
// CLOSE CURL
curl_close ($ch);
?>
Are you receiving a Location: redirect? If so, use CURLOPT_FOLLOWLOCATION
Also, Christian Sciberras recommends setting MAXREDIRECTS to at least 3. Proof