I have a script that send POST data to several pages. however, I encountered some difficulties sending request to some servers. The reason is redirection. Here's the model:
I'am sending post request to server
Server responses: 301 Moved Permanently
Then curl_setopt ( $ch, CURLOPT_FOLLOWLOCATION, TRUE) kicks in and follows the redirection (but via GET request).
To solve this I'am using curl_setopt ( $ch, CURLOPT_CUSTOMREQUEST, "POST") and yes, now its redirecting without POST body content that I've send in first request. How can I force curl to send post body when redirected? Thanks!
Here's the example:
<?php
function curlPost($url, $postData = "")
{
$ch = curl_init () or exit ( "curl error: Can't init curl" );
$url = trim ( $url );
curl_setopt ( $ch, CURLOPT_URL, $url );
//curl_setopt ( $ch, CURLOPT_POST, 1 );
curl_setopt ( $ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt ( $ch, CURLOPT_POSTFIELDS, $postData );
curl_setopt ( $ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt ( $ch, CURLOPT_CONNECTTIMEOUT, 30 );
curl_setopt ( $ch, CURLOPT_TIMEOUT, 30 );
curl_setopt ( $ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36");
curl_setopt ( $ch, CURLOPT_FOLLOWLOCATION, TRUE);
$response = curl_exec ( $ch );
if (! $response) {
echo "Curl errno: " . curl_errno ( $ch ) . " (" . $url . " postdata = $postData )\n";
echo "Curl error: " . curl_error ( $ch ) . " (" . $url . " postdata = $postData )\n";
$info = curl_getinfo($ch);
echo "HTTP code: ".$info["http_code"]."\n";
// exit();
}
curl_close ( $ch );
// echo $response;
return $response;
}
?>
curl is following what RFC 7231 suggests, which also is what browsers typically do for 301 responses:
Note: For historical reasons, a user agent MAY change the request
method from POST to GET for the subsequent request. If this
behavior is undesired, the 307 (Temporary Redirect) status code
can be used instead.
If you think that's undesirable, you can change it with the CURLOPT_POSTREDIR option, which only seems very sparsely documented in PHP but the libcurl docs explains it. By setting the correct bitmask there, you then make curl not change method when it follows the redirect.
If you control the server end for this, an easier fix would be to make sure a 307 response code is returned instead of a 301.
Related
I'm trying to send the following CURL request in PHP. But its returning: "HTTP Error 411. The request must be chunked or have a content length."
PHP Script containing the CURL Request:
<?php
$numbers = 9999999999;
$message = "Test";
$message = urlencode($message);
$port = 80;
$url = "http://191.95.51.64/API/sendsms.aspx?loginID=myloginid&password=mypassword&mobile=".$numbers."&text=".$message."&senderid=ABCDEF&route_id=1&Unicode=1";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt ( $ch, CURLOPT_PORT, $port );
curl_setopt ( $ch, CURLOPT_POST, 1 );
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ( $ch, CURLOPT_TIMEOUT, 20 );
curl_setopt ( $ch, CURLOPT_CONNECTTIMEOUT, 20 );
$output = curl_exec($ch);
$err = curl_error($ch);
curl_close($ch);
if ($err) {
echo $err;
} else {
echo $output;
}
?>
Output:
Length Required
HTTP Error 411. The request must be chunked or have a content length.
Since I'm new to use PHP Curl, so I couldn't be able to find out whats wrong. If anyone can briefly guide me the solution with an example, I'll be very appreciated to him. Thanks!
Since your data seems to be send as URL parameters, try adding content length header of zero like below:
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Length: 0'));
Also instead of touching the content length header, simply try to add an empty POST body. Based on which type of HTTP server you are posting to, the behaviour differs slightly (IIS, LightHTTPD, Apache).
Empty post body similar to:
curl_setopt($ch, CURLOPT_POSTFIELDS, array());
You forgot to close quote:
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Length: 0'));
OK, before saying this is a duplicate just read a bit....
I have been trying to echo contents of URL that has allow_url_fopen disabled for HOURS now, I have tried every solution posted on stack overflow. EXAMPLE:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
$result = curl_exec($ch);
curl_close($ch);
Doesn't WORK
function curl_get_contents($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
Doesn't WORK
$url = "http://www.google.com";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec($ch);
curl_close($ch);
echo $data;
Doesn't WORK
fopen("cookies.txt", "w");
$url="http://adfoc.us/1575051";
$ch = curl_init();
$header=array('GET /1575051 HTTP/1.1',
'Host: adfoc.us',
'Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language:en-US,en;q=0.8',
'Cache-Control:max-age=0',
'Connection:keep-alive',
'Host:adfoc.us',
'User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36',
);
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,0);
curl_setopt( $ch, CURLOPT_COOKIESESSION, true );
curl_setopt($ch,CURLOPT_COOKIEFILE,'cookies.txt');
curl_setopt($ch,CURLOPT_COOKIEJAR,'cookies.txt');
curl_setopt($ch,CURLOPT_HTTPHEADER,$header);
$result=curl_exec($ch);
curl_close($ch);
Doesn't WORK
// create the Gateway object
$gateway = new Gateway();
// set our url
$gateway->init($url);
// get the raw response, ignore errors
$response = $gateway->exec();
Doesn't WORK
$file = "http://www.example.com/my_page.php";
if (function_exists('curl_version'))
{
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $file);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$content = curl_exec($curl);
curl_close($curl);
}
else if (file_get_contents(__FILE__) && ini_get('allow_url_fopen'))
{
$content = file_get_contents($file);
}
else
{
echo 'You have neither cUrl installed nor allow_url_fopen activated. Please setup one of those!';
}
This doesn't work.
The page I am trying to use file_get_contents on is not on my website. I am trying to use file_get_contents so i can make a simple API for the site owner by reading a page and checking if a certain word is present on the page.
But yeah if anyone has any suggestions PLEASE post below :)
You can check first weather the site is available or not for example a sample code
Code taken from here:
<?php
$cURL = curl_init('http://www.technofusions.com/');
curl_setopt ( $cURL , CURLOPT_RETURNTRANSFER , true );
// Follow any kind of redirection that are in the URL
curl_setopt ( $cURL , CURLOPT_FOLLOWLOCATION , true );
$result = curl_exec ( $cURL );
// Getting HTTP response code
$answer = curl_getinfo ( $cURL , CURLINFO_HTTP_CODE );
curl_close ( $cURL );
if ( $answer == ' 404 ' ) {
echo ' The site not found (ERROR 404)! ' ;
} else {
echo ' It looks like everything is working fine ... ' ;
}
?>
For a full answer you can got to this tutorial Curl IN PHP
I started off using file_get_contents() and it returned string(9259) in which every character is a space (aka. its a lot of empty). After some research I tried using curl() and after a few struggles with getting the CURLOPT_SSL_VERIFYPEER and CURLOPT_USERAGENT to work it brought me right back to where I was, string(9259) of all blank.
I am attempting to retrieve the tracking information on multiple packages automatically and the code for a single iteration is as follows:
function curl($url)
{
$ch = curl_init();
curl_setopt( $ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.52 Safari/537.17' );
curl_setopt( $ch, CURLOPT_SSL_VERIFYPEER, false );
curl_setopt( $ch, CURLOPT_URL, $url );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, 1 );
$data = curl_exec( $ch );
var_dump( curl_getinfo( $ch ) );
echo curl_errno( $ch ) . '<br/>';
echo curl_error( $ch ) . '<br/>';
curl_close( $ch );
return $data;
}
$url for one instance is https://www.fedex.com/fedextrack/?tracknumbers=055575670028673&cntry_code=us
My question is essentially why am I receiving the string(9259) of blank characters? I expected to receive an actual string representation of the website.
Maybe Fedex doesn't like it when you scrape pages, has detected that you are doing so, and is returning dummy data?
They do have APIs for this: http://www.fedex.com/us/developer/web-services/index.html
I would like to scrape the content of this Google search result page using curl.
I've been trying setting different user agents, and setting other options but I just can't seem to get the content of that page, as I often get redirected or I get a "page moved" error.
I believe it has something to do with the fact that the query string gets encoded somewhere but I'm really not sure how to get around that.
//$url is the same as the link above
$ch = curl_init();
$user_agent='Mozilla/5.0 (Windows NT 6.1; rv:8.0) Gecko/20100101 Firefox/8.0'
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt ($ch, CURLOPT_HEADER, 0);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch,CURLOPT_CONNECTTIMEOUT,120);
curl_setopt ($ch,CURLOPT_TIMEOUT,120);
curl_setopt ($ch,CURLOPT_MAXREDIRS,10);
curl_setopt ($ch,CURLOPT_COOKIEFILE,"cookie.txt");
curl_setopt ($ch,CURLOPT_COOKIEJAR,"cookie.txt");
echo curl_exec ($ch);
What do I need to do to get my php code to show the exact content of the page as I would see it on my browser? What am I missing? Can anyone point me to the right direction?
I've seen similar questions on SO, but none with an answer that could help me.
EDIT:
I tried to just open the link using the Selenium WebDriver, that gives the same results as cURL. I am still thinking that this has to do with the fact that there are special characters in the query string which are getting messed up somewhere in the process.
this is how:
/**
* Get a web file (HTML, XHTML, XML, image, etc.) from a URL. Return an
* array containing the HTTP server response header fields and content.
*/
function get_web_page( $url )
{
$user_agent='Mozilla/5.0 (Windows NT 6.1; rv:8.0) Gecko/20100101 Firefox/8.0';
$options = array(
CURLOPT_CUSTOMREQUEST =>"GET", //set request type post or get
CURLOPT_POST =>false, //set to GET
CURLOPT_USERAGENT => $user_agent, //set user agent
CURLOPT_COOKIEFILE =>"cookie.txt", //set cookie file
CURLOPT_COOKIEJAR =>"cookie.txt", //set cookie jar
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
);
$ch = curl_init( $url );
curl_setopt_array( $ch, $options );
$content = curl_exec( $ch );
$err = curl_errno( $ch );
$errmsg = curl_error( $ch );
$header = curl_getinfo( $ch );
curl_close( $ch );
$header['errno'] = $err;
$header['errmsg'] = $errmsg;
$header['content'] = $content;
return $header;
}
Example
//Read a web page and check for errors:
$result = get_web_page( $url );
if ( $result['errno'] != 0 )
... error: bad url, timeout, redirect loop ...
if ( $result['http_code'] != 200 )
... error: no page, no permissions, no service ...
$page = $result['content'];
For a realistic approach that emulates the most human behavior, you may want to add a referer in your curl options. You may also want to add a follow_location to your curl options. Trust me, whoever said that cURLING Google results is impossible, is a complete dolt and should throw his/her computer against the wall in hopes of never returning to the internetz again.
Everything that you can do "IRL" with your own browser can all be emulated using PHP cURL or libCURL in Python. You just need to do more cURLS to get buff. Then you will see what I mean. :)
$url = "http://www.google.com/search?q=".$strSearch."&hl=en&start=0&sa=N";
$ch = curl_init();
curl_setopt($ch, CURLOPT_REFERER, 'http://www.example.com/1');
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_VERBOSE, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible;)");
curl_setopt($ch, CURLOPT_URL, urlencode($url));
$response = curl_exec($ch);
curl_close($ch);
Try This:
$url = "http://www.google.com/search?q=".$strSearch."&hl=en&start=0&sa=N";
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_VERBOSE, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible;)");
curl_setopt($ch, CURLOPT_URL, urlencode($url));
$response = curl_exec($ch);
curl_close($ch);
I suppose that have you noticed that your link is actually an HTTPS link....
It seems that CURL parameters do not include any kind of SSH handling... maybe this could be your problem.
Why don't you try with a non-HTTPS link to see what happens (i.e Google Custom Search Engine)...?
Get content with Curl php
request server support Curl function, enable in httpd.conf in folder Apache
function UrlOpener($url)
global $output;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
curl_close($ch);
echo $output;
If get content by google cache use Curl you can use this url: http://webcache.googleusercontent.com/search?q=cache:Put your url
Sample: http://urlopener.mixaz.net/
I've been working on writing a script which automatically logs me into my school's network, checks if the classes I'm trying to get into are no longer completely full, and if a spot has opened up, registers the class for me. However, I've hit a big snag in just the logging-in process.
Basically, I've been looking at the headers that are sent when I log in and try to replicate them. The problem is I keep getting an error saying "HTTP/1.1 400 Bad Request Content-Type: text/html Date: Sat, 23 Oct 2010 18:42:20 GMT Connection: close Content-Length: 42
Bad Request (Invalid Header Name)".
I'm guessing it has something to do with Host parameter I'm setting being different from what it really is (I set it so it is elion.psu.edu, but when looking at the headers from my script it has changed back to grantbachman.com, where the script is hosted). I guess it'll be best just to show you.
The beginning of the header I'm trying to create:
https://elion.psu.edu/cgi-bin/elion-student.exe/submit
POST /cgi-bin/elion-student.exe/submit HTTP/1.1
Host: elion.psu.edu
The beginning of the header which shows up when I run my script:
http://myDomain.com/myScriptName.php
GET /elionScript.php HTTP/1.1
Host: myDomain.com
Basically, the first line is different, the Host name is different, and it says I'm sending my info with a GET variable instead of a POST variable (even though I set curlopt_post to true). I'm basically looking for any help with altering this info such that the server accepts my script. I'm fresh out of ideas. Thanks.
Oh here's the code I'm using:
$data = array(
"$userIDName" => '********',
"$passName" => '********',
"$submitName" => 'Login+to+eLion',
'submitController' => '',
'forceUnicode' => '%D0%B4%D0%B0',
'sessionKey' => "$sessionValue",
'pageKey' => "$pageKeyValue",
'shopperID' => '');
$contentLength = strlen($userIDName . '=*********&' . $passName . '=********&' . $submitName .'=Login+to+eLion&submitController=&forceUnicode=%D0%B4%D0%B0&sessionKey=' . $sessionValue . '&pageKey=' . $pageKeyValue . '&shopperID=');
$ch = curl_init("https://elion.psu.edu/cgi-bin/elion-student.exe/submit");
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_COOKIE,'sessionKey="$sessionValue";pageKey="$pageKeyValue";BIGipServerelion_prod_pool="$prodPoolValue"');
curl_setopt($ch,CURLOPT_HTTPHEADER,array(
'Host: elion.psu.edu',
'User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.10',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: en-us,en;q=0.5','Accept-Encoding: gzip,deflate',
'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Keep-Alive: 115',
'Referer: https://elion.psu.edu/cgi-bin/elion-student.exe/launch/ELionMainGUI/Student',
"Cookie: sessionKey=$sessionValue; pageKey=$pageKeyValue; BIGipServerelion_prod_pool=$prodPoolValue", 'Content-Type: application/x-www-form-urlencoded',
"Content-Length: $contentLength"));
$contents2 = curl_exec ($ch);
I't also probably important to note that when I run the script, none of the information below the 'Keep-Alive: 115' line is displayed when I view the header.
Seems it missing some code in your question but try this :
1- Save your certificate on your server
2- Try this code
$pg = curl_init();
// Set the form data for posting the login information
$postData = array();
$postData["username"] = $username;
$postData["password"] = $password;
$postText = "";
foreach( $postData as $key => $value ) {
$postText .= $key . "=" . $value . "&";
}
curl_setopt( $pg, CURLOPT_URL, $YOUR_URL );
curl_setopt( $pg, CURLOPT_POST, true );
curl_setopt( $pg, CURLOPT_POSTFIELDS, $postText );
curl_setopt( $pg, CURLOPT_SSL_VERIFYPEER, true );
curl_setopt( $pg, CURLOPT_SSL_VERIFYHOST, 2 );
curl_setopt( $pg, CURLOPT_CAINFO, getcwd() . '/web'); //web is the exported certificate
//curl_setopt( $pg, CURLOPT_VERBOSE, true ); // for debug
//curl_setopt( $pg, CURLOPT_RETURNTRANSFERT, true); // if you want a ouput
curl_setopt( $pg, CURLOPT_COOKIEJAR, "cookies.txt" );
curl_setopt( $pg, CURLOPT_COOKIEFILE, "cookies.txt" );
curl_setopt( $pg, CURLOPT_USERAGENT, "Mozilla/5.0 (compatible; MSIE 5.01; Windows NT 5.0)" );
curl_setopt( $pg, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $pg, CURLOPT_COOKIE, session_name() . '=' . session_id() );
if( ( $response = curl_exec( $pg ) ) === false ) {
echo '*Curl erro' . curl_error($pg) . "\n";
}
curl_close($pg)
$YOUR_URL:https://elion.psu.edu/cgi-bin/elion-student.exe/launch/ELionMainGUI/Student
The form using dynamic name so its not simple like "username" and "password". Check on the website to know a "good" one.
Do not forget to add others hidden field like you did in the postData array and update the cookie section too.