Hello StackOverflow community, I've encountered a problem when I try to use cURL methods on PHP. I tried with this sample code:
$html_brand = "www.google.com/";
$ch = curl_init();
$options = array(
CURLOPT_URL => $html_brand,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HEADER => true,
CURLOPT_HTTPHEADER => array("User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36"),
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_SSL_VERIFYPEER => false,
CURLOPT_SSL_VERIFYHOST => false,
CURLOPT_ENCODING => "",
CURLOPT_AUTOREFERER => true,
CURLOPT_CONNECTTIMEOUT => 120,
CURLOPT_TIMEOUT => 120,
CURLOPT_MAXREDIRS => 10,
);
curl_setopt_array( $ch, $options );
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if ( $httpCode != 200 ){
echo "Return code is {$httpCode} \n"
.curl_error($ch);
} else {
echo "<pre>".htmlspecialchars($response)."</pre>";
}
curl_close($ch);
And it always ends with an error, displaying it on screen:
Return code is 0 Recv failure: Connection was reset
Or this error, when trying to reach any site with https:
Return code is 0 Failed to connect to www.google.com port 443: Timed
out
This are my settings:
Windows 7 Professional 32 bit
Apache 2.4.12
PHP 5.6.11
Is it a code error or any server configurations I have not considered?
The HTTP_HOST value in Apache is localhost:8080, which I'm not really sure if it has anything to do with my problem, but maybe it's worth noting.
Thank all of you in advance.
I've developed a custom function, that works fine for GET, POST and ajax requests.
So here it's
function spider($header = array(), $referer = false, $url, $cookie = false,$post = false)
{
if (!$cookie)
{
$cookie = "cookie.txt";
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_ENCODING, 'gzip,deflate,sdch');
if (isset($header) && !empty($header))
{
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
}
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 200);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 5);
curl_setopt($ch, CURLOPT_USERAGENT,
"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7");
curl_setopt($ch, CURLOPT_COOKIEJAR, realpath($cookie));
curl_setopt($ch, CURLOPT_COOKIEFILE, realpath($cookie));
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
if (isset($referer) && $referer != false)
{
curl_setopt($ch, CURLOPT_REFERER, $referer);
} else
{
curl_setopt($ch, CURLOPT_REFERER, $url);
}
//if have to post data on the server
if (isset($post) && !empty($post) && $post)
{
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
} //endif
$data = curl_exec($ch);
$info = curl_getinfo($ch);
print_r($info);
curl_close($ch);
if($info['http_code'] == 200){
return ($data);
}else{
return false;
}
}
so, parameters are,
$header => Headers to send to server, this should be associative array.
$referer => Referrer to current page(if any).
$url => URL that you want to get.
$cookie => cookie file, should be on the same level where the calling script is.
$post => data to post (if any)
I tested this function as follows.
echo spider(FALSE, FALSE, 'https://www.google.com/ncr');
here is header that I got.
Array
(
[url] => https://www.google.com/
[content_type] => text/html; charset=UTF-8
[http_code] => 200
[header_size] => 1665
[request_size] => 701
[filetime] => -1
[ssl_verify_result] => 20
[redirect_count] => 1
[total_time] => 2.215
[namelookup_time] => 0
[connect_time] => 0
[pretransfer_time] => 0
[size_upload] => 0
[size_download] => 9024
[speed_download] => 4074
[speed_upload] => 0
[download_content_length] => 0
[upload_content_length] => 0
[starttransfer_time] => 0.437
[redirect_time] => 1.7
[certinfo] => Array
(
)
[redirect_url] =>
)
and here is snapshot of this request.
Hope this will help you...
Related
IP URL with Port is not working in cURL PHP
$url = 'http://IP:PORT/API_URL';
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.16) Gecko/20110319 Firefox/3.6.16");
$curlData = curl_exec($curl);
if (curl_errno($curl)) {
echo 'Error: ' . curl_error($curl);
}
curl_close($curl);
echo $curlData;
Above code is return Error: Failed to connect to [IP] port [PORT]: Connection refused with latest PHP and timeout with older PHP.
Second Case:
$url = 'http://IP/API_URL';
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($curl, CURLOPT_PORT, [PORT]);
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.16) Gecko/20110319 Firefox/3.6.16");
$curlData = curl_exec($curl);
echo 'curl_getinfo: ';
echo "<pre>";
print_r(curl_getinfo($curl));
echo "</pre>";
if (curl_errno($curl)) {
echo 'Error: ' . curl_error($curl);
}
curl_close($curl);
echo $curlData;
And the output of the above case is:
curl_getinfo:
Array
(
[url] => http://IP:PORT/API_URL
[content_type] =>
[http_code] => 0
[header_size] => 0
[request_size] => 0
[filetime] => -1
[ssl_verify_result] => 0
[redirect_count] => 0
[total_time] => 0.201014
[namelookup_time] => 2.7E-5
[connect_time] => 0
[pretransfer_time] => 0
[size_upload] => 0
[size_download] => 0
[speed_download] => 0
[speed_upload] => 0
[download_content_length] => -1
[upload_content_length] => -1
[starttransfer_time] => 0
[redirect_time] => 0
[redirect_url] =>
[primary_ip] =>
[certinfo] => Array
(
)
[primary_port] => 0
[local_ip] =>
[local_port] => 0
)
Error: Failed to connect to [IP] port [PORT]: Connection refused
I would test if the port is open. You can use telnet IP PORT. High chance the port is being blocked. Another possibility is that you're trying to connect to a port which forces SSL. Which you have clearly inactivated in your code. Try generating a valid certificate and try again.
I wrote a script that is sending some data to an external server. Everything is working fine so far. The problem is when I'm building the url by a variable, which is necessary, the post request won't succeed. When I'm echoing out both variables they're exactly the same. The $action variable is coming through an array and has the value of /index.php?param=value.
$regexp = '/<form(.*?)action="(.*?)"(.*?)>(.*?)<\/form>/';
preg_match_all($regexp, $body, $form);
$action = $form[2][0];
$url = 'https://www.homepage.com/index.php?param=value'; // success
$url = 'https://www.homepage.com'.$action; // no success
$post_data = array(
'param1' => 'value1',
'param2' => 'value2'
);
$data = array(
'url' => $url,
'post_data' => $post_data
);
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3 AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 5);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false );
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false );
curl_setopt($ch, CURLOPT_ENCODING, "" );
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_COOKIEFILE, getcwd () . '/cookie.txt' );
curl_setopt($ch, CURLOPT_COOKIEJAR, getcwd () . '/cookie.txt' );
curl_setopt($ch, CURLOPT_URL, $url);
curl_exec($ch);
There is no curl_error($ch) and for curl_getinfo($ch) I get the following in both cases:
Array
(
[url] => https://www.homepage.com/index.php?param=value
[content_type] => text/html; charset=utf-8
[http_code] => 200
[header_size] => 437
[request_size] => 865
[filetime] => -1
[ssl_verify_result] => 0
[redirect_count] => 0
[total_time] => 0.12154
[namelookup_time] => 2.1E-5
[connect_time] => 0.003557
[pretransfer_time] => 0.014587
[size_upload] => 468
[size_download] => 1248
[speed_download] => 10268
[speed_upload] => 3850
[download_content_length] => -1
[upload_content_length] => 468
[starttransfer_time] => 0.057372
[redirect_time] => 0
[redirect_url] =>
[primary_ip] => 00.00.00.00 // changed by myself
[certinfo] => Array
(
)
[primary_port] => 443
[local_ip] => 000.000.000.000 // changed by myself
[local_port] => 62352
)
EDIT
When I am doing
$url_1 = 'https://www.homepage.com/index.php?param=value'; // success
$url_2 = 'https://www.homepage.com'.$action; // no success
if ($url_1 == $url_2) {
echo 'MATCH';
} else {
echo $url_1.'_';
echo $url_2.'_';
}
I get the else statement. So they are not the same. But they are looking identically. _ in else statement is for looking after white spaces.
I'm trying to scrap a website but it always said that Empty Reply from server
can any one look at the code and tell me what am I doing wrong?
Here is the code
function spider($url){
$header = array(
"Host" => "www.example.net",
//"Accept-Encoding:gzip,deflate,sdch",
"Accept-Language:en-US,en;q=0.8",
"Cache-Control:max-age=0",
"Connection:keep-alive","Content-Length:725","Content-Type:application/x-www-form-urlencoded",
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'
,"X-Requested-With:XMLHttpRequest"
);
$cookie = "cookie.txt";
$ch = curl_init();
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0); // return headers 0 no 1 yes
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // return page 1:yes
curl_setopt($ch, CURLOPT_TIMEOUT, 200); // http request time-out 20 seconds
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // Follow redirects, need this if the URL changes
curl_setopt($ch, CURLOPT_MAXREDIRS, 2); //if http server gives redirection response
curl_setopt($ch, CURLOPT_USERAGENT,
"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7");
curl_setopt($ch, CURLOPT_COOKIEJAR, realpath( $cookie)); // cookies storage / here the changes have been made
curl_setopt($ch, CURLOPT_COOKIEFILE, realpath( $cookie));
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // false for https
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS,"view=ViewDistrict¶m=7&uniqueid=1397991494188&PHPSESSID=f134vrnv7glosgojvf4n1mp7o2&page=http%3A%2F%2Fwww.example.com%2Fxhr.php");
curl_setopt($ch, CURLOPT_REFERER, $url);
curl_setopt($ch, CURLOPT_REFERER, "http://www.example.com/");
$data = curl_exec($ch); // execute the http request
$info = curl_getinfo($ch);
curl_close($ch); // close the connection
return $data;
}
Here is function call
echo spider("http://www.example.net/");
Edit
Array ( [url] => http://www.example.net/ [content_type] => text/html [http_code] => 301 [header_size] => 196 [request_size] => 840 [filetime] => -1 [ssl_verify_result] => 0 [redirect_count] => 1 [total_time] => 61.359 [namelookup_time] => 0 [connect_time] => 0.281 [pretransfer_time] => 0.281 [size_upload] => 0 [size_download] => 0 [speed_download] => 0 [speed_upload] => 0 [download_content_length] => 178 [upload_content_length] => 0 [starttransfer_time] => 60.593 [redirect_time] => 0.766 [certinfo] => Array ( ) [redirect_url] => ) Empty reply from server
this is the header now
also I'd updated my post data
it's now
curl_setopt($ch, CURLOPT_POSTFIELDS,"view=ViewDistrict¶m=7&uniqueid=".time(). rand(101,500)."&PHPSESSID=f134vrnv7glosgojvf4n1mp7o2&page=http%3A%2F%2Fexample.com%2Fxhr.php");
and also had removed "X-Requested-With:XMLHttpRequest" from headers
Have you tried removing this from the headers ?
X-Requested-With:XMLHttpRequest
My guess is that your problem is in this line:
curl_setopt(
$ch,
CURLOPT_POSTFIELDS,
"view=ViewDistrict¶m=7&uniqueid=1397991494188&PHPSESSID=f134vrnv7glosgojvf4n1mp7o2&page=http%3A%2F%2Fwww.example.com%2Fxhr.php"
);
Notice that you're passing a value for PHPSESSID. I'm guessing you copied & pasted a URL from a visit to the site, right? That session ID was probably valid when you visited the site, but the odds of it being valid now are pretty slim. And if the server doesn't like the session ID, chances are it's not going to give you any data.
I'm trying to submit a simple form with cURL. After obtaining the login cookie and submitting the data I want, I get a random response and form fails to submit. Here is what it looks like when browser submits a form:
formPost:cur_product_form_new_86286
curCheckbox:Y
C2cProductsListing[business_hr_from]:10:00:00
C2cProductsListing[business_hr_to]:09:30:00
C2cProductsListing[online_hr]:35
C2cProductsListing[offline_hr]:16
C2cProductsListing[non_business_hr]:17
C2cProductsListing[actual_quantity]:25000
C2cProductsListing[minimum_quantity]:25000
C2cProductsListing[products_base_currency]:USD
C2cProductsListing[products_price]:88
delivery[1]:1
C2cProductsListing[c2c_products_listing_id]:86286
Here is what it looks like when I submit the form:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.g2g.com/sell/manageListingInfo'); // open a protected page
curl_setopt($ch, CURLOPT_REFERER, 'http://www.g2g.com/sell/manageListing?game=2522&product_type=19248');
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13");
$data = array(
'formPost' => 'cur_product_form_new_86286',
'curCheckbox' => 'Y',
'C2cProductsListing' => array(
'business_hr_from' => '09:00:00',
'business_hr_to' => '08:30:00',
'online_hr' => '35',
'offline_hr' => '16',
'non_business_hr' => '17',
'actual_quantity' => '25000',
'minimum_quantity' => '25000',
'products_base_currency' => 'USD',
'products_price' => '25',
'c2c_products_listing_id' => '86286'
),
'delivery' => array(
'1' => '1'
)
);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($data));
curl_setopt($ch, CURLOPT_REFERER, 'http://www.g2g.com/sell/manageListing?game=2522&product_type=19248');
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_FAILONERROR, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
print_r(curl_getinfo($ch));
curl_close($ch);
And here is the completely random response I get:
Array ( [url] => http://www.g2g.com/sell/manageListingInfo [content_type] => [http_code] => 0 [header_size] => 0 [request_size] => 0 [filetime] => 0 [ssl_verify_result] => 0 [redirect_count] => 0 [total_time] => 0 [namelookup_time] => 0 [connect_time] => 0 [pretransfer_time] => 0 [size_upload] => 0 [size_download] => 0 [speed_download] => 0 [speed_upload] => 0 [download_content_length] => -1 [upload_content_length] => -1 [starttransfer_time] => 0 [redirect_time] => 0 [certinfo] => Array ( ) )
The array I just posted is result of print_r(curl_getinfo($ch)). How do I end up here? I emulated the browser request fully (aside from it being cURL and not the actual browser) using all the data necessary, yet the form doesn't seem to be submitted?
You're not calling curl_exec to actually execute the query. Call curl_getinfo afterwards. If you want the data returned use the return value of curl_exec instead of curl_getinfo. For example:
$data = curl_exec($ch);
echo $data;
here is my URL: https://webservices-dev.compuscan.co.za:9443/PersonStatusService/user2/password2/8310240031083/XML
i am try to get its content with PHP curl but no success hope some can help.
here is my code:
// you can add anoother curl options too
// see here - http://php.net/manual/en/function.curl-setopt.php
function get_data($url) {
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
//curl_setopt($ch, CURLOPT_POST, TRUE); // Use POST method
//curl_setopt($ch, CURLOPT_POSTFIELDS, "var1=1&var2=2&var3=3"); // Define POST values
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST,false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,false);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
$output = curl_getinfo($ch);
print_r($output);
curl_close($ch);
return $data;
}
$url = "https://webservices-dev.compuscan.co.za:9443/PersonStatusService/user2/password2/8310240031083/XML" ;
$variablee = get_data($url);
echo $variable;
Well i am getting results as expected.
It should be
$variablee = get_data($url);
echo $variablee; //<--- Must be $variablee (as you have defined it above)
OUTPUT:
Array ( [url] =>
https://webservices-dev.compuscan.co.za:9443/PersonStatusService/user2/password2/8310240031083/XML
[content_type] => text/plain; charset=utf-8 [http_code] => 200
[header_size] => 81 [request_size] => 192 [filetime] => -1
[ssl_verify_result] => 19 [redirect_count] => 0 [total_time] => 1.544
[namelookup_time] => 0 [connect_time] => 0.343 [pretransfer_time] =>
1.186 [size_upload] => 0 [size_download] => 656 [speed_download] => 424 [speed_upload] => 0 [download_content_length] => 656
[upload_content_length] => 0 [starttransfer_time] => 1.544
[redirect_time] => 0 [certinfo] => Array ( ) [primary_ip] =>
196.34.30.23 [primary_port] => 9443 [local_ip] => 192.168.72.37 [local_port] => 61090 [redirect_url] => )
8310240031083CLAIRECSAPLARN1983/10/24FY2012/03/242013/09/308 SKALIE
STWEST ACRES EXT 1312012013/08/130768333118
function get_data($url) {
$ch = curl_init($url);
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 8.0;
Windows NT 6.0)");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if ($httpCode == 404) {
curl_close($ch);
return '404';
} else {
curl_close($ch);
return $data;
}
}
This function worked perfectly for me, maybe somebody finds it useful as I did. I'm not pretending to answer your question (well, I'm 9 months late) but solve people's similar questions about a similar issue.