php - curl and source - php

Why I get empty source? When I uncommit
//curl_setopt($ch, CURLOPT_URL, 'www.onet.pl');
and commit
'Accept-Encoding: gzip,deflate,sdch',
everything works fine for www.onet.pl. Why it doesn't work for www.ebok.pl (?
<?php
$COOKIEFILE = $_SERVER['DOCUMENT_ROOT'] . '/../data/config/ebok.txt';
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIEJAR, $COOKIEFILE);
curl_setopt($ch, CURLOPT_COOKIEFILE, $COOKIEFILE);
curl_setopt($ch,CURLOPT_RETURNTRANSFER, 0);
curl_setopt($ch,CURLOPT_HTTPHEADER, array('Content-type: application/x-www-form-urlencoded',
'Accept: text/html,application/xhtml+xml,application/xml',
'Accept-Encoding: gzip,deflate,sdch',
'Accept-Language: pl-PL,pl;q=0.8,en-US;q=0.6,en;q=0.4',
'Connection: keep-alive',
'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36')
);
//curl_setopt($ch, CURLOPT_URL, 'http://www.ebok.pl');
curl_setopt($ch, CURLOPT_URL, 'https://ssl.plusgsm.pl/ebok-web/');
//curl_setopt($ch, CURLOPT_URL, 'www.onet.pl');
echo curl_exec($ch);
?>
I need to log-in to this page.

If something isn't working right, there's almost always a way to find out why. In curl's case, curl_error will tell you.
If i change your curl_exec line to instead say
$result = curl_exec($ch);
if ($result === false) die(curl_error($ch));
I get this:
[cHao#hydra-vm ~]$ php curl.php
error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac%
Looks like this has something to do with the server doing SSL/TLS handshaking incorrectly, but i don't know enough about curl or SSL to say for sure.
Reason #15 to always check the result from functions that can return errors.
Either way, if i force the SSL version to TLS v1.0, like so:
curl_setopt($ch, CURLOPT_SSLVERSION, 4);
then it works for me. Generic TLSv1 fails with the same error as above, versions higher than 1.0 give me an error about "unsupported protocol", SSLv3 says "alert handshake failure", and SSLv2 simply isn't supported on my machine.

Related

php curl is not working while debug proxy server like fiddler is working

I am trying to scrape from a site which is behind cloudflare.
My code is::
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL, 'https://targetsite');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_POSTFIELDS, '{"current_bid_status":true}');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_HTTPHEADER, [
'User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.130 Safari/537.36',
'Accept: application/json',
'Accept-Language: en-US,en;q=0.5',
'Content-Type: application/x-www-form-urlencoded',
'Content-Length: '.strlen($data)
]);
$result = curl_exec($ch);
$status = curl_getinfo ($ch);
The response header was 403, and the response body was error code: 1020
It looks like cloudflare is blocking the request.
But when i add fiddler proxy:
curl_setopt($ch, CURLOPT_PROXY, '127.0.0.1:8888');
It works nicely!
What can be the possible reason here? Is it something related with ssl certificate?
_

GnuTLS recv error (-54): Error in the pull function

I have a PHP scraper that works perfectly on my local. But when I uploaded it to my VPS (Ubuntu 16.04), it's not able to get data from the website. Instead, it's showing this error message:
"curl: (56) GnuTLS recv error (-54): Error in the pull function"
I updated Openssl, Curl, GnuTLS still no luck. Tried to perform CURL from command line, it showed same error. It should be something to do with CURL /GnuTLS. I saw some people had same error message when using Git and fixed it but that solution isn't working in my case. Is there any way to fix it?
Here is the PHP function I use to get data from website:
function get_html($url)
{
$agent= 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.92 Mobile Safari/537.36';
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Language: q=0.9,en-US;q=0.8,en;q=0.7',
'Connection: keep-alive'
));
curl_setopt($ch, CURLOPT_URL,$url);
$pageContent = curl_exec($ch);
curl_close($ch);
return $pageContent;
}
Thanks in advance.

PHP Curl disable cache

I have a php script that periodically perform an http request to a remote api server.
I use curl to perform this task. My code is like the following exemple:
$url='http://apiserver.com/services/someservice.php?apikey='.$my_key;
$ch=curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Accept-Encoding: gzip, deflate',
'Accept: */*',
'Connection: keep-alive',
'User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:48.0) Gecko/20100101 Firefox/48.0'
));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
$data=curl_exec($ch);
curl_close($ch);
$json_data = json_decode($data,TRUE);
Recently I verified that it no longer fetch new data, it is using instead a cached version. I tried to add the following curl flags to the code:
curl_setopt($ch, CURLOPT_FRESH_CONNECT, 1);
curl_setopt($ch, CURLOPT_FORBID_REUSE, 1);
This did not solve the problem. I still receive the same cached response.
If I add an extra parameter to the api url like "&time=".time(), that fix the problem, but I dont want to add extra parameters to the url.
What can I do to solve the problem?

PHP cURL doesn't follow redirects even if the flag is set

Even though I have set curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true) cURL doesn't want to follow redirects, it only shows the "301 Moved page". Tried it with multiple sites.
Strange thing is that it works on localhost, but when I upload it to my webspace then refuses to work.
Is it possible that my web hosting provider made some tweaks that it doesn't work? Never seen such thing :(
Here's the code:
$ch=curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://google.com');
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_ENCODING, '');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36');
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: en-US,en;q=0.5',
'Accept-Encoding: gzip, deflate',
'Connection: keep-alive'
));
$result = curl_exec($ch);
curl_close($ch);
I had a similar issue and it was due to cURL executing a GET immediately after receiving the redirect header. To fix this i specified CURLOPT_CUSTOMREQUEST
Example:
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");

PHP Curl Randomly Hangs

I wrote a PHP script that curls URLs to get the page html. The page content comes back about 50% of the time and the rest of the time only part of the content is returned and the script fails to terminate. I'm not getting any errors...
$headers = array(
'Accept-Language: en-US,en;q=0.8',
'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36',
'Content-Type: application/x-www-form-urlencoded',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'
);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.youtube.com/channel/UCkN6ktadXpZl_viwRCSEGUQ');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
$curl_info = curl_getinfo($ch);
$response = curl_exec($ch);
curl_close($ch);
print $response;
print_r($curl_info);
Run on CLI:
php script_name.php
If ran 10 times or so you will see that it is unable to complete at least a few times with no warnings or errors...
Ubuntu had performed a bunch of system updates. I was working with the code when the updates finished. After a system restart this issue went completely away. Go figure.

Categories