I couldn't find answer on this questions. Sometimes* while trying to retrieve data from http (NOT https) site I get 35 error - SSL connect error.
URL that I'm trying to reach is ie. http://www.aliexpress.com/item//32566080839.html. Then i get redirected to "full url": http://www.aliexpress.com/item/NEW-Sport-Headband-Bike-Halloween-Skull-face-mask-balaclava-Skull-Bandana-Paintball-Ski-Motorcycle-Helmet-Neck/32566080839.html
My cURL code:
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, 'http://aliexpress.com/item//'. $id .'.html');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 3);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20100101 Firefox/17.0');
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
$data = curl_exec($curl);
I've been trying to add curl_setopt($curl, CURLOPT_SSLVERSION , 3); but it doesn't help.
Why http site gives a 35 error? Is it normal?
Is it possible that aliexpress i blocking my requests?
Sometimes I also get 28 error which is timeout reached - even with 10 seconds timeout.
*Sometimes - I mean it's working for a few hours then not working for about 10 minutes and then still working.
It looks like you are trying to spider on their site using the Id. And as a consequence the site blocks you. As you are referring to SSL error, it is very likely that during the blockade period they are redirecting you to an error page that starts with https://
For the debugging purpose you can enable the verbose mode and observe the header and you'll find what is inside the Location: response header.
curl_setopt ($curl, CURLOPT_VERBOSE, true);
Related
If we call the same URL in the browser then it auto-download the CSV file. So, I want the same feature using PHP curl to save the CSV file under the same folder. But it gives me an empty result every time. Can you please guide me on what's is wrong in the code below?
$url="https://www.centrano.com/catalog_download.php?email=info#sporttema.com&password=dHB3L1FpTEg1c2pLZ29SUkdnUWcwWTFqN2RIamQx";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
$agent= "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36";
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, false);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 100000); //time out of 15 seconds
$output = curl_exec($ch);
print_r($output);
curl_close($ch);
Add the following two CURL options to make it work:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, '/tmp/cookies.txt');
The page redirects several times to the same host and requires for the session cookies to remain present (thus, storing them in a cookie jar.
Also: change your centrano.com account password IMMEDIATELY! Even though it helped solving this, it's generally not a good idea to make it public.
First at all, sorry for my bad English.
I'm trying to get the HTML code from https://www.uasd.edu.do/ but when I try to catch the code with the PHP function "file_get_contents()" or using cURL, it just simply doesn't work.
With "file_get_contents()" it returns with a 403 HTTP error. With cURL, it returns with a fictional captcha that just do not appear.
I tried sending Cookies with cURL, setting a user-agent, but I'm still on the same point. Also I tried to find the real IP address of the site, but with not success. Please help me! I'll really appreciate that.
The code:
$curl = curl_init();
if (!$curl) {
die("Is not working");
}
curl_setopt($curl, CURLOPT_URL, "https://uasd.edu.do/");
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0');
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_FAILONERROR, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($curl, CURLOPT_TIMEOUT, 50);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
$html = curl_exec($curl);
echo $html;
curl_close($curl);
The output:
Please enable cookies. One more step Please complete the security
check to access www.uasd.edu.do Why do I have to complete a CAPTCHA?
Completing the CAPTCHA proves you are a human and gives you temporary
access to the web property. What can I do to prevent this in the
future?
If you are on a personal connection, like at home, you can run an
anti-virus scan on your device to make sure it is not infected with
malware.
If you are at an office or shared network, you can ask the network
administrator to run a scan across the network looking for
misconfigured or infected devices.
Cloudflare Ray ID: 4fcbf50d18679f88 • Your IP: ... •
Performance & security by Cloudflare
Note: The "please enable cookies" appear using and not using cookies.
I am using PHP Simple HTML DOM Parser, here you can check more about it: http://simplehtmldom.sourceforge.net/
And also i am using a CURL because this web adress http://www.sportsdirect.com is not loading on the normal examples from the SimpleHTMLDom.
So here is the code i use:
<?php
include_once('../simple_html_dom.php');
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, 'http://www.sportsdirect.com/');
curl_setopt($curl, CURLOPT_HEADER, 0);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 10);
$str = curl_exec($curl);
curl_close($curl);
$html= str_get_html($str);
echo $html->plaintext;
?>
When i try to load the script it gives me: 500 Internal Server Error
Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete your request.
Please contact the server administrator, webmaster#superweb.bg and inform them of the time the error occurred, and anything you might have done that may have caused the error.
More information about this error may be available in the server error log.
Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.
This script is just not working for this web adress, because when i try to load other website like mandmdirectDOTcom it is woking OKEY!
Where is my mistake and how i can make this thing works?
Try this for the curl fetch. It works for me in this case. This is a standard set of curl options & settings I use that work well:
include_once('simple_html_dom.php');
$url = "http://www.sportsdirect.com";
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_SSLVERSION, 3);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
$str = curl_exec($curl);
curl_close($curl);
$html = str_get_html($str);
echo $html->plaintext;
I believe the issue with your original curl settings was the missing user agent. Try the same script with the CURLOPT_USERAGENT line commented out to see what I mean.
Many servers have firewall settings that disallow curl requests from users making requests without a proper user agent setting. The user agent I have set here is a fairly generic Firefox user agent, so feel free to experiment with that to use something else.
Try setting a Host header in the request. It's possible that the target domain is on a shared server, and without a Host header, the server doesn't know what to do.
curl_setopt($curl, CURLOPT_HTTPHEADER, array('Host: www.sportsdirect.com'));
I'm getting the following error when attempting perform a file_get_contents on a specific URL: http://lolking.net/champions/. I've had no problems performing file_get_contents on any other web page I've tried.
The error is:
Warning: file_get_contents(http://lolking.net/champions) [function.file-get-contents]: failed to open stream: HTTP request failed!
I think that the page is purposefully blocking the request I'm trying to make. I've also tried using cURL and faking a user agent, but neither of these have worked.
Is there anything else I can do to try to grab information from the aforementioned URL?
This works for me, remember to have cookies.txt.
$cookie_file = "cookies.txt";
$url = 'http://www.lolking.net/champions';
$c = curl_init($url);
curl_setopt($c, CURLOPT_FRESH_CONNECT, 1);
curl_setopt($c, CURLOPT_COOKIEJAR, $cookie_file);
curl_setopt($c, CURLOPT_COOKIEFILE, $cookie_file);
curl_setopt($c, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20100101 Firefox/17.0");
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($c, CURLOPT_FOLLOWLOCATION, 1);
$z = curl_getinfo($c);
$s = curl_exec($c);
curl_close($c);
echo $s;
Try
file_get_contents('http://www.lolking.net/champions/');
file_get_contents can't handle HTTP header redirects which this page does to make it go to www.lolking.net. Try it with the last slash on it.
Seem to be in a bit of a predicament. As far as I am aware, there have been no changed to PHP or Apache, however a code that has worked for almost 6 months just stoped working today at 2pm.
The code is:
function ls_record($prospectid,$campid){
$api_post = "method=NewProspect&prospect_id=".$prospectid."&campaign_id=".$campid;
$ch = curl_init();
curl_setopt($ch, CURLOPT_FRESH_CONNECT, TRUE);
curl_setopt($ch, CURLOPT_HEADER, FALSE);
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS, $api_post);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_URL, "http://XXXXX/XXXXXX/store.php");
$x = print_r(curl_exec($ch), TRUE);
return $x;
}
It returns NULL, I tried usingfile_get_contents()which also returnsNULL`. I checking the Apache error logs and see nothing...I need some help on this one.
Do you have access to the command line of the server? It could be that the destination has blocked you somehow.
If you have command line access, try this
wget http://XXXXX/XXXXXX/store.php
That should at least return something (if not headers)
use curl_getinfo to check your curl execution status, it maybe that the server you try to extract content from need your curl to set user-agent, some site check user-agent to block unwanted curl access.
below are the user agent I used to disguise my curl as desktop chrome browser.
curl_setopt($ch,CURLOPT_USERAGENT,' Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36');
I face same problem on my server , because of the low internet speed. Internet speed is go down for some time and curl take so many time to execute , so it return a timeout error . After a few minute it is working fine without any changes on server.