Is this site blocking/ignoring my HTTP requests? - php

I'm only able to fetch from a site when I use cURL with a proxy. cURL without a proxy and file_get_contents() return nothing (cURL HTTP code "0" and curl_error()
Empty reply from server). I'm able to fetch other sites just fine without a proxy.
Aside from being blocked, is there any other possible explanation of why I can only access this site via proxy?

Did you set a USER AGENT in cURL? Sometimes websites will block you if your USER AGENT isn't set or if your HTTP request looks suspicious.
To set your USER AGENT in PHP:
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");

Is this from your workplace or something? Many companies disable file_get_contents() on shared PHP installs, as it's quite risky.
The site probably has user agent detection. You can fake that in your curl call but I don't believe that's possible with file_get_contents(). Another method sites use is to only display content once a cookie has been set so site scrapers will never see the data.
Try this:
function curl_scrape($url,$data,$proxy,$proxystatus)
{
$fp = fopen("cookie.txt", "w");
fclose($fp);
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");
curl_setopt($ch, CURLOPT_TIMEOUT, 40);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
if ($proxystatus == 'on')
{
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, TRUE);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
}
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
ob_start(); // prevent any output
return curl_exec ($ch); // execute the curl command
ob_end_clean(); // stop preventing output
curl_close ($ch);
unset($ch);
}

I'm guessing I was truly blocked. Using proxy now and it works fine.

Related

Can't get url using JSON script parsed with file_get_contents

I have this link I want to parse some information in it or just save it in a file...
can't do it without this simple code:
Example:
<?php
$myFile = 'test.txt';
$get= file_get_contents("http://www.ticketmaster.com/json/resale?command=get_resale_listings&event_id=0C004B290BF2D95F");
file_put_contents($myFile, $get); ?>
The output is:
{"version":1.1,"error":{"invalid":{"cookies":true}},"command":"get_resale_listings"}
I tried many other things like fopen or include did not work either. I don't understand because when I put the url in the browser it shows exactly ALL the code (google chrome) OR even better ask me to save it as a file (explorer). Looks like a browser cookies or something that doesn't load on my localhost ??
thanks for your tips.
You need to access that url with CURL.
The server checks if the client has cookies enabled. Using file_get_content() You do not send any information about client (browser).
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.ticketmaster.com/json/resale?command=get_resale_listings&event_id=0C004B290BF2D95F');
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIEJAR, "my_cookies.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "my_cookies.txt");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3");
curl_exec($ch);

multiple actions with curl

I'm trying to do two actions with curl:
1. Login into admin page
2. Submit a form (add user)
The first one go fine but the second show error as not loged in.
Here is my code:
$ch1 = curl_init();
$ch2 = curl_init();
curl_setopt($ch1, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1" );
curl_setopt($ch1, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch1, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch1, CURLOPT_URL, "http://admin.example.com/admin");
curl_setopt($ch1, CURLOPT_POST, 1);
curl_setopt($ch1, CURLOPT_POSTFIELDS, "user=admin&pass=password");
curl_setopt($ch1, CURLOPT_FOLLOWLOCATION, 1); // allow redirects
curl_setopt($ch1, CURLOPT_RETURNTRANSFER,1); // return into a variable
curl_setopt($ch2, CURLOPT_URL, "http://admin.example.com/admin/adduser");
curl_setopt($ch2, CURLOPT_POST, 1);
curl_setopt($ch2, CURLOPT_POSTFIELDS, "newu=demo&pass=password");
curl_setopt($ch2, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch2, CURLOPT_FOLLOWLOCATION, 1);
$mh = curl_multi_init();
curl_multi_add_handle($mh, $ch1);
curl_multi_add_handle($mh, $ch2);
// execute all queries simultaneously, and continue when all are complete
$running = null;
do {
curl_multi_exec($mh, $running);
} while ($running);
//close the handles
curl_multi_remove_handle($mh, $ch1);
curl_multi_remove_handle($mh, $ch2);
curl_multi_close($mh);
You can perform both of these requests using the same cURL handle. The problem in using curl_multi_exec in this case is that each curl handle has different options and $ch2 does not reference any cookies.
Also, curl_multi_exec performs the requests in parallel which means you may try to add the user before the login request is completed or even started.
Try this instead, it illustrates logging in using $ch, and then using it again to add the user. If the server supports keep-alive, then you can add a keep-alive header and the same socket connection is re-used for the second request.
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1" );
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_URL, "http://admin.example.com/admin");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, "user=admin&pass=password");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); // allow redirects
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); // return into a variable
$res = curl_exec($ch);
// check $res here to see if login was successful
curl_setopt($ch, CURLOPT_URL, "http://admin.example.com/admin/adduser");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, "newu=demo&pass=password");
$res = curl_exec($ch);
// check $res to see that the user was successfully created
curl_close($ch);
Here are some other answers showing how to make multiple serial requests to the same site using cURL after logging in.
Login to Google with PHP and Curl, Cookie turned off?
Retrieve Android Market mylibrary with curl
PHP Curl - Cookies problem
The problem is that by using curl_multi, you are executing both requests at the same time. The form submitting request will be sent at the same time as the login request, so the login cookies will not be available yet.
Additionally, you're not passing the cookies to the form request at all. You forgot to do this:
curl_setopt($ch2, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch2, CURLOPT_COOKIEFILE, "cookie.txt");
You should do both requests separately to ensure that the proper login cookies are available for the form request.

curl through proxy returns no content

I'm working on a PHP script right now which sends requests to our school's servers to get real-time information about class sizes for different courses. The script runs perfectly fine when I don't use a proxy, returning a string full of course numbers and available seats. However, I want to make this a service for the students, and I'm afraid that if I make too many requests my ip will get blocked. So I'm attempting to do this through a proxy, with no success. As soon as I add the CURLOPT_HTTPPROXYTUNNEL and CURLOPT_PROXY fields to my requests nothing gets returned. I'm not even sure how to troubleshoot it at this point since I'm not getting an error message of any kind. Does anyone know what's going on or how to at least troubleshoot it?
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$proxy = explode(':', $proxy);
curl_setopt($ch, CURLOPT_PROXY, $proxy[0]);
curl_setopt($ch, CURLOPT_PROXYPORT, $proxy[1]);
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, 'tempcookie.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, 'tempcookie.txt');
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_REFERER, $ref);
$exec = curl_exec($ch);
echo curl_error($ch);
print_r(curl_getinfo($ch));
echo $exec;
Proxy used for tests: 75.147.173.215:8080
I use the following code if I have to use a proxy with curl:
$proxy = "127.0.0.1:8080"; // or something like that
if($proxy !== null){
// no need to specify PROXYPORT again
curl_setopt($ch, CURLOPT_PROXY, $proxy);
// to make the request go through as though proxy didn't exist
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 0);
}
You can set CURLOPT_STDERR and CURLOPT_VERBOSE curl options to save your errors in a file. Also, you may use curl_error() function. BTW, by default, curl should show all errors in STDERR.
Besides, for general check, you can simply specify selected proxy in you browser configuration properties, try to open specific service in browser and see, whether correct response is returned.
UPDATE:
CURLOPT_HTTPPROXYTUNNEL is used to make curl call CONNECT HTTP method when requesting proxy server (see here for details). I tested code without this option - it worked successfully.
Code I used:
$proxy = "75.147.173.215:8080";
$proxy = explode(':', $proxy);
$url = "http://google.com";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_PROXY, $proxy[0]);
curl_setopt($ch, CURLOPT_PROXYPORT, $proxy[1]);
curl_setopt($ch, CURLOPT_HEADER, 1);
$exec = curl_exec($ch);
echo curl_error($ch);
print_r(curl_getinfo($ch));
echo $exec;
Here is a well tested function which i used for my projects with detailed self explanatory comments
There are many times when the ports other than 80 are blocked by server firewall so the code appears to be working fine on localhost but not on the server , try using port 80 proxies
function get_page($url){
global $proxy;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
//curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_HEADER, 0); // return headers 0 no 1 yes
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // return page 1:yes
curl_setopt($ch, CURLOPT_TIMEOUT, 200); // http request timeout 20 seconds
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // Follow redirects, need this if the url changes
curl_setopt($ch, CURLOPT_MAXREDIRS, 2); //if http server gives redirection responce
curl_setopt($ch, CURLOPT_USERAGENT,
"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7");
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookies.txt"); // cookies storage / here the changes have been made
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookies.txt");
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // false for https
curl_setopt($ch, CURLOPT_ENCODING, "gzip"); // the page encoding
$data = curl_exec($ch); // execute the http request
curl_close($ch); // close the connection
return $data;
}

How to store multiple cookies through PHP Curl

'SOUP.IO' is not providing any api. So Iam trying to use 'PHP Curl' to login and submit data through PHP.
Iam able to login the website successfully(through cUrl), but when I try to submit data through cUrl, it gives me error of 'invalid user'.
When I tried to analysed the code and website, I came to know that cUrl is getting values of only 1-2 cookies. Where as when I open the same page in FireFox, it shows me 6-7 cookies related to 'SOUP.IO'.
Can some one guide me how to get all these 7 cookies values.
Following cookies are getable by cUrl:
soup_session_id
Following cookies are shown in Firefox (not through cUrl):
__qca, __utma, __utmb, __utmc, __utmz
Following is my cUrl code:
<?php
session_start();
$cookie_file_path = getcwd()."/cookie/cookie.txt";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.soup.io');
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 (.NET CLR 3.5.30729) FirePHP/0.4');
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$result = curl_exec($ch);
curl_close($ch);
print_r($result);
?>
Can some one guide me in this regards
Thanks in advance
These extra "underscore" cookies seem like Google Analytics or similar tracking cookies, most likely set via Javascript. That's the reason they don't show up when using cURL. I'd venture the guess that they're not the problem.
A couple of things i noticed once i signup and went into the area. The domain all my action happens is "user.soup.io" and not "www.soup.io" , that could be the reason behind your invalid user error. Try to set the url to your own subdomain AFTER the login is complete and see how it goes. Also what data are you exactly trying to post?
This may not be relevant but soup.io doesnt seem to use HTTPS, so why use:
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
Here is my curl code which Iam trying to use for sending data to soup.io after suceesful login through cUrl
$storedata = array();
$storedata["post[title]"] = 'Phoonk 2 (16th April 2010)';
$storedata["post[body]"] = 'Ramgopal Varma\'s love for horror and supernatural continues. This time, in PHOONK 2, the team behind PHOONK promise more chills, more thrills and more screams. But what you get to hear at the end of the screening is a moan, since PHOONK 2 lacks the chills, thrills and screams that were the mainstay of its first part.';
$storedata["post[tags]"] = 'Bollywood Movie, Indian movie';
$storedata["commit"] = 'Save';
$storedata["post[id]"] = '';
$storedata["post[type]"] = 'PostRegular';
$storedata["post[parent_id]"] = '';
$storedata["post[original_id]"] = '';
$storedata["post[edited_after_repost]"] = '';
$store_post_str = '';
foreach($storedata as $key => $value){
$store_post_str .= $key.'='.urlencode($value).'&';
}
$store_post_str = substr($store_post_str, 0, -1);
$ch2 = curl_init();
curl_setopt($ch2, CURLOPT_URL, 'http://loekit.soup.io/save');
curl_setopt($ch2, CURLOPT_VERBOSE, 1);
curl_setopt($ch2, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch2, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch2, CURLOPT_HEADER, TRUE);
curl_setopt($ch2, CURLOPT_ENCODING, 'gzip,deflate');
//curl_setopt($ch2, CURLOPT_COOKIEJAR, $cookie_file_path);
//curl_setopt($ch2, CURLOPT_COOKIEFILE, $cookie_file_path);
curl_setopt($ch2, CURLOPT_REFERER, 'http://loekit.soup.io/');
curl_setopt($ch2, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 (.NET CLR 3.5.30729) FirePHP/0.4');
curl_setopt($ch2, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch2, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch2, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch2, CURLOPT_POSTFIELDS, $store_post_str);
curl_setopt($ch2, CURLOPT_POST, TRUE);

Payment Gateway using cURL with SSL?

I am processing credit cards using a payment gateway. To POST the data to their servers, I am using cURL in PHP. I have an SSL certificate issued to my domain, to ensure all POST'ed data is encrypted. Because the SSL certificate is already installed, do I still need to use the SSL options for cURL? If so, which of the options do I need to set given my setup?
I have tried the following code unsuccessfully:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,"https://secure.paymentgateway.com/blah.php");
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_CAINFO, getcwd().'/cert/ca.crt');
curl_setopt($ch, CURLOPT_SSLCERT, getcwd().'/cert/mycert.pem');
curl_setopt($ch, CURLOPT_SSLCERTPASSWD, 'password');
curl_setopt($ch, CURLOPT_POST, $count);
curl_setopt($ch,CURLOPT_POSTFIELDS,"variables...");
$output = curl_exec($ch);
echo $output;
curl_close($ch);
Well you already disabled the verification (which I don't recommend: curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);). This opens you for Man-in-the-middle attacks.
Here's a simple tutorial that might help you:
http://developer.paypal-portal.com/pdn/board/message?board.id=ipn&message.id=12754#M12754

Categories