PHP cURL is not capturing all ASP.NET Cookies - php

Trying to automate login to a ASP.NET site using PHP & cURL but running into a cookie problem.
When I check in the browser, initial login page stores 5 cookies. Which are ASP.NET_SessionId, __utma, __utmb, __utmc & __utmz
When this page is accessed via cURL the cookie file is storing only one cookie: "ASP.NET_SessionId"
I referred to many posts & tried all kinds of cURL option combinations returning the same result.
I don't know how ASP.NET cookies work or differ from PHP. Any help is appreciated.
Here is my php code:
$cookie_file_path = "tmp/cookie.txt";
$LOGINURL = "https://godaddy.com";
$agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1";
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
curl_setopt($ch, CURLOPT_URL, $LOGINURL);
$content = curl_exec($ch);
curl_close($ch);
echo '<textarea style="width:1000px; height:300px">'.$content.'</textarea>';

__utma, __utmb, __utmc & __utmz are all Google Analytics cookies stored by javascript, thus being created client side.
So no way of processing them through cURL / PHP

Related

Cookie not set, using curl in php

I currently learning to web scraping an asynchronous website. First, I need to get the cookie. I'm using the code below to save the cookie to a txt file. But it not save the cookie when I run it. When I access the file, it's empty. I don't know where my problem is, because you know I still a noob in this thing. Hope you guys can answer it. Thanks for your time.
$cookie_file_path = dirname(__FILE__) . "/cookie.txt";
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_URL, url);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36');
curl_setopt($ch, CURLOPT_TIMEOUT, 40);
curl_exec($ch);
curl_close($ch);

PHP Curl gets 403 error, but browser from same machine can request page?

I've got this script working with generally no problems. I say generally, because while it retrieves pages from CNN.com, allrecipes.com, reddit.com, etc - when I point it towards at least one URL (foxnews.com), I get a 403 error instead.
As you can see, I've set the user agent to the same as my machine's browser (that was necessitated by sending a request to Facebook's homepage, which returned a message that the browser wasn't supported).
So, basically wondering what step(s) I need to take to have as many sites as possible recognize the CURL request as coming from a real, actual browser, rather than 403'ing it.
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $this->url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/602.4.8 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.8');
curl_setopt($ch, CURLOPT_FRESH_CONNECT, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
Fox News appears to be blocking access to their website from any request passing a USERAGENT. Simply removing the USERAGENT string works fine for me:
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $this->url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
Hope this helps! :)

Cookie.txt disabling

ima having a problem with login via curl function......
My problem is that it would like to be able to login without the cookie.txt.......
because if i remove cookie.txt i cant login........ when cookie.txt is there it logins successfully, but i would like to login without using cookies....... i tried unlinking cookie.txt but as i said i cant login then......
PART OF THE CODE
$ret=false;
$useragent = "Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/531.21.10";
$data = setData($email,$pass);
$ch = curl_init('https://www.website.com/login.php');
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_ENCODING , "gzip,deflate");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_COOKIESESSION, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 40);
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
curl_setopt($ch, CURLOPT_COOKIEFILE, dirname(__FILE__) . '/cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__) . '/cookie.txt');
$source=curl_exec($ch);
$info=curl_getinfo($ch);
if($info["redirect_count"]==1)
{
$ret=true;
}
You can't loging without using cookies, neither via curl, nor via browser (unless the site you are logging to implements a different mechanism to save the session id, for example as part of the urls for example, but this is rarely the case and it doesn't depend on you). The reason is that without the cookie the server can't know that the request comes from you and not from someone else.
Facebook doesn't implement a login system that doesn't use cookies, so you can't.

Accept cookies using cURL?

I've been trying to get the contents of a webpage using cURL, but have trouble getting cURL to accept cookies.
For example, on Target.com, when I cURL it, it still says that I have to enable cookies.
Here is my code:
$url = "http://www.target.com/p/Acer-Gateway-15-6-Laptop-PC-NV57H77u-with-320GB-Hard-Drive-4GB-Memory-Black/-/A-13996190#?lnk=sc_qi_detailbutton";
$ch = curl_init(); // initialize curl handle
curl_setopt($ch, CURLOPT_URL,$url); // set url to post to
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);// allow redirects
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); // return into a variable
curl_setopt($ch, CURLOPT_TIMEOUT, 10); // times out after 4s
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:11.0) Gecko/20100101 Firefox/11.0");
$cookie_file = "cookie1.txt";
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);
$result = curl_exec($ch); // run the whole process
curl_close($ch);
echo $result;
What else am I missing?
The cookie1.txt file is 077 permission, by the way.
077 is an malformed permission setting, this means the owner (probably apache) has no access. Try setting it to 644 (owner has read/write) as it's only a file.

Using cURL to download a site's HTML source, but getting different file than intended

I'm trying to use cURL and PHP to download the HTML source (as it appears in the browser) of here. But instead of the actual source code, this is returned instead (a meta refresh link set to 0).
<html>
<head><title>Object moved</title></head>
<body>
<h2>Object moved to here.
</h2>
</body>
</html>
I'm trying to spoof the referral header to be the site, but it seems I'm doing it wrong. Code is below. Any suggestions? Thanks
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.windowsphone.com/en-US/apps/ea39f002-ac30-e011-854c-00237de2db9e');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.6 (KHTML, like Gecko) Chrome/16.0.897.0 Safari/535.6');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
curl_setopt($ch, CURLOPT_AUTOREFERER, false);
curl_setopt($ch, CURLOPT_REFERER, "http://www.windowsphone.com/en-US/apps/ea39f002-ac30-e011-854c-00237de2db9e");
$html = curl_exec($ch);
curl_close($ch);
Add the curl option to follow redirects:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
If it is a meta refresh and not an HTTP moved header, see:
PHP: Can CURL follow meta redirects
As mentioned by flesk, you may also need to store the cookies.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.windowsphone.com/en-US/apps/ea39f002-ac30-e011-854c-00237de2db9e');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.6 (KHTML, like Gecko) Chrome/16.0.897.0 Safari/535.6');
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_REFERER, "http://www.windowsphone.com");
$html = curl_exec($ch);
curl_close($ch);
echo $html;
The problem isn't the referrer but that you need to enable cookies for it to work.
Try something like this:
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
You have to query the page twice. First allow redirects to get the cookie from login.live.com, then query again with the cookie set.

Categories