There is an addon for firefox called httprequester. (https://addons.mozilla.org/en-US/firefox/addon/httprequester/)
When I use the addon to send a GET request with a specific cookie, everything works fine.
Request header:
GET https://store.steampowered.com/account/
Cookie: steamLogin=*removed because of obvious reasons*
Response header:
200 OK
Server: Apache
... (continued, not important)
And then I am trying to do the same thing with cURL:
$ch = curl_init("https://store.steampowered.com/account/");
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Cookie: steamLogin=*removed because of obvious reasons*"));
curl_setopt($ch, CURLINFO_HEADER_OUT, 1);
$response = curl_exec($ch);
$request_header = curl_getinfo($ch, CURLINFO_HEADER_OUT);
echo "<pre>$request_header</pre>";
echo "<pre>$response</pre>";
Request header:
GET /account/ HTTP/1.1
Host: store.steampowered.com
Accept: */*
Cookie: steamLogin=*removed because of obvious reasons*
Response header:
HTTP/1.1 302 Moved Temporarily
Server: Apache
... (continued, not important)
I don't know if it has anything to do with my problem, but a thing I noticed is that the first lines of the request headers are different
GET https://store.steampowered.com/account/
and
GET /account/ HTTP/1.1
Host: store.steampowered.com
My problem is that I get 200 http code with the addon and 302 with curl, however I'm sending (or trying to send) the same request.
The page is doing some redirect, so you must follow it
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
If i really understand your problem, the thing is cURL is not following the redirect. He don't do that by default, you need to set a option:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
With this, cURL is able to follow the redirects.
To set the Cookies to the request use, (You may need pass the user agent):
curl_setopt($ch, CURLOPT_COOKIE, "Cookie: steamLogin=*removed because of obvious reasons*; User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0");
I think your addon sends the useragent string by default from the browser. If you add useragent string with your curl request, I believe your problem will resolve!
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
"Cookie: steamLogin=*removed because of obvious reasons*",
"User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0"
));
Related
I try to crawl Twitter search using curl. last month it works but now it got 302 http response. but using browser and postman return 200 OK
this is my curl
$param = "?f=tweets&q=+LAPOR1708&src=typd&max_position=".$scrollCursor;
$url = "https://twitter.com/i/search/timeline".$param;
$ch = curl_init();
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($ch, CURLOPT_URL,$url);
$result=curl_exec($ch);
curl_setopt($ch, CURLOPT_HTTPHEADER, ["Accept: text/html"]);
dd(curl_getinfo($ch));
curl_close($ch);
and this is my curl_getinfo
my image
and response using postman
enter image description here
A 302 response is a redirect.
Postman automatically follows redirects.
cURL does not.
This is normal. You should follow the redirect.
Twitter’s Terms of Service prohibits crawling in this manner. You should use the official developer API to retrieve search results.
I'm trying to connect to an HTTPS url through a proxy with cURL. I disabled verification since security of the data is no concern, so the error has nothing to do with SSL verification.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://www.google.com");
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.73 Safari/537.36");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_PROXY, 'proxy.example.com:8080');//working fine when proxy is set to null.
curl_exec($ch);
print_r(curl_error($ch));
This handle works fine without a proxy, but returns 'received HTTP code (one of 0/403/400) from proxy after CONNECT' when a proxy is set.
Is this how curl works or incurred by the proxy settings? I have tried a few proxies, none has worked.
Thank you.
I am using Curl on server to curl www.yelp.com, but using Curl on Http(s)://localhost will not output any CSS htmls sheets. I have tried:
CURLOPT_SSL_VERIFYPEER, FALSE
But the issue is not curling the page, it is the fact that my browser Chrome does not seem to recognize any sort of CSS formatting. Any ideas?
For example, running the below code from http://localhost gives a well formatted page. Running the below code from https://localhost gives a page without css.
<?php
$url="http://www.yelp.com/";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20100101 Firefox/17.0");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch,CURLOPT_TIMEOUT,10);
$cl = curl_exec($ch);
echo $cl;
exit;
I am using CURL and file_get_contents to find out the basic difference between a server request for a page and a browser request (organic).
I am requesting for a PHPINFO page both ways and found that it is giving different output in different cases.
For example, when I am using a browser the PHPINFO shows this:
_SERVER["HTTP_CACHE_CONTROL"] no-cache
This info is missing when I am requesting the same page through PHP.
My CURL:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/phpinfo.php");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0");
curl_setopt($ch, CURLOPT_INTERFACE, $testIP);
$output = curl_exec($ch);
curl_close($ch);
My file_get_contents:
$opts = array(
'socket' => array('bindto' => 'xxx.xx.xx.xx:0'),
'method' => 'GET',
'user_agent ' => "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0", // this doesn't work
'header' => array('Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*\/*;q=0.8')
);
My goal:
To make a PHP request look identical to a browser request.
one of possible ways for server to detect you are a php code not a browser is check your cookie. with php curl request to the server once and inject the cookie you get to your next request.
check here :
http://docstore.mik.ua/orelly/webprog/pcook/ch11_04.htm
one other way that server can understand you are a robot(php code) is check referer http header.
you can learn more here :
http://en.wikipedia.org/wiki/HTTP_referer
My URL "www.example.com" is working in browser but when I get response via curl of URL "www.example.com" I get 503 service unavailable response.
I used the following code:
$url = 'http://www.example.com';
$curl_handle = curl_init();
curl_setopt($curl_handle, CURLOPT_URL, $url);
curl_setopt($curl_handle, CURLOPT_CONNECTTIMEOUT, 0);
curl_setopt($curl_handle, CURLOPT_TIMEOUT, 0);
curl_setopt($curl_handle, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl_handle, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($curl_handle, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, TRUE);
$JsonResponse = curl_exec($curl_handle);
$http_code = curl_getinfo($curl_handle);
print_r($http_code);die;
I'm pretty sure the remote server requires specific HTTP headers (cookies for example), like a session token or a language preference.
You have to analyze the HTTP traffic sent from your browser to the remote server and find the required HTTP headers yourself. I recommend a tool like Fiddler.
An example:
GET / HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Cookie: foo=bar
Connection: keep-alive
Assuming the remote server requires clients to send a cookie with the name foo, he will probably send you a 503 or 400 error message in the case you omit it. You have to send the cookie from cURL as well in order to get a successful response, acting like a regular client.