PHP (cURL, headers, COOKIES) and more - php

I will try to explain what I am trying to reach the best way I can.
Let's say there is a page that shows information and it has cookies ( I can see the cookies through Firecookie [Firefox add-on in Firebug]) I am able to print the cookies in my localhost through
$cookies = array();
foreach ($http_response_header as $hdr) {
if (preg_match('/^Set-Cookie:\s*([^;]+)/', $hdr, $matches)) {
parse_str($matches[1], $tmp);
$cookies += $tmp;
}
}
print_r($cookies);
but the original page has request headers, and what I am trying to do is get the request header and make a request to that same page. I guess I have two questions, do I get the request header through COOKIES or separately. And how do I get the request headers of a page and send a request to that page with those request headers? I tried lots of things and couldn't succeed. I don't have the codes I've tried since I constantly try new things therefore can't paste what I have, only the file I pasted.

If you're using PHP with apache you can get request headers using function apache_request_headers.
http://php.net/manual/en/function.apache-request-headers.php

Related

Send multiple http request at same time in php

I am trying to get page meta tags and description from given url .
I have url array that I have to loop through to send curl get request and get each page meta, this takes a lot of time to process .
Is there any way to process all urls simultaneuosly at same time?
I mean send request to all urls at same time and then receive
response as soon as request is completed respectively.
For this purpose I have used
curl_multi_init()
but its not working as expected. I have used this example
Simultaneuos HTTP requests in PHP with cURL
I have also used GuzzleHttp example
Concurrent HTTP requests without opening too many connections
my code
$urlData = [
'http://youtube.com',
'http://dailymotion.com',
'http://php.net'
];
foreach ($urlData as $url) {
$promises[] = $this->client->requestAsync('GET', $url);
}
Promise\all($promises)->then(function (array $responses) {
foreach ($responses as $response) {
$htmlData = $response->getBody();
dump($profile);
}
})->wait();
But I got this error
Call to undefined function GuzzleHttp\Promise\Promise\all()
I am using Guzzle 6 and Promises 1.3
I need a solution whether it is in curl or in guzzle to send simultaneous request to save time .
Check your use statements. You probably have a mistake there, because correct name is GuzzleHttp\Promise\all(). Maybe you forgot use GuzzleHttp\Promise as Promise.
Otherwise the code is correct and should work. Also check that you have cURL extension enabled in PHP, so Guzzle will use it as the backend. It's probably there already, but worth to check ;)

Google drive api file_get_contents and refferer

I am trying to list files from google drive folder.
If I use jquery I can successfully get my results:
var url = "https://www.googleapis.com/drive/v3/files?q='" + FOLDER_ID + "'+in+parents&key=" + API_KEY;
$.ajax({
url: url,
dataType: "jsonp"
}).done(function(response) {
//I get my results successfully
});
However I would like to get this results with php, but when I run this:
$url = 'https://www.googleapis.com/drive/v3/files?q='.$FOLDER_ID.'+in+parents&key='.$API_KEY;
$content = file_get_contents($url);
$response = json_decode($content, true);
echo json_encode($response);
exit;
I get an error:
file_get_contents(...): failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden
If I run this in browser:
https://www.googleapis.com/drive/v3/files?q={FOLDER_ID}+in+parents&key={API_KEY}
I get:
The request did not specify any referer. Please ensure that the client is sending referer or use the API Console to remove the referer restrictions.
I have set up referrers for my website and localhost in google developers console.
Can someone explain me what is the difference between jquery and php call and why does php call fails?
It's either the headers or the cookies.
When you conduct the request using jQuery, the user agent, IP and extra headers of the user are sent to Google, as well as the user's cookies (which allow the user to stay logged in). When you do it using PHP this data is missing because you, the server, becomes the one who sends the data, not the user, nor the user's browser.
It might be that Google blocks requests with invalid user-agents as a first line of defense, or that you need to be logged in.
Try conducting the same jQuery AJAX request while you're logged out. If it didn't work, you know your problem.
Otherwise, you need to alter the headers. Take a look at this: PHP file_get_contents() and setting request headers. Of course, you'll need to do some trial-and-error to figure out which missing header allows the request to go through.
Regarding the referrer, jQuery works because the referrer header is set as the page you're currently on. When you go to the page directly there's no referrer header. PHP requests made using file_get_contents have no referrer because it doesn't make much sense for them to have any.

Get set-cookie header from redirect php

Okay, I haven't been able to find a solution to this as of yet, and I need to start asking questions on SO so I can get my reputation up and hopefully help out others.
I am making a wordpress plugin that retrieves a json list of items from a remote site. Recently, the site added a redirecting check for a cookie.
Upon first request without the cookie, 302 headers are provided, pointing to a second page which also returns a 302 redirect pointing to the homepage. On this second page, however, the set-cookie headers are also provided, which prevents the homepage from redirecting yet again.
When I make a cURL request to a url on the site, however, it fails in a redirect loop.
Now, obviously the easiest solution would be to fix this on the remote server. It should not be implementing that redirect for api routes. But that at the moment is not an option for me.
I have found how to retrieve the set-cookie header value from a 2** code response, however I cannot seem to figure out how to access that value when 302 headers are provided, and cURL returns nothing but an error.
Is there a way to access the headers even when it reaches the maximum (20) redirects?
Is it possible to stop the execution after a set number of redirects?
How can I get this cookie's value so I can provide it in a final request?
If you use the cURL option CURLOPT_HEADER the data you get back from curl_exec will include the headers from each response, including the 302.
If you enable cookie handling in cURL, it should pick up the cookie set by the 302 response just fine unless you prefer to handle it manually.
I often do something like this when there could be multiple redirects:
$ch = curl_init($some_url_that_302_redirects);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, ''); // enable curl cookie handling
$result = curl_exec($ch);
// $result contains the headers from each response, plus the body of the last response
$info = curl_getinfo($ch); // info will tell us how many redirects were followed
for ($i = 0; $i < intval($info['redirect_count']); ++$i) {
// get headers from each response
list($headers, $response) = explode("\r\n\r\n", $response, 2);
// DO SOMETHING WITH $headers HERE
// If there was a redirect, headers will be all headers from that response,
// including Set-Cookie headers
}
list($headers, $body) = explode("\r\n\r\n", $response, 2);
// Now $headers are the headers from the final response
// $body is the content from the final response
You already had problems before you started trying to add cookies into the mix. Doing a single redirect is bad for performance. Using a 302 response as a means of dissociating data presentation from data retrieval under HTTP/1,1 or later is bad (it works, but is a violation of the protocol - you should be using a 303 if you really must redirect).
Trying to set a cookie in a 3xx response will not work consistently across browsers. Setting a cookie in an Ajax response will not work consistently across browsers.
It should not be implementing that redirect for api routes
Maybe the people at the remote site are trying to prevent you leeching their content?
Fetch the homepage first in an iframe to populate the cookie and record a flag in your domain on the browser.
I actually found another SO question, of course after I posted, that lead me in the right direction to make this possible, HERE
I used the WebGet class to make the curl request. It has not been maintained for three years, but it still works fine.
It has a function that makes the curl request without following through on the redirect loop.
There are a lot of curl options set in that function, and curl is not returning an error in it, so I'm sure the exact solution could be simpler. HERE is a list of curl options for anyone who would like to delve deeper.
Here is how I handle each of the responses to get the final response
$w = new WebGet();
$cookie_file = 'cookie.txt';
if (!file_exists($cookie_file)) {
$cookie_file_inter = fopen($cookie_file, "w");
fclose($cookie_file_inter);
}
$w->cookieFile = $cookie_file; // must exist and be writable
$w->requestContent($url);
$headers = $w->responseHeaders;
if ($w->responseStatusCode == 302 && isset($headers['LOCATION'])) {
$w->requestContent($headers['LOCATION']);
}
if ($w->responseStatusCode == 302 && isset($headers['LOCATION'])) {
$w->requestContent($headers['LOCATION']);
}
$response = $w->cachedContent;
Of course, this is all extremely bad practice, and has severe performance implications, but there may be some rare use cases that find themselves needing to do this.

Detect : from curl or browser

I've got an almost empty page that I need for a curl.
The thing is that it is also accessible by a browser and looks weird.
Is it possible in PHP to detect if the request come from a browser or a curl ? So that way I can make a redirect if it comes from browser.
Thank you.
This will get you the client Agent, among other stuff. If your curl does not pretend to be something else, it should do.
foreach (getallheaders() as $name => $value) {
echo "$name: $value\n";
}
or simpler:
$_SERVER['HTTP_USER_AGENT']
gets you the user-agent (browser signatur) directly.
Browser sends User-Agent header in request, if you don't setup User-Agent in curl just check request for this header.

Cache curl response

I would like to cache curl responses, and I found out a couple of ways to do that, but all of them include saving a response to the file, and than retrieving it. The problem here is that my code needs to work with curl_getinfo() object, which is available only after the curl_exec call is finished. So, the ideal way would be if the curl itself would cache the response instead of making a new request. I tried that approach using Cache-Control request header with the value max-age=604800, however I don't see any changes. Any ideas how to accomplish this ?
If you have enough information about a request to compile a unique identifier/key you could use for example Memcached:
$key = $url.':'.$some_other_variable;
$cached = $memcached->get($key);
if ($cached)
{
return $cached;
}
// Perform cURL request
// ...
$memcached->set($key, $data_to_cache);

Categories