This is my code:
function get_remote_file_to_cache(){
$sites_array = array("http://www.php.net", "http://www.engadget.com", "http://www.google.se", "http://arstechnica.com", "http://wired.com");
$the_site= $sites_array[rand(0, 4)];
$curl = curl_init();
$fp = fopen("rr.txt", "w");
curl_setopt ($curl, CURLOPT_URL, $the_site);
curl_setopt($curl, CURLOPT_FILE, $fp);
curl_exec ($curl);
curl_close ($curl);
}
$cache_file = 'rr.txt';
$cache_life = '15'; //caching time, in seconds
$filemtime = #filemtime($cache_file);
if (!$filemtime or (time() - $filemtime >= $cache_life)){
ob_start();
echo file_get_contents($cache_file);
ob_get_flush();
echo " <br><br><h1>Writing to cache</h1>";
get_remote_file_to_cache();
}else{
echo "<h1>Reading from cache file:</h1><br> ";
ob_start();
echo file_get_contents($cache_file);
ob_get_flush();
}
Everything works as it should and no problems or surprises, and as you can see its pretty simple code but I am new to CURL and would just like to add one check to the code, but dont know how:
Is there anyway to check that the file fetched from the remote site is not a 404 (not found) page or such but is a status code 200 (successful) ?
So basically, only write to cache file if the fill is status code 200.
Thanks!
To get the status code from a cURL handle, use curl_getinfo after curl_exec:
$status = curl_getinfo($curl, CURLINFO_HTTP_CODE);
But the cached file will be overwritten when
$fp = fopen("rr.txt", "w");
is called, regardless of the HTTP code, this means that to update the cache only when status is 200, you need to read the contents into memory, or write to a temporary file. Then finally write to the real file if the status is 200.
It is also a good idea to
touch('rr.txt');
before executing cURL, so that the next request that may come before the current operation finish will not also try to load the page to page too.
Try this after curl_exec
$httpCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
Related
I have spent a couple of hours reading up on this but as so yet I find no clear solutions....I am using WAMP to run as my Local server. I have a successful API call set up to return data.
I would like to store that data locally, thus reducing the number of API call being made.
For simplicity I have created a cache.json file in the same folder as my PHP scripts and when I run the process I can see the file has been accessed as the time stamp updates.
But the file remains empty.
Based on research I suspect the issue may come down to a permission issue; I have gone through the folders and files and unchecked read only etc.
Appreciate if someone could validate that my code is correct and if it is hopefully point me in the the direction of a solution.
many thanks
<?php
$url = 'https://restcountries.eu/rest/v2/name/'. $_REQUEST['country'];
$cache = __DIR__."/cache.json"; // make this file in same dir
$force_refresh = true; // dev
$refresh = 60; // once an min (set short time frame for testing)
// cache json results so to not over-query (api restrictions)
if ($force_refresh || ((time() - filectime($cache)) > ($refresh) || 0 == filesize($cache))) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_URL,$url);
$result=curl_exec($ch);
curl_close($ch);
$decode = json_decode($result,true);
$handle = fopen($cache, 'w');// or die('no fopen');
$json_cache = $decode;
fwrite($handle, $json_cache);
fclose($handle);
}
} else {
$json_cache = file_get_contents($cache); //locally
}
echo json_encode($json_cache, JSON_UNESCAPED_UNICODE);
?>
I managed to solve this by using file_put_contents(), not being an expert I do not understand why this works and the code above doesn't, but maybe this helps someone else.
adjusted code:
<?php
$url = 'https://restcountries.eu/rest/v2/name/'. $_REQUEST['country'];
$cache = __DIR__."/cache.json"; // make this file in same dir
$force_refresh = false; // dev
$refresh = 60; // once an min (short time frame for testing)
// cache json results so to not over-query (api restrictions)
if ($force_refresh || ((time() - filectime($cache)) > ($refresh) || 0 == filesize($cache))) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_URL,$url);
$result=curl_exec($ch);
curl_close($ch);
$decode = json_decode($result,true);
$handle = fopen($cache, 'w');// or die('no fopen');
$json_cache = $result;
file_put_contents($cache, $json_cache);
} else {
$json_cache = file_get_contents($cache); //locally
$decode = json_decode($json_cache,true);
}
echo json_encode($decode, JSON_UNESCAPED_UNICODE);
?>
This is a bit of a continue from my previous thread (PHP CURL Chunked encoding a large file (700mb)) but I've now improvised something else.
Right now, I'm trying to use fread and then sending files through CURL chunk by chunk (each chunk around 1MB) and while the idea is good and it does work. It does timeout the server, so I was wondering if there was any way to reduce the amount of times it sends a chunk per second or a way to make it so it doesn't completely overload my PHP process.
$length = (1024 * 1024) * 1;
$handle = fopen($getFile, "r");
while (($buffer = fread($handle, $length)) !== false) {
if ($response = sendChunk($getServer, $buffer)) {
$chunk++;
print "Chunk " . $chunk . " Sent (Code: " . $response . ")! \n";
}
}
The function sendChunk is
function sendChunk($url, $chunk) {
$POST_DATA = [
'file' => base64_encode($chunk)
];
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_TIMEOUT, 2048);
curl_setopt($curl, CURLOPT_POST, 1);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_POSTFIELDS, $POST_DATA);
curl_exec($curl);
$response = curl_getinfo($curl, CURLINFO_HTTP_CODE);
curl_close ($curl);
return $response;
}
I tried making it so you can read the file line by line, but it doesn't work since a video file (mp4, wmv) is lots of random characters and what not.
UPDATE: I have discovered the issue and the timing out was actually a result of CloudFlare timing out when there's no such HTTP Response. So I decided to run the script using SSH and it worked fine .... except for one thing.
After the file does get successfully sent over it will just keep sending chunks of 0 bytes in this endless loop and I was told it was because feof() isn't always accurate in measuring that. So I tried using the ($buffer = fread($handle, $length) !== false) trick and it still repeats the same thing. Any ideas?
After working on this for around 8 hours, I noticed that I wasn't using $buffer to send the chunk so now I have done that.
while (!feof($fp) && ($buffer = fread($handle, $length)) !== false) {
if ($response = sendChunk($getServer, $buffer)) {
$chunk++;
print "Chunk " . $chunk . " Sent (Code: " . $response . ")! \n";
}
}
Everything works fine, I did some other touchups like check for a response code of 200. But the core of it works.
A lesson for anyone that is using Cloudflare and is wanting to transfer a file (Up to 2GB) to another server and want to use it via CURL.
There is better ways than just using CURL for this in my opinion, but client has requested it done via this way, but it works.
Cloudflare only has a maximum upload limit of 250MB for Free users, you cannot do chunked uploading through CURL's supported stream function as Cloudflare still reads it as > 250MB in the header.
When I managed to get this code to work, it would timeout on certain chunks and it was because Cloudflare needs an HTTP header within 100 seconds or it times out. Thankfully my script will be executed via CRON so it doesn't need to go through Cloudflare to work. However, if you are looking to execute code within the browser then you may want to take a look at this. https://github.com/marcialpaulg/Fixing-Cloudflare-Error-524
I am making a website that will check if a website is working and live. I pass in the URL of the site I would like to check and the following code will check if the site is live and return the HTTP response code as well as true or false.
function urlExists($url=NULL)
{
if($url == NULL) return false;
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($ch);
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if ($httpcode == 0) {
return array (false, $httpcode);
}
else if($httpcode < 400){
return array (true, $httpcode);
} else {
return array (false, $httpcode);
}
}
With one of the sites I am testing though I am getting the HTTP response code of 0 even though I know that the site is live and working.
The site is very slow as its a large site on a not very powerful server so response times can vary between 7 - 25 seconds.
Any help would be greatly appreciated.
Thanks,
Sam
Based on these two links:-
https://curl.haxx.se/libcurl/c/CURLOPT_TIMEOUT.html
And
https://curl.haxx.se/libcurl/c/CURLOPT_CONNECTTIMEOUT.html
First one is:- set maximum time the request is allowed to take
Second one is:- timeout for the connect phase
As you said that the Site URL you are hitting is taking 7-25 second for responding. meanwhile your CURL request is terminated and closed because of these two time settings.
Increase these two time settings in your code and it will work for you.
thanks.
I will offer 2 alternatives for you to compare - along with your curl() function, you will have 3 options to see which one is better/faster for you.
Option A (all php versions), requires fopen() to be activated:
if (!$fp = fopen($url, 'r'))
{
trigger_error("Unable to open URL ($url)", E_USER_ERROR);
}
$headers = stream_get_meta_data($fp);
fclose($fp);
$http_header_info = $headers['wrapper_data'][0];
$httpCode = (int)substr($http_header_info, 9, 3);
Option B (php5+):
$headers = get_headers($url, 1);
$http_header_info = $headers[0];
$httpCode = substr($http_header_info, 9, 3);
Also, if anyone has benchmarks on these 3 approaches, i am curious to see which is more appropriate (only for retrieving http response headers of course)
Code 0 returns often when used invalid URL syntax or host not found error.
You can also call curl_error($ch) function (http://php.net/manual/en/function.curl-error.php) to determine error details.
I'm trying to find a way to only quickly access a file and then disconnect immediately.
So I've decided to use cURL since it's the fastest option for me. But I can't figure out how I should "disconnect" cURL.
With the code below, Apache's access logs says that the file I tried accessing was indeed accessed, but I'm feeling a little iffy about this, because when I just run the while loop without breaking out of it, it just keeps looping. Shouldn't the loop stop when cURL has finished fetching the file? Or am I just being silly; is the loop just restarting constantly?
<?php
$Resource = curl_init();
curl_setopt($Resource, CURLOPT_URL, '...');
curl_setopt($Resource, CURLOPT_HEADER, 0);
curl_setopt($Resource, CURLOPT_USERAGENT, '...');
while(curl_exec($Resource)){
break;
}
curl_close($Resource);
?>
I tried setting the CURLOPT_CONNECTTIMEOUT_MS / CURLOPT_CONNECTTIMEOUT options to very small values, but it didn't help in this case.
Is there a more "proper" way of doing this?
This statement is superflous:
while(curl_exec($Resource)){
break;
}
Instead just keep the return value for future reference:
$result = curl_exec($Resource);
The while loop does not help anything. So now to your question: You can tell curl that it should only take some bytes from the body and then quit. That can be achieved by reducing the CURLOPT_BUFFERSIZE to a small value and by using a callback function to tell curl it should stop:
$withCallback = array(
CURLOPT_BUFFERSIZE => 20, # ~ value of bytes you'd like to get
CURLOPT_WRITEFUNCTION => function($handle, $data) {
echo "WRITE: (", strlen($data), ") $data\n";
return 0;
},
);
$handle = curl_init("http://stackoverflow.com/");
curl_setopt_array($handle, $withCallback);
curl_exec($handle);
curl_close($handle);
Output:
WRITE: (10) <!DOCTYPE
Another alternative is to make a HEAD request by using CURLOPT_NOBODY which will never fetch the body. But it's not a GET request.
The connect timeout settings are about how long it will take until the connect times out. The connect is the phase until the server accepts input from curl and curl starts to know about that the server does. It's not related to the phase when curl fetches data from the server, that's
CURLOPT_TIMEOUT The maximum number of seconds to allow cURL functions to execute.
You find a long list of available options in the PHP Manual: curl_setoptĀDocs.
Perhaps that might be helpful?
$GLOBALS["dataread"] = 0;
define("MAX_DATA", 3000); // how many bytes should be read?
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.php.net/");
curl_setopt($ch, CURLOPT_WRITEFUNCTION, "handlewrite");
curl_exec($ch);
curl_close($ch);
function handlewrite($ch, $data)
{
$GLOBALS["dataread"] += strlen($data);
echo "READ " . strlen($data) . " bytes\n";
if ($GLOBALS["dataread"] > MAX_DATA) {
return 0;
}
return strlen($data);
}
I need a PHP Web Proxy that read html, show to the user and rewrite all the links for when the user click in the next link the proxy will handle the request again, just like this code, but with additionaly sould make the rewrite of all the links.
<?php
// Set your return content type
header('Content-type: text/html');
// Website url to open
$daurl = 'http://www.yahoo.com';
// Get that website's content
$handle = fopen($daurl, "r");
// If there is something, read and return
if ($handle) {
while (!feof($handle)) {
$buffer = fgets($handle, 4096);
echo $buffer;
}
fclose($handle);
}
?>
I hope I have explained well. This question is for not reinventing the wheel.
Another additional question. This kind of proxies will deal with contents like Flash?
For an open source solution, check out PHProxy. I've used it in the past and it seemed to work quite well from what I can remember.
It will sort of work, you need to rewrite any relative path to apsolute, and I think cookies won't work in this case. Use cURL for this operations...
function curl($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
return curl_exec($ch);
curl_close ($ch);
}
$url = "http://www.yahoo.com";
echo curl($url);