How to write file checking code for best performance

How to write file checking code for best performance - php

Is there any appreciable difference, in terms of speed on a low-traffic website, between the following snippets of code?
$html = file_get_contents('cache/foo.html');
if ($html) {
echo $html;
exit;
}
Or this:
$file = 'cache/foo.html';
if (file_exists($file)) {
echo file_get_contents($file);
exit;
}
In the first snippet, there's a single call to file_get_contents() whereas in the second there's also a call to file_exists(). The page does involve database access - and this caching would avoid that entirely.

It will be unnoticeably slower on a low-traffic website; but there is no reason to perform that check anyway if you're going to get the contents if it exists, since file_get_contents() already performs that check behind-the-scenes, returning false if the file doesn't exist.
You can even put the call to file_get_contents() directly inside the condition:
if ($html = file_get_contents('cache/foo.html')) {
echo $html;
exit;
}

The runtime differences are so minimal for both variants that it does not matter in practice.
The first variant is slightly faster if the file exists. The second variant is faster if the file does not exist.
Both solutions do not have the best performance because the entire HTML is first loaded into memory before the output is done with echo. Better is:
$ok = #readfile ('cache/foo.html');
With readfile the file is output directly or without detours. The # operator suppresses the warning if the file does not exist.
$ok contains the number of bytes output if the output was successful and false if the file does not exist.

Related

creating only new files in PHP without cpu intensive code

In my cache system, I want it where if a new page is requested, a check is made to see if a file exists and if it doesn't then a copy is stored on the server, If it does exist, then it must not be overwritten.
The problem I have is that I may be using functions designed to be slow.
This is part of my current implementation to save files:
if (!file_exists($filename)){$h=fopen($filename,"wb");if ($h){fwrite($h,$c);fclose($h);}}
This is part of my implementation to load files:
if (($m=#filemtime($file)) !== false){
if ($m >= filemtime("sitemodification.file")){
$outp=file_get_contents($file);
header("Content-length:".strlen($outp),true);echo $outp;flush();exit();
}
}
What I want to do is replace this with a better set of functions meant for performance and yet still achieve the same functionality. All caching files including sitemodification.file reside on a ramdisk. I added a flush before exit in hopes that content will be outputted faster.
I can't use direct memory addressing at this time because the file sizes to be stored are all different.
Is there a set of functions I can use that can execute the code I provided faster by at least a few milliseconds, especially the loading files code?
I'm trying to keep my time to first byte low.

First, prefer is_file to file_exists and use file_put_contents:
if ( !is_file($filename) ) {
file_put_contents($filename,$c);
}
Then, use the proper function for this kind of work, readfile:
if ( ($m = #filemtime($file)) !== false && $m >= filemtime('sitemodification.file')) {
header('Content-length:'.filesize($file));
readfile($file);
}
}
You should see a little improvement but keep in mind that file accesses are slow and you check three times for files access before sending any content.

PHP, check if the file is being written to/updated by PHP script?

I have a script that re-writes a file every few hours. This file is inserted into end users html, via php include.
How can I check if my script, at this exact moment, is working (e.g. re-writing) the file when it is being called to user for display? Is it even an issue, in terms of what will happen if they access the file at the same time, what are the odds and will the user just have to wait untill the script is finished its work?
Thanks in advance!
More on the subject...
Is this a way forward using file_put_contents and LOCK_EX?
when script saves its data every now and then
file_put_contents($content,"text", LOCK_EX);
and when user opens the page
if (file_exists("text")) {
function include_file() {
$file = fopen("text", "r");
if (flock($file, LOCK_EX)) {
include_file();
}
else {
echo file_get_contents("text");
}
}
} else {
echo 'no such file';
}
Could anyone advice me on the syntax, is this a proper way to call include_file() after condition and how can I limit a number of such calls?
I guess this solution is also good, except same call to include_file(), would it even work?
function include_file() {
$time = time();
$file = filectime("text");
if ($file + 1 < $time) {
echo "good to read";
} else {
echo "have to wait";
include_file();
}
}

To check if the file is currently being written, you can use filectime() function to get the actual time the file is being written.
You can get current timestamp on top of your script in a variable and whenever you need to access the file, you can compare the current timestamp with the filectime() of that file, if file creation time is latest then the scenario occured when you have to wait for that file to be written and you can log that in database or another file.
To prevent this scenario from happening, you can change the script which is writing the file so that, it first creates temporary file and once it's done you just replace (move or rename) the temporary file with original file, this action would require very less time compared to file writing and make the scenario occurrence very rare possibility.
Even if read and replace operation occurs simultaneously, the time the read script has to wait will be very less.

Depending on the size of the file, this might be an issue of concurrency. But you might solve that quite easy: before starting to write the file, you might create a kind of "lock file", i.e. if your file is named "incfile.php" you might create an "incfile.php.lock". Once you're doen with writing, you will remove this file.
On the include side, you can check for the existance of the "incfile.php.lock" and wait until it's disappeared, need some looping and sleeping in the unlikely case of a concurrent access.
Basically, you should consider another solution by just writing the data which is rendered in to that file to a database (locks etc are available) and render that in a module which then gets included in your page. Solutions like yours are hardly to maintain on the long run ...

This question is old, but I add this answer because the other answers have no code.
function write_to_file(string $fp, string $string) : bool {
$timestamp_before_fwrite = date("U");
$stream = fopen($fp, "w");
fwrite($stream, $string);
while(is_resource($stream)) {
fclose($stream);
}
$file_last_changed = filemtime($fp);
if ($file_last_changed < $timestamp_before_fwrite) {
//File not changed code
return false;
}
return true;
}
This is the function I use to write to file, it first gets the current timestamp before making changes to the file, and then I compare the timestamp to the last time the file was changed.

How to avoid a possible missing cache file in PHP?

I have a simple caching system as
if (file_exists($cache)) {
echo file_get_contents($cache);
// if coming here when $cache is deleting, then nothing to display
}
else {
// PHP process
}
We regularly delete outdated cache files, e.g. deleting all caches after 1 hour. Although this process is very fast, but I am thinking that a cache file can be deleted right between the if statement and file_get_contents processes.
I mean when if statement checks the existence of cache file, it exists; but when file_get_contents tries to catch it, it is no longer there (deleted by simultaneous cache deleting process).
file_get_contents locks the file to avoid the undergoing delete process during the read process. But the file can be deleted when the if statement sends the PHP process to the first condition (before start of the file_get_contents).
Is there any approach to avoid this? Is the cache deleting system different?
NOTE: I did not face any practical problem, as it is not very probable to catch this event, but logically it is possible, and should happen on heavy loads.

Luckily file_get_contents return FALSE on error, so you could quick-bake it like:
if (FALSE !== ($buffer = file_get_contents())) {
echo $buffer;
return;
}
// PHP process
or similiar. It's a bit the quick and dirty way, considering you want to place the # operator to hide any warnings about non-existent files:
if (FALSE !== ($buffer = #file_get_contents())) {
The other alternative would be to lock, however that might prevent your cache-deletion to not delete the file if you have locked it.
Then left is to stall the cache your own. That means reading the file-creation time in PHP, check that it is < 5 minutes then for the file-deletion processing (5 minutes is exemplary) and then you would know that the file is already stale and for being replaced with fresh content. Re-create the file then. Otherwise read the file in, which probably is better then with readfile instead of file_get_contents and echo.

On failure, file_get_contents returns false, so what about this:
if (($output = file_get_contents($filename)) === false){
// Do the processing.
$output = 'Generated content';
// Save cache file
file_put_contents($filename, $output);
}
echo $output;
By the way, you may want to consider using fpassthru, which is more memory-efficient, especially for larger files. Using file_get_contents on large files (> 100 MB), will probably cause problems (depending on your configuration).
<?php
$fp = #fopen($filename, 'rb');
if ($fp === false){
// Generate output
} else {
fpassthru($fp);
}

Is there any tool that will resolve and hardcode every included file of a PHP script?

I would need a tool, if it exists or if you can write in under 5 mins (don't want to waste anyone's time).
The tool in question would resolve the includes, requires, include_once and require_once in a PHP script and actually harcode the contents of then, recursively.
This would be needed to ship PHP scripts in one big file that actually use code and resources from multiple included files.
I know that PHP is not the best tool for CLI scripts, but as I'm the most pro-efficient at it, I use it to write some personal or semi-personal tools. I don't want un-helpful answers or comments that tell me to use something else than PHP or learn something else.
The idea of that approach is to be able to have a single file that would represent everything needed to put it in my personal ~/.bin/ directory and let it live there as a completely functional and self-contained script. I know I could set include paths in the script to something that would honor the XDG data directories standards or anything else, but I wanted to try that approach.
Anyway, I ask there because I don't want to re-invent the wheel and all my searches gave nothing, but if I don't have any insight here, I will continue in the way I was going to and actually write a tool that will resolve the includes and requires.
Thanks for any help!
P.S.: I forgot to include examples and don't want to rephrase the message:
Those two files
mainfile.php
<?php
include('resource.php');
include_once('resource.php');
echo returnBeef();
?>
resource.php
<?php
function returnBeef() {
return "The beef!";
}
?>
Would be "compiled" as (comments added for clarity)
<?php
/* begin of include('resource.php'); */?><?php
function returnBeef() {
return "The beef!";
}
?><?php /* end of include('resource.php); */
/*
NOT INCLUDED BECAUSE resource.php WAS PREVIOUSLY INCLUDED
include_once('resource.php');
*/
echo returnBeef();
?>
The script does not have to output explicit comments, but it could be nice if it did.
Thanks again for any help!
EDIT 1
I made a simple modification to the script. As I have begun writing the tool myself, I have seen a mistake I made in the original script. The included file would have, to do the least amount of work, to be enclosed out of start and end tags (<?php ?>)
The resulting script example has been modified in consequence, but it has not been tested.
EDIT 2
The script does not actually need to do heavy-duty parsing of the PHP script as in run-time accurate parsing. Simple includes only have to be treated (like include('file.php');).
I started working on my script and am reading the file to unintelligently parse them to include only when in <?php ?> tags, not in comments nor in strings. A small goal is to also be able to detect dirname(__FILE__)."" in an include directive and actually honor it.

An interesting problem, but one that's not really solvable without detailed runtime knowledge. Conditional includes would be nearly impossible to determine, but if you make enough simple assumptions, perhaps something like this will suffice:
<?php
# import.php
#
# Usage:
# php import.php basefile.php
if (!isset($argv[1])) die("Invalid usage.\n");
$included_files = array();
echo import_file($argv[1])."\n";
function import_file($filename)
{
global $included_files;
# this could fail because the file doesn't exist, or
# if the include path contains a run time variable
# like include($foo);
$file = #file_get_contents($filename);
if ($file === false) die("Error: Unable to open $filename\n");
# trimming whitespace so that the str_replace() at the end of
# this routine works. however, this could cause minor problems if
# the whitespace is considered significant
$file = trim($file);
# look for require/include statements. Note that this looks
# everywhere, including non-PHP portions and comments!
if (!preg_match_all('!((require|include)(_once)?)\\s*\\(?\\s*(\'|")(.+)\\4\\s*\\)?\\s*;!U', $file, $matches, PREG_SET_ORDER | PREG_OFFSET_CAPTURE ))
{
# nothing found, so return file contents as-is
return $file;
}
$new_file = "";
$i = 0;
foreach ($matches as $match)
{
# append the plain PHP code up to the include statement
$new_file .= substr($file, $i, $match[0][1] - $i);
# make sure to honor "include once" files
if ($match[3][0] != "_once" || !isset($included_files[$match[5][0]]))
{
# include this file
$included_files[$match[5][0]] = true;
$new_file .= ' ?>'.import_file($match[5][0]).'<?php ';
}
# update the index pointer to where the next plain chunk starts
$i = $match[0][1] + strlen($match[0][0]);
}
# append the remainder of the source PHP code
$new_file .= substr($file, $i);
return str_replace('?><?php', '', $new_file);
}
?>
There are many caveats to the above code, some of which can be worked around. (I leave that as an exercise for somebody else.) To name a few:
It doesn't honor <?php ?> blocks, so it will match inside HTML
It doesn't know about any PHP rules, so it will match inside PHP comments
It cannot handle variable includes (e.g., include $foo;)
It may introduce scope errors. (e.g., if (true) include('foo.php'); should be if (true) { include('foo.php'); }
It doesn't check for infinitely recursive includes
It doesn't know about include paths
etc...
But even in such a primitive state, it may still be useful.

You could use the built in function get_included_files which returns an array of, you guessed it, all the included files.
Here's an example, you'd drop this code at the END of mainfile.php and then run mainfile.php.
$includes = get_included_files();
$all = "";
foreach($includes as $filename) {
$all .= file_get_contents($filename);
}
file_put_contents('all.php',$all);
A few things to note:
any include which is actually not processed (ie. an include inside a function) will not be dumped into the final file. Only includes which have actually run.
This will also have a around each file but you can have multiple blocks like that with no issues inside a single text file.
This WILL include anything included within another include.
Yes, get_included_files will list the script actually running as well.
If this HAD to be a stand-alone tool instead of a drop in, you could read the inital file in, add this code in as text, then eval the entire thing (possibly dangerous).

Streaming output to a file and the browser

So, I'm looking for something more efficient than this:
<?php
ob_start();
include 'test.php';
$content = ob_get_contents();
file_put_contents('test.html', $content);
echo $content;
?>
The problems with the above:
Client doesn't receive anything until the entire page is rendered
File might be enormous, so I'd rather not have the whole thing in memory
Any suggestions?

Interesting problem; don't think I've tried to solve this before.
I'm thinking you'll need to have a second request going from your front-facing PHP script to your server. This could be a simple call to http://localhost/test.php. If you use fopen-wrappers, you could use fread() to pull the output of test.php as it is rendered, and after each chunk is received, output it to the screen and append it to your test.html file.
Here's how that might look (untested!):
<?php
$remote_fp = fopen("http://localhost/test.php", "r");
$local_fp = fopen("test.html", "w");
while ($buf = fread($remote_fp, 1024)) {
echo $buf;
fwrite($local_fp, $buf);
}
fclose($remote_fp);
fclose($local_fp);
?>

A better way to do this is to use the first two parameters accepted by ob_start: output_callback and chunk_size. The former specifies a callback to handle output as it's buffered, and the latter specifies the size of the chunks of output to handle.
Here's an example:
$output_file = fopen('test.html', 'w');
if ($output_file === false) {
// Handle error
}
$write_ob_to_file = function($buffer) use ($output_file) {
fwrite($output_file, $buffer);
// Output string as-is
return false;
};
ob_start($write_ob_to_file, 4096);
include 'test.php';
ob_end_flush();
fclose($output_file);
In this example, the output buffer will be flushed (sent) for every 4096 bytes of output (and once more at the end by the ob_end_flush call). Each time the buffer is flushed, the callback $write_ob_to_file will be called and passed the latest chunk. This gets written to test.html. The callback then returns false, meaning "output this chunk as is". If you wanted to only write the output to file and not to PHP's output stream, you could return an empty string instead.

Pix0r's answer is what you want unless you actually need it "included" rather than just executed. For example, if you have login information before the test.php, it will not get passed into the file if you call it with fopen.
If you need it genuinely included, then what you have is the simplest method, but if you want constant output, you'll need to actually write test.php in a manner that outputs as well as stores the information as it goes. As far as I know there's no way to both collect buffer and output it as you go.

Here you go x-send-file, use mod_xsendfile to send file efficiently, really easy.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to write file checking code for best performance - php

Related

creating only new files in PHP without cpu intensive code

PHP, check if the file is being written to/updated by PHP script?

How to avoid a possible missing cache file in PHP?

Is there any tool that will resolve and hardcode every included file of a PHP script?

Streaming output to a file and the browser

Categories

Resources