PHP: The fastest way to check if the directory is empty? - php

On Stack Overflow there are several answers to the question of how to check if the directory is empty, but which is the fastest, which way is the most effective?
Answer 1: https://stackoverflow.com/a/7497848/4437206
function dir_is_empty($dir) {
$handle = opendir($handle);
while (false !== ($entry = readdir($handle))) {
if ($entry != "." && $entry != "..") {
closedir($handle); // <= I added this
return FALSE;
}
}
closedir($handle); // <= I added this
return TRUE;
}
Answer 2: https://stackoverflow.com/a/18856880/4437206
$isDirEmpty = !(new \FilesystemIterator($dir))->valid();
Answer 3: https://stackoverflow.com/a/19243116/4437206
$dir = 'directory'; // dir path assign here
echo (count(glob("$dir/*")) === 0) ? 'Empty' : 'Not empty';
Or, there is a completely different way which is faster and more effective than these three above?
As for the Answer 1, please note that I added closedir($handle);, but I'm not sure if that's necessary (?).
EDIT: Initially I added closedir($dir); instead of closedir($handle);, but I corrected that as #duskwuff stated in his answer.

The opendir()/readdir() and FilesystemIterator approaches are both conceptually equivalent, and perform identical system calls (as tested on PHP 7.2 running under Linux). There's no fundamental reason why either one would be faster than the other, so I would recommend that you run benchmarks if you need to microoptimize.
The approach using glob() will perform worse. glob() returns an array of all the filenames in the directory; constructing that array can take some time. If there are many files in the directory, it will perform much worse, as it must iterate through the entire contents of the directory.
Using glob() will also give incorrect results in a number of situations:
If $dir is a directory name which contains certain special characters, including *, ?, and [/]
If $dir contains only dotfiles (i.e, filenames starting with .)
As for the Answer 1, please note that I added closedir($dir);, but I'm not sure if that's necessary (?).
It's a good idea, but you've implemented it incorrectly. The directory handle that needs to be closed is $handle, not $dir.

Related

Strict Standards trouble using arrays and end() function [duplicate]

// Other variables
$MAX_FILENAME_LENGTH = 260;
$file_name = $_FILES[$upload_name]['name'];
//echo "testing-".$file_name."<br>";
//$file_name = strtolower($file_name);
$file_extension = end(explode('.', $file_name)); //ERROR ON THIS LINE
$uploadErrors = array(
0=>'There is no error, the file uploaded with success',
1=>'The uploaded file exceeds the upload max filesize allowed.',
2=>'The uploaded file exceeds the MAX_FILE_SIZE directive that was specified in the HTML form',
3=>'The uploaded file was only partially uploaded',
4=>'No file was uploaded',
6=>'Missing a temporary folder'
);
Any ideas? After 2 days still stuck.
Assign the result of explode to a variable and pass that variable to end:
$tmp = explode('.', $file_name);
$file_extension = end($tmp);
The problem is, that end requires a reference, because it modifies the internal representation of the array (i.e. it makes the current element pointer point to the last element).
The result of explode('.', $file_name) cannot be turned into a reference. This is a restriction in the PHP language, that probably exists for simplicity reasons.
Everyone else has already given you the reason you're getting an error, but here's the best way to do what you want to do:
$file_extension = pathinfo($file_name, PATHINFO_EXTENSION);
Php 7 compatible proper usage:
$fileName = 'long.file.name.jpg';
$tmp = explode('.', $fileName);
$fileExtension = end($tmp);
echo $fileExtension;
// jpg
save the array from explode() to a variable, and then call end() on this variable:
$tmp = explode('.', $file_name);
$file_extension = end($tmp);
btw: I use this code to get the file extension:
$ext = substr( strrchr($file_name, '.'), 1);
where strrchr extracts the string after the last . and substr cuts off the .
The answer given elsewhere,
$tmp = explode('.', $fileName);
$file_extension = end($tmp);
is correct and valid. It accomplishes what you are trying to do.
Why?
The end() function does not do quite what you think it does. This is related to how the PHP array data structure works. You don't normally see it, but arrays in PHP contain a pointer to a current element, which is used for iteration (like with foreach).
In order to use end(), you must have an actual array, which has attached to it (normally, invisibly), the current element pointer. The end() function physically modifies that pointer.
The output of explode() is not an actual array. It is a function output. Therefore, you cannot run end(explode()) because you violate language requirements.
Simply setting the output of explode() in a variable creates the array that you're looking for. That created array has a current element pointer. Now, all is once again right in the world.
So what about the parentheses?
This is not a bug. Once again, it's a language requirement.
The extra parentheses (like end((explode()))) do more than just grouping. They create an inline instance variable, just like setting the function output to a variable. You may think of it as a lambda function that is executed immediately.
This is another correct and valid solution. It is perhaps a better solution, as it takes less space. A good reviewer or maintainer should grok what you're trying to do when they see the extra parentheses.
If you use a linter or SCA program like PHPCS, that may dislike the extra parentheses, depending on the linting profile you're using. It's your linter, tell it what you want it to do for you.
Some other answers also list things like the spread operator or array_key_last(), which are also reasonable solutions. They may be perfectly valid but they're more complicated to use and read.
I'll just use the # prefix
This solution is valid, however incorrect. It is valid because it solves the problem. That's about the end of its merit.
Suppressing errors is always bad practice. There are many reasons why. One very large one is that you are trying to suppress one specific error condition (one that you have created), but the error suppression prefix suppresses all errors.
In this case, you will probably get away with this. However, engaging in bad programming habits is cheating and will likely lead you to cheat more and bigger in the future. You will be responsible for bad code. But I'm not the Code Police and it's your code. It's valid because it solves the problem.
Okay, so what's the best answer?
Do what #ryeguy suggests. Don't do string manipulation to solve a well-defined problem that the platform already solves for you. Use pathinfo().
This has the added benefit that it actually does what you want, which is finding the extension on a file name. There is a subtle difference.
What you are doing is getting the text following the final dot. This is different from finding the file extension. Consider the file name, .gitignore. PHP knows how to handle this. Does your code?
Once again, I'm not the Code Police. Do what suits you best.
Since it raise a flag for over 10 years, but works just fine and return the expected value, a little stfu operator is the goodiest bad practice you are all looking for:
$file_extension = #end(explode('.', $file_name));
But warning, don't use in loops due to a performance hit.
Newest version of php 7.3+ offer the method array_key_last() and array_key_first().
https://www.php.net/manual/en/function.array-key-last.php
uuuuuuu
uu$$$$$$$$$$$uu
uu$$$$$$$$$$$$$$$$$uu
u$$$$$$$$$$$$$$$$$$$$$u
u$$$$$$$$$$$$$$$$$$$$$$$u
u$$$$$$$$$$$$$$$$$$$$$$$$$u
u$$$$$$$$$$$$$$$$$$$$$$$$$u
u$$$$$$" "$$$" "$$$$$$u
"$$$$" u$u $$$$"
$$$u u$u u$$$
$$$u u$$$u u$$$
"$$$$uu$$$ $$$uu$$$$"
"$$$$$$$" "$$$$$$$"
u$$$$$$$u$$$$$$$u
u$"$"$"$"$"$"$u
uuu $$u$ $ $ $ $u$$ uuu
u$$$$ $$$$$u$u$u$$$ u$$$$
$$$$$uu "$$$$$$$$$" uu$$$$$$
u$$$$$$$$$$$uu """"" uuuu$$$$$$$$$$
$$$$"""$$$$$$$$$$uuu uu$$$$$$$$$"""$$$"
""" ""$$$$$$$$$$$uu ""$"""
uuuu ""$$$$$$$$$$uuu
u$$$uuu$$$$$$$$$uu ""$$$$$$$$$$$uuu$$$
$$$$$$$$$$"""" ""$$$$$$$$$$$"
"$$$$$" ""$$$$""
$$$" $$$$"
Try this:
$parts = explode('.', $file_name);
$file_extension = end($parts);
The reason is that the argument for end is passed by reference, since end modifies the array by advancing its internal pointer to the final element. If you're not passing a variable in, there's nothing for a reference to point to.
See end in the PHP manual for more info.
PHP complains because end() expects a reference to something that it wants to change (which can be a variable only). You however pass the result of explode() directly to end() without saving it to a variable first. At the moment when explode() returns your value, it exists only in memory and no variable points to it. You cannot create a reference to something (or to something unknown in the memory), that does not exists.
Or in other words: PHP does not know, if the value you give him is the direct value or just a pointer to the value (a pointer is also a variable (integer), which stores the offset of the memory, where the actual value resides). So PHP expects here a pointer (reference) always.
But since this is still just a notice (not even deprecated) in PHP 7, you can savely ignore notices and use the ignore-operator instead of completely deactivating error reporting for notices:
$file_extension = #end(explode('.', $file_name));
end(...[explode('.', $file_name)]) has worked since PHP 5.6. This is documented in the RFC although not in PHP docs themselves.
Just as you can't index the array immediately, you can't call end on it either. Assign it to a variable first, then call end.
$basenameAndExtension = explode('.', $file_name);
$ext = end($basenameAndExtension);
PHP offical Manual :
end()
Parameters
array
The array. This array is passed by reference because it is modified by the function. This means you must pass it a real variable and not a function returning an array because only actual variables may be passed by reference.
First, you will have to store the value in a variable like this
$value = explode("/", $string);
Then you can use the end function to get the last index from an array like this
echo end($value);
I hope it will work for you.

PHP error - Only variables should be passed by reference [duplicate]

// Other variables
$MAX_FILENAME_LENGTH = 260;
$file_name = $_FILES[$upload_name]['name'];
//echo "testing-".$file_name."<br>";
//$file_name = strtolower($file_name);
$file_extension = end(explode('.', $file_name)); //ERROR ON THIS LINE
$uploadErrors = array(
0=>'There is no error, the file uploaded with success',
1=>'The uploaded file exceeds the upload max filesize allowed.',
2=>'The uploaded file exceeds the MAX_FILE_SIZE directive that was specified in the HTML form',
3=>'The uploaded file was only partially uploaded',
4=>'No file was uploaded',
6=>'Missing a temporary folder'
);
Any ideas? After 2 days still stuck.
Assign the result of explode to a variable and pass that variable to end:
$tmp = explode('.', $file_name);
$file_extension = end($tmp);
The problem is, that end requires a reference, because it modifies the internal representation of the array (i.e. it makes the current element pointer point to the last element).
The result of explode('.', $file_name) cannot be turned into a reference. This is a restriction in the PHP language, that probably exists for simplicity reasons.
Everyone else has already given you the reason you're getting an error, but here's the best way to do what you want to do:
$file_extension = pathinfo($file_name, PATHINFO_EXTENSION);
Php 7 compatible proper usage:
$fileName = 'long.file.name.jpg';
$tmp = explode('.', $fileName);
$fileExtension = end($tmp);
echo $fileExtension;
// jpg
save the array from explode() to a variable, and then call end() on this variable:
$tmp = explode('.', $file_name);
$file_extension = end($tmp);
btw: I use this code to get the file extension:
$ext = substr( strrchr($file_name, '.'), 1);
where strrchr extracts the string after the last . and substr cuts off the .
The answer given elsewhere,
$tmp = explode('.', $fileName);
$file_extension = end($tmp);
is correct and valid. It accomplishes what you are trying to do.
Why?
The end() function does not do quite what you think it does. This is related to how the PHP array data structure works. You don't normally see it, but arrays in PHP contain a pointer to a current element, which is used for iteration (like with foreach).
In order to use end(), you must have an actual array, which has attached to it (normally, invisibly), the current element pointer. The end() function physically modifies that pointer.
The output of explode() is not an actual array. It is a function output. Therefore, you cannot run end(explode()) because you violate language requirements.
Simply setting the output of explode() in a variable creates the array that you're looking for. That created array has a current element pointer. Now, all is once again right in the world.
So what about the parentheses?
This is not a bug. Once again, it's a language requirement.
The extra parentheses (like end((explode()))) do more than just grouping. They create an inline instance variable, just like setting the function output to a variable. You may think of it as a lambda function that is executed immediately.
This is another correct and valid solution. It is perhaps a better solution, as it takes less space. A good reviewer or maintainer should grok what you're trying to do when they see the extra parentheses.
If you use a linter or SCA program like PHPCS, that may dislike the extra parentheses, depending on the linting profile you're using. It's your linter, tell it what you want it to do for you.
Some other answers also list things like the spread operator or array_key_last(), which are also reasonable solutions. They may be perfectly valid but they're more complicated to use and read.
I'll just use the # prefix
This solution is valid, however incorrect. It is valid because it solves the problem. That's about the end of its merit.
Suppressing errors is always bad practice. There are many reasons why. One very large one is that you are trying to suppress one specific error condition (one that you have created), but the error suppression prefix suppresses all errors.
In this case, you will probably get away with this. However, engaging in bad programming habits is cheating and will likely lead you to cheat more and bigger in the future. You will be responsible for bad code. But I'm not the Code Police and it's your code. It's valid because it solves the problem.
Okay, so what's the best answer?
Do what #ryeguy suggests. Don't do string manipulation to solve a well-defined problem that the platform already solves for you. Use pathinfo().
This has the added benefit that it actually does what you want, which is finding the extension on a file name. There is a subtle difference.
What you are doing is getting the text following the final dot. This is different from finding the file extension. Consider the file name, .gitignore. PHP knows how to handle this. Does your code?
Once again, I'm not the Code Police. Do what suits you best.
Since it raise a flag for over 10 years, but works just fine and return the expected value, a little stfu operator is the goodiest bad practice you are all looking for:
$file_extension = #end(explode('.', $file_name));
But warning, don't use in loops due to a performance hit.
Newest version of php 7.3+ offer the method array_key_last() and array_key_first().
https://www.php.net/manual/en/function.array-key-last.php
uuuuuuu
uu$$$$$$$$$$$uu
uu$$$$$$$$$$$$$$$$$uu
u$$$$$$$$$$$$$$$$$$$$$u
u$$$$$$$$$$$$$$$$$$$$$$$u
u$$$$$$$$$$$$$$$$$$$$$$$$$u
u$$$$$$$$$$$$$$$$$$$$$$$$$u
u$$$$$$" "$$$" "$$$$$$u
"$$$$" u$u $$$$"
$$$u u$u u$$$
$$$u u$$$u u$$$
"$$$$uu$$$ $$$uu$$$$"
"$$$$$$$" "$$$$$$$"
u$$$$$$$u$$$$$$$u
u$"$"$"$"$"$"$u
uuu $$u$ $ $ $ $u$$ uuu
u$$$$ $$$$$u$u$u$$$ u$$$$
$$$$$uu "$$$$$$$$$" uu$$$$$$
u$$$$$$$$$$$uu """"" uuuu$$$$$$$$$$
$$$$"""$$$$$$$$$$uuu uu$$$$$$$$$"""$$$"
""" ""$$$$$$$$$$$uu ""$"""
uuuu ""$$$$$$$$$$uuu
u$$$uuu$$$$$$$$$uu ""$$$$$$$$$$$uuu$$$
$$$$$$$$$$"""" ""$$$$$$$$$$$"
"$$$$$" ""$$$$""
$$$" $$$$"
Try this:
$parts = explode('.', $file_name);
$file_extension = end($parts);
The reason is that the argument for end is passed by reference, since end modifies the array by advancing its internal pointer to the final element. If you're not passing a variable in, there's nothing for a reference to point to.
See end in the PHP manual for more info.
PHP complains because end() expects a reference to something that it wants to change (which can be a variable only). You however pass the result of explode() directly to end() without saving it to a variable first. At the moment when explode() returns your value, it exists only in memory and no variable points to it. You cannot create a reference to something (or to something unknown in the memory), that does not exists.
Or in other words: PHP does not know, if the value you give him is the direct value or just a pointer to the value (a pointer is also a variable (integer), which stores the offset of the memory, where the actual value resides). So PHP expects here a pointer (reference) always.
But since this is still just a notice (not even deprecated) in PHP 7, you can savely ignore notices and use the ignore-operator instead of completely deactivating error reporting for notices:
$file_extension = #end(explode('.', $file_name));
end(...[explode('.', $file_name)]) has worked since PHP 5.6. This is documented in the RFC although not in PHP docs themselves.
Just as you can't index the array immediately, you can't call end on it either. Assign it to a variable first, then call end.
$basenameAndExtension = explode('.', $file_name);
$ext = end($basenameAndExtension);
PHP offical Manual :
end()
Parameters
array
The array. This array is passed by reference because it is modified by the function. This means you must pass it a real variable and not a function returning an array because only actual variables may be passed by reference.
First, you will have to store the value in a variable like this
$value = explode("/", $string);
Then you can use the end function to get the last index from an array like this
echo end($value);
I hope it will work for you.

Strings, regexp and files

<?php
$iprange = array(
"^12\.34\.",
"^12\.35\.",
);
foreach($iprange as $var) {
if (preg_match($var, $_SERVER['REMOTE_ADDR'])) {
I'm looking to have a list that will constitute each of the values inside the array. Let's call it iprange.txt, from which I would extract the variable $iprange. I would also be updating the file with new ranges, but I also want to convert those strings to regexp if that's something that's needed in php, as it is in the above example.
If you could help me with the two following issues:
I understand that somehow I would be using an array include, but I'm not sure how to implement it.
I would like to run a cron that would update the text file and turn it into a regexp acceptable for use in the above example, if you think regexp is a good idea and there isn't another option. I know how to apply a cron in a directadmin gui, but I don't know what the cronned file would look like.
edit------------------------
Thanks Mamsaac, very helpful, right now I'm stuck on further issues that have risen that have to do with cases and ob_file_callback, and if I start talking about them here, I won't get anywhere, but they can be followed here: Problems with ob_file_callback and fwrite
As for this thread here, to keep it on topic, I wanted to ask you how would you go about including a whole file in the array you suggested?
I no longer need the cronjob you were thinking about if I don't have to convert strings to regular expression.
I will propose a different approach to the problem, if that's OK with you.
You could try using ip2long() function for this, making comparisons much faster. The advantage of doing it this way is that you can be very specific with each range (and in a natural way, where a range means "between two numbers".
So, you can do it something like this:
$ranges = array('10.20.8.0-10.20.14.254', '192.168.0.2-192.168.0.254');
foreach ($ranges as $iprange) {
list($lowerip, $highip) = explode('-', $iprange);
$remoteip = ip2long($_SERVER['REMOTE_ADDR']);
if (ip2long($lowerip) <= $remoteip && $remoteip <= ip2long($highip)) {
//it is within this range! I don't know what you want to do with it.
}
}
You could also use netmasks, but I will leave that as an exercise for you. To do it, you will play a bit with bitwise operations. Negate the mask, then use and bitwise and operation... Not what you requested! I might update this after I go to sleep.
About the file and cronjob. I am absolutely unsure of why you want a cronjob for this. How are you deciding what new ranges you will be accepting?
You can always read a file (you can use file_get_contents if you so desire and do a split on the string using
$ranges = explode("\n", file_get_contents("filename")) ;
and then you would have your array ready. (notice I even called it the same as in the block code above).
if the file ever gets REALLY big, avoid using the approach above and go ahead with fopen and fgets approach:
$file = #fopen("filename", "r"); //suppressing error messages, probably don't want that
if (!$file) {
//for some reason the file didn't open. Do error reporting or checking
}
$ranges = array();
while (($line = fgets($file)) !== false) {
$ranges[] = $line;
}
Seems like I'm only missing why you want to use a cronjob. Please elaborate on your criteria for deciding to add new IP ranges.
$ranges = array('12.34.1.0-12.34.14.254', '192.168.0.2-192.168.0.254');
foreach ($ranges as $iprange) {
list($lowerip, $highip) = explode('-', $iprange);
$remoteip = ip2long($_SERVER['REMOTE_ADDR']);
if (ip2long($lowerip) <= $remoteip && $remoteip <= ip2long($highip)) {
}
}
header("LOCATION: page1.php");
}
else
{
header("LOCATION: page2.php");
}
?>
I've isolated the if else and that works fine. I've also placed the two '}' at the end of your script both before and after my if else but no luck.

Handle $_GET safely PHP

I have a code like this:
$myvar=$_GET['var'];
// a bunch of code without any connection to DB where $myvar is used like this:
$local_directory=dirname(__FILE__).'/images/'.$myvar;
if ($myvar && $handle = opendir($local_directory)) {
$i=0;
while (false !== ($entry = readdir($handle))) {
if(strstr($entry, 'sample_'.$language.'-'.$type)) {
$result[$i]=$entry;
$i++;
}
}
closedir($handle);
} else {
echo 'error';
}
I'm a little confused with a number of stripping and escaping functions, so the question is, what do i need to do with $myvar for this code to be safe? In my case i don't make any database connections.
You are trying to prevent directory traversal attacks, so you don't want the person putting in ./../../../ or something, hoping to read out files or filenames, depending on what you are doing.
I often using something like this:
$myvar = preg_replace("/[^a-zA-Z0-9-]/","",$_GET['var']);
This replaces anything that isn't a-zA-Z0-9- with a blank, so if the variable contains say, *, this code would delete that.
I then change the a-zA-Z0-9- to match which characters I want to be allowed in the string. I can then lock it down to only containing numbers or whatever I need.
It's really, really dangerous to do something like: opendir($local_directory) where $local_directory is a value which could come from the outside.
What if someone passes in something like ../../../../../../../../../etc ...or something like that? You risk of compromising security of your host.
You can take a glance here, to start:
http://php.net/manual/en/book.filter.php
IMHO, if you don't create anything on the fly, you should have something like:
$allowed_dirs = array('dir1','dir2', 'dir3');
if (!in_array($myvar, $allowed_dirs)) {
// throw an error and log what has happened
}
You can do this right after you receive your input from "outside". If it's impractical for you to do this because the number of image dirs can vary with time and you're afraid of missing the sync with your codebase, you could also populate the array of valid values making a scan of subdirectories you have into the image folders first.
So, at the end, you could have something like:
$allowed_dirs = array();
if ($handle = opendir(dirname(__FILE__) . '/images')) {
while (false !== ($entry = readdir($handle))) {
$allowed_dirs[] = $entry;
}
closedir($handle);
}
$myvar=$_GET['var'];
// you can deny access to dirs you want to protect like this
unset($allowed_dirs['private_stuff']);
// rest of code
$local_directory = dirname(__FILE__) . "/images/.$myvar";
if (in_array(".$myvar", $allowed_dirs) && $handle = opendir($local_directory)) {
$i=0;
while (false !== ($entry = readdir($handle))) {
if(strstr($entry, 'sample_'.$language.'-'.$type)) {
$result[$i]=$entry;
$i++;
}
}
closedir($handle);
} else {
echo 'error';
}
Code above is NOT optimized. But let's avoid premature optimization in this case (stating this to avoid another "nice" downvote); snippet is just to get you the idea of explicitly allowing values VS alternate approach of allowing everything unless matching a certain pattern. I think the former is more secure.
Let me just note for completeness that, if you can be sure your code will only be run on Unixish systems (such as Linux), the only things you need to ensure are that:
$myvar does not contain any slash ("/", U+002F) or null ("\0", U+0000) characters, and that
$myvar is not empty or equal to "." (or, equivalently, that ".$myvar" is not equal to "." or "..").
That's because, on a Unix filesystem, the only directory separator character (and one of the two characters not allowed in filenames, the other being the null character "\0") is the slash, and the only special directory entries pointing upwards in the directory tree are "." and "..".
However, if your code might someday be run on Windows, then you'll need to disallow more characters (at least the backslash, "\\", and probably others too). I'm not familiar enough with Windows filesystem conventions to say exactly which characters you'd need to disallow there, but the safe approach is to do as Rich Bradshaw suggests and only allow characters that you know are safe.
As with every data that comes from an untrusted source: Validate it before use and encode it properly when passing it to another context.
As for the former, you first need to specify what properties the data must have to be considered valid. This primarily depends on the purpose of its use.
In your case, the value of $myvar should probably be at least a valid directory name but it could also be a valid relative path composed of directory names, depending on your requirements. At this point, you are supposed to specify these requirements.

PHP concatenation of paths

Is there a PHP internal function for path concatenation ? What possibilities do I have to merge several paths (absolute and relative).
//Example:
$path1="/usr/home/username/www";
$path2="domainname";
$path3="images/thumbnails";
$domain="exampledomain.com";
//As example: Now I want to create a new path (domain + path3) on the fly.
$result = $domain.DIRECTORY_SEPARATOR.$path3
Ok, there is an easy solution for this example, but what if there are different dictionary separators or some paths are a little bit more complicated?
Is there an existing solution for trim it like this: /home/uploads/../uploads/tmp => /home/uploads/tmp ....
And how would a platform-independent version of an path-concat-function look like?
should an relative path start with "./" as prefix or is "home/path/img/" the common way?
I ran into this problem myself, primarily regarding the normalization of paths.
Normalization is:
One separator (I've chosen to support, but never return a backwards slash \\)
Resolving indirection: /../
Removing duplicate separators: /home/www/uploads//file.ext
Always remove trailing separator.
I've written a function that achieves this. I don't have access to that code right now, but it's also not that hard to write it yourself.
Whether a path is absolute or not doesn't really matter for the implementation of this normalization function, just watch out for the leading separator and you're good.
I'm not too worried about OS dependence. Both Windows and Linux PHP understand / so for the sake of simplicity I'm just always using that - but I guess it doesn't really matter what separator you use.
To answer your question: path concatenation can be very easy if you just always use / and assume that a directory has no trailing separator. 'no trailing separator' seems like a good assumption because functions like dirname remove the trailing separator.
Then it's always safe to do: $dir . "/" . $file.
And even if the result path is /home/uploads/../uploads//my_uploads/myfile.ext it's still going to work fine.
Normalization becomes useful when you need to store the path somewhere. And because you have this normalization function you can make these assumptions.
An additional useful function is a function to make relative paths.
/files/uploads
/files/uploads/my_uploads/myfile.ext
It can be useful to derive from those two paths, what the relative path to the file is.
realpath
I've found realpath to be extremely performance heavy. It's not so bad if you're calling it once but if you're doing it in a loop somewhere you get a pretty big hit. Keep in mind that each realpath call is a call to the filesystem as well. Also, it will simply return false if you pass in something silly, I'd rather have it throw an Exception.
To me the realpath function is a good example of a BAD function because it does two things: 1. It normalizes the path and 2. it checks if the path exists. Both of these functions are useful of course but they must be separated. It also doesn't distinguish between files and directories. For windows this typically isn't a problem, but for Linux it can be.
And I think there is some quirky-ness when using realpath("") on Windows. I think it will return \\ - which can be profoundly unacceptable.
/**
* This function is a proper replacement for realpath
* It will _only_ normalize the path and resolve indirections (.. and .)
* Normalization includes:
* - directiory separator is always /
* - there is never a trailing directory separator
* #param $path
* #return String
*/
function normalize_path($path) {
$parts = preg_split(":[\\\/]:", $path); // split on known directory separators
// resolve relative paths
for ($i = 0; $i < count($parts); $i +=1) {
if ($parts[$i] === "..") { // resolve ..
if ($i === 0) {
throw new Exception("Cannot resolve path, path seems invalid: `" . $path . "`");
}
unset($parts[$i - 1]);
unset($parts[$i]);
$parts = array_values($parts);
$i -= 2;
} else if ($parts[$i] === ".") { // resolve .
unset($parts[$i]);
$parts = array_values($parts);
$i -= 1;
}
if ($i > 0 && $parts[$i] === "") { // remove empty parts
unset($parts[$i]);
$parts = array_values($parts);
}
}
return implode("/", $parts);
}
/**
* Removes base path from longer path. The resulting path will never contain a leading directory separator
* Base path must occur in longer path
* Paths will be normalized
* #throws Exception
* #param $base_path
* #param $longer_path
* #return string normalized relative path
*/
function make_relative_path($base_path, $longer_path) {
$base_path = normalize_path($base_path);
$longer_path = normalize_path($longer_path);
if (0 !== strpos($longer_path, $base_path)) {
throw new Exception("Can not make relative path, base path does not occur at 0 in longer path: `" . $base_path . "`, `" . $longer_path . "`");
}
return substr($longer_path, strlen($base_path) + 1);
}
If the path actually exists, you can use realpath to expand it.
echo realpath("/home/../home/dogbert")
/home/dogbert
Another problem with realpath is that it doesn't work with URLs, and yet the logic of concatenation is essentially the same (there should be exactly one slash between two joined components). Obviously the protocol portion at the beginning with two slashes is an exception to this rule. But still, a function that joined the pieces of a URL together would be really nice.

Categories