Say I have a badly formatted path /public/var/www/html/images\uploads\
Are there any performance benefits between these two methods to "normalize" the slashes, or is it just a different way of doing things?
realpath($path) . DIRECTORY_SEPARATOR
str_replace('\\', '/', $path);
realpath() might and probably does take a tad more computation but it is doing more than str_replace() would. As to which you would use is up to you and depends on the application. realpath() will not only fix the format of strings.. but will also verify that a file exists by that name. Also, using realpath() will, in most cases, make your code more readable and easier to understand because it's naming better corresponds to it's functionality here (depending, again, on the application).
realpath()
Related
require_once dirname(__FILE__).DIRECTORY_SEPARATOR . './../../../wp-config.php';
require_once dirname(__FILE__).DIRECTORY_SEPARATOR.'inc/options.php';
The above code is from a plugin from the Wordpress.
I don't understand why half of it uses DIRECTORY_SEPARATOR, but the other half uses "/" ?
Because in different OS there is different directory separator. In Windows it's \ in Linux it's /. DIRECTORY_SEPARATOR is constant with that OS directory separator. Use it every time in paths.
In you code snippet we clearly see bad practice code. If framework/cms are widely used it doesn't mean that it's using best practice code.
All of the PHP IO functions will internally convert slashes to the appropriate character, so it's not a huge deal which method you use. Below are some things to consider.
It can look ugly and confusing when you print out your file paths and there is a mix of \ and /. This won't ever happen if DIRECTORY_SEPARATOR is used
Using something such as $generated_css = DIRECTORY_SEPARATOR.'minified.css'; will work all fine and dandy for file IO, but if a developer unknowingly references it in a URL such as echo "<link rel='stylesheet'href='https://example.com$generated_css'>";, a bug was just created. Did you catch it? While this will work on Windows, for everyone else a forward slash, instead of a backslash, will be in $generated_css, resulting in the percent encoded, non-existant, URL https://example.com%5cgenerated_css! When using a DIRECTORY_SEPARATOR you have to take special care to make sure your filepath variables never end up in a URL.
And lastly, in the unlikely scenario your filepath is used by non-PHP code — for example, in a shell_exec call — you won't be able to mix slashes and will need to either construct the filepath with DIRECTORY_SEPARATOR or use realpath.
I learned from distributing code that the best way for your application to run on both Linux and Windows is to never use DIRECTORY_SEPARATOR, or backslashes \\, and to ONLY use forward slashes /.
Why? Because a backslash directory separator ONLY works on Windows. And forward slashes works on ALL (Linux, Windows, Mac altogether).
Using the constant DIRECTORY_SEPARATOR or escaping your backslashes \\ quickly becomes messy. I mean look at it:
$file = 'path' . DIRECTORY_SEPARATOR . 'to' . DIRECTORY_SEPARATOR . 'file';
$file = str_replace('/', DIRECTORY_SEPARATOR, 'path/to/file';
$file = (strtoupper(substr(PHP_OS, 0, 3)) === 'WIN') ? 'path\\to\\file' : 'path/to/file';
When you can just do this:
$file = 'path/to/file';
The only downside is that on Windows; PHP will return backslashes for all file references from functions like realpath(), glob(), and magic constants like __FILE__ and __DIR__. So you might need to str_replace() them into forward slashes to keep it consistant.
$dir = str_replace('\\', '/', realpath('../'));
I wish there was a php.ini setting to always return forward slashes.
Do not use your own folder separators. Always use DIRECTORY_SEPARATOR, because:
In some special cases you really need the correct path delimiter
The OS might handle it correctly, but many 3rd party applications can't and might fail!
Some operating systems do not use / or \ as separators but something different
Don't forget: Use the constant only on the remote system - don't use it for URIs or anything else that you want to send to the client (except you really need it, like a "remote browser").
Here are some things I have done in PHP:
$normalizedPath = rtrim($path, '/');
$fullPath = $path . '/' . $basename;
Is there a better class or function to do this where I do not need to hardcode / into my application? Hopefully this will work with unicode and CJK characters.
You should use realpath() function.
Check pathinfo function too, it may be useful
No there are no functions to assemble a path.
Since PHP supports both, the Windows style paths and the UNIX style paths, the rtrim() statement will not work if $path is a Windows style path. You can use realpath() to work around this, but realpath() has the drawback that it returns an absolute path which might be not desired and it returns false for non existing paths, which might be a problem as well in cases were you build a path for something that should be created but does not already exist.
require_once dirname(__FILE__).DIRECTORY_SEPARATOR . './../../../wp-config.php';
require_once dirname(__FILE__).DIRECTORY_SEPARATOR.'inc/options.php';
The above code is from a plugin from the Wordpress.
I don't understand why half of it uses DIRECTORY_SEPARATOR, but the other half uses "/" ?
Because in different OS there is different directory separator. In Windows it's \ in Linux it's /. DIRECTORY_SEPARATOR is constant with that OS directory separator. Use it every time in paths.
In you code snippet we clearly see bad practice code. If framework/cms are widely used it doesn't mean that it's using best practice code.
All of the PHP IO functions will internally convert slashes to the appropriate character, so it's not a huge deal which method you use. Below are some things to consider.
It can look ugly and confusing when you print out your file paths and there is a mix of \ and /. This won't ever happen if DIRECTORY_SEPARATOR is used
Using something such as $generated_css = DIRECTORY_SEPARATOR.'minified.css'; will work all fine and dandy for file IO, but if a developer unknowingly references it in a URL such as echo "<link rel='stylesheet'href='https://example.com$generated_css'>";, a bug was just created. Did you catch it? While this will work on Windows, for everyone else a forward slash, instead of a backslash, will be in $generated_css, resulting in the percent encoded, non-existant, URL https://example.com%5cgenerated_css! When using a DIRECTORY_SEPARATOR you have to take special care to make sure your filepath variables never end up in a URL.
And lastly, in the unlikely scenario your filepath is used by non-PHP code — for example, in a shell_exec call — you won't be able to mix slashes and will need to either construct the filepath with DIRECTORY_SEPARATOR or use realpath.
I learned from distributing code that the best way for your application to run on both Linux and Windows is to never use DIRECTORY_SEPARATOR, or backslashes \\, and to ONLY use forward slashes /.
Why? Because a backslash directory separator ONLY works on Windows. And forward slashes works on ALL (Linux, Windows, Mac altogether).
Using the constant DIRECTORY_SEPARATOR or escaping your backslashes \\ quickly becomes messy. I mean look at it:
$file = 'path' . DIRECTORY_SEPARATOR . 'to' . DIRECTORY_SEPARATOR . 'file';
$file = str_replace('/', DIRECTORY_SEPARATOR, 'path/to/file';
$file = (strtoupper(substr(PHP_OS, 0, 3)) === 'WIN') ? 'path\\to\\file' : 'path/to/file';
When you can just do this:
$file = 'path/to/file';
The only downside is that on Windows; PHP will return backslashes for all file references from functions like realpath(), glob(), and magic constants like __FILE__ and __DIR__. So you might need to str_replace() them into forward slashes to keep it consistant.
$dir = str_replace('\\', '/', realpath('../'));
I wish there was a php.ini setting to always return forward slashes.
Do not use your own folder separators. Always use DIRECTORY_SEPARATOR, because:
In some special cases you really need the correct path delimiter
The OS might handle it correctly, but many 3rd party applications can't and might fail!
Some operating systems do not use / or \ as separators but something different
Don't forget: Use the constant only on the remote system - don't use it for URIs or anything else that you want to send to the client (except you really need it, like a "remote browser").
Whenever I work with PHP (often) I typically work on a Windows box, however I (try to) develop platform agnostic applications; one major point of issue being the use of directory separators.
As many know, doing any filesystem work in a Windows environment in PHP, you can use forward slashes in lieu of backwards, and PHP sorts it out under the hood. This is all fine when it comes to using string literals to pass a path to fopen() or whatever; but when retrieving paths, be it __FILE__ or expanding with realpath(), the paths retrieved are of course using the OS appropriate slashes. Also, I've noticed some inconsistencies in trailing slashes. Once or twice __DIR__ has appended one (a backslash) and realpath() too (I prefer the trailing slash, but not intermittently)
This is clearly a problem for string comparison, because instead of doing:
compare_somehow('path/to/file.php', __DIR__);
For the sake of reliability, I'm having to go:
compare_somehow('path/to/file.php', rtrim(strtr(__DIR__, '\\', '/'), '/') . '/');
This seems like alot of work. I can drop it into a function, sure; now I'm stuck with an arbitrary function dependency in all my OO code.
I understand that PHP isn't perfect, and accommodations need to be made, but surely there must exist some platform agnostic workaround to force filesystem hits to retrieve forward slashed paths, or at least a non-intrusive way to introduce a class-independent function for this purpose.
Summary question(s):
Is there some magical (though reliable) workaround, hack, or otherwise to force PHP to kick back forward slashed filesystem paths, regardless of the server OS?
I'm going to assume the answer is no to the above, so moving on; what provisions can I make to enforce forward slash (or whatever choice, really) as the directory separator? I'm assuming through the aforementioned filter function, but now where should it go?
Forward slash for everything. Even if the host OS separator is #*&#.
As I commented I can't really see why you would have to do this (I'd be interested in a quick description of the specific problem you are solving), but here is a possible solution using the output of __FILE__ as an example:-
$path = str_replace('\\', '/', __FILE__);
See it working
This will(should?) work regardless of the *slashes returned by the OS (I think).
Unfortunately I'm not aware of "some magical (though reliable) workaround, hack, or otherwise to force PHP to kick back forward slashed filesystem paths, regardless of the server OS" other than this. I imagine it could be wrapped in a helper class, but that still gives you an arbitary dependancy in your code.
I have a full path which I would like to remove certain levels of it. So for instance,
/home/john/smith/web/test/testing/nothing/
I would like to get rid of 4 levels, so I get
/test/testing/nothing/
What would be a good of doing this?
Thanks
A simple solution is to slice the path up into parts, and then manipulate the array before sticking it back together again:
join("/", array_slice(explode("/", $path), 5));
Of course, if you wanted to remove that specific path, you could also use a regular expression:
preg_replace('~^/home/john/smith/web/~', '', $path);
One word of advice though. If your application is juggling around with paths a lot, it may be a good idea to create a class to represent paths, so as to encapsulate the logic, rather than have a lot of string manipulations all over the place. This is especially a good idea, if you mix absolute and relative paths.
Why are you all using regular expressions for something that requires absolutely no matching; CPU cycles are valuable!
str_replace would be more efficient:
$s_path = '/home/john/smith/web/test/testing/nothing/';
$s_path = str_replace('john/smith/web/test/', '', $s_path);
And use realpath() to resolve any '../../' paths.
And remember dirname(__FILE__) gets the CWD and rtrim() is extremely useful for removing trailing slashes..