Whenever I work with PHP (often) I typically work on a Windows box, however I (try to) develop platform agnostic applications; one major point of issue being the use of directory separators.
As many know, doing any filesystem work in a Windows environment in PHP, you can use forward slashes in lieu of backwards, and PHP sorts it out under the hood. This is all fine when it comes to using string literals to pass a path to fopen() or whatever; but when retrieving paths, be it __FILE__ or expanding with realpath(), the paths retrieved are of course using the OS appropriate slashes. Also, I've noticed some inconsistencies in trailing slashes. Once or twice __DIR__ has appended one (a backslash) and realpath() too (I prefer the trailing slash, but not intermittently)
This is clearly a problem for string comparison, because instead of doing:
compare_somehow('path/to/file.php', __DIR__);
For the sake of reliability, I'm having to go:
compare_somehow('path/to/file.php', rtrim(strtr(__DIR__, '\\', '/'), '/') . '/');
This seems like alot of work. I can drop it into a function, sure; now I'm stuck with an arbitrary function dependency in all my OO code.
I understand that PHP isn't perfect, and accommodations need to be made, but surely there must exist some platform agnostic workaround to force filesystem hits to retrieve forward slashed paths, or at least a non-intrusive way to introduce a class-independent function for this purpose.
Summary question(s):
Is there some magical (though reliable) workaround, hack, or otherwise to force PHP to kick back forward slashed filesystem paths, regardless of the server OS?
I'm going to assume the answer is no to the above, so moving on; what provisions can I make to enforce forward slash (or whatever choice, really) as the directory separator? I'm assuming through the aforementioned filter function, but now where should it go?
Forward slash for everything. Even if the host OS separator is #*&#.
As I commented I can't really see why you would have to do this (I'd be interested in a quick description of the specific problem you are solving), but here is a possible solution using the output of __FILE__ as an example:-
$path = str_replace('\\', '/', __FILE__);
See it working
This will(should?) work regardless of the *slashes returned by the OS (I think).
Unfortunately I'm not aware of "some magical (though reliable) workaround, hack, or otherwise to force PHP to kick back forward slashed filesystem paths, regardless of the server OS" other than this. I imagine it could be wrapped in a helper class, but that still gives you an arbitary dependancy in your code.
Related
In PHP, the function urlencode() allows to encode a string as a url fragment, encoding all characters that might have a special meaning in a url. The encoding is reversible with urldecode().
I am looking for something similar for filesystem paths.
urlencode() might already work here, but it seems more aggressive than needed. E.g. there is no reason to escape "()" parentheses. On the other hand, a dot, especially double dot, can have special meaning in a filesystem path (one level up), so perhaps this would need to be encoded.
Is there anything available in PHP, or what would be the closest to do manually?
Is urlencode() safe for this purpose, or do we need to be careful indeed about the dot? Perhaps it is sufficient to exclude the dot as a first character, but allow it in the middle of a path.
My current use case would be Linux filesystem paths, but in the end I might want to put this in a software package that should be usable on different OSes.
Like urlencode(), I am looking for something that is reversible. So, not just sanitization, but actually encoding so it can be decoded.
Ideally, encoded paths should still be readable by humans. So, most characters that are allowed in a filesystem paths should preferably remain unchanged.
Use case
The question above should already be complete without this use case, but some people prefer to see one.
I have a scenario where I want to export a huge amount of data to distinct php files, and interpret the file names / paths as array keys when reading / writing the data.
E.g.
foreach ($addresses as $name => $address) {
file_put_contents($dir . '/' . FILEPATH_ENCODE($name), '<?php return ' . var_export($address, TRUE);');
}
The benefit of storing this in separate files is to track them easily using git, with minimal potential for git conflicts (compared to lumping them all in one file). Encoding the array key in the file name / path means I no longer need to store it in the file itself, thus reducing redundancy.
Extended example: Return closures instead of values.
foreach ($addresses as $name => $closureBodyPhp) {
file_put_contents($dir . '/' . FILEPATH_ENCODE($name), '<?php return function () {' . $closureBodyPhp . '};');
}
The data or closures I want to export are provided by an existing system, it is not up to me.
If you really want to know: I want to replace Drupal 7 features, that is, export existing features-generated php in a centralized directory instead of in separate feature modules.
But please, don't get too distracted with the use case.
Does PHP self auto-handle path delimiters in Win and *nix?
Ex.: converting \ to / ... or \ to \\?
Thanks.
No. But you can use the DIRECTORY_SEPARATOR constant.
Predefined Constants
Your question is not fully clear to me but... I'd aswer "yes, but". "Yes" as your script can do i.e. include "foo/bar/smth.php"; and it will work the same on windows and linux/unix PHPs and you do not need to bother (however if you do include "foo\bar\smth.php"; then it may work on windows (never checked) but will not work on linux/unix, so beware). So filesystem access layer is aware about this and would take care. And "but", becasue if you are also talking about i.e. doing http access (i.e. over HTTP) then "No" as it got nothing do with PHP. Also, I recall some MSIE did convert backslashes for normal slashes, so crap like htt:\\ works, but that's example of extremely wrong approach.
Is there a pre-existing function or class for URL normalization in PHP?
Specifically, following the semantic preserving normalization rules laid out in this wikipedia article on URL normalization, (or whatever 'standard' I should be following).
Converting the scheme and host to lower case
Capitalizing letters in escape sequences
Adding trailing / (to directories, not files)
Removing the default port
Removing dot-segments
Right now, I'm thinking that I'll just use parse_url(), and apply the rules individually, but I'd prefer to avoid reinventing the wheel.
The Pear Net_URL2 library looks like it'll do at least part of what you want. It'll remove dot segments, fix capitalization and get rid of the default port:
include("Net/URL2.php");
$url = new Net_URL2('HTTP://example.com:80/a/../b/c');
print $url->getNormalizedURL();
emits:
http://example.com/b/c
I doubt there's a general purpose mechanism for adding trailing slashes to directories because you need a way to map urls to directories which is challenging to do in a generic way. But it's close.
References:
http://pear.php.net/package/Net_URL2
http://pear.php.net/package/Net_URL2/docs/latest/Net_URL2/Net_URL2.html
I am using below statement to return the directory name of the running script:
print dirname(__FILE__);
it outputs something like this with back-slashes:
www\EZPHP\core\ezphp.php
Question:
Is a path with back-slashes acceptable across all major operating systems? If not, how should i construct the path either with slashes or back-slashes so that it is acceptable on all major operating systems eg Windows, Linux, Ubuntu.
Thank You.
Forward slashes are a good route.
There is also a constant called DIRECTORY_SEPARATOR that will return the directory separator for the system the code is running on.
I use forward slashes when I write paths for all my apps, and I often use DIRECTORY_SEPARATOR when I am exploding the results of a call that returns a path so that I can ensure I always have the right one to break on.
HTH,
Jc
I would normalize that to forward slashes. Windows accepts forward slashes, and they are the default on *nix systems
print str_replace('\\','/',dirname(__FILE__));
In reality, it doesn't matter... this is because dirname() doesn't necessarily return backslashes: it returns whatever directory separator is used by the OS. That is to say, whatever dirname returns is the separator you should be using anyway.
Other than that, just use forward slashes: PHP will interpret it correctly in Windows and Linux.
It doesn't matter, dirname() always return the path in the OS format.
dirname('c:/x'); // returns 'c:\'
dirname('c:/Temp/x'); // returns 'c:/Temp'
dirname('/x'); // returns '\'
I have a full path which I would like to remove certain levels of it. So for instance,
/home/john/smith/web/test/testing/nothing/
I would like to get rid of 4 levels, so I get
/test/testing/nothing/
What would be a good of doing this?
Thanks
A simple solution is to slice the path up into parts, and then manipulate the array before sticking it back together again:
join("/", array_slice(explode("/", $path), 5));
Of course, if you wanted to remove that specific path, you could also use a regular expression:
preg_replace('~^/home/john/smith/web/~', '', $path);
One word of advice though. If your application is juggling around with paths a lot, it may be a good idea to create a class to represent paths, so as to encapsulate the logic, rather than have a lot of string manipulations all over the place. This is especially a good idea, if you mix absolute and relative paths.
Why are you all using regular expressions for something that requires absolutely no matching; CPU cycles are valuable!
str_replace would be more efficient:
$s_path = '/home/john/smith/web/test/testing/nothing/';
$s_path = str_replace('john/smith/web/test/', '', $s_path);
And use realpath() to resolve any '../../' paths.
And remember dirname(__FILE__) gets the CWD and rtrim() is extremely useful for removing trailing slashes..