Encode string as file path fragment in PHP - php

In PHP, the function urlencode() allows to encode a string as a url fragment, encoding all characters that might have a special meaning in a url. The encoding is reversible with urldecode().
I am looking for something similar for filesystem paths.
urlencode() might already work here, but it seems more aggressive than needed. E.g. there is no reason to escape "()" parentheses. On the other hand, a dot, especially double dot, can have special meaning in a filesystem path (one level up), so perhaps this would need to be encoded.
Is there anything available in PHP, or what would be the closest to do manually?
Is urlencode() safe for this purpose, or do we need to be careful indeed about the dot? Perhaps it is sufficient to exclude the dot as a first character, but allow it in the middle of a path.
My current use case would be Linux filesystem paths, but in the end I might want to put this in a software package that should be usable on different OSes.
Like urlencode(), I am looking for something that is reversible. So, not just sanitization, but actually encoding so it can be decoded.
Ideally, encoded paths should still be readable by humans. So, most characters that are allowed in a filesystem paths should preferably remain unchanged.
Use case
The question above should already be complete without this use case, but some people prefer to see one.
I have a scenario where I want to export a huge amount of data to distinct php files, and interpret the file names / paths as array keys when reading / writing the data.
E.g.
foreach ($addresses as $name => $address) {
file_put_contents($dir . '/' . FILEPATH_ENCODE($name), '<?php return ' . var_export($address, TRUE);');
}
The benefit of storing this in separate files is to track them easily using git, with minimal potential for git conflicts (compared to lumping them all in one file). Encoding the array key in the file name / path means I no longer need to store it in the file itself, thus reducing redundancy.
Extended example: Return closures instead of values.
foreach ($addresses as $name => $closureBodyPhp) {
file_put_contents($dir . '/' . FILEPATH_ENCODE($name), '<?php return function () {' . $closureBodyPhp . '};');
}
The data or closures I want to export are provided by an existing system, it is not up to me.
If you really want to know: I want to replace Drupal 7 features, that is, export existing features-generated php in a centralized directory instead of in separate feature modules.
But please, don't get too distracted with the use case.

Related

laravel. Replace %20 in url

I have simple problem, I have to replace %20 and other crap from URL. At the moment it looks like this http://exmaple/profile/about/Eddies%20Plumbing. As you can see it's profile link.
Yes I could add str_replace values before every hyperlink, but I have like 10 of them and I think it's bad practice. Maybe there is better solution? What solution would you use? Thanks.
That is not crap, that is a valid unicode representation of a space character. And it's encoded because it's one of the characters that are deemed unsafe by RFC1738:
All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
systems that do not normally deal with fragment or anchor
identifiers, so that if the URL is copied into another system that
does use them, it will not be necessary to change the URL encoding.
So in order to have pretty URLs, you should avoid using reserved and unsafe characters which need encoding to be valid as part of a URL:
Reserved characters: $ & + , / : ; = ? #
Unsafe characters: Blank/empty space and < > # % { } | \ ^ ~ [ ] `
Instead replace spaces with dashes, which serve the same purpose visually while being a safe character, for example look at the Stack Overflow URL for this question. The URL below looks just fine and readable without spaces in it:
http://exmaple/profile/about/eddies-plumbing
You can use Laravel's str_slug helper function to do the hard work for your:
str_slug('Eddies Plumbing', '-'); // returns eddies-plumbing
The str_slug does more that replace spaces with dashes, it replaces multiple spaces with a single dash and also strips all non-alphanumeric characters, so there's no reliable way to decode it.
That being said, I wouldn't use that approach in the first place. There are two main ways I generally use to identify a database entry:
1. Via an ID
The route path definition would look like this in your case:
/profiles/about/{id}/{slug?} // real path "/profiles/about/1/eddies-plumbing"
The code used to identify the user would look like this User::find($id) (the slug parameter is not needed, it's just there to make the URL more readable, that's why I used the ? to make it optional).
2. Via a slug
The route path definition would look like this in your case:
/profiles/about/{slug} // real path "/profiles/about/eddies-plumbing"
In this case I always store the slug as a column in the users table because it's a property relevant to that user. So the retrieval process is very easy User::where('slug', $slug). Of course using str_slug to generate a valid slug when saving the user to the database. I usually like this approach better because it has the added benefit of allowing the slug to be whatever you want (not really needing to be generated from the user name). This can also allow users to choose their custom URL, and can also help with search engine optimisation.
The links are urlencoded. Use urldecode($profileLink); to decode them.
I am parsing the url tha i got in this way ->
$replacingTitle = str_replace('-',' ',$title);
<a href="example.com/category/{{ str_slug($article->title) }}/" />
In your view ...
{{$comm->title}
and in controller using parsing your url as
public function showBySlug($slug) {
$title = str_replace('-',' ',$slug);
$post = Community::where('title','=',$title)->first();
return view('show')->with(array(
'post' => $post,
));
}

PHP escaping \14 character in file paths

I know there're plenty of topics regarding escaping characters but I just can't find the solution for my problem.
It's very easy. This is string I have:
$path = "C:\Users\Me\Desktop\14409238.jpg";
Howver, no matter how many escaping techniques I use, I can't manage to display the correct path without destroying it. In all cases the \14 will be replaced with
C:\Users\Me\Desktopd09238.jpg
How do I solve this?
Don't use backslashes in PHP for windows paths. It's smart enough to convert for you:
$path = "c:/users/me/desktop/...";
Using backslashes runs into the exact problem you have - backslashing certain characters turns them into metacharacters, not regular characters.
try to change, the Physical path to access the image, stored on Desktop can be written as,
$path = "C:\Users\Me\Desktop\14409238.jpg";
to
$path = "C:\\Users\\Me\\Desktop\\14409238.jpg";
Avoid the situation entirely, PHP under Windows allows you to submit paths with the backslash
c://Users/Me/Desktop/file.jpg
This also avoids interoperability headaches when a script must run within .nix and Windows.

Finding the correct permutation of spaces and underscores in a string, in PHP

I have to parse a XFERLOG log file of all the files being written to disk, and process the said files with an external script. The issue with XFERLOG is that it replaces all spaces with underscores, while the filename on disk remains unchanged (as it should be).
If the original filename has a mix of spaces and underscores, this situation makes it difficult to determine the actual filename on disk, so one would have to loop through all the permutations of spaces and underscores, check each permutation again the filesystem to see if it exists.
So lets say the logfile reads this:
/path/to/file/OCD_Nightmare_-_[stuff_here_2].txt
The actual file on disk looks like this:
/path/to/file/OCD Nightmare - [stuff_here 2].txt
There is 2^5 permutations here. What would be the best course of action to find the "right" string?
Possibly use str_replace for this:
if(str_replace('_', ' ', $filename) == str_replace('_', ' ', $logfilename))
{
//Yay, a match!
}
Note: As mentioned in a comment below, if your filesystem has /path/to/file/OCD_Nightmare_-_[stuff_here_2].txt and /path/to/file/OCD_Nightmare -_[stuff here_2].txt, they will both match the log entry of /path/to/file/OCD Nightmare - [stuff_here 2].txt, possibly resulting in unwanted behavior. I believe this may be a very unlikely situation, but still worth noting.

PHP directory separators, forcing forward slash; non-intrusive

Whenever I work with PHP (often) I typically work on a Windows box, however I (try to) develop platform agnostic applications; one major point of issue being the use of directory separators.
As many know, doing any filesystem work in a Windows environment in PHP, you can use forward slashes in lieu of backwards, and PHP sorts it out under the hood. This is all fine when it comes to using string literals to pass a path to fopen() or whatever; but when retrieving paths, be it __FILE__ or expanding with realpath(), the paths retrieved are of course using the OS appropriate slashes. Also, I've noticed some inconsistencies in trailing slashes. Once or twice __DIR__ has appended one (a backslash) and realpath() too (I prefer the trailing slash, but not intermittently)
This is clearly a problem for string comparison, because instead of doing:
compare_somehow('path/to/file.php', __DIR__);
For the sake of reliability, I'm having to go:
compare_somehow('path/to/file.php', rtrim(strtr(__DIR__, '\\', '/'), '/') . '/');
This seems like alot of work. I can drop it into a function, sure; now I'm stuck with an arbitrary function dependency in all my OO code.
I understand that PHP isn't perfect, and accommodations need to be made, but surely there must exist some platform agnostic workaround to force filesystem hits to retrieve forward slashed paths, or at least a non-intrusive way to introduce a class-independent function for this purpose.
Summary question(s):
Is there some magical (though reliable) workaround, hack, or otherwise to force PHP to kick back forward slashed filesystem paths, regardless of the server OS?
I'm going to assume the answer is no to the above, so moving on; what provisions can I make to enforce forward slash (or whatever choice, really) as the directory separator? I'm assuming through the aforementioned filter function, but now where should it go?
Forward slash for everything. Even if the host OS separator is #*&#.
As I commented I can't really see why you would have to do this (I'd be interested in a quick description of the specific problem you are solving), but here is a possible solution using the output of __FILE__ as an example:-
$path = str_replace('\\', '/', __FILE__);
See it working
This will(should?) work regardless of the *slashes returned by the OS (I think).
Unfortunately I'm not aware of "some magical (though reliable) workaround, hack, or otherwise to force PHP to kick back forward slashed filesystem paths, regardless of the server OS" other than this. I imagine it could be wrapped in a helper class, but that still gives you an arbitary dependancy in your code.

How do I apply URL normalization rules in PHP?

Is there a pre-existing function or class for URL normalization in PHP?
Specifically, following the semantic preserving normalization rules laid out in this wikipedia article on URL normalization, (or whatever 'standard' I should be following).
Converting the scheme and host to lower case
Capitalizing letters in escape sequences
Adding trailing / (to directories, not files)
Removing the default port
Removing dot-segments
Right now, I'm thinking that I'll just use parse_url(), and apply the rules individually, but I'd prefer to avoid reinventing the wheel.
The Pear Net_URL2 library looks like it'll do at least part of what you want. It'll remove dot segments, fix capitalization and get rid of the default port:
include("Net/URL2.php");
$url = new Net_URL2('HTTP://example.com:80/a/../b/c');
print $url->getNormalizedURL();
emits:
http://example.com/b/c
I doubt there's a general purpose mechanism for adding trailing slashes to directories because you need a way to map urls to directories which is challenging to do in a generic way. But it's close.
References:
http://pear.php.net/package/Net_URL2
http://pear.php.net/package/Net_URL2/docs/latest/Net_URL2/Net_URL2.html

Categories