How to conduct File path operation in PHP? - php

Something like:
/directory/a/b - /directory/ = a/b
Is it possible to do this easily?

Since you're working with paths, platform sensitivity is important; Windows has a different path separator than most other platforms, and to write reusable code you can't snub a platform.
PHP has a few functions to deal with paths. If you're handed a really strange path like ~foo/bar//bitty/../index.php, use realpath to clean that up for you.
$path = realpath("~foo/bar//bitty/../index.php");
/* output: /home/foo/bar/index.php */
Other functions will aid you -- for example, to get the path part of a filename by itself, use dirname:
print dirname($path);
/* output: /home/foo/bar */
Once you have that, split on the separators and do whatever work you want. The real trick is having PHP worry about all the weirdness in paths for you, and then just working with each part separately. Look into pathinfo and basename as well. I think this is what you were asking for, not how to do dumb string replacements.
Don't forget not allowing injection to your application! Working with paths from Web input is dangerous. Never trust user input.

echo str_replace("/directory/","","/directory/a/b");
And to use this on other types of strings, your full string goes in the third parameter, and whatever you're "subtracting" goes as the first parameter.

Using the dirname() funciton and some strings you can cut the original path up and get the pieces.
<?php
// from: http://php.net/manual/en/function.dirname.php
$path = "/dirname/a/b";
$dir = dirname(dirname($path));
echo "dir at front=$dir\n";
$len = strlen($dir);
$dirname = substr ( $path, 0, $len+1 );
echo "dirname=$dirname\n";
$last_2 = substr ( $path, $len+1 );
echo "last_2=$last_2\n";
?>
results in
$ php x.php
dir at front=/dirname
dirname=/dirname/
last_2=a/b

Related

Is it possible to escape this str_replace() for path traversal?

I'm testing one of our PHP web applications for security issues.
The system the code is running on runs with at least PHP7.2.
Now I found something like the following in the code (simplified for this question, but boils down to this):
$file = $_GET['file'];
$path = "/some/directory/" . $file;
$path = str_replace(['../', '..'], '', $path);
echo file_get_contents($path);
Is it possible to modify the file parameter in a way that we can escape /some/directory, so that after the str_replace() the file_get_contents()-call looks something like: file_get_contents(/some/directory/../../etc/passwd)?
Edit:
I can't change the order of code execution. I can only define the value of $_GET['file'] with my request.
Furthermore I know how to make this more secure but for my research I intend to break it.
Basically what needs to be done is somehow tricking out the str_replace() into leaving some ../ behind.
I tried for a few hours now, with various approaches, but - luckily for our application - couldn't get it working.
Do you have any ideas?
You can fiddle around with the code here: https://3v4l.org/3ehYA

PHP: Recommended way to escape slashes in path (e.g. to prevent directory traversal attack)

I am looking for a PHP function to sanitize strings into safe and valid file names with no directory separators (slashes).
Ideally it should be reversible, and it should not scramble the name more than necessary.
Of course I want to prevent intentional directory traversal attacks. But I also want to prevent subfolders being created.
I figured that urlencode() would work, but I wonder if this is sufficient, and/or if there is something better or more popular.
Also if there is something that works equally well on Windows (backslash as directory separator) - so the solution would be portable.
Use case / scenario:
As part of a data import, I want to download files from remote urls into the local filesystem. The urls are from a csv file. Most of them are ok, but they may contain more slashes than expected.
E.g. most of them are like this:
https://files.example.com/pdf/12345.pdf
But then individual files might be like this:
https://files.example.com/pdf/1/2345.pdf
The files should all go into the same directory, e.g.
https://files.example.com/pdf/12345.pdf -> /destination/dir/12345.pdf
A file like 1/2345.pdf should not result in a subdirectory. Instead, the / should be escaped in some (reversible) way. E.g. with urlencode() this would be 1%2F2345.pdf.
You could create a set of replacements. For example, you could make the / char that appears in a filename be represented with something else like "(slash)". Simply use str_replace to to switch between looking up a filename and encoding a filename into a url. This is only one example.
This should help you.
Input: https://files.example.com/pdf/1/2345.pdf
Output: pdf_1_2345.pdf
$url = 'https://files.example.com/pdf/1/2345.pdf';
$parse = parse_url($url);
//get path, remove first slash
//$path: pdf/1/2345.pdf
$path = substr($parse['path'],1);
//result becomes: pdf_1_2345.pdf
$result = str_replace('/','_',$path);
EDIT: The best bet is to store remote file url in the database, hashing its value (using md5 or similar) and saving file under that name locally, storing that hashed value in the database too.
This is your best bet, this way you can always know which remote file corresponds to your local file, and vice versa, and you won't have to deal with filenames locally, as they could be whatever you want (as long as you keep them in check for uniqueness)
Database Table:
--------------------
| id | remote_url | local_name |
-----------------------------------------------------
| 1 | http://example/.../123.pdf | sdflkfd..dl.pdf|
You get the idea.
You can use this function, it replaces all directory separators with an underscore.
function secureFilePath($str)
{
$str = str_replace('/', '_', $str);
$str = str_replace('\\', '_', $str);
$str = str_replace(DIRECTORY_SEPARATOR, '_', $str); // In case it does not equal the standard values
return $str;
}

how to safely join strings to a path in php? [duplicate]

This question already has answers here:
Preventing Directory Traversal in PHP but allowing paths
(7 answers)
Closed 9 years ago.
I have a constant beginning of a string, and a variable ending, how can I secure the string so that is doesn't create a step-back (or step-up) in case the string I inject contains
../
Here is a short sample code:
$dir = 'my/base/path/';
$file = $dir . $userSelectedFilename;
unlink($file);
If $userSelectedFilename would be '../../myFileName' I assume that would cause my script to actually try to unlink something two directory levels up my/myFilename which is clearly not something I want to allow, I want to keep it under the basepath my/base/path/ under all circumstances.
I suggest the the following, and only the following method:
<?
$dir = 'my/base/path/';
$file = $dir . $userSelectedFilename;
if(strpos(realpath($file),realpath($dir)) === 0) && is_file($file)) { // beware of the three ===
unlink($file);
}
Why?
It is safe to rely on realpath to find out the real location of a file which eliminates directory traversal / multibyte double-dots etc.
After that we check whether the beginning of the reapath of the file is really the beginning of our expacted directory (strpos).
After that we also check whether the file is really a file and not some symlink pointing elswhere or something like that.
I have seen character eliminating solutions been broken by multibyte strings and similar attacks.
This method so far far withstands all of these.
You could filter out those characters by doing something like:
$file = preg_match("/\.\.\//", "", $file)
which will remove occurrences of the string ../
And just a side note, you should probably find a different way of allowing users to select files to delete rather than allowing them to input the path as a string, maybe by showing them a directory listing of files they can delete or something like that.
You can do this "my/base/path/../../dir/", if you want "real" path use this :
echo realpath($dir."../../dir/"); // my/dir
http://php.net/manual/en/function.realpath.php
Using regex to validate the string for ../ or /../ and not accepting the string if the regex returns true:
function validatePath($path) {
if(preg_match('#\/\.\.\/#',$path))
return False;
}

shell_exec() statement to pdftotext entire directory?

I'm at a loss as to how I could build a loop to pdftotext and entire directory through a shell_exec() statement.
Something like :
$pdfs = glob("*.pdf");
foreach($pdfs as $pdfs) {
shell_exec('pdftotext '.$pdfs.' '.$pdfs'.txt');
}
But I'm unsure how I can drop the .pdf extension the 2nd time I call $pdfs in my shell_exec() statement and replace that with .txt
Not really sure this loop is correct either....
Try
foreach(glob("*.pdf") as $src) {
// Manually remove file extension because glob() may return a dir path component
$parts = explode('.', $src);
$parts[count($parts) - 1] = 'txt';
$dest = implode('.', $parts);
// Escape shell arguments, just in case
shell_exec('pdftotext '.escapeshellarg($src).' '.escapeshellarg($dest));
}
Basically, loop the PDF files in the directory and execute the command for each one, using just the name component of the file name (extracted with pathinfo())see edit for the output file (so test.pdf becomes test.txt).
Using the result of glob() directly in foreach easily avoids the variable naming collision you had in the code above.
EDIT
I have change the above code to manually remove the file extension when generating the output file name. This is because glob() may return a directory component of the path strings, as well as just a file name. Using pathinfo() or basename() will strip this off, and since we know that a . will be present in the file name (the rule passed to glob() dictates this) we can safely remove everything after the last one. I have also added escapeshellarg() for good measure - it is highly unlikely (if not impossible) that a file name that already exists would fall foul of this, but it is best to be safe.
$pdfs = glob("*.pdf");
$fmt='/path/to/pdftotext "%s" "%s.txt"';
foreach($pdfs as $thispdf) {
shell_exec(sprintf($fmt, $thispdf, basename($thispdf, ".pdf")));
}

How to check an exectuable's path is correct in PHP?

I'm writing a setup/installer script for my application, basically just a nice front end to the configuration file. One of the configuration variables is the executable path for mysql. After the user has typed it in (for example: /path/to/mysql-5.0/bin/mysql or just mysql if it is in their system PATH), I want to verify that it is correct. My initial reaction would be to try running it with "--version" to see what comes back. However, I quickly realised this would lead to me writing this line of code:
shell_exec($somethingAUserHasEntered . " --version");
...which is obviously a Very Bad Thing. Now, this is a setup script which is designed for trusted users only, and ones which probably already have relatively high level access to the system, but still I don't think the above solution is something I want to write.
Is there a better way to verify the executable path? Perhaps one which doesn't expose a massive security hole?
Running arbitrary user commands is like running queries based on user input... Escaping is the key.
First, validate if it is an executable using is_executable().
PHP exposes two functions for this: escapeshellarg() and escapeshellcmd().
escapeshellarg() adds single quotes around a string and quotes/escapes any existing single quotes allowing you to pass a string directly to a shell function and having it be treated as a single safe argument.
escapeshellcmd() escapes any characters in a string that might be used to trick a shell command into executing arbitrary commands.
This should limit the amount of risk.
if(is_executable($somethingAUserHasEntered)) {
shell_exec(escapeshellarg($somethingAUserHasEntered) . " --version");
}
After all, doing rm --version isn't very harmful, and "rm -rf / &&" --version will get you anywhere very fast.
EDIT: Since you mentioned PATH... Here is a quick function to validate if the file is an executable according to PATH rules:
function is_exec($file) {
if(is_executable($file)) return true;
if(realpath($file) == $file) return false; // Absolute Path
$paths = explode(PATH_SEPARATOR, $_ENV['PATH']);
foreach($paths as $path) {
// Make sure it has a trailing slash
$path = rtrim($path, DIRECTORY_SEPARATOR) . DIRECTORY_SEPARATOR;
if(is_executable($path . $file)) return true;
}
return false;
}
You could try a simple file_exists call to determine if something exists at that location, along with an is_executable to confirm that it's something you can run.
have you looked at is_dir() or is_link() or is_file() or is_readable()
Hope these help.
system('which '.escapeshellarg($input)) will give you the absolute path to the executable, regardless if it's just the name or an absolute path.

Categories