Can someone please help me with this preg_match
if (preg_match('~[^A-Za-z0-9_\./\]~', $filepath))
// Show Error message.
I need to match a possible filepath. So I need to check for double slashes, etc. Valid file path strings should look like this only:
mydir/aFile.php
or
mydir/another_dir/anyfile.js
So a slash at the beginning of this string should be checked also. Please help.
Thanks :)
EDIT:
Also, guys, this path is being read from within a text file. It is not a filepath on the system. So hopefully it should be able to support all systems in this case.
RE-EDIT:
Sorry, but the string can also look like this too:
myfile.php, or myfile.js, or myfile.anything
How do I allow strings like this as well?? I apologize for not being too specific on this before...
Please notice that there are many types of possible file paths.
For example:
"./"
"../"
"........" (yes this can be a file's name)
"file/file.txt"
"file/file"
"file.txt"
"file/.././/file/file/file"
"/file/.././/file/file/.file" (UNIX)
"C:\Windows\" (Windows)
"C:\Windows\asd/asd" (Windows, php accepts this)
"file/.././/file/file/file!##$"
"file/.././/file/file/file!##.php.php.php.pdf.php"
All these file paths are valid. I can't think of a simple regex that can make it perfect.
Let's assume it's just a UNIX path for now, this is what I think should work for most cases:
preg_match('/^[^*?"<>|:]*$/',$path)
It checks all string for ^, *, ?, ", <, >, |, :(remove this for windows). These are all character that windows does not allow for file name, along with / and .
If it's windows, you should replace the path's \ with / and then explode it and check if it's absolute. Here is one example that working in both unix and windows.
function is_filepath($path)
{
$path = trim($path);
if(preg_match('/^[^*?"<>|:]*$/',$path)) return true; // good to go
if(!defined('WINDOWS_SERVER'))
{
$tmp = dirname(__FILE__);
if (strpos($tmp, '/', 0)!==false) define('WINDOWS_SERVER', false);
else define('WINDOWS_SERVER', true);
}
/*first, we need to check if the system is windows*/
if(WINDOWS_SERVER)
{
if(strpos($path, ":") == 1 && preg_match('/[a-zA-Z]/', $path[0])) // check if it's something like C:\
{
$tmp = substr($path,2);
$bool = preg_match('/^[^*?"<>|:]*$/',$tmp);
return ($bool == 1); // so that it will return only true and false
}
return false;
}
//else // else is not needed
return false; // that t
}
You can do:
if(preg_match('#^(\w+/){1,2}\w+\.\w+$#',$path)) {
// valid path.
}else{
// invalid path
}
Related
I have a question. I got a PHP script (PHP 5) which is saving a URL-Parameter $_GET['file'] to the variable $file. Is there now a way to check if this variable is a valid filename (for example: hello.txt and not /../otherdir/secret.txt). Because without checking the $file variable a hacker would be able to use the /../ to get to my parent folder.
You may have a look in php's basename function, it will return with filename, see example below:
$file = '../../abc.txt';
echo basename($file); //output: abc.txt
Note: basename gets you the file name from path string irrespective of file physically exists or not. file_exists function can be used to verify that the file physically exists.
POSIX "Fully portable filenames" lists these: A-Z a-z 0-9 . _ -
Use this code to validate the filename against POSIX rules using regex:
/ - forward slash (if you need to validate a path rather than a filename)
\w - equivalent of [0-9a-zA-Z_]
- . - dash dot space
$filename = '../../test.jpg';
if (preg_match('/^[\/\w\-. ]+$/', $filename))
echo 'VALID FILENAME';
else
echo 'INVALID FILENAME';
If you want to ensure it has no path (no forwardslash) then change the regex to '/^[\w\-. ]+$/'.
Instead of checking valid characters why not looking for character you don't want. Also filenames are limited to 255 characters:
function valid_filename(string $filename)
{
if (strlen($filename) > 255) { // no mb_* since we check bytes
return false;
}
$invalidCharacters = '|\'\\?*&<";:>+[]=/';
if (false !== strpbrk($filename, $invalidCharacters)) {
return false;
}
return true;
}
valid_filename('hello'); // true
valid_filename('hello.php'); // true
valid_filename('foo:bar.php'); // false
valid_filename('foo/bar'); // false
Adapt $invalidCharacters according to your needs/OS.
Source: https://www.cyberciti.biz/faq/linuxunix-rules-for-naming-file-and-directory-names/
Would that work?
http://php.net/manual/en/function.file-exists.php
E.g, in your case,
if(file_exists(str_replace("../", "", $file))){
// valid
}
else{
// invalid
}
Files can be in subfolders but not in parent folders.
However, if you're just interested in the filename,
if(file_exists(pathinfo($file, PATHINFO_BASENAME))){
// valid
}
else{
// invalid
}
should work.
i will like to combine kamal pal's and Pancake_M0nster's answers to create simple:
if(file_exists(basename($file))){
// valid
}
else{
// invalid
}
So I have a URL which contains &title=blabla
I know how to extract the title, and return it. But I've been searching my ass off to get the full path to the filename when I only have the filename.
So what I must have is an way to search in all directories for an html file called 'blabla' when the only thing it has is blabla. After finding it, it must return the full path.
Anyone who does have an solution for me?
<?php
$file = $_GET['title'];
if ($title = '') {
echo "information.html";
} else {
//here it must search for the filepath and echo it.
echo "$filepath";
}
?>
You can use the solution provided here.
It allows you to recurse through a directory and list all files in the directory and sub-directories. You can then compare to see if it matches the files you are looking for.
$root = '/'; // directory from where to start search
$toSearch = 'file.blah'; // basename of the file you wish to search
$it = new RecursiveDirectoryIterator($root);
foreach(new RecursiveIteratorIterator($it) as $file){
if($file->getBasename() === $toSearch){
printf("Found it! It's %s", $file->getRealPath());
// stop at the first match
break;
}
}
Keep in mind that depending on the number of files you have, this can be slow as hell
For a start this line is at fault
if ($title = '') {
See http://www.php.net/manual/en/reserved.variables.files.php
I have a function that is scanning dirs on server, read files, do something with it , and then deletes the dirs (nested)
The function is quite long , So I will post the relevant part .
//many other things ...
$dir_to_delete[] = $filename['dirname']; // the array to hold all the dirs.
} // end for each
$dir_to_delete_clean = array_unique($dir_to_delete); //clean array - we might have duplicated dir names
foreach ($dir_to_delete_clean as $delete) {
o99_deleteDirectory($delete) ;
}
// rmdir( $filename['dirname'] );
return $attc_id;
}
this is the delete function for non-empty dirs:
function o99_deleteDirectory($dir) {
if (!file_exists($dir)) return true;
if (!is_dir($dir)) return unlink($dir);
foreach (scandir($dir) as $item) {
if ($item == '.' || $item == '..') continue;
if (!o99_deleteDirectory($dir.DIRECTORY_SEPARATOR.$item)) return false;
}
return rmdir($dir);
}
It works great .
the problems is - when I checked for NON english characters ( German, Chinese, Hebrew, Arab, Cyrillic - or any other) - the script fails and stops...
I then tried rename() , rmdir() etc. - they all fail.
Is this a PHP bug ?
How can I resolve the problem ? I can not even rename them to later delete 8because rename() fails as well...
Any Ideas ??
Edit I
I forgot to mention that it is for wordpress plugin - but I would assume that it makes no difference...
Edit II
I am posting here some languages if someone wants to try but do not have the right keyboard / language settings . I am not sure that cutting and pasting will give the right encoding, but can always try ...
עברית (hebrew)
中國的 (chinese traditional)
عربي (arabic)
кириллица (cyrillic)
ελληνικά (greek)
öäüìíáàóò´Ä´` (German-Italian-Spanish and other european)
Have you tried to set locale before scanning or removing directories.
http://www.php.net/manual/en/function.setlocale.php
I have not tried this but you can give it a shot. It might help.
I want to amend a PHP script I'm using in wordPress (Auto Featured Image plugin).
The problem is that this script creates filenames for thumbnails based on the URLs of the image.
That sounds great until you get a filename with spaces and the thumbnail is something like this%20Thumbnail.jpg and when the browser goes to http://www.whatever.com/this%20Thumbnail.jpg it converts the %20 to a space and there is no filename on the server by that name (with spaces).
To fix this, I think I need to change the following line in such a way that $imageURL is filtered to convert %20 to spaces. Sound right?
Here is the code. Perhaps you can tell me if I'm barking up the wrong tree.
Thank you!
<?php
static function create_post_attachment_from_url($imageUrl = null)
{
if(is_null($imageUrl)) return null;
// get file name
$filename = substr($imageUrl, (strrpos($imageUrl, '/'))+1);
if (!(($uploads = wp_upload_dir(current_time('mysql')) ) && false === $uploads['error'])) {
return null;
}
// Generate unique file name
$filename = wp_unique_filename( $uploads['path'], $filename );
?>
Edited to a more appropriate and complete answer:
static function create_post_attachment_from_url($imageUrl = null)
{
if(is_null($imageUrl)) return null;
// get the original filename from the URL
$filename = substr($imageUrl, (strrpos($imageUrl, '/'))+1);
// this bit is not relevant to the question, but we'll leave it in
if (!(($uploads = wp_upload_dir(current_time('mysql')) ) && false === $uploads['error'])) {
return null;
}
// Sanitize the filename we extracted from the URL
// Replace any %-escaped character with a dash
$filename = preg_replace('/%[a-fA-F0-9]{2}/', '-', $filename);
// Let Wordpress further modify the filename if it may clash with
// an existing one in the same directory
$filename = wp_unique_filename( $uploads['path'], $filename );
// ...
}
You better to replace the spaces in image name with underscores or hypens using regexp.
$string = "Google%20%20%20Search%20Amit%20Singhal"
preg_replace('/%20+/g', ' ', $string);
This regex will replace multiple spaces (%20) with a single space(' ').
I have almost 10,000 images in a Folder with image name like
Abies_koreana_Blauer_Pfiff_05-06-10_1.jpg
Abies_koreana_Prostrate_Beauty_05-05-10_2.jpg
Chamaecyparis_obtusa_Limerick 06-10-10_3.jpg
Fagus_sylvatica_Dawyck_Gold_05-02-10_1.jpg
What i want do is rename the images using PHP so that only the characters remain in the image name want to delete the Numeric part so for example the above images would look like
Abies_koreana_Blauer_Pfiff.jpg
Abies_koreana_Prostrate_Beauty.jpg
Chamaecyparis_obtusa_Limerick.jpg
Fagus_sylvatica_Dawyck_Gold.jpg
Is this possible ? Or i have to do it manually ?
foreach file name do this
$new_filename = preg_replace("/(\w\d{0,2}[\W]{1}.+\.)/",".",$current_file_name);
so final function may look like this
function renameFiles($directory)
{
$handler = opendir($directory);
while ($file = readdir($handler)) {
if ($file != "." && $file != "..") {
if(preg_match("/(\w\d{0,2}[\W]{1}.+\.)/",$file)) {
echo $file."<br/>";
}
rename($directory."/".$file,$directory."/".preg_replace("/(\w\d{0,2}[\W]{1}.+\.)/",".",$file));
}
}
closedir($handler);
}
renameFiles("c:/wserver");
Updated
You can do this with PHP (or bash).
Your friends are RecursiveDirectoryIterator to walk through directories, preg_replace() to modify the file names, rename() to reflect changed filename on disk.
What you're trying to do can be done in ~10 lines of code. Using the ingredients above, you should be able to write a little script to change filenames yourself.
Update
throwing out the numeric parts (according to the examples given) can be done with a rather simple regular expression. Note that this will remove any numbers (-_ ) between the [a-z] filename and the suffix (".jpq"). So you won't get "foo3.png" but "foo.png". If this is a problem, the regex can be adjusted to meet that criteria…
<?php
$files = array(
'Abies_koreana_Blauer_Pfiff_05-06-10_1.jpg',
'Abies_koreana_Prostrate_Beauty_05-05-10_2.jpg',
'Chamaecyparis_obtusa_Limerick 06-10-10_3.jpg',
'Fagus_sylvatica_Dawyck_Gold_05-02-10_1.jpg',
);
foreach ($files as $source) {
// strip all numeric (date, counts, whatever)
// characters before the file's suffix
// (?= …) is a non-capturing look-ahead assertion
// see http://php.net/manual/en/regexp.reference.assertions.php for more info
$destination = preg_replace('#[ _0-9-]+(?=\.[a-z]+$)#i', '', $source);
echo "'$source' to '$destination'\n";
}