Delete dir with non english characters in PHP - php

I have a function that is scanning dirs on server, read files, do something with it , and then deletes the dirs (nested)
The function is quite long , So I will post the relevant part .
//many other things ...
$dir_to_delete[] = $filename['dirname']; // the array to hold all the dirs.
} // end for each
$dir_to_delete_clean = array_unique($dir_to_delete); //clean array - we might have duplicated dir names
foreach ($dir_to_delete_clean as $delete) {
o99_deleteDirectory($delete) ;
}
// rmdir( $filename['dirname'] );
return $attc_id;
}
this is the delete function for non-empty dirs:
function o99_deleteDirectory($dir) {
if (!file_exists($dir)) return true;
if (!is_dir($dir)) return unlink($dir);
foreach (scandir($dir) as $item) {
if ($item == '.' || $item == '..') continue;
if (!o99_deleteDirectory($dir.DIRECTORY_SEPARATOR.$item)) return false;
}
return rmdir($dir);
}
It works great .
the problems is - when I checked for NON english characters ( German, Chinese, Hebrew, Arab, Cyrillic - or any other) - the script fails and stops...
I then tried rename() , rmdir() etc. - they all fail.
Is this a PHP bug ?
How can I resolve the problem ? I can not even rename them to later delete 8because rename() fails as well...
Any Ideas ??
Edit I
I forgot to mention that it is for wordpress plugin - but I would assume that it makes no difference...
Edit II
I am posting here some languages if someone wants to try but do not have the right keyboard / language settings . I am not sure that cutting and pasting will give the right encoding, but can always try ...
עברית (hebrew)
中國的 (chinese traditional)
عربي (arabic)
кириллица (cyrillic)
ελληνικά (greek)
öäüìíáàóò´Ä´` (German-Italian-Spanish and other european)

Have you tried to set locale before scanning or removing directories.
http://www.php.net/manual/en/function.setlocale.php
I have not tried this but you can give it a shot. It might help.

Related

Excluding file extensions from my preg_match

I have a script I am using to filter some ROM and ISO files.
I have (with a lot of help) got a working script where files are filtered by filename, however I am trying to add a section in which I can include extra ad-hoc filenames to be filtered for me by providing them in a local .txt file. This is working OK, however in my .txt file I am having to put the full filename (including the .txt extension) into the .txt - for example my "manualregiondupes.txt file looks like this:
Game One.zip
Game Two.zip
Game Three.zip
Whereas I want it to just type them in my .txt file like so:
Game One
Game Two
Game Three
The current regex i'm using is trying to match the full filename it finds (including the .zip extension) whereas I want it to just match the section before the file extension. I have to be careful, however - as I don't want a game like:
"Summer Heat Beach Volleyball (USA)" being matched if "Beach Volleyball (USA)" is in the .txt.
Same goes for words on the other side - like
"Sensible Soccer (USA) (BETA)" being matched if "Sensible Soccer (USA)" is in the .txt
Here is my script;
// Make sure what manualregiondupes.txt is doing
if (file_exists('manualregiondupes.txt'))
{
$manualRegionDupes = file('manualregiondupes.txt', FILE_IGNORE_NEW_LINES);
$manualRegionPattern = "/^(?:" . implode("|", array_map(function ($i)
{
return preg_quote(trim($i) , "/");
}
, $manualRegionDupes)) . ')$/';
echo "ManualRegionDupes.txt has been found, ";
if (trim(file_get_contents('manualregiondupes.txt')) == false)
{
echo "but is empty! Continuing without manual region dupes filter.\n";
}
else
{
echo "and is NOT empty! Applying the manual region dupes filter.\n";
}
}
else
{
echo "ManualRegionDupes.txt has NOT been found. Continuing without the manual region dupes filter.\n";
}
// Do this magic for every file
foreach ($gameArray as $thisGame) {
if (!$thisGame) continue;
// Probably already been removed
if (!file_exists($thisGame)) continue;
// Filenames in manualregiondupes.txt
if (file_exists('manualregiondupes.txt'))
{
if (trim(file_get_contents('manualregiondupes.txt')) == true)
{
if (preg_match($manualRegionPattern, $thisGame))
{
echo "{$thisGame} is on the manual region dupes remove list. Moved to Removed folder.\n";
shell_exec("mv \"{$thisGame}\" Removed/");
continue;
}
}
}
... SCRIPT CONTINUES HERE BUT ISN'T RELEVANT!
What's the easiest way of doing this? I think i've just asked a very long question when it's actually quite simple, but oh well - I am not very good with PHP (or any script to be honest!) so apologies and thankyou's in advance! :D
You can use pathinfo in regex like -
$withoutExt = preg_replace('/\.' . preg_quote(pathinfo($path, PATHINFO_EXTENSION), '/') . '$/', '', $path);
it gives you perfect file name output without extension for
file.txt -> file
file.sometext.txt -> file.sometext

PHP hidden directories - Windows

I'm attempting to add a feature to our intranet, which will allow users to log onto the intranet, and access documents stored within a Windows network SAN.
At the moment, I've successfully retrieved all the file and folder names within a specified users 'My Documents'.
I'm having difficulty removing hidden files and folders from the array.
At the moment, I can remove all folders and files starting with ..
However on Windows, they're being marked as 'hidden' in the properties. I've googled and found lots of resources about how to mark a file as hidden, and how to hide files that start with a ., but none on how to remove hidden windows files / folders. One post on stackoverflow mentions to use DirectoryIterator, but at the moment, but haven't explained at all how to use it to check if a files marked as hidden.
We have over 1000 users, with approximately 500MB - 1GB of documents, with multiple layers of directories, so It needs to be relatively fast.
For clarification:
During a recursive iteration on a Windows system, how can I find out whether a directory is hidden or not, without relying on a prepended . symbol?
Ok, so worked it out, with help from the exec() function, so use with care!
I'm using CodeIgniter, so I've modified the directory_helper.php function slightly, as its installed on a windows box, it'll always need to check for the hidden files, but it should also work for non-codeigniter sites:
function directory_map($source_dir, $directory_depth = 0, $hidden = FALSE)
{
if ($fp = #opendir($source_dir))
{
if(!$hidden)
{
$exclude = array();
exec('dir "' . $source_dir . '" /ah /B', $exclude);
}
$filedata = array();
$new_depth = $directory_depth - 1;
$source_dir = rtrim($source_dir, DIRECTORY_SEPARATOR).DIRECTORY_SEPARATOR;
while (FALSE !== ($file = readdir($fp)))
{
// Remove '.', '..', and hidden files [optional]
if ( ! trim($file, '.') OR ($hidden == FALSE && $file[0] == '.') OR ($hidden === FALSE && in_array($file, $exclude)))
{
continue;
}
if (($directory_depth < 1 OR $new_depth > 0) && #is_dir($source_dir.$file))
{
$filedata[$file] = directory_map($source_dir.$file.DIRECTORY_SEPARATOR, $new_depth, $hidden);
}
else
{
$filedata[] = $file;
}
}
closedir($fp);
return $filedata;
}
return FALSE;
}
This scanned 2207 files, and 446 folders in approx 11 seconds (Ages I know, but the best I could do). Tested it on 500 folders and 200 files, and did it in around 3 seconds.
Its a recursive function which will scan each non-hidden directory. The first thing it does is scan the current directory for all hidden files and folders using the exec('dir *directory* /ah /B') function.
It will then store the results in an array and make sure that the current file/directory being read isn't in that array.

how to scan all usages of a custom function in all my php files?

I have created my own l($text) function in php for a multi lingual website. i use it like this in my documents :
echo '<h1>' . l('Title of the page') . '</h1';
echo '<p>' . l('Some text here...') . '</p>';
My question is, with a php script, how can i scan all my .php files to catch all this function usages and list all the arguments used into a mysql table?
the goal, of course, is to not forget any sentences in my traduction files.
I didn't find anything on google or here, so if you have any ideas, or need some more information.
Could you:
read all *.php files with glob()
then use a regex to pull the strings out (preg_match())
strings simple mysql insert?
Seems simple enough?
i just finished, your help was usefull ! :-)
here is my ugly code for those who can be interested. it's not beautifuly coded, but not made to be loaded 10000 times per day so...
<?php
// define a plain text document to see what appen on test
header('Content-Type: text/plain; charset=UTF-8');
$dossier = 'pages/'; // folder to scan
$array_exclude = array('.', '..', '.DS_Store'); // system files to exclude
$array_sentences_list = array();
if(is_dir($dossier)) // verify if is a folder
{
if($dh = opendir($dossier)) // open folder
{
while(($file = readdir($dh)) !== false) // scan all files in the folder
{
if(!in_array($file, $array_exclude)) // exclude system files previously listed in array
{
echo "\n".'######## ' . strtoupper($file) . ' ##########'."\n";
$file1 = file('pages/'.$file); // path to the current file
foreach($file1 AS $fileline)
{
// regex : not start with a to z characters or a (
// then catch sentences into l(' and ')
// and put results in a $matchs array
preg_match_all("#[^a-z\(]l\('(.+)'\)#U", $fileline, $matchs);
// fetch the associative array
foreach($matchs AS $match_this)
{
foreach($match_this AS $line)
{
// technique of "I do not want to break my head"
if(substr($line, 0, 3) != "l('" AND substr($line, 0, 4) != " l('" AND substr($line, 0, 4) != ".l('")
{
// check if the sentence is not already listed
if(!in_array($line, $array_sentences_list))
{
// if not, add it to the sentences list array and write it for fun !
$array_sentences_list[] = $line;
echo $line . "\n";
}
}
}
}
}
}
}
closedir($dh);
}
}
?>
small precision : i do have to escape various cases as :
-> CSS : background: url('image.jpg');
and
-> jQuery : $(this).html('bla bla');
so here is why the regex starts with [^a-z(] :-)
it works very well now! just have to finish later with recording entries in a mysql table and ensure that i can load the script from time to time when there are changes on the site... keep the existing translation, overwrite the existing files etc... no problem with that.
thanks a gain, this website is really helpful ! :-)

Renaming Alphanumeric images with PHP

I have almost 10,000 images in a Folder with image name like
Abies_koreana_Blauer_Pfiff_05-06-10_1.jpg
Abies_koreana_Prostrate_Beauty_05-05-10_2.jpg
Chamaecyparis_obtusa_Limerick 06-10-10_3.jpg
Fagus_sylvatica_Dawyck_Gold_05-02-10_1.jpg
What i want do is rename the images using PHP so that only the characters remain in the image name want to delete the Numeric part so for example the above images would look like
Abies_koreana_Blauer_Pfiff.jpg
Abies_koreana_Prostrate_Beauty.jpg
Chamaecyparis_obtusa_Limerick.jpg
Fagus_sylvatica_Dawyck_Gold.jpg
Is this possible ? Or i have to do it manually ?
foreach file name do this
$new_filename = preg_replace("/(\w\d{0,2}[\W]{1}.+\.)/",".",$current_file_name);
so final function may look like this
function renameFiles($directory)
{
$handler = opendir($directory);
while ($file = readdir($handler)) {
if ($file != "." && $file != "..") {
if(preg_match("/(\w\d{0,2}[\W]{1}.+\.)/",$file)) {
echo $file."<br/>";
}
rename($directory."/".$file,$directory."/".preg_replace("/(\w\d{0,2}[\W]{1}.+\.)/",".",$file));
}
}
closedir($handler);
}
renameFiles("c:/wserver");
Updated
You can do this with PHP (or bash).
Your friends are RecursiveDirectoryIterator to walk through directories, preg_replace() to modify the file names, rename() to reflect changed filename on disk.
What you're trying to do can be done in ~10 lines of code. Using the ingredients above, you should be able to write a little script to change filenames yourself.
Update
throwing out the numeric parts (according to the examples given) can be done with a rather simple regular expression. Note that this will remove any numbers (-_ ) between the [a-z] filename and the suffix (".jpq"). So you won't get "foo3.png" but "foo.png". If this is a problem, the regex can be adjusted to meet that criteria…
<?php
$files = array(
'Abies_koreana_Blauer_Pfiff_05-06-10_1.jpg',
'Abies_koreana_Prostrate_Beauty_05-05-10_2.jpg',
'Chamaecyparis_obtusa_Limerick 06-10-10_3.jpg',
'Fagus_sylvatica_Dawyck_Gold_05-02-10_1.jpg',
);
foreach ($files as $source) {
// strip all numeric (date, counts, whatever)
// characters before the file's suffix
// (?= …) is a non-capturing look-ahead assertion
// see http://php.net/manual/en/regexp.reference.assertions.php for more info
$destination = preg_replace('#[ _0-9-]+(?=\.[a-z]+$)#i', '', $source);
echo "'$source' to '$destination'\n";
}

PHP regular expression to match a filepath

Can someone please help me with this preg_match
if (preg_match('~[^A-Za-z0-9_\./\]~', $filepath))
// Show Error message.
I need to match a possible filepath. So I need to check for double slashes, etc. Valid file path strings should look like this only:
mydir/aFile.php
or
mydir/another_dir/anyfile.js
So a slash at the beginning of this string should be checked also. Please help.
Thanks :)
EDIT:
Also, guys, this path is being read from within a text file. It is not a filepath on the system. So hopefully it should be able to support all systems in this case.
RE-EDIT:
Sorry, but the string can also look like this too:
myfile.php, or myfile.js, or myfile.anything
How do I allow strings like this as well?? I apologize for not being too specific on this before...
Please notice that there are many types of possible file paths.
For example:
"./"
"../"
"........" (yes this can be a file's name)
"file/file.txt"
"file/file"
"file.txt"
"file/.././/file/file/file"
"/file/.././/file/file/.file" (UNIX)
"C:\Windows\" (Windows)
"C:\Windows\asd/asd" (Windows, php accepts this)
"file/.././/file/file/file!##$"
"file/.././/file/file/file!##.php.php.php.pdf.php"
All these file paths are valid. I can't think of a simple regex that can make it perfect.
Let's assume it's just a UNIX path for now, this is what I think should work for most cases:
preg_match('/^[^*?"<>|:]*$/',$path)
It checks all string for ^, *, ?, ", <, >, |, :(remove this for windows). These are all character that windows does not allow for file name, along with / and .
If it's windows, you should replace the path's \ with / and then explode it and check if it's absolute. Here is one example that working in both unix and windows.
function is_filepath($path)
{
$path = trim($path);
if(preg_match('/^[^*?"<>|:]*$/',$path)) return true; // good to go
if(!defined('WINDOWS_SERVER'))
{
$tmp = dirname(__FILE__);
if (strpos($tmp, '/', 0)!==false) define('WINDOWS_SERVER', false);
else define('WINDOWS_SERVER', true);
}
/*first, we need to check if the system is windows*/
if(WINDOWS_SERVER)
{
if(strpos($path, ":") == 1 && preg_match('/[a-zA-Z]/', $path[0])) // check if it's something like C:\
{
$tmp = substr($path,2);
$bool = preg_match('/^[^*?"<>|:]*$/',$tmp);
return ($bool == 1); // so that it will return only true and false
}
return false;
}
//else // else is not needed
return false; // that t
}
You can do:
if(preg_match('#^(\w+/){1,2}\w+\.\w+$#',$path)) {
// valid path.
}else{
// invalid path
}

Categories