In a directory I have filenames like 123X1.jpg, 23X1.jpg, 23X2.jpg, 4123X1.jpg.
I need the glob pattern to only get listed files starting with a required string.
For example:
'23X' -> 23X1.jpg, 23X2.jpg
'123X' -> 123X1.jpg
Last part part of the pattern is always an X. The first one is a number.
It's trivial with glob():
print_r(glob('/path/to/23X*.jpg'));
print_r(glob('/path/to/123X*.jpg'));
You can try RegexIterator
$fi = new FilesystemIterator(__DIR__, FilesystemIterator::SKIP_DOTS);
$regex = new RegexIterator($fi, "/\dX[a-z\d]+/i");
foreach($regex as $file) {
echo (string) $file, PHP_EOL;
}
Related
I have a path "../uploads/e2c_name_icon/" and I need to extract e2c_name_icon from the path.
What I tried is using str_replace function
$msg = str_replace("../uploads/","","../uploads/e2c_name_icon/");
This result in an output "e2c_name_icon/"
$msg=str_replace("/","","e2c_name_icon/")
There is a better way to do this. I am searching alternative method to use regex expression.
Try this. Outputs: e2c_name_icon
<?php
$path = "../uploads/e2c_name_icon/";
// Outputs: 'e2c_name_icon'
echo explode('/', $path)[2];
However, this is technically the third component of the path, the ../ being the first. If you always need to get the third index, then this should work. Otherwise, you'll need to resolve the relative path first.
Use basename function provided by PHP.
$var = "../uploads/e2c_name_icon/";
echo basename( $var ); // prints e2c_name_icon
If you are strictly want to get the last part of the url after '../uploads'
Then you could use this :
$url = '../uploads/e2c_name_icon/';
$regex = '/\.\.\/uploads\/(\w+)/';
preg_match($regex, $url, $m)
print_r ($m); // $m[1] would output your url if possible
You can trim after the str_replace.
echo $msg = trim(str_replace("../uploads/","","../uploads/e2c_name_icon/"), "/");
I don't think you need to use regex for this. Simple string functions are usually faster
You could also use strrpos to find the second last /, then trim off both /.
$path = "../uploads/e2c_name_icon/";
echo $msg = trim(substr($path, strrpos($path, "/",-2)),"/");
I added -2 in strrpos to skip the last /. That means it returns the positon of the / after uploads.
So substr will return /e2c_name_icon/ and trim will remove both /.
You'd be much better off using the native PHP path functions vs trying to parse it yourself.
For example:
$path = "../uploads/e2c_name_icon/";
$msg = basename(dirname(realpath($path))); // e2c_name_icon
I have a string with a wildcard at the end, but I don't know how many characters that string will be. How can I use GlobIterator and RegexIterator to match similar file names? The second match returns all the files from a directory, but I don't want that. I need a proper regular expression. I don't want to match the last set before the extension (ex. the files sized 250M, 500M, etc.)
$iterator = new GlobIterator($this->srcDir . $identifier . ".*");
MATCH ON
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.*
This returns the correct files.
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.250m.jpg
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.500m.jpg
MATCH ON
/var/www/import/2014047-0216/YukonGold.A2014047.1620.*
Returns the files:
/var/www/import/2014047-0216/YukonGold.A2014047.1620.250m.jpg
/var/www/import/2014047-0216/YukonGold.A2014047.1620.500m.jpg
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.250m.jpg
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.500m.jpg
EXPECTED OUTPUT
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.*
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.250m.jpg
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.500m.jpg
/var/www/import/2014047-0216/YukonGold.A2014047.1620.*
/var/www/import/2014047-0216/YukonGold.A2014047.1620.250m.jpg
/var/www/import/2014047-0216/YukonGold.A2014047.1620.500m.jpg
You should use it inside a RegexIterator:
// Notice that there is no expansion pattern used here
$path = '/var/www/import/2014047-0216/YukonGold.A2014047.1620.';
$re = '~\Q' . $path . '\E(?:[^.]+\.)?\w+$~';
$regexIterator = new RegexIterator(new GlobIterator("{$path}*"), $re);
foreach ($regexIterator as $filename) {
echo $filename . "\n";
}
if I have a directory, where a list of file looks like this:
Something_2015020820.txt
something_5294032944.txt.a
something_2015324234.txt
Something_2014435353.txt.a
and I want to get the list of file sort by oldest date (not the filename) and the result should take anything that match something_xxxxxxx.txt. So, anything that ends with ".a" is not included. in this case it should return
Something_2015020820.txt
something_2015324234.txt
I do some google search and it seems like I can use glob
$listOfFiles = glob($this->directory . DIRECTORY_SEPARATOR . $this->pattern . "*");
but I'm not sure about the pattern.
It would be awesome if you could provide both case sensitive and insensitive pattern. The match pattern will have to be something_number.txt
It's a bit tricky using glob. It does support some limited pattern matching. For example you can do
glob("*.[tT][xX][tT]");
This will match file.txt and file.TXT, but it will also match file.tXt and file.TXt. Unfortunately there is no way to specify something like file.(txt or TXT). If that's not a problem, great! Otherwise, you'll have to first use this method to at least narrow the results down, and then perform some additional processing afterwards. array_filter and some regex maybe.
A better option might be to use PHP's Iterator classes, so you can specify much more advanced rules.
$directoryIterator = new RecursiveDirectoryIterator($directory);
$iteratorIterator = new RecursiveIteratorIterator($directoryIterator);
$fileList = new RegexIterator($iteratorIterator, '/^.*\.(txt|TXT)$/');
foreach($fileList as $file) {
echo $file;
}
You can use the scandir to get all your files. Then preg_match to filter those files down to ones that don't end with .a. Then filemtime can be used to pull the time the file was last modified. array_multisort can then be used to sort by the time the file was last modified and maintain the key/filename association.
This works as I expected:
date_default_timezone_set('America/New_York'); //change to your timezone
$directory = 'Your Directory';
$files = scandir($directory);
foreach($files as $file) {
if(!preg_match('~\.a$~', $file) && !is_dir($directory . $file)) {
$time["$file"] = filemtime($upload_directory . $file);
}
}
array_multisort($time);
Then to handle the outputting you could do something like:
foreach($time as $file => $da_time){
echo $file . ' '. date('m/d/Y H:i:s', $da_time) . "\n";
}
The preg_match can have i added after the ~ delimiter so the regex is case insensitive; or you could just make change the a to [aA].
In PHP, I have two paths on a server that both have a matching part. I'd like to join them, but delete the part that is equal.
EXAMPLE:
Path #1:
/home7/username/public_html/dir/anotherdir/wp-content/uploads
Path #2:
/dir/anotherdir/wp-content/uploads/2011/09/image.jpg
You see the part /dir/anotherdir/wp-content/uploads is the same in both strings, but when I simply join them I would have some directories twice.
The output I need is this:
/home7/username/public_html/dir/anotherdir/wp-content/uploads/2011/09/image.jpg
Since the dirs can change on different servers I need a dynamic solution that detects the matching part from #2 and removes it on #1 so I can trail #2 right after #1 :)
$path1 = "/home7/username/public_html/dir/anotherdir/wp-content/uploads";
$path2 = "/dir/anotherdir/wp-content/uploads/2011/09/image.jpg";
echo $path1 . substr($path2, strpos($path2, basename($path1)) + strlen(basename($path1)));
The problem is not so generic here. You should not look at the problem as matching equal parts of strings, rather you should look at it like equal directory structure.
That said you need to concentrate on strings after '/'.
So basically you need to do string matching of directory names. Moreover your problem looks like that first input file name's last part of directory structure name may be common to some part (starting from first character) of second input string.
So I will suggest to start reading the first input from end at the jumps of '/' and try to get first string matching with the first folder name in second file-path. If match happens then rest of the string character from this index to last index in first file-path should be there in first part of second input string. If this condition fails the repeat the process of finding the first directory name in second string matching with a directory name in first file-name for next index.
This code can help you:
$str1 = $argv[1];
$str2 = $argv[2];
//clean
$str1 = trim(str_replace("//", "/", $str1), "/");
$str2 = trim(str_replace("//", "/", $str2), "/");
$paths1 = explode("/", $str1);
$paths2 = explode("/", $str2);
$j = 0;
$found = false;
$output = '';
for ($i=0; $i<count($paths1); $i++) {
$item1 = $paths1[$i];
$item2 = $paths2[$j];
if ($item1 == $item2) {
if (!$found)
$found = $i; //first point
$j++;
} else if ($found) {
//we have found a subdir so remove
$output = "/".implode("/", array_slice($paths1, 0, $i))
."/".implode("/", array_slice($paths2, $j));
$found = false;
break;
}
}
//final checking
if ($found) {
$output = "/".implode("/", $paths1)
."/".implode("/", array_slice($paths2, $j));
}
print "FOUND?: ".(!empty($output)?$output:'No')."\n";
Will detect the equal substrings and will cut the first string until that point and copy the other part from second string.
This code will accept also two strings if they share "partial" substrings like:
/path1/path2/path3
/path2/other/file.png
will output:
/path/path2/other/file.png
And will remove the "path3", but with few changes can be more strict
how about using the similar_text as described in this link. It returns the matching chars between two strings. Once you have it, replace the first one with empty string and append the second.
Hacking up what I thought was the second simplest type of regex (extract a matching string from some strings, and use it) in php, but regex grouping seems to be tripping me up.
Objective
take a ls of files, output the commands to format/copy the files to have the correct naming format.
Resize copies of the files to create thumbnails. (not even dealing with that step yet)
Failure
My code fails at the regex step, because although I just want to filter out everything except a single regex group, when I get the results, it's always returning the group that I want -and- the group before it, even though I in no way requested the first backtrace group.
Here is a fully functioning, runnable version of the code on the online ide:
http://ideone.com/2RiqN
And here is the code (with a cut down initial dataset, although I don't expect that to matter at all):
<?php
// Long list of image names.
$file_data = <<<HEREDOC
07184_A.jpg
Adrian-Chelsea-C08752_A.jpg
Air-Adams-Cap-Toe-Oxford-C09167_A.jpg
Air-Adams-Split-Toe-Oxford-C09161_A.jpg
Air-Adams-Venetian-C09165_A.jpg
Air-Aiden-Casual-Camp-Moc-C09347_A.jpg
C05820_A.jpg
C06588_A.jpg
Air-Aiden-Classic-Bit-C09007_A.jpg
Work-Moc-Toe-Boot-C09095_A.jpg
HEREDOC;
if($file_data){
$files = preg_split("/[\s,]+/", $file_data);
// Split up the files based on the newlines.
}
$rename_candidates = array();
$i = 0;
foreach($files as $file){
$string = $file;
$pattern = '#(\w)(\d+)_A\.jpg$#i';
// Use the second regex group for the results.
$replacement = '$2';
// This should return only group 2 (any number of digits), but instead group 1 is somehow always in there.
$new_file_part = preg_replace($pattern, $replacement, $string);
// Example good end result: <img src="images/ch/ch-07184fs.jpg" width="350" border="0">
// Save the rename results for further processing later.
$rename_candidates[$i]=array('file'=>$file, 'new_file'=>$new_file_part);
// Rename the images into a standard format.
echo "cp ".$file." ./ch/ch-".$new_file_part."fs.jpg;";
// Echo out some commands for later.
echo "<br>";
$i++;
if($i>10){break;} // Just deal with the first 10 for now.
}
?>
Intended result for the regex: 788750
Intended result for the code output (multiple lines of): cp air-something-something-C485850_A.jpg ./ch/ch-485850.jpg;
What's wrong with my regex? Suggestions for simpler matching code would be appreciated as well.
Just a guess:
$pattern = '#^.*?(\w)(\d+)_A\.jpg$#i';
This includes the whole filename in the match. Otherwise preg_replace() will really only substitute the end of each string - it only applies the $replacement expression on the part that was actually matched.
Scan Dir and Expode
You know what? A simpler way to do it in php is to use scandir and explode combo
$dir = scandir('/path/to/directory');
foreach($dir as $file)
{
$ext = pathinfo($file,PATHINFO_EXTENSION);
if($ext!='jpg') continue;
$a = explode('-',$file); //grab the end of the string after the -
$newfilename = end($a); //if there is no dash just take the whole string
$newlocation = './ch/ch-'.str_replace(array('C','_A'),'', basename($newfilename,'.jpg')).'fs.jpg';
echo "#copy($file, $newlocation)\n";
}
#and you are done :)
explode: basically a filename like blah-2.jpg is turned into a an array('blah','2.jpg); and then taking the end() of that gets the last element. It's the same almost as array_pop();
Working Example
Here's my ideaone code http://ideone.com/gLSxA