I have an S3 bucket full of images whose naming follows a simple pattern. The first 6 digits group images by listing number, the trailing digit(s) are non-sequential, but follow a reliable pattern (0 thru 99) I'm capturing the six digits that start the filename in a variable $ln.
/*
https://s3.amazonaws.com/stroupenwmls2/602665_10.jpg
https://s3.amazonaws.com/stroupenwmls2/602665_12.jpg
https://s3.amazonaws.com/stroupenwmls2/602665_13.jpg
https://s3.amazonaws.com/stroupenwmls2/602665_15.jpg
*/
What I want to do is populate a 'listing' img src attribute with the url to an image, if one exists for that listing (if not, I provide a no-image.jpg). And I'm looping thru many different listings to create my web page.
I'm struggling with the logic to grab the first image that matches the $listing variable. Here is what I've tried, with no luck (just produces a 0):
$bucket = 'https://s3.amazonaws.com/stroupenwmls2/';
$ln = '602665';
$string = $bucket . $ln . '_';
// match the pattern '_xx.jpg', with 1 or 2 numbers
$image = preg_match('/^_[0-9]{1,2}\.(jpg|jpeg|png|gif)/i', $string);
Then in my web app:
<img src="<?php echo $image ?>">
I'm an idiot when it comes to using preg_match, what I really need is some sort of wildcard parameter. I'm sure I'm making this way too complicated.
The problem is that you're not matching against the image paths, you're matching against what i assume you intend to be part of your regular expression. See below:
$bucket = 'https://s3.amazonaws.com/stroupenwmls2/';
$ln = '602665';
$re = $bucket . $ln . '_' + '[0-9]{1,2}\.(jpg|jpeg|png|gif)';
// let's say you have an array called img_list;
// loop through each path in the list, searching strings
// that match the regular expression constructed in $re.
// if you find a match, return it.
// you'd probably want to define a function to do this for you,
// and call it with the $listing and array as parameters.
foreach (img_list as $img) {
// this returns either 0 or 1 depending on match.
// return the first one, and we're done.
if (preg_match('/^' . $re . '/i', $img)) {
return $img;
}
}
Related
I have a string with a wildcard at the end, but I don't know how many characters that string will be. How can I use GlobIterator and RegexIterator to match similar file names? The second match returns all the files from a directory, but I don't want that. I need a proper regular expression. I don't want to match the last set before the extension (ex. the files sized 250M, 500M, etc.)
$iterator = new GlobIterator($this->srcDir . $identifier . ".*");
MATCH ON
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.*
This returns the correct files.
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.250m.jpg
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.500m.jpg
MATCH ON
/var/www/import/2014047-0216/YukonGold.A2014047.1620.*
Returns the files:
/var/www/import/2014047-0216/YukonGold.A2014047.1620.250m.jpg
/var/www/import/2014047-0216/YukonGold.A2014047.1620.500m.jpg
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.250m.jpg
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.500m.jpg
EXPECTED OUTPUT
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.*
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.250m.jpg
/var/www/import/2014047-0216/YukonGold.A2014047.1620.721.500m.jpg
/var/www/import/2014047-0216/YukonGold.A2014047.1620.*
/var/www/import/2014047-0216/YukonGold.A2014047.1620.250m.jpg
/var/www/import/2014047-0216/YukonGold.A2014047.1620.500m.jpg
You should use it inside a RegexIterator:
// Notice that there is no expansion pattern used here
$path = '/var/www/import/2014047-0216/YukonGold.A2014047.1620.';
$re = '~\Q' . $path . '\E(?:[^.]+\.)?\w+$~';
$regexIterator = new RegexIterator(new GlobIterator("{$path}*"), $re);
foreach ($regexIterator as $filename) {
echo $filename . "\n";
}
I need some help with refining my current search.
I have folder with images that are named as:
20171116-category_title.jpg (where first number is date yyyymmdd)
My current search looks like this:
<?php
// string to search in a filename.
if(isset($_POST['question'])){
$searchString = $_POST['question'];
}
// image files in my/dir
$imagesDir = '';
$files = glob($imagesDir . '*.{jpg,jpeg,png,gif}', GLOB_BRACE);
// array populated with files found
// containing the search string.
$filesFound = array();
// iterate through the files and determine
// if the filename contains the search string.
foreach($files as $file) {
$name = pathinfo($file, PATHINFO_FILENAME);
// determines if the search string is in the filename.
if(strpos(strtolower($name), strtolower($searchString))) {
$filesFound[] = $file;
}
}
// output the results.
echo json_encode($filesFound, JSON_UNESCAPED_UNICODE);
?>
And this works just fine but...
I would like to limit search only to part of .jpg name that contains "title" behind underscore " _ " and after that (if possible) to expand search to:
To make double search if AJAX POST sends following format: abc+xyz where delimiter "+" practicaly means 2 queries.
First part is (abc) which targets "category" that stands between minus and underscore and second part of query (xyz) (which is basically my first question) only among previously found (category) answers.
Your tips are more than welcome!
Thank you!
For the first part of your question, the exact pattern you use depends on the format of your category strings. If you will never have underscores _ in the category, here's one solution:
foreach($files as $file) {
// $name = "20171116-category_title"
$name = pathinfo($file, PATHINFO_FILENAME);
// $title = "title", assuming your categories will never have "_".
// The regular expression matches 8 digits, followed by a hyphen,
// followed by anything except an underscore, followed by an
// underscore, followed by anything
$title = preg_filter('/\d{8}-[^_]+_(.+)/', '$1', $name);
// Now search based on your $title, not $name
// *NOTE* this test is not safe, see update below.
if(strpos(strtolower($title), strtolower($searchString))) {
If your categories can or will have underscores, you'll need to adjust the regular expression based on some format you can be sure of.
For your 2nd question, you need to first separate your query into addressable parts. Note though that + is typically how spaces are encoded in URLs, so using it as a delimiter means you will never be able to use search terms with spaces. Maybe that's not a problem for you, but if it is you should try another delimter, or maybe simpler would be to use separate search fields, eg 2 inputs on your search form.
Anyway, using +:
if(isset($_POST['question'])){
// $query will be an array with 0 => category term, and 1 => title term
$query = explode('+', $_POST['question']);
}
Now in your loop you need to identify not just the $title part of the filename, but also the $category:
$category = preg_filter('/\d{8}-([^_]+)_.+/', '$1', $name);
$title = preg_filter('/\d{8}-[^_]+_(.+)/', '$1', $name);
Once you have those, you can use them in your final test for a match:
if( strpos(strtolower($category), strtolower($query[0])) && strpos(strtolower($title), strtolower($query[1])) ) {
UPDATE
I just noticed your match test has a problem. strpos can return 0 if a match is found starting at position 0. 0 is a falsey result which which means your test will fail, even though there's a match. You need to explicitly test on FALSE, as described in the docs:
if( strpos(strtolower($category), strtolower($query[0])) !== FALSE
&& strpos(strtolower($title), strtolower($query[1])) !== FALSE ) {
In PHP, I have two paths on a server that both have a matching part. I'd like to join them, but delete the part that is equal.
EXAMPLE:
Path #1:
/home7/username/public_html/dir/anotherdir/wp-content/uploads
Path #2:
/dir/anotherdir/wp-content/uploads/2011/09/image.jpg
You see the part /dir/anotherdir/wp-content/uploads is the same in both strings, but when I simply join them I would have some directories twice.
The output I need is this:
/home7/username/public_html/dir/anotherdir/wp-content/uploads/2011/09/image.jpg
Since the dirs can change on different servers I need a dynamic solution that detects the matching part from #2 and removes it on #1 so I can trail #2 right after #1 :)
$path1 = "/home7/username/public_html/dir/anotherdir/wp-content/uploads";
$path2 = "/dir/anotherdir/wp-content/uploads/2011/09/image.jpg";
echo $path1 . substr($path2, strpos($path2, basename($path1)) + strlen(basename($path1)));
The problem is not so generic here. You should not look at the problem as matching equal parts of strings, rather you should look at it like equal directory structure.
That said you need to concentrate on strings after '/'.
So basically you need to do string matching of directory names. Moreover your problem looks like that first input file name's last part of directory structure name may be common to some part (starting from first character) of second input string.
So I will suggest to start reading the first input from end at the jumps of '/' and try to get first string matching with the first folder name in second file-path. If match happens then rest of the string character from this index to last index in first file-path should be there in first part of second input string. If this condition fails the repeat the process of finding the first directory name in second string matching with a directory name in first file-name for next index.
This code can help you:
$str1 = $argv[1];
$str2 = $argv[2];
//clean
$str1 = trim(str_replace("//", "/", $str1), "/");
$str2 = trim(str_replace("//", "/", $str2), "/");
$paths1 = explode("/", $str1);
$paths2 = explode("/", $str2);
$j = 0;
$found = false;
$output = '';
for ($i=0; $i<count($paths1); $i++) {
$item1 = $paths1[$i];
$item2 = $paths2[$j];
if ($item1 == $item2) {
if (!$found)
$found = $i; //first point
$j++;
} else if ($found) {
//we have found a subdir so remove
$output = "/".implode("/", array_slice($paths1, 0, $i))
."/".implode("/", array_slice($paths2, $j));
$found = false;
break;
}
}
//final checking
if ($found) {
$output = "/".implode("/", $paths1)
."/".implode("/", array_slice($paths2, $j));
}
print "FOUND?: ".(!empty($output)?$output:'No')."\n";
Will detect the equal substrings and will cut the first string until that point and copy the other part from second string.
This code will accept also two strings if they share "partial" substrings like:
/path1/path2/path3
/path2/other/file.png
will output:
/path/path2/other/file.png
And will remove the "path3", but with few changes can be more strict
how about using the similar_text as described in this link. It returns the matching chars between two strings. Once you have it, replace the first one with empty string and append the second.
Hacking up what I thought was the second simplest type of regex (extract a matching string from some strings, and use it) in php, but regex grouping seems to be tripping me up.
Objective
take a ls of files, output the commands to format/copy the files to have the correct naming format.
Resize copies of the files to create thumbnails. (not even dealing with that step yet)
Failure
My code fails at the regex step, because although I just want to filter out everything except a single regex group, when I get the results, it's always returning the group that I want -and- the group before it, even though I in no way requested the first backtrace group.
Here is a fully functioning, runnable version of the code on the online ide:
http://ideone.com/2RiqN
And here is the code (with a cut down initial dataset, although I don't expect that to matter at all):
<?php
// Long list of image names.
$file_data = <<<HEREDOC
07184_A.jpg
Adrian-Chelsea-C08752_A.jpg
Air-Adams-Cap-Toe-Oxford-C09167_A.jpg
Air-Adams-Split-Toe-Oxford-C09161_A.jpg
Air-Adams-Venetian-C09165_A.jpg
Air-Aiden-Casual-Camp-Moc-C09347_A.jpg
C05820_A.jpg
C06588_A.jpg
Air-Aiden-Classic-Bit-C09007_A.jpg
Work-Moc-Toe-Boot-C09095_A.jpg
HEREDOC;
if($file_data){
$files = preg_split("/[\s,]+/", $file_data);
// Split up the files based on the newlines.
}
$rename_candidates = array();
$i = 0;
foreach($files as $file){
$string = $file;
$pattern = '#(\w)(\d+)_A\.jpg$#i';
// Use the second regex group for the results.
$replacement = '$2';
// This should return only group 2 (any number of digits), but instead group 1 is somehow always in there.
$new_file_part = preg_replace($pattern, $replacement, $string);
// Example good end result: <img src="images/ch/ch-07184fs.jpg" width="350" border="0">
// Save the rename results for further processing later.
$rename_candidates[$i]=array('file'=>$file, 'new_file'=>$new_file_part);
// Rename the images into a standard format.
echo "cp ".$file." ./ch/ch-".$new_file_part."fs.jpg;";
// Echo out some commands for later.
echo "<br>";
$i++;
if($i>10){break;} // Just deal with the first 10 for now.
}
?>
Intended result for the regex: 788750
Intended result for the code output (multiple lines of): cp air-something-something-C485850_A.jpg ./ch/ch-485850.jpg;
What's wrong with my regex? Suggestions for simpler matching code would be appreciated as well.
Just a guess:
$pattern = '#^.*?(\w)(\d+)_A\.jpg$#i';
This includes the whole filename in the match. Otherwise preg_replace() will really only substitute the end of each string - it only applies the $replacement expression on the part that was actually matched.
Scan Dir and Expode
You know what? A simpler way to do it in php is to use scandir and explode combo
$dir = scandir('/path/to/directory');
foreach($dir as $file)
{
$ext = pathinfo($file,PATHINFO_EXTENSION);
if($ext!='jpg') continue;
$a = explode('-',$file); //grab the end of the string after the -
$newfilename = end($a); //if there is no dash just take the whole string
$newlocation = './ch/ch-'.str_replace(array('C','_A'),'', basename($newfilename,'.jpg')).'fs.jpg';
echo "#copy($file, $newlocation)\n";
}
#and you are done :)
explode: basically a filename like blah-2.jpg is turned into a an array('blah','2.jpg); and then taking the end() of that gets the last element. It's the same almost as array_pop();
Working Example
Here's my ideaone code http://ideone.com/gLSxA
I have a following string and I want to extract image123.jpg.
..here_can_be_any_length "and_here_any_length/image123.jpg" and_here_also_any_length
image123 can be any length (newimage123456 etc) and with extension of jpg, jpeg, gif or png.
I assume I need to use preg_match, but I am not really sure and like to know how to code it or if there are any other ways or function I can use.
You can use:
if(preg_match('#".*?\/(.*?)"#',$str,$matches)) {
$filename = $matches[1];
}
Alternatively you can extract the entire path between the double quotes using preg_match and then extract the filename from the path using the function basename:
if(preg_match('#"(.*?)"#',$str,$matches)) {
$path = $matches[1]; // extract the entire path.
$filename = basename ($path); // extract file name from path.
}
What about something like this :
$str = '..here_can_be_any_length "and_here_any_length/image123.jpg" and_here_also_any_length';
$m = array();
if (preg_match('#".*?/([^\.]+\.(jpg|jpeg|gif|png))"#', $str, $m)) {
var_dump($m[1]);
}
Which, here, will give you :
string(12) "image123.jpg"
I suppose the pattern could be a bit simpler -- you could not check the extension, for instance, and accept any kind of file ; but not sure it would suit your needs.
Basically, here, the pattern :
starts with a "
takes any number of characters until a / : .*?/
then takes any number of characters that are not a . : [^\.]+
then checks for a dot : \.
then comes the extension -- one of those you decided to allow : (jpg|jpeg|gif|png)
and, finally, the end of pattern, another "
And the whole portion of the pattern that corresponds to the filename is surrounded by (), so it's captured -- returned in $m
$string = '..here_can_be_any_length "and_here_any_length/image123.jpg" and_here_also_any_length';
$data = explode('"',$string);
$basename = basename($data[1]);