Regular Expression String Break - php

I am fairly new to regex. I have been trying to break string to get the initial part of the string to create folders.
Here are few examples of the variables that I need to break.
test1-792X612.jpg
test-with-multiple-hyphens-612X792.jpg
Is there a way using regular expression that I can get test1 and test-with-multiple-hyphens?

You can use a regex like this:
(.*?)-\d+x\d+
Working demo
The idea is that the pattern will match the string with the -NumXNum but capture the previous content. Note the case insensitive flag.
MATCH 1
1. [0-5] `test1`
MATCH 2
1. [18-44] `test-with-multiple-hyphens`
If you don't want to use the insensitive flag, you could change the regex to:
(.*?)-\d+[Xx]\d+

If you're certain that all filenames end with 000X000 (where the 0's are any number), this should work:
/^(.*)-[0-9]{3}X[0-9]{3}\.jpg$/
The value from (.*) will contain the part that you're looking for.
In case there could be more or fewer numbers, but at least one:
/^(.*)-[0-9]+X[0-9]+$\.jpg/

You can use this simple regex:
(.+)(?=-.+$)
Explanations:
(.+) : Capture desired part
(?=-.+$) : (Positive Lookahead) Which is following a dashed part
Live demo

If I understood your question correctly, you want to break the hyphenated parts of a file into directories. The expression (.*?)-([^-]+\.jpg)$ will capture everything before and after the last - in a .jpg file. You can then use preg_match() to match/capture these groups and explode() to split the - into different directories.
$files = array(
'test1-792X612.jpg',
'test-with-multiple-hyphens-612X792.jpg',
);
foreach($files as $file) {
if(preg_match('/(.*?)-([^-]+\.jpg)$/', $file, $matches)) {
$directories = explode('-', $matches[1]);
$file = $matches[2];
}
}
// 792X612.jpg
// Array
// (
// [0] => test1
// )
//
// 612X792.jpg
// Array
// (
// [0] => test
// [1] => with
// [2] => multiple
// [3] => hyphens
// )

Related

How to numerically sort an array like this: ['11--2017 name.png','1--2016 name.png','2--1999 name.png']

Am I correct that character precedence would order these like this:
1--2016 name.png, 11--2017 name.png, 2--1999 name.png
Numerically, however, they would be like this:
1--2016 name.png, 2--1999 name.png, 11--2017 name.png
That is, if I'm looking at the first numbers alone. How do you numerically sort an array with strings like this? Namely, integers appended with "--".
It's important to note that these "strings" are actually pathnames which cannot be renamed. See glob for more information.
Edit, after modified question:
After your edit, obviously all answers in this thread are wrong. Also, you don't have to only copy-and-paste a piece of code, but to read entire answer. Sure enough, in my original answer, I say:
if you have a value like “12--3”, it will be sorted like “123”
So, you could see right away that your real case is not coherent with provided sample.
This second solution will sort an array by number at start of given basename path followed by two dashes. It will be applicable on following cases:
String Will be sorted by
------------------------------ -----------------
/Absolute/Path/12-- 12
/Absolute/Path/12--2001.png 12
/12--2001.png 12
12--2001.png 12
a12--2001.png a12--2001.png
-12--2001.png -12--2001.png
Having this array:
[
'/path/to/image/1--2016 name.png',
'/path/to/image/11--2017.png',
'/path/to/image/2--1999.png'
]
You can replace regular expression patter of above original solution with this pattern:
~^(.*/)?(\d+)--[^/]*$~
And above array will be sorted in this way:
Array
(
[0] => /path/to/image/1--2016 name.png
[1] => /path/to/image/2--1999.png
[2] => /path/to/image/11--2017.png
)
eval.in demo
Pattern explanation:
~
^ # Start of string
(.*/)? # Group 1 (optional): zero-ore-more characters followed by a slash
(\d+) # Group 2: one-or-more digits
-- # two dashes
[^/]* # zero-or-more characters, except slash
$ # End of string
~
In the future, take a look at How to create a Minimal, Complete, and Verifiable example
Original answer (for original question):
There are surely many ways to obtain your result. Using usort and preg_replace:
$array = ['11--','23--','1--'];
usort
(
$array,
function( $a, $b )
{
return preg_replace( '~[^\d]~', '', $a ) - preg_replace( '~[^\d]~', '', $b );
}
);
$array now is:
Array
(
[0] => 1--
[1] => 11--
[2] => 23--
)
Above solution will sort your array deleting1 all not digits characters.
So, if you have a value like 12--3, it will be sorted like 123. Consequently, it doesn't work on not-integer or negative numbers.
1 Actually, the original array values are not changed.
If you wanted a quick fix to getting this done, you could:
$strings = array('5--', '2--', '11--');
$newStrings = array();
foreach ($strings as $string) {
$stringNew = str_replace('--', '', $string);
array_push($newStrings, $stringNew);
}
sort($newStrings);
$doneArray = array();
foreach ($newStrings as $newString) {
array_push($doneArray, $newString.'--');
}
// $doneArray is the new array full of the sorted strings.
I didn't really bother with the variable names, but that's a nice way to do it.
natsort
See here.
I'm not sure how glob sorts things as they come in, but I thought that sort would have ordered them correctly, but natsort will do the trick.

How Do you split a string like this? PHP

I am currently trying to create a random image generator in PHP, and I'm having a hard time setting the file path, I can get all the file paths, but they are in one long string, like this.
" ../images/box1/IMG_3158.JPG../images/box1/IMG_3161.JPG../images/box1/IMG_3163.JPG../images/box1/IMG_3158.JPG../images/box1/IMG_3161.JPG../images/box1/IMG_3163.JPG"
with the " .." marking the beginning of a new file.
How would i explode ( or something of the kind ) each file_path to return them separately?
Here is one way you can do this.
$data = '../images/box1/IMG_3158.JPG../images/box1/IMG_3161.JPG../images/box1/IMG_3163.JPG../images/box1/IMG_3158.JPG../images/box1/IMG_3161.JPG../images/box1/IMG_3163.JPG';
$files = preg_split('/(?<!^)(?=\.{2})/', $data);
print_r($files);
Output
Array
(
[0] => ../images/box1/IMG_3158.JPG
[1] => ../images/box1/IMG_3161.JPG
[2] => ../images/box1/IMG_3163.JPG
[3] => ../images/box1/IMG_3158.JPG
[4] => ../images/box1/IMG_3161.JPG
[5] => ../images/box1/IMG_3163.JPG
)
Regular Expression:
(?<! look behind to see if there is not:
^ the beginning of the string
) end of look-behind
(?= look ahead to see if there is:
\.{2} '.' (2 times)
) end of look-ahead
<?php
//Since new file path is starting from ".." we explode it using ".." and added to each file path.
$string ="../images/box1/IMG_3158.JPG../images/box1/IMG_3161.JPG../images/box1/IMG_3163.JPG../images/box1/IMG_3158.JPG../images/box1/IMG_3161.JPG../images/box1/IMG_3163.JPG";
$file_path=explode('..',$string);
$i=0;
while(isset($file_path[++$i])){
$file_path[$i]="..".$file_path[$i];
echo $file_path[$i]."<br />";
}
?>
http://ideone.com/MTRr9Q
Just use explode()
$file_paths = explode('..', $input);
Example
$string = "../images/box1/IMG_3158.JPG../images/box1/IMG_3161.JPG../images/box1/IMG_3163.JPG../images/box1/IMG_3158.JPG../images/box1/IMG_3161.JPG../images/box1/IMG_3163.JPG";
$file_paths = explode('..', $string);
var_dump($file_paths);
This strips of the ".." at the beginning, so try to append it yourself. For more complicated situations, a preg_split() would be an appropriate task. Given that your string doesn't change, then an explode could do.
If you are able to introduce an otherwise unused character between each path as you import(?) the list of path names, you could then use that character as your delimiter instead of ".." (refer to Ali's answer).

Reqular Expression for getting sub-string, if not exist get other

I have a reqular expression for getting sub-string from a string, if a particular string doesn't exist then try to get another sub-string.
Reqular expression i am trying is this:
\#\s*\((.*?)\)|\((.*?)\)
But this does not work, and always get the second option sub-string instead of the first option.
Example string is
some text (2nd Sub-string) # (First sub-string)
And it give me this result:
Array
(
[0] => (2nd Sub-string)
[1] =>
[2] => 2nd Sub-string
)
Why don't you simply get both strings (for which your regexp works correctly) and check their existence programmatically? Something like:
$num = preg_match_all(
"/\#\s*\((.*?)\)|\((.*?)\)/",
"some text (2nd Sub-string) # (First sub-string)",
$matches, PREG_SET_ORDER
);
var_dump($num, $matches);
if($num < 2)
{
// no second match, read first
}
if(!array_key_exists(2, $matches[1]))
{
// another way to put it
}
HTH.

RegEx for hashtag separated string

I have bunch of strings like this:
a#aax1aay222b#bbx4bby555bbz6c#mmm1d#ara1e#abc
And what I need to do is to split them up based on the hashtag position to something like this:
Array
(
[0] => A
[1] => AAX1AAY222
[2] => B
[3] => BBX4BBY555BBZ6
[4] => C
[5] => MMM1
[6] => D
[7] => ARA1
[8] => E
[9] => ABC
)
So, as you see the character right behind the hashtag is captured plus everything after the hashtag just right before the next char+hashtag.
I've the following RegEx which works fine only when I have a numeric value in the end of each part.
Here is the RegEx set up:
preg_split('/([A-Z])+#/', $text, 0, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
And it works fine with something like this:
C#mmm1D#ara1
But, if I change it to this (removing the numbers):
C#mmmD#ara
Then it will be the result, which is not good:
Array
(
[0] => C
[1] => D
)
I've looked at this question and this one also, which are similar but none of them worked for me.
So, my question is why does it work only if it has followed by a number? and how I can solve it?
Here you can see some of them sample strings which I have:
a#123b#abcc#def456 // A:123, B:ABC, C:DEF456
a#abc1def2efg3b#abcdefc#8 // A:ABC1DEF2EFG3, B:ABCDEF, C:8
a#abcdef123b#5c#xyz789 // A:ABCDEF123, B:5, C:XYZ789
P.S. Strings are case-insensitive.
P.P.S. If you ever thinking what the hell are these strings, they are user submitted answers to a questionnaire, and I can't do anything on them like refactoring as they are already stored and just need to be proceed.
Why Not Using explode?
If you look at my examples you will see that I need to capture the character right before the # as well. If you think it's possible with explode() please post the output as well, thanks!
Update
Should we focus on why /([A-Z])+#/ works only if numbers included? thanks.
Instead of using preg_split(), decide what you want to match instead:
A set of "words" if followed by either <any-char># or <end-of-string>.
A character if immediately followed by #.
$str = 'a#aax1aay222b#bbx4bby555bbz6c#mmm1d#ara1e#abc';
preg_match_all('/\w+(?=.#|$)|\w(?=#)/', $str, $matches);
Demo
This expression uses two look-ahead assertions. The results are in $matches[0].
Update
Another way of looking at it would be this:
preg_match_all('/(\w)#(\w+)(?=\w#|$)/', $str, $matches);
print_r(array_combine($matches[1], $matches[2]));
Each entry starts with a single character, followed by a hash, followed by X characters until either the end of the string is encountered or the start of a next entry.
The output is this:
Array
(
[a] => aax1aay222
[b] => bbx4bby555bbz6
[c] => mmm1
[d] => ara1
[e] => abc
)
If you still want to use preg_split you can remove the + and it might work as expected:
'/([A-Z])#/i'
Since then you only match the hashtag and ONE alpha character before, and not all them.
Example: http://codepad.viper-7.com/z1kFDb
Edit: Added a case-insensitive flag i in the pattern.
Use explode() rather than Regexp
$tmpArray = explode("#","a#aax1aay222b#bbx4bby555bbz6c#mmm1d#ara1e#abc");
$myArray = array();
for($i = 0; $i < count($tmpArray) - 1; $i++) {
if (substr($tmpArray[$i],0,-1)) $myArray[] = substr($tmpArray[$i],0,-1);
if (substr($tmpArray[$i],-1)) $myArray[] = substr($tmpArray[$i],-1);
}
if (count($tmpArray) && $tmpArray[count($tmpArray) - 1]) $myArray[] = $tmpArray[count($tmpArray) - 1];
edit: I updated my answer to reflect better reading the questions
You can use explode() function that will split the string except the hash signs, like stated in the answers given before.
$myArray = explode("#",$string);
For the string 'a#aax1aay222b#bbx4bby555bbz6c#mmm1d#ara1e#abc' this returns something like
$myarray = array('a', 'aax1aay22b', 'bbx4bby555bbz6c' ....);
All you need now is to take the last character of each string in array as another item.
$copy = array();
foreach($myArray as $item){
$beginning = substr($item,0,strlen($item)-1); // this takes all characters except the last one
$ending = substr($item,-1); // this takes the last one
$copy[] = $beginning;
$copy[] = $ending;
} // end foreach
This is an example, not tested.
EDIT
Instead of substr($item,0,strlen($item)-1); you might use substr($item,0,-1);.

regex assistance

I am trying to match a semi dynamically generated string. So I can see if its the correct format, then extract the information from it that I need. My Problem is I no matter how hard I try to grasp regex can't fathom it for the life of me. Even with the help of so called generators.
What I have is a couple different strings like the following. [#img:1234567890] and [#user:1234567890] and [#file:file_name-with.ext]. Strings like this pass through are intent on passing through a filter so they can be replaced with links, and or more readable names. But again try as I might I can't come up with a regex for any given one of them.
I am looking for the format: [#word:] of which I will strip the [, ], #, and word from the string so I can then turn around an query my DB accordingly for whatever it is and work with it accordingly. Just the regex bit is holding me back.
Not sure what you mean by generators. I always use online matchers to see that my test cases work. #Virendra almost had it except forgot to escape the [] charaters.
/\[#(\w+):(.*)\]/
You need to start and end with a regex delimeter, in this case the '/' character.
Then we escape the '[]' which is use by regex to match ranges of characters hence the '['.
Next we match a literal '#' symbol.
Now we want to save this next match so we can use it later so we surround it with ().
\w matches a word. Basically any characters that aren't spaces, punctuation, or line characters.
Again match a literal :.
Maybe useful to have the second part in a match group as well so (.*) will match any character any number of times, and save it for you.
Then we escape the closing ] as we did earlier.
Since it sounds like you want to use the matches later in a query we can use preg_match to save the matches to an array.
$pattern = '/\[#(\w+):(.*)\]/';
$subject = '[#user:1234567890]';
preg_match($pattern, $subject, $matches);
print_r($matches);
Would output
array(
[0] => '[#user:1234567890]', // Full match
[1] => 'user', // First match
[2] => '1234567890' // Second match
)
An especially helpful tool I've found is txt2re
Here's what I would do.
<pre>
<?php
$subj = 'An image:[#img:1234567890], a user:[#user:1234567890] and a file:[#file:file_name-with.ext]';
preg_match_all('~(?<match>\[#(?<type>[^:]+):(?<value>[^\]]+)\])~',$subj,$matches,PREG_SET_ORDER);
foreach ($matches as &$arr) unset($arr[0],$arr[1],$arr[2],$arr[3]);
print_r($matches);
?>
</pre>
This will output
Array
(
[0] => Array
(
[match] => [#img:1234567890]
[type] => img
[value] => 1234567890
)
[1] => Array
(
[match] => [#user:1234567890]
[type] => user
[value] => 1234567890
)
[2] => Array
(
[match] => [#file:file_name-with.ext]
[type] => file
[value] => file_name-with.ext
)
)
And here's a pseudo version of how I would use the preg_replace_callback() function:
function replace_shortcut($matches) {
global $users;
switch (strtolower($matches['type'])) {
case 'img' : return '<img src="images/img_'.$matches['value'].'jpg" />';
case 'file' : return ''.$matches['value'].'';
// add id of each user in array
case 'user' : $users[] = (int) $matches['value']; return '%s';
default : return $matches['match'];
}
}
$users = array();
$replaceArr = array();
$subj = 'An image:[#img:1234567890], a user:[#user:1234567890] and a file:[#file:file_name-with.ext]';
// escape percentage signs to avoid complications in the vsprintf function call later
$subj = strtr($subj,array('%'=>'%%'));
$subj = preg_replace_callback('~(?<match>\[#(?<type>[^:]+):(?<value>[^\]]+)\])~',replace_shortcut,$subj);
if (!empty($users)) {
// connect to DB and check users
$query = " SELECT `id`,`nick`,`date_deleted` IS NOT NULL AS 'deleted'
FROM `users` WHERE `id` IN ('".implode("','",$users)."')";
// query
// ...
// and catch results
while ($row = $con->fetch_array()) {
// position of this id in users array:
$idx = array_search($row['id'],$users);
$nick = htmlspecialchars($row['nick']);
$replaceArr[$idx] = $row['deleted'] ?
"<span class=\"user_deleted\">{$nick}</span>" :
"{$nick}";
// delete this key so that we can check id's not found later...
unset($users[$idx]);
}
// in here:
foreach ($users as $key => $value) {
$replaceArr[$key] = '<span class="user_unknown">User'.$value.'</span>';
}
// replace each user reference marked with %s in $subj
$subj = vsprintf($subj,$replaceArr);
} else {
// remove extra percentage signs we added for vsprintf function
$subj = preg_replace('~%{2}~','%',$subj);
}
unset($query,$row,$nick,$idx,$key,$value,$users,$replaceArr);
echo $subj;
You can try something like this:
/\[#(\w+):([^]]*)\]/
\[ escapes the [ character (otherwise interpreted as a character set); \w means any "word" character, and [^]]* means any non-] character (to avoid matching past the end of the tag, as .* might). The parens group the various matched parts so that you can use $1 and $2 in preg_replace to generate the replacement text:
echo preg_replace('/\[#(\w+):([^]]*)\]/', '$1 $2', '[#link:abcdef]');
prints link abcdef

Categories