Split string with regular expressions - php

I have this string:
EXAMPLE|abcd|[!PAGE|title]
I want to split it like this:
Array
(
[0] => EXAMPLE
[1] => abcd
[2] => [!PAGE|title]
)
How to do it?
Thank you.

DEMO
If you don't need anything more than you said, is like parsing a CSV but with | as separator and [ as " so: (\[.*?\]+|[^\|]+)(?=\||$) will do the work I think.
EDIT: Changed the regex, now it accepts strings like [asdf]].[]asf]
Explanation:
(\[.*?\]+|[^\|]+) -> This one is divided in 2 parts: (will match 1.1 or 1.2)
1.1 \[.*?\]+ -> Match everything between [ and ]
1.2 [^\|]+ -> Will match everything that is enclosed by |
(?=\||$) -> This will tell the regular expression that next to that must be a | or the end of the string so that will tell the regex to accept strings like the earlier example.

Given your example, you could use (\[.*?\]|[^|]+).
preg_match_all("#(\[.*?\]|[^|]+)#", "EXAMPLE|abcd|[!PAGE|title]", $matches);
print_r($matches[0]);
// output:
Array
(
[0] => EXAMPLE
[1] => abcd
[2] => [!PAGE|title]
)

use this regex (?<=\||^)(((\[.*\|?.*\])|(.+?)))(?=\||$)
(?<=\||^) Positive LookBehind
1st alternative: \|Literal `|`
2nd alternative: ^Start of string
1st Capturing group (((\[.*\|?.*\])|(.+?)))
2nd Capturing group ((\[.*\|?.*\])|(.+?))
1st alternative: (\[.*\|?.*\])
3rd Capturing group (\[.*\|?.*\])
\[ Literal `[`
. infinite to 0 times Any character (except newline)
\| 1 to 0 times Literal `|`
. infinite to 0 times Any character (except newline)
\] Literal `]`
2nd alternative: (.+?)
4th Capturing group (.+?)
. 1 to infinite times [lazy] Any character (except newline)
(?=\||$) Positive LookAhead
1st alternative: \|Literal `|`
2nd alternative: $End of string
g modifier: global. All matches (don't return on first match)

A Non-regex solution:
$str = str_replace('[', ']', "EXAMPLE|abcd|[!PAGE|title]");
$arr = str_getcsv ($str, '|', ']')
If you expect things like this "[[]]", you would've to escape the inside brackets with slashes in which case regex might be the better option.

http://de2.php.net/manual/en/function.explode.php
$array= explode('|', $string);

Related

Regex for find value between curly braces which have pipe separator

$str = ({max_w} * {max_h} * {key|value}) / {key_1|value}
I have the above formula, I want to match the value with curly braces and which has a pipe separator. Right now the issue is it's giving me the values which have not pipe separator. I am new in regex so not have much idea about that. I tried below one
preg_match_all("^\{(|.*?|)\}^",$str, PREG_PATTERN_ORDER);
It gives below output
Array
(
[0] => key|value
[1] => max_w
[2] => max_h
[3] => key_1|value
)
Expected output
Array
(
[0] => key|value
[1] => key_1|value
)
Not sure about PHP. Here's the general regex that will do this.
{([^{}]*\|[^{}]*)}
Here is the demo.
You can use
(?<={)[^}]*\|[^}]*(?=})
For the given string the two matches are shown by the pointy characters:
({max_w} * {max_h} * {key|value}) / {key_1|value}
^^^^^^^^^ ^^^^^^^^^^^
Demo
(?<={) is a positive lookbehind. Arguably, the positive lookahead (?=}) is not be needed if it is known that all braces appear in matching, non-overlapping pairs.
The pattern \{(|.*?|)\} has 2 alternations | that can be omitted as the alternatives on the left and right of it are not really useful.
That leaves \{(.*?)} where the . can match any char including a pipe char, and therefore does not make sure that it is matched in between.
You can use a pattern that does not crosses matching a curly or a pipe char to match a single pipe in between.
{\K[^{}|]*\|[^{}|]*(?=})
{ Match opening {
\K Forget what is matches until now
[^{}|]* Match any char except the listed
\| Match a | char
[^{}|]* Match any char except the listed
(?=}) Assert a closing } to the right
Regex demo | PHP demo
$str = "({max_w} * {max_h} * {key|value}) / {key_1|value}";
$pattern = "/{\K[^{}|]*\|[^{}|]*(?=})/";
preg_match_all($pattern, $str, $matches);
print_r($matches[0]);
Output
Array
(
[0] => key|value
[1] => key_1|value
)
Or using a capture group:
{([^{}|]*\|[^{}|]*)}
Regex demo

regex expected value in a postion depends on a random value in another position

I need regex to find all shortcode tag pairs that look like this [sc1-g-data]b[/sc1-g-data] but the number next to the sc can vary but they must match.
So something like this won't work \[sc(.*?)\-((.|\n)*?)\[\/sc(.*?)\- as this matches unmatching tag pairs like this which i don't want [sc1-g-data]b[/sc2-g-data]
so the expected number in the second tag depends on a random number in the first tag
You may use a regex like:
\[(sc\d*-[^\]\[]*)\]([\s\S]*?)\[\/\1\]
See the regex demo
\[ - a [ char
(sc\d*-[^\]\[]*) - Capturing group 1: sc, 0+ digits, -, and then 0+ chars other than ] and [
\] - a ] char
([\s\S]*?) - Capturing group 2: any 0+ chars, as few as possible
\[\/ - a [/ string
\1 - the same text stored in Group 1
\] - a ] char
See the regex graph:
PHP demo:
$pattern = '~\[(sc\d*-[^][]*)](.*?)\[/\1]~s';
$string = '[sc1-g-data]a[/sc1-g-data] ';
if (preg_match($pattern, $string, $matches)) {
print_r($matches);
}
Mind the use of a single quoted string literal, if you use a double quoted one you will need to use \\1, not \1 as '\1' != "\1" in PHP.
Output:
Array
(
[0] => [sc1-g-data]a[/sc1-g-data]
[1] => sc1-g-data
[2] => a
)
If your tags are just anything between brackets [blah][/blah] you can use:
\[(.*?)\].*?\[\/\1\]

Regex of number inside brackets

I need to get the float number inside brackets..
I tried this '([0-9]*[.])?[0-9]+' but it returns the first number like 6 in the first example.
Also I tried this
'/\((\d+)\)/'
but it returns 0.
Please note that I need the extracted number either int or float.
Can u plz help
As you need to match bracket also, You need to add () in regular expression:
$str = 'Serving size 6 pieces (40)';
$str1 = 'Per bar (41.5)';
preg_match('#\(([0-9]*[.]?[0-9]+)\)#', $str, $matches);
print_r($matches);
preg_match('#\(([0-9]*[.]?[0-9]+)\)#', $str1, $matches);
print_r($matches);
Output:
Array
(
[0] => (40)
[1] => 40
)
Array
(
[0] => (41.5)
[1] => 41.5
)
DEMO
You could escape brackets:
$str = 'Serving size 6 pieces (41.5)';
if (preg_match('~\((\d+.?\d*)\)~', $str, $matches)) {
print_r($matches);
}
Outputs:
Array
(
[0] => (41.5)
[1] => 41.5
)
Regex:
\( # open bracket
( # capture group
\d+ # one or more numbers
.? # optional dot
\d* # optional numbers
) # end capture group
\) # close bracket
You could also use this to get only one digit after the dot:
'~\((\d+.?\d?)\)~'
You need to escape the brackets
preg_match('/\((\d+(?:\.\d+)?)\)/', $search, $matches);
explanation
\( escaped bracket to look for
( open subpattern
\d a number
+ one or more occurance of the character mentioned
( open Group
?: dont save data in a subpattern
\. escaped Point
\d a number
+ one or more occurance of the character mentioned
) close Group
? one or no occurance of the Group mentioned
) close subpattern
\) escaped closingbracket to look for
matches numbers like
1,
1.1,
11,
11.11,
111,
111.111 but NOT .1, .
https://regex101.com/r/ei7bIM/1
You could match an opening parenthesis, use \K to reset the starting point of the reported match and then match your value:
\(\K\d+(?:\.\d+)?(?=\))
That would match:
\( Match (
\K Reset the starting point of the reported match
\d+ Match one or more digits
(?: Non capturing group
\.\d+ Match a dot and one or more digits
)? Close non capturing group and make it optional
(?= Positive lookahead that asserts what follows is
\) Match )
) Close posive lookahead
Demo php

Split String With preg_match

I have string :
$productList="
Saluran Dua(Bothway)-(TAN007);
Speedy Password-(INET PASS);
Memo-(T-Memo);
7-pib r-10/10-(AM);
FBI (R/N/M)-(Rr/R(A));
";
i want the result like this:
Array(
[0]=>TAN007
[1]=>INET PASS
[2]=>T-Memo
[3]=>AM
[4]=>Rr/R(A)
);
I used :
$separator = '/\-\(([A-z ]*)\)/';
preg_match_all($separator, $productList, $match);
$value=$match[1];
but the result:
Array(
[0]=>INET PASS
[1]=>AM
);
there's must wrong code, anybody can help this?
Your regex does not include all the characters that can appear in the piece of text you want to capture.
The correct regex is:
$match = array();
preg_match_all('/-\((.*)\);/', $productList, $match);
Explanation (from the inside to outside):
.* matches anything;
(.*) is the expression above put into parenthesis to capture the match in $match[1];
-\((.*)\); is the above in the context: it matches if it is preceded by -( and followed by );; the parenthesis are escaped to use their literal values and not their special regex interpretation;
there is no need to escape - in regex; it has special interpretation only when it is used inside character ranges ([A-Z], f.e.) but even there, if the dash character (-) is right after the [ or right before the ] then it has no special meaning; e.g. [-A-Z] means: dash (-) or any capital letter (A to Z).
Now, print_r($match[1]); looks like this:
Array
(
[0] => TAN007
[1] => INET PASS
[2] => T-Memo
[3] => AM
[4] => Rr/R(A)
)
for the 1th line you need 0-9
for the 3th line you need a - in and
in the last line you need ()
try this
#\-\(([a-zA-Z/0-9(\)\- ]*)\)#
try with this ReGex
$separator = '#\-\(([A-Za-z0-9/\-\(\) ]*)\)#';

How do i break string into words at the position of number

I have some string data with alphanumeric value. like us01name, phc01name and other i.e alphabates + number + alphabates.
i would like to get first alphabates + number in first string and remaining on second.
How can i do it in php?
You can use a regular expression:
// if statement checks there's at least one match
if(preg_match('/([A-z]+[0-9]+)([A-z]+)/', $string, $matches) > 0){
$firstbit = $matches[1];
$nextbit = $matches[2];
}
Just to break the regular expression down into parts so you know what each bit does:
( Begin group 1
[A-z]+ As many alphabet characters as there are (case agnostic)
[0-9]+ As many numbers as there are
) End group 1
( Begin group 2
[A-z]+ As many alphabet characters as there are (case agnostic)
) End group 2
Try this code:
preg_match('~([^\d]+\d+)(.*)~', "us01name", $m);
var_dump($m[1]); // 1st string + number
var_dump($m[2]); // 2nd string
OUTPUT
string(4) "us01"
string(4) "name"
Even this more restrictive regex will also work for you:
preg_match('~([A-Z]+\d+)([A-Z]+)~i', "us01name", $m);
You could use preg_split on the digits with the pattern capture flag. It returns all pieces, so you'd have to put them back together. However, in my opinion is more intuitive and flexible than a complete pattern regex. Plus, preg_split() is underused :)
Code:
$str = 'user01jason';
$pieces = preg_split('/(\d+)/', $str, -1, PREG_SPLIT_DELIM_CAPTURE);
print_r($pieces);
Output:
Array
(
[0] => user
[1] => 01
[2] => jason
)

Categories