I'm having trouble splitting this string into an array in the pattern I need it to be. The string is:
ATTRIBUTE1: +VALUE1;
ATTRIBUTE2: -VALUE2%;
I essentially need an array like this:
array (
[0] => "ATTRIBUTE1",
[1] => "+",
[2] => "VALUE1",
[3] => "%"
)
array (
[0] => "ATTRIBUTE2", ...
)
The "%" is optional, but the +/- sign is not. Any help would be appreciated!
You could use regex:
$text= "ATTRIBUTE1: +VALUE1;\nATTRIBUTE2: -VALUE2%;";
echo "STRING\n" . $text . "\n\n";
preg_match_all("~
^ # match start of line
([^:]+):\s* # match anything that's not a ':' (attribute), followed by a colon and spaces
([+-]) # match a plus or a minus sign
([^%;]+) # match anything that's not a '%' or ';' (value)
(%?) # optionally match percent sign
;\s*$ # match ';' then optional spaces and end of line
~mx", $text, $matches, PREG_SET_ORDER);
print_r($matches);
Prints:
STRING
ATTRIBUTE1: +VALUE1;
ATTRIBUTE2: -VALUE2%;
Array
(
[0] => Array
(
[0] => ATTRIBUTE1: +VALUE1;
[1] => ATTRIBUTE1
[2] => +
[3] => VALUE1
[4] =>
)
[1] => Array
(
[0] => ATTRIBUTE2: -VALUE2%;
[1] => ATTRIBUTE2
[2] => -
[3] => VALUE2
[4] => %
)
)
You might have to play around with the regex, but it's got comments now so it shouldn't be too hard to figure it out.
One thing you could do:
$str=explode(": ",$str);
$array[0]=$str[0];
$array[1]=substr($str[1],0,1);
if(substr($str[1],strlen($str[1])-1)=="%"){
$array[2]=substr($str[1],1,strlen($str[1])-2);
$array[3]="%";
}else
$array[2]=substr($str[1],1);
Related
I need some way of capturing the text outside square brackets. So for example, the following string:
My [ground]name[test]Jhon[random]petor [shorts].
I m using the below preg match expression but the result could not be expected
preg_match_all("/\[[^\]]*\]/", $text, $matches);
it giving me the result which is within the square bracket.
Result :
Array (
[0] => [ground]
[1] => [test]
[2] => [random]
[3] => [shorts]
)
Expect Output:
Array (
[0] => [My]
[1] => [name]
[2] => [Jhon]
[3] => [petor]
)
Any help that would be great
You can extend the pattern adding \K to clean what is matched so far and then using an alternation to match 1 or more word characters.
\[[^][]+]\K|\w+
See a regex demo
$re = '/\[[^][]+]\K|\w+/';
$str = 'My [ground]name[test]Jhon[random]petor [shorts].';
preg_match_all($re, $str, $matches);
print_r(array_values(array_filter($matches[0])));
Output
Array
(
[0] => My
[1] => name
[2] => Jhon
[3] => petor
)
I have string similar to this
word1/word2/word3/<b>word3</b>
I want to explode this string by forward slash. So that I can get the following result.
Array = (
[0] => 'word1',
[1] => 'word2',
[2] => 'word3',
[3] => '<b>word3</b>'
);
But I'm unable to get the above result. Instead I'm getting the following result
Array = (
[0] => 'word1',
[1] => 'word2',
[2] => 'word3',
[3] => '<b>word3<',
[4] => 'b>'
);
What regular expression should I use for this to use the preg_split function to achieve the expected results?
With preg_split function and specific regex pattern:
$s = 'word1/word2/word3/<b>word3</b>';
$result = preg_split('~(?<!<)/~', $s);
print_r($result);
~ - treated as regex expression separator
(?<!<)/ - negative lookbehind assertion, assures that forward slash / is not preceded by <
The output:
Array
(
[0] => word1
[1] => word2
[2] => word3
[3] => <b>word3</b>
)
I have a string with data that looks like this:
$string = '
foo=bar
badge_name_foo=foo
bar_badge_name=bar
bar=baz
';
I want to match all *_badge_name and badge_name_* strings.
The regex im using is this:
preg_match_all('~(?:(\w+)_)?badge_name(?:_(\w+))?~', $string, $matches, PREG_SET_ORDER);
The result is:
Array
(
[0] => Array
(
[0] => badge_name_foo
[1] =>
[2] => foo
)
[1] => Array
(
[0] => bar_badge_name
[1] => bar
)
)
The *_badge_name is working fine, but on badge_name_* there is every time a empty value? Now how can i remove that with preg_match_all
Expected result should be:
Array
(
[0] => Array
(
[0] => badge_name_foo
[1] => foo
)
[1] => Array
(
[0] => bar_badge_name
[1] => bar
)
)
It seems you need to use BRANCH RESET feature:
Alternatives inside a branch reset group share the same capturing groups. The syntax is (?|regex) where (?| opens the group and regex is any regular expression. If you don't use any alternation or capturing groups inside the branch reset group, then its special function doesn't come into play. It then acts as a non-capturing group.
Use
(?|(\w+)_badge_name|badge_name_(\w+))
^^^
See the regex demo.
PHP demo:
$re = '/(?|(\w+)_badge_name|badge_name_(\w+))/';
$str = 'foo=bar
badge_name_foo=foo
bar_badge_name=bar
bar=baz';
preg_match_all($re, $str, $matches);
print_r($matches);
Result:
Array
(
[0] => Array
(
[0] => badge_name_foo
[1] => bar_badge_name
)
[1] => Array
(
[0] => foo
[1] => bar
)
)
trying to break a string from (optional space) number and a dot.
$string = "1.1Kumar/Sandeep MR*T0148.4801 12.23Pal/Sandeep MR*T643.948";
$regex1 = "/(\s*[0-9]+\.)/";
$regex2 = "/(?<=\s)[0-9]+\./";
I need to break from 1. and 12. .
The first regex gives:
Array
(
[0] =>
[1] => 1Kumar/Sandeep MR*T
[2] => 4801
[3] => 23Pal/Sandeep MR*T
[4] => 948
)
The second regex gives:
Array
(
[0] => 1.1Kumar/Sandeep MR*T0148.4801
[1] => 23Pal/Sandeep MR*T643.948
)
I am trying to get:
Array
(
[0] => 1Kumar/Sandeep MR*T0148.4801
[1] => 23Pal/Sandeep MR*T643.948
)
For you example string this will work:
\b\d+\.
Debuggex Demo
It makes sure there's a word break before the numeric part. (start of line or a space does it)
given the subject
AB: CD:DEF: HIJ99:message packet - no capture
I have crafted the following regex to capture correctly the 2-5 character targets which are all followed by a colon.
/\s{0,1}([0-9a-zA-Z]{2,5}):\s{0,1}/
which returns my matches even if erronious spaces are added before or after the targets
[0] => AB
[1] => CD
[2] => DEF
[3] => HIJ99
However, if the message packet contains a colon in it anywhere, for example
AB: CD:DEF: HIJ99:message packet no capture **or: this either**
it of course includes [4] => or in the resulting set, which is not desired. I want to limit the matches to a consecutive set from the beginning, then once we lose concurrency, stop looking for more matches in the remainder
Edit 1:
Also tried ^(\s{0,1}([0-9a-zA-Z]{2,5}):\s{0,1}){1,5} to force checking from the beginning of the string for multiple matches, but then I lose the individual matches
[0] => Array
(
[0] => AB: CD:DEF: HIJ99:
)
[1] => Array
(
[0] => HIJ99:
)
[2] => Array
(
[0] => HIJ99
)
Edit 2:
keep in mind the subject is not fixed.
AB: CD:DEF: HIJ99:message packet - no capture
could just as easily be
ZY:xw:VU:message packet no capture or: this either
for the matches we are trying to pull, with the subject being variable as well. Just trying to filter out the chance of matching a ":" in the message packet
You could use \G to do a consecutive string match.
$str = 'AB: CD:DEF: HIJ99:message packet no capture or: this either';
preg_match_all('/\G\s*([0-9a-zA-Z]{2,5}):\s*/', $str, $m);
print_r($m[1]);
Output:
Array
(
[0] => AB
[1] => CD
[2] => DEF
[3] => HIJ99
)
DEMO
How about:
$str = 'AB: CD:DEF: HIJ99:message packet no capture or: this either';
preg_match_all('/(?<![^:]{7})([0-9a-zA-Z]{2,5}):/', $str, $m);
print_r($m);
Output:
Array
(
[0] => Array
(
[0] => AB:
[1] => CD:
[2] => DEF:
[3] => HIJ99:
)
[1] => Array
(
[0] => AB
[1] => CD
[2] => DEF
[3] => HIJ99
)
)