trying to break a string from (optional space) number and a dot.
$string = "1.1Kumar/Sandeep MR*T0148.4801 12.23Pal/Sandeep MR*T643.948";
$regex1 = "/(\s*[0-9]+\.)/";
$regex2 = "/(?<=\s)[0-9]+\./";
I need to break from 1. and 12. .
The first regex gives:
Array
(
[0] =>
[1] => 1Kumar/Sandeep MR*T
[2] => 4801
[3] => 23Pal/Sandeep MR*T
[4] => 948
)
The second regex gives:
Array
(
[0] => 1.1Kumar/Sandeep MR*T0148.4801
[1] => 23Pal/Sandeep MR*T643.948
)
I am trying to get:
Array
(
[0] => 1Kumar/Sandeep MR*T0148.4801
[1] => 23Pal/Sandeep MR*T643.948
)
For you example string this will work:
\b\d+\.
Debuggex Demo
It makes sure there's a word break before the numeric part. (start of line or a space does it)
Related
I need some way of capturing the text outside square brackets. So for example, the following string:
My [ground]name[test]Jhon[random]petor [shorts].
I m using the below preg match expression but the result could not be expected
preg_match_all("/\[[^\]]*\]/", $text, $matches);
it giving me the result which is within the square bracket.
Result :
Array (
[0] => [ground]
[1] => [test]
[2] => [random]
[3] => [shorts]
)
Expect Output:
Array (
[0] => [My]
[1] => [name]
[2] => [Jhon]
[3] => [petor]
)
Any help that would be great
You can extend the pattern adding \K to clean what is matched so far and then using an alternation to match 1 or more word characters.
\[[^][]+]\K|\w+
See a regex demo
$re = '/\[[^][]+]\K|\w+/';
$str = 'My [ground]name[test]Jhon[random]petor [shorts].';
preg_match_all($re, $str, $matches);
print_r(array_values(array_filter($matches[0])));
Output
Array
(
[0] => My
[1] => name
[2] => Jhon
[3] => petor
)
I have string similar to this
word1/word2/word3/<b>word3</b>
I want to explode this string by forward slash. So that I can get the following result.
Array = (
[0] => 'word1',
[1] => 'word2',
[2] => 'word3',
[3] => '<b>word3</b>'
);
But I'm unable to get the above result. Instead I'm getting the following result
Array = (
[0] => 'word1',
[1] => 'word2',
[2] => 'word3',
[3] => '<b>word3<',
[4] => 'b>'
);
What regular expression should I use for this to use the preg_split function to achieve the expected results?
With preg_split function and specific regex pattern:
$s = 'word1/word2/word3/<b>word3</b>';
$result = preg_split('~(?<!<)/~', $s);
print_r($result);
~ - treated as regex expression separator
(?<!<)/ - negative lookbehind assertion, assures that forward slash / is not preceded by <
The output:
Array
(
[0] => word1
[1] => word2
[2] => word3
[3] => <b>word3</b>
)
I'm trying to split a sentence at the .!? while keeping them, but for some reason it's not working correctly. What am I doing wrong?
$input = "hi i am1. hi i am2.";
$inputX = preg_split("~[.!?]+\K\b~", $input);
print_r($inputX);
Result:
Array ( [0] => hi i am1. hi i am2. )
Expected Result:
Array ( [0] => hi i am1. [1] => hi i am2. )
I am not sure if you need to do a preg_split() but try preg_match_all() if that is an option:
$input = "hi i am1. hi i am2.";
preg_match_all("/[^\.\?\!]+[\.\!\?]/", $input,$matched);
print_r($matched);
Gives you:
Array
(
[0] => Array
(
[0] => hi i am1.
[1] => hi i am2.
)
)
Try without \b, I think it is redundant (if it is not a case) here.
$input = "hi i am1. hi i am2.?! hi i am2.?";
$inputX = preg_split("~(?>[.!?]+)\K(?!$)~", $input);
print_r($inputX);
The (?!$) is to avoid splitting on matched element, if it is on the end of string, so there will not be an additional empty result. The atomic grouping ?> is to avoid spliting if there is series of characters on the end of string, like ?!.(without atomic grouping it would split on !, and last result would be single char .). Output:
Array
(
[0] => hi i am1.
[1] => hi i am2.?!
[2] => hi i am2.?
)
i hope this is what you are expecting
$input = "hi i am1. hi i !am?2."; // i have added other ?! symbols also
$inputX = preg_split("/(\.|\!|\?)/", $input,-1,PREG_SPLIT_DELIM_CAPTURE);
print_r($inputX)
output:
Array ( [0] => hi i am1 [1] => . [2] => hi i [3] => ! [4] => am [5] => ? [6] => 2 [7] => . [8] => )
given the subject
AB: CD:DEF: HIJ99:message packet - no capture
I have crafted the following regex to capture correctly the 2-5 character targets which are all followed by a colon.
/\s{0,1}([0-9a-zA-Z]{2,5}):\s{0,1}/
which returns my matches even if erronious spaces are added before or after the targets
[0] => AB
[1] => CD
[2] => DEF
[3] => HIJ99
However, if the message packet contains a colon in it anywhere, for example
AB: CD:DEF: HIJ99:message packet no capture **or: this either**
it of course includes [4] => or in the resulting set, which is not desired. I want to limit the matches to a consecutive set from the beginning, then once we lose concurrency, stop looking for more matches in the remainder
Edit 1:
Also tried ^(\s{0,1}([0-9a-zA-Z]{2,5}):\s{0,1}){1,5} to force checking from the beginning of the string for multiple matches, but then I lose the individual matches
[0] => Array
(
[0] => AB: CD:DEF: HIJ99:
)
[1] => Array
(
[0] => HIJ99:
)
[2] => Array
(
[0] => HIJ99
)
Edit 2:
keep in mind the subject is not fixed.
AB: CD:DEF: HIJ99:message packet - no capture
could just as easily be
ZY:xw:VU:message packet no capture or: this either
for the matches we are trying to pull, with the subject being variable as well. Just trying to filter out the chance of matching a ":" in the message packet
You could use \G to do a consecutive string match.
$str = 'AB: CD:DEF: HIJ99:message packet no capture or: this either';
preg_match_all('/\G\s*([0-9a-zA-Z]{2,5}):\s*/', $str, $m);
print_r($m[1]);
Output:
Array
(
[0] => AB
[1] => CD
[2] => DEF
[3] => HIJ99
)
DEMO
How about:
$str = 'AB: CD:DEF: HIJ99:message packet no capture or: this either';
preg_match_all('/(?<![^:]{7})([0-9a-zA-Z]{2,5}):/', $str, $m);
print_r($m);
Output:
Array
(
[0] => Array
(
[0] => AB:
[1] => CD:
[2] => DEF:
[3] => HIJ99:
)
[1] => Array
(
[0] => AB
[1] => CD
[2] => DEF
[3] => HIJ99
)
)
I'm having trouble splitting this string into an array in the pattern I need it to be. The string is:
ATTRIBUTE1: +VALUE1;
ATTRIBUTE2: -VALUE2%;
I essentially need an array like this:
array (
[0] => "ATTRIBUTE1",
[1] => "+",
[2] => "VALUE1",
[3] => "%"
)
array (
[0] => "ATTRIBUTE2", ...
)
The "%" is optional, but the +/- sign is not. Any help would be appreciated!
You could use regex:
$text= "ATTRIBUTE1: +VALUE1;\nATTRIBUTE2: -VALUE2%;";
echo "STRING\n" . $text . "\n\n";
preg_match_all("~
^ # match start of line
([^:]+):\s* # match anything that's not a ':' (attribute), followed by a colon and spaces
([+-]) # match a plus or a minus sign
([^%;]+) # match anything that's not a '%' or ';' (value)
(%?) # optionally match percent sign
;\s*$ # match ';' then optional spaces and end of line
~mx", $text, $matches, PREG_SET_ORDER);
print_r($matches);
Prints:
STRING
ATTRIBUTE1: +VALUE1;
ATTRIBUTE2: -VALUE2%;
Array
(
[0] => Array
(
[0] => ATTRIBUTE1: +VALUE1;
[1] => ATTRIBUTE1
[2] => +
[3] => VALUE1
[4] =>
)
[1] => Array
(
[0] => ATTRIBUTE2: -VALUE2%;
[1] => ATTRIBUTE2
[2] => -
[3] => VALUE2
[4] => %
)
)
You might have to play around with the regex, but it's got comments now so it shouldn't be too hard to figure it out.
One thing you could do:
$str=explode(": ",$str);
$array[0]=$str[0];
$array[1]=substr($str[1],0,1);
if(substr($str[1],strlen($str[1])-1)=="%"){
$array[2]=substr($str[1],1,strlen($str[1])-2);
$array[3]="%";
}else
$array[2]=substr($str[1],1);