Trying to learn to write a RegExp for PHP - php

I am trying to learn how to write a RegEx but it seems all my searches lead to unclear information. So my question is two fold.
Does anyone have a good source on how a newbie could learn to write RegExs?
How could I write a RegEx that breaks the string 1y 311d 16h 42m into variables?
I'm looking to take the above text string and break it into something like:
$duration[years] = 1;
$duration[days] = 311;
$duration[hours] = 16;
$duration[minutes] = 42;
Please note the total digits might may not always be the same for example it could be two digit days. Something like. 25d or some could be omitted. I might just get days and hours. Lastly the order might change. Perhaps it is written days then years etc.
I know I could do this easily with an explode function and strpos, but I really want to learn Regex so I am using this as an example as I understand they can be very powerful for things like this.

1) Some useful pages:
http://www.regular-expressions.info
https://regex101.com/
http://php.net/manual/en/book.pcre.php
2) Specifically, this:
$pattern = '/(?:(?P<years>\d+)y\s*)?(?:(?P<days>\d+)d\s*)?(?:(?P<hours>\d+)h\s*)?(?:(?P<minutes>\d+)m\s*)?/';
preg_match($pattern, '1y 311d 16h 42m', $duration);
print_r($duration);
// Array
// (
// [0] => 1y 311d 16h 42m
// [years] => 1
// [1] => 1
// [days] => 311
// [2] => 311
// [hours] => 16
// [3] => 16
// [minutes] => 42
// [4] => 42
// )
preg_match($pattern, '311d 42m', $duration);
print_r($duration);
// Array
// (
// [0] => 1y 311d 16h 42m
// [years] =>
// [1] =>
// [days] => 311
// [2] => 311
// [hours] =>
// [3] =>
// [minutes] => 42
// [4] => 42
// )
This will give fixed order though. If the order can change, regexp is not a good tool. It's still possible in this case, but rather awkward. Here it is:
$pattern = "/(?=.*?(?:(?P<years>\d+)y|$))(?=.*?(?:(?P<days>\d+)d|$))(?=.*?(?:(?P<hours>\d+)h|$))(?=.*?(?:(?P<minutes>\d+)m|$))/";
preg_match($pattern, '311d 16h 1y', $duration);
print_r($duration);
// Array
// (
// [0] =>
// [years] => 1
// [1] => 1
// [days] => 311
// [2] => 311
// [hours] => 16
// [3] => 16
// )
Entering these patterns (without the leading and trailing slashes) in regex101 will give you the explanation of what exactly it is trying to match. Find other examples from the regex tag questions and enter them as well, and try to see how they work. Experience is the best teacher.

Related

php split / cluster binary into chunks based on next 1 in cycle

I need to figure out a method using PHP to chunk the 1's and 0's into sections.
1001 would look like: array(100,1)
1001110110010011 would look like: array(100,1,1,10,1,100,100,1,1)
It gets different when the sequence starts with 0's... I would like it to segment the first 0's into their own blocks until the first 1 is reached)
00110110 would look like (0,0,1,10,1,10)
How would this be done with PHP?
You can use preg_match_all to split your string, using the following regex:
10*|0
This matches either a 1 followed by some number of 0s, or a 0. Since a regex always tries to match the parts of an alternation in the order they occur, the second part will only match 0s that are not preceded by a 1, that is those at the start of the string. PHP usage:
$beatstr = '1001110110010011';
preg_match_all('/10*|0/', $beatstr, $m);
print_r($m);
$beatstr = '00110110';
preg_match_all('/10*|0/', $beatstr, $m);
print_r($m);
Output:
Array
(
[0] => Array
(
[0] => 100
[1] => 1
[2] => 1
[3] => 10
[4] => 1
[5] => 100
[6] => 100
[7] => 1
[8] => 1
)
)
Array
(
[0] => Array
(
[0] => 0
[1] => 0
[2] => 1
[3] => 10
[4] => 1
[5] => 10
)
)
Demo on 3v4l.org

conversion from gis to lat/long give me big trouble in charset

i've a string like
$input="16°28'60,00''"
thats is on my db and stored as TEXT utf8_general_ci
im trying to convert it to decimal/lat-long system. So I write a function that splice the input and convert it.
Im using $input as an array, and when is on position 2, I have a strange result thats broke my function:
$input[2]---> 'b"Â"'
in position 2 there is the "°"
the next row check if esist "°" but due this error can works
if($tempD == iconv("UTF-8", "ISO-8859-1//TRANSLIT", '°')
how can i fix that?
If the format of the DB string is always the same, just grab the digits out and you don't need to bother with the degrees, minutes, seconds.
$input = "16°28'60,00''";
preg_match_all("/(\d+)/", $input, $match);
print_r($match);
Output:
Array
(
[0] => Array
(
[0] => 16
[1] => 28
[2] => 60
[3] => 00
)
[1] => Array
(
[0] => 16
[1] => 28
[2] => 60
[3] => 00
)
)
Now you have each digit and you can convert it easily.

PHP: Can preg_match include unmatched groups?

Can the preg_match() function include groups it did not find in the matches array?
Here is the pattern I'm using:
/^([0-9]+)(.[0-9]+)?\s?([^iIbB])?([iI])?([bB])?$/
What I'm trying to is parse an human readable size into bytes. This pattern fits my requirement, but only if I can retrieve matches in the absolute group order.
This can produce upto 5 match groups, which would result in a matches array with indices 0-5. However if the string does not match all groups, then the matches array may have, for example, group 5 actually at index 3.
What I'd like is the final match in that pattern (5) to always be at the same index of the matches array. Because multiple groups are optional it's very important that when reading the matches array we know which group in the expression got matched.
Example situation: The regex tester at regexr.com will show all 5 groups including those not matched always in the correct order. By enabling the "global" and "multi-line" flags and using the following text, you can hover over the blue matches for a good visual.
500.2 KiB
256M
700 Mb
1.2GiB
You'll notice that not all groups are always matched, however the group indexes are always in the correct order.
Edit: Yes I did already try this in PHP with the following:
$matches = [];
$matchesC = 0;
$matchesN = 6;
if (!preg_match("/^([0-9]+)(\.[0-9]+)?\s?([^iIbB])?([iI])?([bB])?$/", $size, $matches) || ($matchesC = count($matches)) < $matchesN) {
print_r($matches);
throw new \Exception(sprintf("Could not parse size string. (%d/%d)", $matchesC, $matchesN));
}
When $size is "256M" that print_r($matches); returns:
Array
(
[0] => 256M
[1] => 256
[2] =>
[3] => M
)
Groups 4 and 5 are missing.
The non-participating groups are just not initialized with an empty string value in PHP, so, Group 4 and 5 are null in case of '256M' string. It seems that preg_match discards those non-initialized values from the end of the array.
In your case, you can make your capturing groups non-optional, but the patterns inside optional.
$arr = array('500.2 KiB', '256M', '700 Mb', '1.2GiB');
foreach ($arr as $s) {
if (preg_match('~^([0-9]+)(\.[0-9]+)?\s?([^ib]?)(i?)(b?)$~i', $s, $m)) {
print_r($m) . "\n";
}
}
Output:
Array
(
[0] => 500.2 KiB
[1] => 500
[2] => .2
[3] => K
[4] => i
[5] => B
)
Array
(
[0] => 256M
[1] => 256
[2] =>
[3] => M
[4] =>
[5] =>
)
Array
(
[0] => 700 Mb
[1] => 700
[2] =>
[3] => M
[4] =>
[5] => b
)
Array
(
[0] => 1.2GiB
[1] => 1
[2] => .2
[3] => G
[4] => i
[5] => B
)
See the PHP demo.
You can use T-Regx which can handle such cases with ease! It always checks whether a group is matched, even if it's last and unmatched. It also can differentiate between "" (matched empty) or null (unmatched):
pattern('^([0-9]+)(.[0-9]+)?\s?([^iIbB])?([iI])?([bB])?$')
->match($size)
->first(function (Match $match) {
// whether the group was used in a pattern
$match->hasGroup(14);
// whether the group was matched, even if last or empty string
$match->matched(5);
// group, or default value if not matched
$match->group(5)->orReturn('unmatched');
});

How to manipulate complex strings in php?

I am trying to group bunch of texts from a string and create an array for it.
The string is something like this:
<em>string</em> and the <em>test</em> here.
tableBegin rowNumber:2, columnNumber:2 11 22 33 44 tableEnd
<em>end</em> text here
I was hoping to get an array like the following results
array (0 => '<em>string</em> and the <em>test</em> here.',
1=>'rowNumber:5',
2=>'columnNumber:3',
3=>'11',
4=>'22',
5=>'33',
6=>'44'
7=>'<em>end</em> text here')
11,22,33,44 are the table cell data the user enters. I want to make them have unique index but keep the rest of texts together.
tableBegin and tableEnd are just the check for the table cell data
Any help or tips? Thanks a lot!
You may try the following, note that you need PHP 5.3+:
$string = '<em>string</em> and the <em>test</em> here.
tableBegin rowNumber:2, columnNumber:2 11 22 33 44 tableEnd
SOme other text
tableBegin rowNumber:3, columnNumber:3 11 22 33 44 55 tableEnd
<em>end</em> text here';
$array = array();
preg_replace_callback('#tableBegin\s*(.*?)\s*tableEnd\s*|.*?(?=tableBegin|$)#s', function($m)use(&$array){
if(isset($m[1])){ // If group 1 exists, which means if the table is matched
$array = array_merge($array, preg_split('#[\s,]+#s', $m[1])); // add the splitted string to the array
// split by one or more whitespace or comma --^
}else{// Else just add everything that's matched
if(!empty($m[0])){
$array[] = $m[0];
}
}
}, $string);
print_r($array);
Output
Array
(
[0] => string and the test here.
[1] => rowNumber:2
[2] => columnNumber:2
[3] => 11
[4] => 22
[5] => 33
[6] => 44
[7] => SOme other text
[8] => rowNumber:3
[9] => columnNumber:3
[10] => 11
[11] => 22
[12] => 33
[13] => 44
[14] => 55
[15] => end text here
)
Regex explanation
tableBegin : match tableBegin
\s* : match a whitespace zero or more times
(.*?) : match everything ungreedy and put it in group 1
\s* : match a whitespace zero or more times
tableEnd : match tableEnd
\s* : match a whitespace zero or more times
| : or
.*?(?=tableBegin|$) : match everything until tableBegin or end of line
The s modifier : make dots also match newlines
Here is the ugly way to do it, if you can't find a Regex guru out ther.
So, this is your text
$string = "<em>string</em> and the <em>test</em> here.
tableBegin rowNumber:2, columnNumber:2 11 22 33 44 tableEnd
<em>end</em> text here";
And this is my code
$E = explode(' ', $string);
$A = $E[0].$E[1].$E[2].$E[3].$E[4].$E[5];
$B = $E[17].$E[18].$E[19];
$All = [$A, $E[8],$E[9], $E[11], $E[12], $E[13], $E[14], $B];
print_r($All);
And this is the output
Array
(
[0] => stringandthetesthere.
[1] => rowNumber:2,
[2] => columnNumber:2
[3] => 11
[4] => 22
[5] => 33
[6] => 44
[7] => endtexthere
)
off-course, the <em> tags won't be visible, unless view the source code.

preg_match to match substring of three numbers consecutively?

I have a string $text_arr="101104105106109111112113114116117120122123124"
fairly big string
If i want to split three numbers from them like 101,104,105 and store them in $array .What should i do?
I tried doing this:
preg_match_all('/[0-9]{3}$/',"$text_arr",$array);
The easiest way to do this is with preg_split()Docs:
$result = preg_split('/(\d{3})/', $str, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
See it working, or the result:
Array
(
[0] => 101
[1] => 104
[2] => 105
[3] => 106
[4] => 109
[5] => 111
[6] => 112
[7] => 113
[8] => 114
[9] => 116
[10] => 117
[11] => 120
[12] => 122
[13] => 123
[14] => 124
)
Though you could use a regular expression for this, it might be more performant to use a simple, standard function:
$groups = str_split($numbers, 3);//returns array you want
Read all about it here
You have to remove the ends with $ from your expression, it is causing to return only one result
try like this
preg_match_all('/[0-9]{3}/', $text_arr, $array);
check this working here
Choose this simplest code
<?php
$string = "101104105106109111112113114116117120122123124";
$parts = str_split($string, 3);
$res=implode(',',$parts);
echo($res);
?>

Categories