preg_match - getting position of regex match in the matched string - php

So I know if you pass in the flag PREG_OFFSET_CAPTURE you get the index of the regex match in the orginal "haystack", but what if I want the index of the match within the whole match?
Simple example:
Original String: "Have a <1 + 2> day today"
My regular expression /<1 ([+|-]) 2>/
So in the example I am matching whatever symbol is between the 1 and 2. If I did this in preg_match with the PREG_OFFSET_CAPTURE flag, the index for the matched symbol would be 10. I really would like it to return 3 though.
Is there any way to do this?

the only way is to substract the whole pattern offset (7) to the capturing group offset (10): 10-7=3
$group_offset = $matches[1][1] - $matches[0][1];

You could use a more tricky way by using preg_replace_callback:
$string = 'I have a <1 + 2> day today and a foo <4 - 1> week.';
$match = array();
preg_replace_callback('/<\d+ ([+|-]) \d+>/', function($m)use(&$match){
$match[] = array($m[0], $m[1], strpos($m[0], $m[1]));
}, $string); // PHP 5.3+ required (anonymous function)
print_r($match);
Output:
Array
(
[0] => Array
(
[0] => <1 + 2>
[1] => +
[2] => 3
)
[1] => Array
(
[0] => <4 - 1>
[1] => -
[2] => 3
)
)

Related

need some help on regex in preg_match_all()

so I need to extract the ticket number "Ticket#999999" from a string.. how do i do this using regex.
my current regex is working if I have more than one number in the Ticket#9999.. but if I only have Ticket#9 it's not working please help.
current regex.
preg_match_all('/(Ticket#[0-9])\w\d+/i',$data,$matches);
thank you.
In your pattern [0-9] matches 1 digit, \w matches another digit and \d+ matches 1+ digits, thus requiring 3 digits after #.
Use
preg_match_all('/Ticket#([0-9]+)/i',$data,$matches);
This will match:
Ticket# - a literal string Ticket#
([0-9]+) - Group 1 capturing 1 or more digits.
PHP demo:
$data = "Ticket#999999 ticket#9";
preg_match_all('/Ticket#([0-9]+)/i',$data,$matches, PREG_SET_ORDER);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => Ticket#999999
[1] => 999999
)
[1] => Array
(
[0] => ticket#9
[1] => 9
)
)

PHP regex to extract special string

I am trying to use regex to extract a certain syntax, in my case something like "10.100" or "20.111", in which 2 numbers are separated by dot(.) . So if I provide "a 10.100", it will extract 10.100 from the string. If I provide "a 10.100 20.101", it will extract 10.100 and 20.101.
Until now I have tried to use
preg_match('/^.*([0-9]{1,2})[^\.]([0-9]{1,4}).*$/', $message, $array);
but still no luck. Please provide any suggestion because I don't have strong regex knowledge. Thanks.
You may use
\b[0-9]{1,2}\.[0-9]{1,4}\b
See the regex demo.
Details:
\b - a leading word boundary
[0-9]{1,2} - 1 or 2 digits
\. - a dot
[0-9]{1,4} - 1 to 4 digits
\b - a trailing word boundary.
If you do not care about the whole word option, just remove \b. Also, to match just 1 or more digits, you may use + instead of the limiting quantifiers. So, perhaps
[0-9]+\.[0-9]+
will also work for you.
See a PHP demo:
$re = '/[0-9]+\.[0-9]+/';
$str = 'I am trying to use regex to extract a certain syntax, in my case something like "10.100" or "20.111", in which 2 numbers are separated by dot(.) . So if I provide "a 10.100", it will extract 10.100 from the string. If I provide "a 10.100 20.101", it will extract 10.100 and 20.101.';
preg_match_all($re, $str, $matches);
print_r($matches[0]);
Output:
Array
(
[0] => 10.100
[1] => 20.111
[2] => 10.100
[3] => 10.100
[4] => 10.100
[5] => 20.101
[6] => 10.100
[7] => 20.101
)
Regex: /\d+(?:\.\d+)/
1. \d+ for matching digits one or more.
2. (?:\.\d+) for matching digits followed by . like .1234
Try this code snippet here
<?php
ini_set('display_errors', 1);
$string='a 10.100 20.101';
preg_match_all('/\d+(?:\.\d+)/', $string, $array);
print_r($array);
Output:
Array
(
[0] => Array
(
[0] => 10.100
[1] => 20.101
)
)
$decimals = "10.5 100.50 10.250";
preg_match_all('/\b[\d]{2}\.\d+\b/', $decimals, $output);
print_r($output);
Output:
Array
(
[0] => 10.5
[1] => 10.250
)
Regex Demo | Php Demo

Catching ids and its values from a string with preg_match

I was wondering how can I create preg_match for catching:
id=4
4 being any number and how can I search for the above example in a string?
If this is could be correct /^id=[0-9]/, the reason why I'm asking is because I'm not really good with preg_match.
for 4 being any number, we must set the range for it:
/^id\=[0-9]+/
\escape the equal-sign, plus after the number means 1 or even more.
You should go with the the following:
/id=(\d+)/g
Explanations:
id= - Literal id=
(\d+) - Capturing group 0-9 a character range between 0 and 9; + - repeating infinite times
/g - modifier: global. All matches (don't return on first match)
Example online
If you want to grab all ids and its values in PHP you could go with:
$string = "There are three ids: id=10 and id=12 and id=100";
preg_match_all("/id=(\d+)/", $string, $matches);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => id=10
[1] => id=12
[2] => id=100
)
[1] => Array
(
[0] => 10
[1] => 12
[2] => 100
)
)
Example online
Note: If you want to match all you must use /g modifier. PHP doesn't support it but has other function for that which is preg_match_all. All you need to do is remove the g from the regex.

regex match between 2 strings

For example I have the text
a1aabca2aa3adefa4a
I want to extract 2 and 3 with a regex between abc and def, so 1 and 4 should be not included in the result.
I tried this
if(preg_match_all('#abc(?:a(\d)a)+def#is', file_get_contents('test.txt'), $m, PREG_SET_ORDER))
print_r($m);
I get this
> Array
(
[0] => Array
(
[0] => abca1aa2adef
[1] => 3
)
)
But I want this
Array
(
[0] => Array
(
[0] => abca1aa2adef
[1] => 2
[2] => 3
)
)
Is this possible with one preg_match_all call? How can I do it?
Thanks
preg_match_all(
'/\d # match a digit
(?=.*def) # only if followed by <anything> + def
(?!.*abc) # and not followed by <anything> + abc
/x',
$subject, $result, PREG_PATTERN_ORDER);
$result = $result[0];
works on your example. It assumes that there is exactly one instance of abc and def per line in your string.
The reason why your attempt didn't work is that your capturing group (\d) that matches the digit is within another, repeated group (?:a(\d)a)+. With every repetition, the result of the capture is overwritten. This is how regular expressions work.
In other words - see what's happening during the match:
Current position Current part of regex Capturing group 1
--------------------------------------------------------------
a1a no match, advancing... undefined
abc abc undefined
a2a (?:a(\d)a) 2
a3a (?:a(\d)a) (repeated) 3 (overwrites 2)
def def 3
You ask if it is possible with a single preg_match_all.
Indeed it is.
This code outputs exactly what you want.
<?php
$subject='a1aabca2aa3adefa4a';
$pattern='/abc(?:a(\d)a+(\d)a)def/m';
preg_match_all($pattern, $subject, $all_matches,PREG_OFFSET_CAPTURE | PREG_PATTERN_ORDER);
$res[0]=$all_matches[0][0][0];
$res[1]=$all_matches[1][0][0];
$res[2]=$all_matches[2][0][0];
var_dump($res);
?>
Here is the output:
array
0 => string 'abca2aa3adef' (length=12)
1 => string '2' (length=1)
2 => string '3' (length=1)

Replacing based on position in string

Is there a way using regex to replace characters in a string based on position?
For instance, one of my rewrite rules for a project I’m working on is “replace o with ö if o is the next-to-last vowel and even numbered (counting left to right).”
So, for example:
heabatoik would become heabatöik (o is the next-to-last vowel, as well as the fourth vowel)
habatoik would not change (o is the next-to-last vowel, but is the third vowel)
Is this possible using preg_replace in PHP?
Starting with the beginning of the subject string, you want to match 2n + 1 vowels followed by an o, but only if the o is followed by exactly one more vowel:
$str = preg_replace(
'/^((?:(?:[^aeiou]*[aeiou]){2})*)' . # 2n vowels, n >= 0
'([^aeiou]*[aeiou][^aeiou]*)' . # odd-numbered vowel
'o' . # even-numbered vowel is o
'(?=[^aeiou]*[aeiou][^aeiou]*$)/', # exactly one more vowel
'$1$2ö',
'heaeafesebatoik');
To do the same but for an odd-numbered o, match 2n leading vowels rather than 2n + 1:
$str = preg_replace(
'/^((?:(?:[^aeiou]*[aeiou]){2})*)' . # 2n vowels, n >= 0
'([^aeiou]*)' . # followed by non-vowels
'o' . # odd-numbered vowel is o
'(?=[^aeiou]*[aeiou][^aeiou]*$)/', # exactly one more vowel
'$1$2ö',
'habatoik');
If one doesn't match, then it performs no replacement, so it's safe to run them in sequence if that's what you're trying to do.
You can use preg_match_all to split the string into vowel/non-vowel parts and process that.
e.g. something like
preg_match_all("/(([aeiou])|([^aeiou]+)*/",
$in,
$out, PREG_PATTERN_ORDER);
Depending on your specific needs, you may need to modify the placement of ()*+? in the regex.
I like to expand on Schmitt. (I don't have enough points to add a comment, I'm not trying to steal his thunder). I would use the flag PREG_OFFSET_CAPTURE as it returns not only the vowels but also there locations. This is my solution:
const LETTER = 1;
const LOCATION = 2
$string = 'heabatoik'
preg_match_all('/[aeiou]/', $string, $in, $out, PREG_OFFSET_CAPTURE);
$lastElement = count($out) - 1; // -1 for last element index based 0
//if second last letter location is even
//and second last letter is beside last letter
if ($out[$lastElement - 1][LOCATION] % 2 == 0 &&
$out[$lastElement - 1][LOCATION] + 1 == $out[$lastElement][LOCATION])
substr_replace($string, 'ö', $out[$lastElement - 1][LOCATION]);
note:
print_r(preg_match_all('/[aeiou]/', 'heabatoik', $in, $out, PREG_OFFSET_CAPTURE));
Array
(
[0] => Array
(
[0] => Array
(
[0] => e
[1] => 1
)
[1] => Array
(
[0] => a
[1] => 2
)
[2] => Array
(
[0] => a
[1] => 4
)
[3] => Array
(
[0] => o
[1] => 6
)
[4] => Array
(
[0] => i
[1] => 7
)
)
)
This is how I would do it:
$str = 'heabatoik';
$vowels = preg_replace('#[^aeiou]+#i', '', $str);
$length = strlen($vowels);
if ( $length % 2 && $vowels[$length - 2] == 'o' ) {
$str = preg_replace('#o([^o]+)$#', 'ö$1', $str);
}

Categories