regex match as 1 unit - php

I am using the PHP to run a regex on some strings.
The strings looks like:
somethingsomethin.somethingsomething.extension
I want to match the bits between the 2 periods and including the 2 periods part in the above:
.somethingsomething.
I came up with something simple like: \..+\.
The problem is that it matches all the periods in something like this:
somethingsomethin....somethingsomething....extension matches as ....somethingsomething.... when I only want .somethingsomething..
How can I get my regex expression to match as "1 unit" and to match only once?

Since . matches a ., exclude literal .s: \.[^.]+\. or possibly \.\w+\..

The . matches entire string oin your example. Try this:
<?php
$str = 'somethingsomethin....somethingsomething....extension';
preg_match('#\.\w+\.#', $str, $m);
print_r($m);

Related

preg_replace - similar patterns

I have a string that contains something like "LAB_FF, LAB_FF12" and I'm trying to use preg_replace to look for both patterns and replace them with different strings using a pattern match of;
/LAB_[0-9A-F]{2}|LAB_[0-9A-F]{4}/
So input would be
LAB_FF, LAB_FF12
and the output would need to be
DAB_FF, HAD_FF12
Problem is, for the second string, it interprets it as "LAB_FF" instead of "LAB_FF12" and so the output is
DAB_FF, DAB_FF
I've tried splitting the input line out using 2 different preg_match statements, the first looking for the {2} pattern and the second looking for the {4} pattern. This sort of works in that I can get the correct output into 2 separate strings but then can't combine the two strings to give the single amended output.
\b is word boundary. Meaning it will look at where the word ends and not only pattern match.
https://regex101.com/r/upY0gn/1
$pattern = "/\bLAB_[0-9A-F]{2}\b|\bLAB_[0-9A-F]{4}\b/";
Seeing the comment on the other answer about how to replace the string.
This is one way.
The pattern will create empty entries in the output array for each pattern that fails.
In this case one (the first).
Then it's just a matter of substr.
$re = '/(\bLAB_[0-9A-F]{2}\b)|(\bLAB_[0-9A-F]{4}\b)/';
$str = 'LAB_FF12';
preg_match($re, $str, $matches);
var_dump($matches);
$substitutes = ["", "DAB", "HAD"];
For($i=1; $i<count($matches); $i++){
If($matches[$i] != ""){
$result = $substitutes[$i] . substr($matches[$i],3);
Break;
}
}
Echo $result;
https://3v4l.org/gRvHv
You can specify exact amounts in one set of curly braces, e.g. `{2,4}.
Just tested this and seems to work:
/LAB_[0-9A-F]{2,4}/
LAB_FF, LAB_FFF, LAB_FFFF
EDIT: My mistake, that actually matches between 2 and 4. If you change the order of your selections it matches the first it comes to, e.g.
/LAB_([0-9A-F]{4}|[0-9A-F]{2})/
LAB_FF, LAB_FFFF
EDIT2: The following will match LAB_even_amount_of_characters:
/LAB_([0-9A-F]{2})+/
LAB_FF, LAB_FFFF, LAB_FFFFFF...

PHP match for strings between two (starting, ending) delimters

String;
RandomValue1:|RandomSentence1.|RandomValue2:|RandomSentence2.|
I'm trying to match RandomSentence1. and RandomSentence2.. I figured the "." in the sentence could be used to help the matching since every sentence ends with a period. So if I don't have the period in my match. I'm OK with that. I've never been very good at RegEx but I'm always willing to try and learn. Through the results on here I haven't been able to come up with anything that works. I'd be coding this in PHP. I believe either preg_match() or preg_split() would be the usage here.
I initially tried; .*:\|.*\.\|
But that just matches the entire string since it ends with .|.
Then I tried this; .*:\|\s*(.*?)\s*\|
But that only matched the RandomSentence2.
These are adaptions of what I've found online.
This should work for a regex to capture all. Look for NOT . or | followed by . and |:
preg_match_all('/([^.|]+\.)\|/', $string, $matches);
print_r($matches[1]);
An alternate if you want to do something with the other entries would be to split and then find what you want. Split on | then grep for array values ending in .:
$matches = preg_grep('/\.$/', explode('|', $string));
Since you already know there is a dot at the end, you can just match all
with something simple (?<=\|)[^|.]+(?=\.\|)
https://regex101.com/r/ZsHcWq/1
(?<= \| )
[^|.]+
(?= \.\| )

PHP regex: find the first occurrence of a given pattern in a string

my string can be
new-york-10036
or
chicago-55036
the desired result is
new-york
chicago
and i basically want to remove all the string that come after the first dash - followed by a number
seems easy but i don't know how
You can use Negative Lookahead, like so:
(.+)(?=\-\d)
The regex reads: "get me everything that is not followed by exactly one dash and exactly one number after that".
Given the input new-york-10036 the regex is going to capture only new-york. In PHP you can get the matched string with:
$string = 'new-york-10036';
$regex = '/(.+)(?=\-\d)/';
preg_match($regex, $string, $return);
echo $return[0] . "\n";
It outputs new-york.
See the regex working here.

PHP RexExp match and substitute

I am testing RegExp with online regexr.com tool. I will test string with multiple cases, but I can't get substitution to work.
RexEx for matching string is:
/^[0-9]{1,3}[0-9]{6,7}$/
Which matches local mobile number in my country like this:
0921234567
But then I want to substitute number in this way: add "+" sign, add my country code "123", add "." sign, and then finaly, add matched number with stripped leading zero.
Final number will be:
+385.921234567
I have basic idea to insert matched string, but I am not sure how prepend characters, and strip zero from matched string in following substitution pattern:
\+$&\n\t
I will use PHP preg_replace function.
EDIT:
As someone mentioned wisely, there is posibility that there will be one, two or none of zeros, but I will create separate test cases with regex just testing number of zeroes. Doing so in one regex seems to complicated for now.
Possible numbers will be:
0921234567
00111921234567
Where 111 is country code. I know that some country codes consist of 2 or 3 digits, but I will create special cases, for most country codes.
You can use this preg_replace to strip optional zeroes from start of your mobile #:
$str = preg_replace('~^0*(\d{7,9})$~', '+385.$1', $str);
^[0-9]([0-9]{1,2}[0-9]{6,7})$
You just need to add groups.Replace by +385.$1.See demo.
https://regex101.com/r/cJ6zQ3/22
$re = "/^[0-9]([0-9]{1,2}[0-9]{6,7})$/m";
$str = "0921234567\n";
$subst = "+385.$1";
$result = preg_replace($re, $subst, $str);
I would use a 2-step solution:
Check if we match the main regex
Replace the number by pre-pending + + country code + . + number without leading zeros.
PHP code:
$re = "/^[0-9]{7,10}$/";
$str = "0921234567";
if (preg_match($re, $str, $match)) {
echo "+385." . preg_replace('/^0+/', '', $match[0]);
}
Note that splitting out character class in your regex pattern makes no sense when not using capture groups. ^[0-9]{7,10}$ is the same then as ^[0-9]{1,3}[0-9]{6,7}$, meaning match 7 to 10 digits from start to end of the string.
Leading zeros are easily trimmed from the start with /^0+/ regex.

Regex to extract substring

really struggling with this...hopefully someone can put me on the right path to a solution.
My input string is structured like this:
66-2141-A-AC107-7
I'm interested in extracting the string 'AC107' using a single regular expression. I know how to do this with other PHP string functions, but I have to do this with a regular expression.
What I need is to extract all data between the third and fourth hyphens. The structure of each section is not fixed (i.e, 66 may be 8798709 and 2141 may be 38). The presence of the number of hyphens is guaranteed (i.e., there will always be a total of four (4) hyphens).
Any help/guidance is greatly appreciated!
This will do what you need:
(?:[^-]*-){3}([^-]+)
Debuggex Demo
Explanation:
(?:[^-]*-) Look for zero or more non-hyphen characters followed by a hyphen
{3} Look for three of the blocks just described
([^-]+) Capture all the consecutive non-hyphen characters from that point forward (will automatically cut off before the next hyphen)
You can use it in PHP like this:
$str = '66-2141-A-AC107-7';
preg_match('/^(?:[^-]*-){3}([^-]+)/', $str, $matches);
echo $matches[1]; // prints AC107
This should look for anything followed by a hyphen 3 times and then in group 2 (the second set of parenthesis) it will have your value, followed by another hyphen and anything else.
/^(.*-){3}(.*)-(.*)/
You can access it by using $2. In php, it would be like this:
$string = '66-2141-A-AC107-7';
preg_match('/^(.*-){3}(.*)-(.*)/', $string, $matches);
$special_id = $matches[2];
print $special_id;

Categories