Match multiple same occurence after one specific character chain REGEX - php

Thanks to anyone that will try to help me.
I struggle into making a regex that can do this case :
I want every match of "Heure Pleine Saison Basse" that occur after the first occurence of "Acheminement conso".
Using the raw text below, i want to match "Heure Pleine Saison Basse" 3 5 6 7 and not 1 & 2.
Do not use the number inside characted recognition, it is just here to help you uderstand which chain i want to match
This example regex only match the last occurrence :
Acheminement[\s\S]*(Heure Pleine Saison Basse)
Here is a great raw text example :
Electricité n° de\n
compteur ancien\n
index nouvel\n
index conso\n
kWh/Qté prix unitaire\n
HT en euros montant HT\n
en euros taux de\n
TVA\n
Contribution cee du 14/07/22 au 13/08/22 143020,00495 70,7920,0%\n
Evolutions arenh du 14/07/22 au 13/08/22 14302-0,03149 -450,3720,0%\n
Consommation  du 14/07/22 au 13/08/22 154\n
Heure Pleine Saison Basse 1
Heure Pleine Saison Basse 2
Heure Creuse Saison Basse 2
Acheminement conso\n
kWh/Qté prix unitaire\n
HT en euros montant HT\n
en euros taux de\n
TVA\n
Composante de comptage du 1
Composante de comptage du 2
Composante de soutirage du 1
Composante de soutirage du 2
Composante de gestion 1
Composante de gestion 2
Consommation du 14/07/22 au 31/07/22 Heure Pleine Saison Basse 56200,02000 112,4020,0%\n
Heure Creuse Saison Basse 26840,01700 45,6320,0%\n
Consommation du 01/08/22 au 13/08/22\n
Heure Pleine Saison Basse 3
Heure Creuse Saison Basse 4
Heure Pleine Saison Basse 5
Heure Pleine Saison Basse 6
Heure Pleine Saison Basse 7
Services et prestations techniques conso\n
kWh/Qté prix unitaire\n
HT en euros montant HT\n
en euros taux de\n
TVA\n
Espace Client Gratuit\n
Taxes et Contributions conso\n

You can use
'/(?:\G(?!\A)|Acheminement conso)[\s\S]*?\KHeure Pleine Saison Basse/u'
'/(?:\G(?!\A)|Acheminement conso).*?\KHeure Pleine Saison Basse/su'
See the regex demo. Details:
(?:\G(?!\A)|Acheminement conso) - either Acheminement conso or the end of the previous match (\G(?!\A) is matching what \G operator matches except the position at the start of string that is "cancelled" with the (?!\A) negative lookahead)
[\s\S]*? - any zero or more chars as few as possible
\K - omit the text matched so far
Heure Pleine Saison Basse - a fixed string.
The u flag is necessary when you have to deal with Unicode strings.
The s flag is useful to make . match any characters including line breaks.

Related

Conditional regex length based on the first character

There is a string with numbers I need to validate with PHP preg_match.
If it starts with 10 or 20 or 30, I need 7 more numbers after the inital 2, but in any other cases I need 8 numbers only and don't care what are the lead characters.
The first part is the simple one
/^(1|2|3)0\d{7}$
But how can I add an ELSE part? There I need a simple
^\d{8}$
I need to match these examples:
101234567
201234567
12345678
33445566
You may use
^(?:[1-3]0\d{7}|(?![1-3]0)\d{8})$
See the regex demo
Details
^ - start of string
(?: - start of a non-capturing group:
[1-3]0\d{7} - 1, 2 or 3, then 0 and any 7 digits
| - or
(?![1-3]0)\d{8} - no 10, 20 or 30 immediately at the start of the string are allowed, then any 8 digits are matched
) - end of the group
$ - end of the string.
Here's an alternative using (?(?=regex)then|else) aka conditionals:
^(?(?=[1-3]0)[1-3]0\d{7}|\d{8})$
It literally says: if [1-3]0 is right at the start, match [1-3]0\d{7}, else match \d{8}.
Demo: https://regex101.com/r/LXoHyk/1 (examples shamelessly taken from Wiktor's answer)

Combine two regular expressions for php

I have these two regular expression
^(((98)|(\+98)|(0098)|0)(9){1}[0-9]{9})+$
^(9){1}[0-9]{9}+$
How can I combine these phrases together?
valid phone :
just start with : 0098 , +98 , 98 , 09 and 9
sample :
00989151855454
+989151855454
989151855454
09151855454
9151855454
You haven't provided what passes and what doesn't, but I think this will work if I understand correctly...
/^\+?0{0,2}98?/
Live demo
^ Matches the start of the string
\+? Matches 0 or 1 plus symbols (the backslash is to escape)
0{0,2} Matches between 0 and 2 (0, 1, and 2) of the 0 character
9 Matches a literal 9
8? Matches 0 or 1 of the literal 8 characters
Looking at your second regex, it looks like you want to make the first part ((98)|(\+98)|(0098)|0) in your first regex optional. Just make it optional by putting ? after it and it will allow the numbers allowed by second regex. Change this,
^(((98)|(\+98)|(0098)|0)(9){1}[0-9]{9})+$
to,
^(?:98|\+98|0098|0)?9[0-9]{9}$
^ this makes the non-grouping pattern optional which contains various alternations you want to allow.
I've made few more corrections in the regex. Use of {1} is redundant as that's the default behavior of a character, with or without it. and you don't need to unnecessarily group regex unless you need the groups. And I've removed the outer most parenthesis and + after it as that is not needed.
Demo
This regex
^(?:98|\+98|0098|0)?9[0-9]{9}$
matches
00989151855454
+989151855454
989151855454
09151855454
9151855454
Demo: https://regex101.com/r/VFc4pK/1/
However note that you are requiring to have a 9 as first digit after the country code or 0.

RegExp Match PHP

Data:
N15319542045C13_1_3/61488007C13-130083_1_3/61488007C13-130083-1_1_3/P1197443641_1_3SD|1
NP1196939393_1_3SU|OD=2/7;|BNP1196939393_1_3SU|OD=2/7;|BNP1196930222_1_3SU|OD=4/11;|
NP1196930222_1_3SU|OD=4/11;|
N15319384625C13_1_3/61445794C13-130077_1_3SD||BN15319384625C13_1_3/61445794C13-130077_1_3SD||
RegExp:
(N(.*?)S([UID])\|(.*?))(?:B|\|.?$)
I am trying to find 7 matches using above regex but only 6 are matching. Not sure how to fix to match 1st line as well.
Format:
N(key)S(action)|(value or end)
end depend on different matches
I solved it if someone else needs:
(\x15(.*?)\x01([UID])\|(.*?))(?:.*?\x08|.*\|?$)
The regex didn't work because after the S[UID] you expect 2 | as per the regex but in the first input string there is only one.
One fix is to make the second group optional and move out the string end anchor $
(N(.*?)S([UID])\|(.*?))(?:B|\|.?)?$
Regex Demo
Or may be more simpler as
N.*?S[UID]\|.*$
Regex Demo

How to match two digits in pregmatch?

I have return a preg_match to check whether starting digits were 9477 or not and it works correctly. But now I want add another condition to add 9477 or 9476 should be valid.
Here is the condition:
should contain 11 digits
should starts with 9477 or 9476
Here is my code:
preg_match('/^9477\d{7}$/',$Mnumber)
Use an alternation between the two numbers:
preg_match('/^947(?:7|6)\d{7}$/',$Mnumber)
(?:7|6) is a non capture group that matches digit 7 or 6. A non capture group is much more efficient than a capture group.
You can do also:
preg_match('/^947[76]\d{7}$/',$Mnumber);
[67] is a character class that matches digit 7 or 6
Use grouping in []
echo preg_match('/^947[76]\d{7}$/',$Mnumber);
Just use (9477|9476)
echo preg_match('/^(9477|9476)\d{7}$/',$Mnumber);
You can also use:
/^947(7|6)\d{7}$/
/^947[76]\d{7}$/

php regex to match text

I need a php regex to match text that is not preceded by the name "Total" of "maximum" case insensitive in the text below.
[1]
[1m]
[1mk][1mks]
[1mark]
[1marks]
(1mk)
12mk
12 mark
13 mark
[Total: 15]
Total: 16 mark
Total 1 mark
Total 12 mark
Total: 9 mark
Total: 10 mark
[Total: 11 marks] Total 6 mark
maximum 5 marks
maximum:5 marks
Note: This text is in a one long line.
The regex should match the following
[1]
[1m]
[1mk][1mks]
[1mark]
[1marks]
(1mk)
12mk
12 mark
13 mark
I have tried this one but its not working
/(?<!Total\:\s|Total\s|maximum\s|maximum\:\s)[\[|\(]?([0-9]{1,2})(\s|(?=marks|mark|mks|mk|m|\]))?(\]|marks|mark|mks|mk|m)[\]|\)]?/i
EDIT
https://www.debuggex.com/r/yNNN_B3iQmGyYWoz
EDIT2
e.g '12 mark' should be returned only is its not "Total[:]\s+ 12 mark" or "maximum[:]\s+12 mark"
Try this: (?:\[?\b(?:Total|maximum):?\s?\d+\s?[^ ]+(*SKIP)(*FAIL))|(\d++\s?[^ )\]]*)
(Use ignore case.)
Explanation
Part 1
(?:\[? Non capturing group that may have a [
\b Boundary
(?:Total|maximum) non capturing group matching either literal
:?\s?\d+\s? Maybe a : maybe a space, some digits, maybe another space.
[^ ]+ A bunch of non spaces.
(*SKIP)(*FAIL))| Plot twist: Anything matching Part 1 FAILS
Part 2
This is captured, for real.
\d++\s? digits, maybe followed by a space.
[^ )\]]* And maybe stuff that's not a space, ), or ].
The PHP should look something like this:
preg_match_all(
'/(?:\[?\b(?:Total|maximum):?\s?\d+\s?[^ ]+(*SKIP)(*FAIL))|(\d++\s?[^ )\]]*)/i',
"YOUR STRING",
$matches
);
print_r($matches[0]);
Actually I would go for the two step solution. First clean up the trashy words by replacing them with this regexp:
(Total:?\s?|maximum:?\s?)
Then match all the content you really need is easy:
\[?\(?([0-9]{1,2}\s?marks?|[0-9]{1,2}\s?mk?s?)\)?\]?
No idea how to use debuggex.com but I tested all regular expressions in pspad so it definitely works.

Categories