Regex match specific string without other string - php

So I've made this regex:
/(?!for )€([0-9]{0,2}(,)?([0-9]{0,2})?)/
to match only the first of the following two sentences:
discount of €50,20 on these items
This item on sale now for €30,20
As you might've noticed already, I'd like the amount in the 2nd sentence not to be matched because it's not the discount amount. But I'm quite unsure how to find this in regex because of all I could find offer options like:
(?!foo|bar)
This option, as can be seen in my example, does not seem to be the solution to my issue.
Example:
https://www.phpliveregex.com/p/y2D
Suggestions?

You can use
(?<!\bfor\s)€(\d+(?:,\d+)?)
See the regex demo.
Details
(?<!\bfor\s) - a negative lookbehind that fails the match if there is a whole word for and a whitespace immediately before the current position
€ - a euro sign
(\d+(?:,\d+)?) - Group 1: one or more digits followed with an optional sequence of a comma and one or more digits
See the PHP demo:
$strs= ["discount of €50,20 on these items","This item on sale now for €30,20"];
foreach ($strs as $s){
if (preg_match('~(?<!\bfor\s)€(\d+(?:,\d+)?)~', $s, $m)) {
echo $m[1].PHP_EOL;
} else {
echo "No match!";
}
}
Output:
50,20
No match!

You could make sure to match the discount first in the line:
\bdiscount\h[^\r\n€]*\K€\d{1,2}(?:,\d{1,2})?\b
Explanation
\bdiscount\h A word boundary, match discount and at least a single space
[^\r\n€]\K Match 0+ times any char except € or a newline, then reset the match buffer
€\d{1,2}(?:,\d{1,2})? Match €, 1-2 digits with an optional part matching , and 1-2 digits
\b A word boundary
Regex demo | Php demo
$re = '/\bdiscount\h[^\r\n€]*\K€\d{1,2}(?:,\d{1,2})?\b/';
$str = 'discount of €50,20 on these items €
This item on sale now for €30,20';
if (preg_match($re, $str, $matches)) {
echo($matches[0]);
}
Output
€50,20

Related

How to echo only a part of preg_replace string?

I've searched but there's nothing that really helps. I'm learning PHP and trying to output the value of $1, after preg_replace has applied the regex rules.
I only want to be left with <span>$1</span>, and the rest of the string needs stripping. Note that the string is highly variable so I can't set custom strstr for example to remove the word 'Get' as there are many variations.
$the_coupon_title = 'Get 10% off at Walmart';
$the_coupon_title = preg_replace(
'/((£\d+\.?\d{0,2}|\d+\.?\d{0,2}%)\s+(off)+)/i',
'<span>$1</span>',
$the_coupon_title
);
echo $the_coupon_title;
?>
You can extract the match and wrap with span tag:
$the_coupon_title = 'Get 10% off at Walmart';
$the_coupon_title = preg_match(
'/(£)?\d+(?:\.\d{1,2})?(?(1)|%)\s+off/iu',
$the_coupon_title,
$m
);
echo '<span>', htmlspecialchars($m[0], ENT_QUOTES | ENT_HTML5), '</span>';
See the PHP demo. New regex details:
(£)? - An optional group #1: a £ char
\d+(?:\.\d{1,2})? - one or more digits and then an optional sequence of . and one or two digits
(?(1)|%) - if £ was not captured in Group 1 match % here
\s+off - one or more whitespaces and off string.
Another approach is to match any text up to the pattern of yours, capture the value matched with the pattern, and then match the rest of the string and replace the match with the captured substring:
echo preg_replace(
'/^.*?((£)?\d+(?:\.\d{1,2})?(?(2)|%)\s+off).*/iu',
'<span>$1</span>',
$the_coupon_title);
See another PHP demo. Note the \1 to \2 change since now the pound symbol is captured into Group 2. It is also prone to HTML injection.

Regular expression for highlighting numbers between words

Site users enter numbers in different ways, example:
from 8 000 packs
432534534
from 344454 packs
45054 packs
04 555
434654
54 564 packs
I am looking for a regular expression with which I could highlight words before digits (if there are any), digits in any format and words after (if there are any). It is advisable to exclude spaces.
Now I have such a design, but it does not work correctly.
(^[0-9|a-zA-Z].*?)\s([0-9].*?)\s([a-zA-Z]*$)
The main purpose of this is to put the strings in order, bring them to the same form, format them in PHP digit format, etc.
As a result, I need to get the text before the digits, the digits themselves and the text after them into the variables separately.
$before = 'from';
$num = '8000';
$after = 'packs';
Thank you for any help in this matter)
I think you may try this:
^(\D+)?([\d \t]+)(\D+)?$
group 1: optional(?) group that will contain anything but digit
group 2: mandatory group that will contain only digits and
white space character like space and tab
group 3: optional(?) group that will contain anything but digit
Demo
Source (run)
$re = '/^(\D+)?([\d \t]+)(\D+)?$/m';
$str = 'from 8 000 packs
432534534
from 344454 packs
45054 packs
04 555
434654
54 564 packs
';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
foreach ($matches as $matchgroup)
{
echo "before: ".$matchgroup[1]."\n";
echo "number:".preg_replace('/\D/m','',$matchgroup[2])."\n";
echo "after:".$matchgroup[3]."";
echo "\n\n\n";
}
I corrected your regex and added groups, the regex looks like this:
^(?<before>[a-zA-Z]+)?\s?(?<number>[0-9].*?)\s?(?<after>[a-zA-Z]+)?$`
Test regex here: https://regex101.com/r/QLEC9g/2
By using groups you can easily separate the words and numbers, and handle them any way you want.
Your pattern does not match because there are 4 required parts that all expect 1 character to be present:
(^[0-9|a-zA-Z].*?)\s([0-9].*?)\s([a-zA-Z]*$)
^^^^^^^^^^^^ ^^ ^^^^^ ^^
The other thing to note is that the first character class [0-9|a-zA-Z] can also match digits (you can omit the | as it would match a literal pipe char)
If you would allow all other chars than digits on the left and right, and there should be at least a single digit present, you can use a negated character class [^\d\r\n]* optionally matching any character except a digit or a newline:
^([^\d\r\n]*)\h*(\d+(?:\h+\d+)*)\h*([^\d\r\n]*)$
^ Start of string
([^\d\r\n]*) Capture group 1, match any char except a digit or a newline
\h* Match optional horizontal whitespace chars
(\d+(?:\h+\d+)*) Capture group 2, match 1+ digits and optionally repeat matching spaces and 1+ digits
\h* Match optional horizontal whitespace chars
([^\d\r\n]*) Capture group 3, match any char except a digit or a newline
$ End of string
See a regex demo and a PHP demo.
For example
$re = '/^([^\d\r\n]*)\h*(\d+(?:\h+\d+)*)\h*([^\d\r\n]*)$/m';
$str = 'from 8 000 packs
test from 8 000 packs test
432534534
from 344454 packs
45054 packs
04 555
434654
54 564 packs';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
foreach($matches as $match) {
list(,$before, $num, $after) = $match;
echo sprintf(
"before: %s\nnum:%s\nafter:%s\n--------------------\n",
$before, preg_replace("/\h+/", "", $num), $after
);
}
Output
before: from
num:8000
after:packs
--------------------
before: test from
num:8000
after:packs test
--------------------
before:
num:432534534
after:
--------------------
before: from
num:344454
after:packs
--------------------
before:
num:45054
after:packs
--------------------
before:
num:04555
after:
--------------------
before:
num:434654
after:
--------------------
before:
num:54564
after:packs
--------------------
If there should be at least a single digit present, and the only allowed characters are a-z for the word(s), you can use a case insensitive pattern:
(?i)^((?:[a-z]+(?:\h+[a-z]+)*)?)\h*(\d+(?:\h+\d+)*)\h*((?:[a-z]+(?:\h+[a-z]+)*)?)?$
See another regex demo and a php demo.

Match any string in the format (+-)(digit or letter)(colon)

I need a regex to find any string that matches the format: a '+' or a '-', followed by a number or a letter, followed by a colon ':'.
Example:
"+2: Each player discards a card.\n−X: Return target nonlegendary creature card with converted mana cost X from your graveyard to the battlefield.\n−8: You get an emblem with \"Whenever a creature dies, return it to the battlefield under your control at the beginning of the next end step.\"
Should match "+2:", "-X:" and "-8:".
I've done /[0-9a-z]:/i but I can't match the plus and minus.
Thanks in advance guys.
You may use
$re = '/[−+-]?[0-9a-z]:/iu';
$str = '+2: Each player discards a card.\\n−X: Return target nonlegendary creature card with converted mana cost X from your graveyard to the battlefield.\\n−8: You get an emblem with \\"Whenever a creature dies, return it to the battlefield under your control at the beginning of the next end step.';
if (preg_match_all($re, $str, $matches)) {
print_r($matches[0]);
}
See the PHP demo
The [−+-]? part matches an optional −, - or + chars.
If you want to support any other "minus" looking chars, use
$re = '/[−+\p{Pd}]?[0-9a-z]:/iu';
The \p{Pd} matches dash punctuation chars, but not the − char, unfortunately.

word boundaries preg_replace

I would like to catch any piece of string that matches %[a-z0-9], respecting the following examples :
1. %xxxxxxxxxxxxx //match
2. this will work %xxxxxx but not this%xxxxxxxxx. //match 1st, not 2nd
3. and also %xxxxxxxxxx. //match
4. just a line ending with %xxxxxxxxxxx //match
5. %Xxxxxxxxxxx //no match
6. 100% of dogs //no match
7. 65%. Begining of new phrase //no match
8. 65%.Begining of new phrase //no match
It can be at the begining of the string or at the end, but not in the middle of a word. It can of course be in the string as a word (separated by space).
I have tried
/(\b)%[a-z0-9]+(\b)/
/(^|\b)%[a-z0-9]+($|\b)/
/(\w)%[a-z0-9]+(\w)/
and others like this, but I can't get it to work like I would. I guess the \b token does not work in example 2 because there is a boundary before the % sign.
Any help would be greatly appreciated.
Try
/\B%[a-z0-9]+\b/
You don't have a word boundary \b between a space and the %, but you have one between s and %.
\B is the opposite of \b not a word boundary.
See it here on regex101
%[a-z0-9]+(?=\s|$)|(?:^|(?<=\s))%[a-z0-9]+
Try this.See demo.
https://regex101.com/r/iS6jF6/20
$re = "/%[a-z0-9]+(?=\\s|$)|(?:^|(?<=\\s))%[a-z0-9]+/m";
$str = "1. %xxxxxxxxxxxxx //match\n2. this will work %xxxxxx but not this%xxxxxxxxx. //match 1st, not 2nd\n3. and also %xxxxxxxxxx. //match\n4. just a line ending with %xxxxxxxxxxx //match\n5. %Xxxxxxxxxxx //no match\n6. 100% of dogs //no match\n7. 65%. Begining of new phrase //no match\n8. 65%.Begining of new phrase //no match";
preg_match_all($re, $str, $matches);
or
%[a-z0-9]+\b|\b%[a-z0-9]+

Find first instance of character, then stop at space?

I think I need to use some kind of regex but struggling...
I have a string e.g.
the cat sat on the mat and $10 was all it cost
I want to return
$10
And is there a universal name for currency codes so I could return £10 if it was
the cat sat on the mat and £10 was all it cost
Or a way to add more characters to the expression
If you want to match all currency codes, use the following regex:
/\p{Sc}\d+(\.\d+)?\b/u
explanation:
/ # regex delimiter
\p{Sc} # a currency symbol
\d+ # 1 or more digit
(\.\d+)? # optionally followed by a dot and one or more digit
\b # word boundary
/ # regex delimiter
u # unicode
Have a look at this site to see the meaning of \p{Sc} (Currency Symbol)
You can use
/(\$.*?) /
(note there is a space after the closing parenthesis)
If you want to add more symbols, then use brackets:
$str = 'the cat sat on the mat and £10 was all it cost';
$matches = array();
preg_match( '/([\$£].*?) /', $str, $matches );
This will work if the currency symbol precedes the value, and if there is a space following the value. You might want to check for more general cases, such as the value being at the end of a sentence with no trailing space etc.
$string = 'the cat sat on the mat and $10 was all it cost';
$found = preg_match_all('/[$£]\d*/',$string,$results);
if ($found)
var_dump($results);
This may works for you
$string = "the cat sat on the mat and $10 was all it cost";
preg_match("/ ([\$£\]{1})([0-9]+)/", $string, $matches);
echo "<pre>";
print_r($matches);

Categories