Match any string in the format (+-)(digit or letter)(colon)

Match any string in the format (+-)(digit or letter)(colon) - php

I need a regex to find any string that matches the format: a '+' or a '-', followed by a number or a letter, followed by a colon ':'.
Example:
"+2: Each player discards a card.\n−X: Return target nonlegendary creature card with converted mana cost X from your graveyard to the battlefield.\n−8: You get an emblem with \"Whenever a creature dies, return it to the battlefield under your control at the beginning of the next end step.\"
Should match "+2:", "-X:" and "-8:".
I've done /[0-9a-z]:/i but I can't match the plus and minus.
Thanks in advance guys.

You may use
$re = '/[−+-]?[0-9a-z]:/iu';
$str = '+2: Each player discards a card.\\n−X: Return target nonlegendary creature card with converted mana cost X from your graveyard to the battlefield.\\n−8: You get an emblem with \\"Whenever a creature dies, return it to the battlefield under your control at the beginning of the next end step.';
if (preg_match_all($re, $str, $matches)) {
print_r($matches[0]);
}
See the PHP demo
The [−+-]? part matches an optional −, - or + chars.
If you want to support any other "minus" looking chars, use
$re = '/[−+\p{Pd}]?[0-9a-z]:/iu';
The \p{Pd} matches dash punctuation chars, but not the − char, unfortunately.

Related

Get the last letter and everything after it in PHP

I have a few hundred thousand strings that are laid out like the following
AX23784268B2
LJ93842938A1
MN39423287S
IY289383N2
With PHP I'm racking my brain how to return B2, A1, S, and N2.
Tried all sorts of substr, strstr, strlen manipulation and am coming up short.
substr('MN39423287S', -2); ?> // returns 7S, not S

This is a simpler regexp than the other answer:
preg_match('/[A-Z][^A-Z]*$/', $token, $matches);
echo $matches[0];
[A-Z] matches a letter, [^A-Z] matches a non-letter. * makes the preceiding pattern match any number of times (including 0), and $ matches the end of the string.
So this matches a letter followed by any number of non-letters at the end of the string.
$matches[0] contains the portion of the string that the entire regexp matched.

There's many way to do this.
One example would be a regex
<?php
$regex = "/.+([A-Z].?+)$/";
$tokens = [
'AX23784268B2',
'LJ93842938A1',
'MN39423287S',
'IY289383N2',
];
foreach($tokens as $token)
{
preg_match($regex, $token, $matches);
var_dump($matches[1]);
// B2, A2, S, N2
}
How the regex works;
.+ - any character except newline
( - create a group
[A-Z] - match any A-Z character
.?+ - also match any characters after it, if any
) - end group
$ - match the end of the string

Regex match specific string without other string

So I've made this regex:
/(?!for )€([0-9]{0,2}(,)?([0-9]{0,2})?)/
to match only the first of the following two sentences:
discount of €50,20 on these items
This item on sale now for €30,20
As you might've noticed already, I'd like the amount in the 2nd sentence not to be matched because it's not the discount amount. But I'm quite unsure how to find this in regex because of all I could find offer options like:
(?!foo|bar)
This option, as can be seen in my example, does not seem to be the solution to my issue.
Example:
https://www.phpliveregex.com/p/y2D
Suggestions?

You can use
(?<!\bfor\s)€(\d+(?:,\d+)?)
See the regex demo.
Details
(?<!\bfor\s) - a negative lookbehind that fails the match if there is a whole word for and a whitespace immediately before the current position
€ - a euro sign
(\d+(?:,\d+)?) - Group 1: one or more digits followed with an optional sequence of a comma and one or more digits
See the PHP demo:
$strs= ["discount of €50,20 on these items","This item on sale now for €30,20"];
foreach ($strs as $s){
if (preg_match('~(?<!\bfor\s)€(\d+(?:,\d+)?)~', $s, $m)) {
echo $m[1].PHP_EOL;
} else {
echo "No match!";
}
}
Output:
50,20
No match!

You could make sure to match the discount first in the line:
\bdiscount\h[^\r\n€]*\K€\d{1,2}(?:,\d{1,2})?\b
Explanation
\bdiscount\h A word boundary, match discount and at least a single space
[^\r\n€]\K Match 0+ times any char except € or a newline, then reset the match buffer
€\d{1,2}(?:,\d{1,2})? Match €, 1-2 digits with an optional part matching , and 1-2 digits
\b A word boundary
Regex demo | Php demo
$re = '/\bdiscount\h[^\r\n€]*\K€\d{1,2}(?:,\d{1,2})?\b/';
$str = 'discount of €50,20 on these items €
This item on sale now for €30,20';
if (preg_match($re, $str, $matches)) {
echo($matches[0]);
}
Output
€50,20

How to change all words to upper-case but exclude Roman numerals?

I'm trying to fix some manually typed addresses. I need to apply ucwords on the whole address but I want to keep all the roman numerals in uppercase and the letters after the house number.
VIA PIPPO III 74A
should become:
Via Pippo III 74A
How can I achieve this?

Use a negative lookahead to find words that are not Roman numerals:
/\b(?![LXIVCDM]+\b)([A-Z]+)\b/
Explanation:
\b - assert position at a word boundary
(?! - negative lookahead
[LXIVCDM]+ - match any character from the list one or more times
\b - assert position at a word boundary
) - end of negative lookahead
[A-Z] - any uppercase alphabet, one or more times
\b - assert position at a word boundary
Effectively, this matches any word that aren't entirely composed of the characters in the list [LXIVCDM] - that is, it matches any word that is not a Roman numeral.
Regex101 Demo
Now, use preg_replace_callback() to capture these words, convert them into lower case, and then capitalize the first letter:
$input = 'VIA PIPPO III 74A';
$pattern = '/\b(?![LXIVCDM]+\b)([A-Z]+)\b/';
$output = preg_replace_callback($pattern, function($matches) {
return ucfirst(strtolower($matches[0]));
}, $input);
var_dump($output);
Output:
string(17) "Via Pippo III 74A"
Demo

To selectively uppercase parts of a string via mb_eregi_replace():
$str = mb_eregi_replace('\b([0-9]{1,4}[a-z]{1,2})\b', "strtoupper('\\1')", $str, 'e');
Full example, how to fix an address manually typed, uppercasing the first letter of a words and keeping uppercase roman numerals and the letters A,B,C after the house number):
function ucAddress($str) {
// first lowercase all and use the default ucwords
$str = ucwords(strtolower($str));
// let's fix the default ucwords...
// uppercase letters after house number (was lowercased by the strtolower above)
$str = mb_eregi_replace('\b([0-9]{1,4}[a-z]{1,2})\b', "strtoupper('\\1')", $str, 'e');
// the same for roman numerals
$str = mb_eregi_replace('\bM{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})\b', "strtoupper('\\0')", $str, 'e');
return $str;
}

Find first instance of character, then stop at space?

I think I need to use some kind of regex but struggling...
I have a string e.g.
the cat sat on the mat and $10 was all it cost
I want to return
$10
And is there a universal name for currency codes so I could return £10 if it was
the cat sat on the mat and £10 was all it cost
Or a way to add more characters to the expression

If you want to match all currency codes, use the following regex:
/\p{Sc}\d+(\.\d+)?\b/u
explanation:
/ # regex delimiter
\p{Sc} # a currency symbol
\d+ # 1 or more digit
(\.\d+)? # optionally followed by a dot and one or more digit
\b # word boundary
/ # regex delimiter
u # unicode
Have a look at this site to see the meaning of \p{Sc} (Currency Symbol)

You can use
/(\$.*?) /
(note there is a space after the closing parenthesis)
If you want to add more symbols, then use brackets:
$str = 'the cat sat on the mat and £10 was all it cost';
$matches = array();
preg_match( '/([\$£].*?) /', $str, $matches );
This will work if the currency symbol precedes the value, and if there is a space following the value. You might want to check for more general cases, such as the value being at the end of a sentence with no trailing space etc.

$string = 'the cat sat on the mat and $10 was all it cost';
$found = preg_match_all('/[$£]\d*/',$string,$results);
if ($found)
var_dump($results);

This may works for you
$string = "the cat sat on the mat and $10 was all it cost";
preg_match("/ ([\$£\]{1})([0-9]+)/", $string, $matches);
echo "<pre>";
print_r($matches);

Regular expression to match hyphenated words

How can I extract hyphenated strings from this string line?
ADW-CFS-WE CI SLA Def No SLANAME CI Max Outage Service
I just want to extract "ADW-CFS-WE" from it but has been very unsuccessful for the past few hours. I'm stuck with this simple regEx "(.*)" making the all of the string stated about selected.

You can probably use:
preg_match("/\w+(-\w+)+/", ...)
The \w+ will match any number of alphanumeric characters (= one word). And the second group ( ) is any additional number of hyphen with letters.
The trick with regular expressions is often specificity. Using .* will often match too much.

$input = "ADW-CFS-WE X-Y CI SLA Def No SLANAME CI Max Outage Service";
preg_match_all('/[A-Z]+-[A-Z-]+/', $input, $matches);
foreach ($matches[0] as $m) {
echo $matches . "\n";
}
Note that this solutions assumes that only uppercase A-Z can match. If that's not the case, insert the correct character class. For example, if you want to allow arbitrary letters (like a and Ä), replace [A-Z] with \p{L}.

Just catch every space free [^\s] words with at least an '-'.
The following expression will do it:
<?php
$z = "ADW-CFS-WE CI SLA Def No SLANAME CI Max Outage Service";
$r = preg_match('#([^\s]*-[^\s]*)#', $z, $matches);
var_dump($matches);

The following pattern assumes the data is at the beginning of the string, contains only capitalized letters and may contain a hyphen before each group of one or more of those letters:
<?php
$str = 'ADW-CFS-WE CI SLA Def No SLANAME CI Max Outage Service';
if (preg_match('/^(?:-?[A-Z]+)+/', $str, $matches) !== false)
var_dump($matches);
Result:
array(1) {
[0]=>
string(10) "ADW-CFS-WE"
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Match any string in the format (+-)(digit or letter)(colon) - php

Related

Get the last letter and everything after it in PHP

Regex match specific string without other string

How to change all words to upper-case but exclude Roman numerals?

Find first instance of character, then stop at space?

Regular expression to match hyphenated words

Categories

Resources