PHP regex match SKU (several patterns) in string

PHP regex match SKU (several patterns) in string - php

There are names of records in which are mixed several types of SKU that may contains symbols, digits, etc.
Examples:
Name of product 67304-4200-52-21
67304-4200-52 Name of product
67304-4200 Name of product
38927/6437 Name of product
BKK1MBM06-02 Name of product
BKK1MBM06 Name of product
I need to preg_match (PHP) only SKU part with any symbols in any combinations.
So i wrote pattern:
/\d+\/\d+|\d+-?\d+-?\d+-?\d+|\bbkk.*\b/i
It works but not with [BKK*] SKU.
Is it way to combine all this types of SKU together in one pattern?

The pattern \d+-?\d+-?\d+-?\d+ means that there should be at least 4 digits as all the hyphens are optional, but in the example data the part with the numbers have at least a single hyphen, and consist of 2, 3 or 4 parts.
You could repeat the part with the digits and hyphen 1 or more times, and instead of using .*\b use \S*\b to match optional non whitespace chars that will backtrack until the last word boundary.
Note that if you use another delimiter in php than /, you don't have to escape \/
Using a case insensitive match:
\b(?:\d+(?:-\d+)+|bkk\S*|\d+\/\d+)\b
Explanation
\b A word boundary to prevent a partial word match
(?: Non capture group for the alternatives
\d+(?:-\d+)+ Match 1+ digits and repeat 1 or more times matching - and again 1+ digits (or use {1,3} instead of +)
| Or
bkk\S* Match bkk and optional non whitespace characters
| Or
\d+\/\d+ Match 1+ digits / and 1+ digits
) Close the non capture group
\b A word boundary
See a regex101 demo.

Use
\d+(?:\d+(?:-?\d+){3}|\/\d+)|\b[bB][kK][kK][A-Za-z0-9-]*
See regex proof.
REGEX101 EXPLANATION
1st Alternative \d+(?:\d+(?:-?\d+){3}|\/\d+)
\d matches a digit (equivalent to [0-9])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
Non-capturing group (?:\d+(?:-?\d+){3}|\/\d+)
1st Alternative \d+(?:-?\d+){3}
\d matches a digit (equivalent to [0-9])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
Non-capturing group (?:-?\d+){3}
{3} matches the previous token exactly 3 times
- matches the character - with index 4510 (2D16 or 558) literally (case sensitive)
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
\d matches a digit (equivalent to [0-9])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
2nd Alternative \/\d+
\/ matches the character / with index 4710 (2F16 or 578) literally (case sensitive)
\d matches a digit (equivalent to [0-9])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
2nd Alternative \b[bB][kK][kK][A-Za-z0-9-]*
\b assert position at a word boundary: (^\w|\w$|\W\w|\w\W)
Match a single character present in the list below [bB]
bB matches a single character in the list bB (case sensitive)
Match a single character present in the list below [kK]
kK matches a single character in the list kK (case sensitive)
Match a single character present in the list below [kK]
kK matches a single character in the list kK (case sensitive)
Match a single character present in the list below [A-Za-z0-9-]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
a-z matches a single character in the range between a (index 97) and z (index 122) (case sensitive)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
- matches the character - with index 4510 (2D16 or 558) literally (case sensitive)

Related

Regular expression to find empty functions

I would like to use a regular expression that finds only functions that are empty in php files
For example
function name_not_important()
{
}

Regex can be function\s[^\(]+\([^)]*\)(\n)*{(\n)*}
From https://regex101.com/:
function matches the characters function literally (case sensitive) \s matches any whitespace character (equivalent to [\r\n\t\f\v ])
Match a single character not present in the list below [^(]
matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy) ( matches the
character ( literally (case sensitive) ( matches the character (
literally (case sensitive) Match a single character not present in the
list below [^)]
matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy) ) matches the
character ) literally (case sensitive) ) matches the character )
literally (case sensitive) 1st Capturing Group (\n)*
matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy) A repeated capturing
group will only capture the last iteration. Put a capturing group
around the repeated group to capture all iterations or use a
non-capturing group instead if you're not interested in the data \n
matches a line-feed (newline) character (ASCII 10) { matches the
character { literally (case sensitive) 2nd Capturing Group (\n)*
matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy) A repeated capturing
group will only capture the last iteration. Put a capturing group
around the repeated group to capture all iterations or use a
non-capturing group instead if you're not interested in the data \n
matches a line-feed (newline) character (ASCII 10) } matches the
character } literally (case sensitive) Global pattern flags g
modifier: global. All matches (don't return after first match) m
modifier: multi line. Causes ^ and $ to match the begin/end of each
line (not only begin/end of string)
Note: This regex assumes that indentation of braces are in alignment.

Regular expression to fix my long string that only partially repeats format

I have this string that I want to clean up using PHP and regex:
Name/__text,Password/__text,Profile/__text,Locale/__text,UserType/__text,Passwor
dUpdateDate/__text,Columns/0/Name/__text,Columns/0/Label/__text,Columns/0/Order/
__text,Columns/1/Name/__text,Columns/1/Label/__text,Columns/1/Order/__text,Colum
ns/2/Name/__text,Columns/2/Label/__text,Columns/2/Order/__text,Columns/3/Name/__
text,Columns/3/Label/__text,Columns/3/Order/__text,Columns/4/Name/__text,Columns
/4/Label/__text,Columns/4/Order/__text,Columns/5/Name/__text,Columns/5/Label/__t
ext,Columns/5/Order/__text,Columns/6/Name/__text,Columns/6/Label/__text,Columns/
6/Order/__text,Columns/7/Name/__text,Columns/7/Label/__text,Columns/7/Order/__te
xt,Columns/8/Name/__text,Columns/8/Label/__text,Columns/8/Order/__text,Columns/9
/Name/__text,Columns/9/Label/__text,Columns/9/Order/__text,Columns/10/Name/__tex
t,Columns/10/Label/__text,Columns/10/Order/__text,Columns/11/Name/__text,Columns
/11/Label/__text,Columns/11/Order/__text,Columns/12/Name/__text,Columns/12/Label
/__text,Columns/12/Order/__text,Columns/13/Name/__text,Columns/13/Label/__text,C
olumns/13/Order/__text,MailAddress/__text,Description/__text,Columns/14/Name/__t
ext,Columns/14/Label/__text,Columns/14/Order/__text,Columns/15/Name/__text,Colum
ns/15/Label/__text,Columns/15/Order/__text
I want it to be Password,Profile,Locale,UserType,PasswordUpdateDate,Name,Label,Order...
I'm removing the /text or /__text after the word, but there are only sometimes things like Columns/0/ before the word to remove.
I tried this (below) regular expression in the regex tester, but it misses the first few items that don't have the Columns/2/ type of thing before it. I can't use a regex that will grab what's before /__text, because the / before the word is optional, like for the first Name. Any ideas how to do this? It's tough to search for this pattern or info on how to create it. Any help would be great!
[A-Za-z\/0-9]+\/([A-Za-z]+)\/[__text]

Probably easier to just match what you want and then join them on commas. Match a word (\w+) followed by \__text:
preg_match_all('#(\w+)/__text#', $string, $matches);
$result = implode(',', $matches[1]);
You could also use ([A-Za-z0-9]+) and add anything else instead of (\w+) in case it could be First_Name, First-Name, Firstname0 etc...

Regex:
(\w+)\/__text(?:(,)(?:Columns\/\d+\/)*)*
Demo
Explanation:
/(\w+)\/__text(?:(,)(?:Columns\/\d+\/)*)*/g
1st Capturing Group (\w+)
\w+ matches any word character (equal to [a-zA-Z0-9_])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
\/ matches the character / literally (case sensitive)
__text matches the characters __text literally (case sensitive)
Non-capturing group (?:(,)(?:Columns\/\d+\/)*)*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
2nd Capturing Group (,)
, matches the character , literally (case sensitive)
Non-capturing group (?:Columns\/\d+\/)*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Columns matches the characters Columns literally (case sensitive)
\/ matches the character / literally (case sensitive)
\d+ matches a digit (equal to [0-9])
\/ matches the character / literally (case sensitive)

PHP regex: each word must end with dot

Can someone help me how to specific pattern for preg_match function?
Every word in string must end with dot
First character of string must be [a-zA-Z]
After each dot there can be a space
There can't be two spaces next to each other
Last character must be a dot (logicaly after word)
Examples:
"Ing" -> false
"Ing." -> true
".Ing." -> false
"Xx Yy." -> false
"XX. YY." -> true
"XX.YY." -> true
Can you help me please how to test the string? My pattern is
/^(([a-zA-Z]+)(?! ) \.)+\.$/
I know it's wrong, but i can't figure out it. Thanks

Check how this fits your needs.
/^(?:[A-Z]+\. ?)+$/i
^ matches start
(?: opens a non-capture group for repetition
[A-Z]+ with i flag matches one or more alphas (lower & upper)
\. ? matches a literal dot followed by an optional space
)+ all this once or more until $ end
Here's a demo at regex101
If you want to disallow space at the end, add negative lookbehind: /^(?:[A-Z]+\. ?)+$(?<! )/i

Try this:
$string = "Ing
Ing.
.Ing.
Xx Yy.
XX. YY.
XX.YY.";
if (preg_match('/^([A-Za-z]{1,}\.[ ]{0,})*/m', $string)) {
// Successful match
} else {
// Match attempt failed
}
Result:
The Regex in detail:
^ Assert position at the beginning of a line (at beginning of the string or after a line break character)
( Match the regular expression below and capture its match into backreference number 1
[A-Za-z] Match a single character present in the list below
A character in the range between “A” and “Z”
A character in the range between “a” and “z”
{1,} Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\. Match the character “.” literally
[ ] Match the character “ ”
{0,} Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)

Password Regular expression with four criteria

I am trying to write a regular expression in PHP to ensure a password matches a criteria which is:
It should atleast 8 characters long
It should include at least one special character
It should include at least one capital letter.
I have written the following expression:
$pattern=([a-zA-Z\W+0-9]{8,})
However, it doesn't seem to work as per the listed criteria. Could I get another pair of eyes to aid me please?

Your regex - ([a-zA-Z\W+0-9]{8,}) - actually searches for a substring in a larger text that is at least 8 characters long, but also allowing any English letters, non-word characters (other than [a-zA-Z0-9_]), and digits, so it does not enforce 2 of your requirements. They can be set with look-aheads.
Here is a fixed regex:
^(?=.*\W.*)(?=.*[A-Z].*).{8,}$
Actually, you can replace [A-Z] with \p{Lu} if you want to also match/allow non-English letters. You can also consider using \p{S} instead of \W, or further precise your criterion of a special character by adding symbols or character classes, e.g. [\p{P}\p{S}] (this will also include all Unicode punctuation).
An enhanced regex version:
^(?=.*[\p{S}\p{P}].*)(?=.*\p{Lu}.*).{8,}$
A human-readable explanation:
^ - Beginning of a string
(?=.*\W.*) - Requirement to have at least 1 non-word character
OR (?=.*[\p{S}\p{P}].*) - At least 1 Unicode special or punctuation symbol
(?=.*[A-Z].*) - Requirement to have at least 1 uppercase English letter
OR (?=.*\p{Lu}.*) - At least 1 Unicode letter
.{8,} - Requirement to have at least 8 symbols
$ - End of string
See Demo 1 and Demo 2 (Enhanced regex)
Sample code:
if (preg_match('/^(?=.*\W.*)(?=.*[A-Z].*).{8,}$/u', $header)) {
// PASS
}
else {
# FAIL
}

Using positive lookahead ?= we make sure that all password requirements are met.
Requirements for strong password:
At least 8 chars long
At least 1 Capital Letter
At least 1 Special Character
Regex:
^((?=[\S]{8})(?:.*)(?=[A-Z]{1})(?:.*)(?=[\p{S}])(?:.*))$
PHP implementation:
if (preg_match('/^((?=[\S]{8})(?:.*)(?=[A-Z]{1})(?:.*)(?=[\p{S}])(?:.*))$/u', $password)) {
# Strong Password
} else {
# Weak Password
}
Examples:
12345678 - WEAK
1234%fff - WEAK
1234_44A - WEAK
133333A$ - STRONG
Regex Explanation:
^ assert position at start of the string
1st Capturing group ((?=[\S]{8})(?:.*)(?=[A-Z]{1})(?:.*)(?=[\p{S}])(?:.*))
(?=[\S]{8}) Positive Lookahead - Assert that the regex below can be matched
[\S]{8} match a single character present in the list below
Quantifier: {8} Exactly 8 times
\S match any kind of visible character [\P{Z}\H\V]
(?:.*) Non-capturing group
.* matches any character (except newline) [unicode]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
(?=[A-Z]{1}) Positive Lookahead - Assert that the regex below can be matched
[A-Z]{1} match a single character present in the list below
Quantifier: {1} Exactly 1 time (meaningless quantifier)
A-Z a single character in the range between A and Z (case sensitive)
(?:.*) Non-capturing group
.* matches any character (except newline) [unicode]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
(?=[\p{S}]) Positive Lookahead - Assert that the regex below can be matched
[\p{S}] match a single character present in the list below
\p{S} matches math symbols, currency signs, dingbats, box-drawing characters, etc
(?:.*) Non-capturing group
.* matches any character (except newline) [unicode]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
$ assert position at end of the string
u modifier: unicode: Pattern strings are treated as UTF-16. Also causes escape sequences to match unicode characters
Demo:
https://regex101.com/r/hE2dD2/1

php - regex - how to extract a number with decimal (dot and comma) from a string (e.g. 1,120.01)?

how to extract a number with decimal (dot and comma) from a string (e.g. 1,120.01) ?
I have a regex but doesn't seem to play well with commas
preg_match('/([0-9]+\.[0-9]+)/', $s, $matches);

The correct regex for matching numbers with commas and decimals is as follows (The first two will validate that the number is correctly formatted):
decimal optional (two decimal places)
^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*(?:\.[0-9]{2})?$
Debuggex Demo
Explained:
number (decimal optional)
^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*(?:\.[0-9]{2})?$
Options: case insensitive
Assert position at the beginning of the string «^»
Match a single character present in the list below «[+-]?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
The character “+” «+»
The character “-” «-»
Match a single character in the range between “0” and “9” «[0-9]{1,3}»
Between one and 3 times, as many times as possible, giving back as needed (greedy) «{1,3}»
Match the regular expression below «(?:,?[0-9]{3})*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the character “,” literally «,?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match a single character in the range between “0” and “9” «[0-9]{3}»
Exactly 3 times «{3}»
Match the regular expression below «(?:\.[0-9]{2})?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match the character “.” literally «\.»
Match a single character in the range between “0” and “9” «[0-9]{2}»
Exactly 2 times «{2}»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
Will Match:
1,432.01
456.56
654,246.43
432
321,543
Will not Match
454325234.31
324,123.432
,,,312,.32
123,.23
decimal mandatory (two decimal places)
^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*\.[0-9]{2}$
Debuggex Demo
Explained:
number (decimal required)
^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*\.[0-9]{2}$
Options: case insensitive
Assert position at the beginning of the string «^»
Match a single character present in the list below «[+-]?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
The character “+” «+»
The character “-” «-»
Match a single character in the range between “0” and “9” «[0-9]{1,3}»
Between one and 3 times, as many times as possible, giving back as needed (greedy) «{1,3}»
Match the regular expression below «(?:,?[0-9]{3})*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the character “,” literally «,?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match a single character in the range between “0” and “9” «[0-9]{3}»
Exactly 3 times «{3}»
Match the character “.” literally «\.»
Match a single character in the range between “0” and “9” «[0-9]{2}»
Exactly 2 times «{2}»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
Will Match:
1,432.01
456.56
654,246.43
324.75
Will Not Match:
1,43,2.01
456,
654,246
324.7523
Matches Numbers separated by commas or decimals indiscriminately:
^(\d+(.|,))+(\d)+$
Debuggex Demo
Explained:
Matches Numbers Separated by , or .
^(\d+(.|,))+(\d)+$
Options: case insensitive
Match the regular expression below and capture its match into backreference number 1 «(\d+(.|,))+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Note: You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. «+»
Match a single digit 0..9 «\d+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regular expression below and capture its match into backreference number 2 «(.|,)»
Match either the regular expression below (attempting the next alternative only if this one fails) «.»
Match any single character that is not a line break character «.»
Or match regular expression number 2 below (the entire group fails if this one fails to match) «,»
Match the character “,” literally «,»
Match the regular expression below and capture its match into backreference number 3 «(\d)+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Note: You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. «+»
Match a single digit 0..9 «\d»
Will Match:
1,32.543,2
5456.35,3.2,6.1
2,7
1.6
Will Not Match:
1,.2 // two ., side by side
1234,12345.5467. // ends in a .
,125 // begins in a ,
,.234 // begins in a , and two symbols side by side
123,.1245. // ends in a . and two symbols side by side
Note: wrap either in a group and then just pull the group, let me know if you need more specifics.
Description: This type of RegEx works with any language really (PHP, Python, C, C++, C#, JavaScript, jQuery, etc). These Regular Expressions are good for currency mainly.

You can use this regex: -
/((?:[0-9]+,)*[0-9]+(?:\.[0-9]+)?)/
Explanation: -
/(
(?:[0-9]+,)* # Match 1 or more repetition of digit followed by a `comma`.
# Zero or more repetition of the above pattern.
[0-9]+ # Match one or more digits before `.`
(?: # A non-capturing group
\. # A dot
[0-9]+ # Digits after `.`
)? # Make the fractional part optional.
)/

Add the comma to the range that can be in front of the dot:
/([0-9,]+\.[0-9]+)/
# ^ Comma
And this regex:
/((?:\d,?)+\d\.[0-9]*)/
Will only match
1,067120.01
121,34,120.01
But not
,,,.01
,,1,.01
12,,,.01
# /(
# (?:\d,?) Matches a Digit followed by a optional comma
# + And at least one or more of the previous
# \d Followed by a digit (To prevent it from matching `1234,.123`)
# \.? Followed by a (optional) dot
# in case a fraction is mandatory, remove the `?` in the previous section.
# [0-9]* Followed by any number of digits --> fraction? replace the `*` with a `+`
# )/

The locale-aware float (%f) might be used with sscanf.
$result = sscanf($s, '%f')
That doesn't split the parts into an array though. It simply parses a float.
See also: http://php.net/manual/en/function.sprintf.php
A regex approach:
/([0-9]{1,3}(?:,[0-9]{3})*\.[0-9]+)/

This should work
preg_match('/\d{1,3}(,\d{3})*(\.\d+)?/', $s, $matches);

Here is a great working regex. This accepts numbers with commas and decimals.
/^-?(?:\d+|\d{1,3}(?:,\d{3})+)?(?:\.\d+)?$/

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP regex match SKU (several patterns) in string - php

Related

Regular expression to find empty functions

Regular expression to fix my long string that only partially repeats format

PHP regex: each word must end with dot

Password Regular expression with four criteria

php - regex - how to extract a number with decimal (dot and comma) from a string (e.g. 1,120.01)?

Categories

Resources