Find from parentheses regex php - php

I need help in regex php
It is necessary to use the number from parentheses, the parentheses are repeated in some cases
Example of two different strings:
2.0 16V Quadrifoglio (114 kW / 155 PS)
1.4 TB (940FXB1A) (125 kW / 170 PS)
I needed it to look like this:
2.0 16V Quadrifoglio 155 WORD
1.4 TB 170 WORD
I have code
$text = '2.0 16V Quadrifoglio (114 kW / 155 PS)';
preg_match('#\((.*?)\)#', $text, $match);
print $match[1];
And results is:
114 kW / 155 PS
Please help to find number from parentheses

You need to capture just the number after the /, and replace the whole parenthesis expression with that.
$newText = preg_replace('#\([^)]*/\s*([^)]*)\)#', '$1', $text);

I needed it to look like this:
2.0 16V Quadrifoglio 155 WORD
1.4 TB 170 WORD
Pattern: ~^([^(]*).*?(\d+) PS.*~
Replacement: $1$2 WORD
Demo: https://regex101.com/r/GiJDi5/2
Output:
2.0 16V Quadrifoglio 155 WORD
1.4 TB 170 WORD
PHP: (Demo)
$strings = [
'2.0 16V Quadrifoglio (114 kW / 155 PS)',
'1.4 TB (940FXB1A) (125 kW / 170 PS)'
];
var_export(preg_replace('~^([^(]*).*?(\d+) PS.*~', '$1$2 WORD', $strings));
Output:
array (
0 => '2.0 16V Quadrifoglio 155 WORD',
1 => '1.4 TB 170 WORD',
)

Related

PHP - Regex optimization split string in parts

In PHP I try to make a regex to split a string in different parts as array elements.
For example this are my strings :
$string1 = "For a serving of 100 g Sugars: 2.3 g (Approximately)";
$string2 = "For a serving of 100 g Saturated Fat: 5.8 g (Approximately)";
$string3 = "For a portion of 100 g Energy Value: 290 kcal (Approximately)";
And I want to extract specific informations from these strings :
$arrayString1 = array('100 g','Sugars', '2.3 g');
$arrayString2 = array('100 g','Saturated Fat', '5.8 g');
$arrayString3 = array('100 g','Energy Value', '290 kcal');
I made this regex :
(^For a serving of )([\d g]*)([^:]*)(: )([\d.\d]*)( )([a-z]*)
Do you have any idea how to optimize this regex?
Thanks
You could make it a bit more specific matching the g or kcal and the digits.
To match all examples, you can use an alternation to match either of the alternatives (?:serving|portion)
Instead of using 7 capturing groups, you can use 3 capturing groups.
You can omit the first capturing group (^For a serving of )and combine the values of the digits and the unit.
^For\h+a\h+(?:serving|portion)\h+of\h+(\d+\h+g)\h+([^:\r\n]+):\h+(\d+(?:\.\d+)? (?:g|kcal))\b
^ Start of string
For\h+a\h+(?:serving|portion)\h+of\h+ Match the beginning of the string with either serving or portion
(\d+\h+g)\h+ Capture group 1, match 1+ digits and g
([^:\r\n]+):\h+ Capture group 2, match 1+ times any char except :, followed by matching : and 1+ horizontal whitspace chars
( Capture group 3
\d+(?:\.\d+)? Match 1+ digits with an optional decimal part
\h+(?:g|kcal) Match 1+ horizontal whitespace chars and either g or kcal
)\b Close group 3 and a word boundary to prevent the word being part of a longer word
Regex demo | Php demo
For example
$pattern = "~^For\h+a\h+(?:serving|portion)\h+of\h+(\d+\h+g)\h+([^:\r\n]+):\h+(\d+(?:\.\d+)?\h+(?:g|kcal))\b~";
$strings = [
"For a serving of 100 g Sugars: 2.3 g (Approximately)",
"For a serving of 100 g Saturated Fat: 5.8 g (Approximately)",
"For a portion of 100 g Energy Value: 290 kcal (Approximately)"
];
foreach ($strings as $string) {
preg_match($pattern, $string, $matches);
array_shift($matches);
print_r($matches);
}
Output
Array
(
[0] => 100 g
[1] => Sugars
[2] => 2.3 g
)
Array
(
[0] => 100 g
[1] => Saturated Fat
[2] => 5.8 g
)
Array
(
[0] => 100 g
[1] => Energy Value
[2] => 290 kcal
)

PHP string split regular

Regular exp = (Digits)*(A|B|DF|XY)+(Digits)+
I'm confused about this pattern really
I want to separate this string in PHP, someone can help me
My input maybe something like this
A1234
B 1239
1A123
12A123
1A 1234
12 A 123
1234 B 123456789
12 XY 1234567890
and convert to this
Array
(
[0] => 12
[1] => XY
[2] => 1234567890
)
<?php
$input = "12 XY 123456789";
print_r(preg_split('/\d*[(A|B|DF|XY)+\d+]+/', $input, 3));
//print_r(preg_split('/[\s,]+/', $input, 3));
//print_r(preg_split('/\d*[\s,](A|B)+[\s,]\d+/', $input, 3));
You may match and capture the numbers, letters, and numbers:
$input = "12 XY 123456789";
if (preg_match('/^(?:(\d+)\s*)?(A|B|DF|XY)(?:\s*(\d+))?$/', $input, $matches)){
array_shift($matches);
print_r($matches);
}
See the PHP demo and the regex demo.
^ - start of string
(?:(\d+)\s*)? - an optional sequence of:
(\d+) - Group 1: any or more digits
\s* - 0+ whitespaces
(A|B|DF|XY) - Group 2: A, B, DF or XY
(?:\s*(\d+))? - an optional sequence of:
\s* - 0+ whitespaces
(\d+) - Group 3: any or more digits
$ - end of string.

How to parse mobile number without special character

I want to parse a mobile number without special character for example
+61-426 861 479 ====> 61 426 861 479
PHP preg_match_all
preg_match_all('/(\d{2}) (\d{3}) (\d{3}) (\d{3})/', $part,$matches);
if (count($matches[0])){
foreach ($matches[0] as $mob) {
$records['mobile'][] = $mob;
}
}
Expected Output
+61-426 861 479 ====> 61 426 861 479
You are missing the + and the - in your pattern. You might update your pattern to use 2 capturing groups and use preg_match_all. To add the mobile number to the array you could concatenate the first and the second index.
\+(\d{2})-(\d{3}(?: \d{3}){2})\b
Regex demo | Php demo
For example
$part = "+61-426 861 478 +61-426 861 479 ";
preg_match_all('/\+(\d{2})-(\d{3}(?: \d{3}){2})\b/', $part, $matches, PREG_SET_ORDER, 0);
if (count($matches)) {
foreach ($matches as $mob) {
$records['mobile'][] = $mob[1] . ' ' . $mob[2];
}
}
print_r($records);
Result
Array
(
[mobile] => Array
(
[0] => 61 426 861 478
[1] => 61 426 861 479
)
)
If the number is the only string, you might also remove all the non digits using \D+ and replace with a space. Then use ltrim to remove the leading space from the +. See a php demo.

Extract fragments of text via PHP and REGEXP

Assuming I have the string variable:
$str = '
[WhiteTitle "GM"]
[WhiteCountry "Cuba"]
[BlackCountry "United States"]
1. d4 d5 2. Nf3 Nf6 3. e3 c6 4. c4 e6 5. Nc3 Nbd7 6. Bd3 Bd6
7. O-O O-O 8. e4 dxe4 9. Nxe4 Nxe4 10. Bxe4 Nf6 11. Bc2 h6
12. b3 b6 13. Bb2 Bb7 14. Qd3 g6 15. Rae1 Nh5 16. Bc1 Kg7
17. Rxe6 Nf6 18. Ne5 c5 19. Bxh6+ Kxh6 20. Nxf7+ 1-0
';
I would like to extract some information from that variable into an array that looks like this:
Array {
["WhiteTitle"] => "GM",
["WhiteCountry"] => "Cuba",
["BlackCountry"] => "United States"
}
Thanks.
Here is a safer and more compact solution:
$re = '~\[([^]["]*?)\s*"([^]"]+)~'; // Defining the regex
$str = "[WhiteTitle \"GM\"]\n[WhiteCountry \"Cuba\"]\n[BlackCountry \"United States\"]\n\n1. d4 d5 2. Nf3 Nf6 3. e3 c6 4. c4 e6 5. Nc3 Nbd7 6. Bd3 Bd6\n7. O-O O-O 8. e4 dxe4 9. Nxe4 Nxe4 10. Bxe4 Nf6 11. Bc2 h6\n12. b3 b6 13. Bb2 Bb7 14. Qd3 g6 15. Rae1 Nh5 16. Bc1 Kg7\n17. Rxe6 Nf6 18. Ne5 c5 19. Bxh6+ Kxh6 20. Nxf7+ 1-0";
preg_match_all($re, $str, $matches); // Getting all matches
print_r(array_combine($matches[1],$matches[2])); // Creating the final array with array_combine
See IDEONE PHP demo, and a regex demo.
Regex details:
\[ - opening [
([^]["]*?) - Group 1 matching 0+ characters other than ", [ and ], as few as possible up to
\s* - 0+ whitespaces (to trim the first value)
" - a double quote
([^]"]+) - Group 2 matching 1+ characters other than ] and "
You can use:
preg_match_all('/\[(.*?) "(.*?)"\]/m', $str, $matches, PREG_SET_ORDER);
print_r($matches);
It will give you all the matches in array, 0 key will be complete match, 1st key will be the first part, and 2nd key will be second part:
Output:
Array
(
[0] => Array
(
[0] => [WhiteTitle "GM"]
[1] => WhiteTitle
[2] => GM
)
[1] => Array
(
[0] => [WhiteCountry "Cuba"]
[1] => WhiteCountry
[2] => Cuba
)
[2] => Array
(
[0] => [BlackCountry "United States"]
[1] => BlackCountry
[2] => United States
)
)
If you want it in the format you asked you can use simple looping for this:
$array = array();
foreach($matches as $match){
$array[$match[1]] = $match[2];
}
print_r($array);
Output:
Array
(
[WhiteTitle] => GM
[WhiteCountry] => Cuba
[BlackCountry] => United States
)
You can use something like;:
<?php
$string = <<< EOF
[WhiteTitle "GM"]
[WhiteCountry "Cuba"]
[BlackCountry "United States"]
1. d4 d5 2. Nf3 Nf6 3. e3 c6 4. c4 e6 5. Nc3 Nbd7 6. Bd3 Bd6
7. O-O O-O 8. e4 dxe4 9. Nxe4 Nxe4 10. Bxe4 Nf6 11. Bc2 h6
12. b3 b6 13. Bb2 Bb7 14. Qd3 g6 15. Rae1 Nh5 16. Bc1 Kg7
17. Rxe6 Nf6 18. Ne5 c5 19. Bxh6+ Kxh6 20. Nxf7+ 1-0
EOF;
$final = array();
preg_match_all('/\[(.*?)\s+(".*?")\]/', $string, $matches, PREG_PATTERN_ORDER);
for($i = 0; $i < count($matches[1]); $i++) {
$final[$matches[1][$i]] = $matches[2][$i];
}
print_r($final);
Output:
Array
(
[WhiteTitle] => "GM"
[WhiteCountry] => "Cuba"
[BlackCountry] => "United States"
)
Ideone Demo:
http://ideone.com/wQYshT
Regex Explanation:
\[(.*?)\s+(".*?")\]
Match the character “[” literally «\[»
Match the regex below and capture its match into backreference number 1 «(.*?)»
Match any single character that is NOT a line break character (line feed) «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) «\s+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regex below and capture its match into backreference number 2 «(".*?")»
Match the character “"” literally «"»
Match any single character that is NOT a line break character (line feed) «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “"” literally «"»
Match the character “]” literally «\]»

PHP Number Substring

How can I find all the numbers that are contained in a string except the ones that have also a letter in them (like A1)?
For example in a String "saddfs 2300 dfsfd 45 A3 A6" I only want to get 2300 and 45.
I know that
preg_match_all('!\d+!', $string, $nums);
can find all numbers, but I dont want to find the numbers from A3,A6 too.
Thanks!
Just use word boundary or string boundaries:
preg_match_all('!(^|\b)\d+(\b|$)!', $string, $nums);
Some tests:
php > preg_match_all('!(^|\b)\d+(\b|$)!', 'saddfs 2300 dfsfd 45 A3 A6', $nums);
php > print_r($nums[0]);
Array
(
[0] => 2300
[1] => 45
)
php > preg_match_all('!(^|\b)\d+(\b|$)!', 'saddfs 2300 dfsfd 45 A3 A6 123', $nums);
php > print_r($nums[0]);
Array
(
[0] => 2300
[1] => 45
[2] => 123
)
php > preg_match_all('!(^|\b)[0-9]+(\b|$)!', '789 saddfs 2300 dfsfd 45 A3 A6 123', $nums);
php > print_r($nums[0]);
Array
(
[0] => 789
[1] => 2300
[2] => 45
[3] => 123
)
UPDATE: changed \d to [0-9] per Zsolt Szilagy's suggestion.
Non-robust, quick-and-dirty -- and wrong -- solution:
$ php -a
Interactive shell
php > preg_match_all('/\W\d+\W/', 'saddfs 2300 dfsfd 45 A3 A6', $matches);
php > print_r($matches);
Array
(
[0] => Array
(
[0] => 2300
[1] => 45
)
)
Update Per Aleks G suggestion, laying out the pitfalls to this solution:
First problem: this fails to match pure numbers at the strict beginning or ending of a string. To do that, follow Aleks G pattern, which puts anchor characters in capturing sub-patterns:
preg_match_all('/(^|\W)\d+(\W|$)/', '2300 df A6 242 sfd 45', $matches);
You could make the pattern non-capturing ('/(?:^|\W)\d+(?:\W|$)/') to signal your intent that the parentheses are for grouping, not for capturing -- but this is purely optional as the values you still want remain in $matches[0].
Second problem: \b and \W are not quite the same thing. \b is a "word boundary" while \W is "not a word character". Compare the result of Aleks G and my answer and you'll see that \b gives back pure numbers while \W gives back surrounding space.
Update Per Zsolt Szilagy comment, \d matches the digits in the current character set, so for languages with more digit characters (eg Chinese) you won't get the 0 through 9 expected. Use the character class [0-9] for that.

Categories