Find number of occurrence of given value within string? - php

Consider a string as below .
$string="Lorem ipsum $ 1000 ,ipsum $2000 sopr $250 gerb $ 150 dfkuer fsdf erwer 1020 $ gsdfasdtwe qw $ 5000 efk kdfgksgdf 2000 $ sdhfgsd fsdf 620 $ sdfjg jsdf3000$";
I have to find out how many numbers are there within this string. But the number is equal to 1000 and above 1000 which proceed and followed by $ symbol .
Example : $1000 (or) $ 1000 (or) 1000$ (or) 1000 $ and above 1000 only .

Using preg_match_all() and a foreach loop:
$string="Lorem ipsum $ 1000 ,ipsum $2000 sopr $250 gerb $ 150 dfkuer fsdf erwer 1020 $ gsdfasdtwe qw $ 50000 efk kdfgksgdf 2000 $ sdhfgsd fsdf 620 $ sdfjg jsdf3000$";
preg_match_all('/(\$\s?)(?P<before>\d{4,})|(?P<after>\d{4,})(\s?\$)/', $string, $m);
$tmp = array_filter($m["before"]) + array_filter($m["after"]);
$number = array();
foreach($tmp as $n){
if($n >= 1000){
if(isset($number[$n])){
$number[$n]++;
}else{
$number[$n] = 1;
}
}
}
print_r($number);
// Key => number, value => n occurences
I've used \d{4,} to match 4 digit numbers which are 1000 or higher, but say for example there is a number like 0500, this will also be matched. So I used a foreach loop to filter the numbers.

Try this :
$string ="Lorem ipsum $ 1000 ,ipsum $2000 sopr $250 gerb $ 150 dfkuer fsdf erwer 1020 $ gsdfasdtwe qw $ 5000 efk kdfgksgdf 2000 $ sdhfgsd fsdf 620 $ sdfjg jsdf3000$";
preg_match_all('/\$\s?(?P<pr>\d{4,})|(?P<fl>\d{4,})\s?\$/',$string,$match);
$res = array_merge(array_filter($match['pr']),array_filter($match['fl']));
echo "<pre>";
print_r($res);
Output :
Array
(
[0] => 1000
[1] => 2000
[2] => 5000
[3] => 1020
[4] => 2000
[5] => 3000
)

<?php
$string="Lorem ipsum $ 1000 ,ipsum $2000 sopr $250 gerb $ 150 dfkuer fsdf erwer 1020 $ gsdfasdtwe qw $ 5000 efk kdfgksgdf 2000 $ sdhfgsd fsdf 620 $ sdfjg jsdf3000$";
$pattern = "#([$][\s]*)?([1-9]\d{3})([\s]*[$])?#";
//(?<=$|$\s)
//(?=$|\s$)
preg_match_all($pattern, $string, $out);
print_r($out[2]);
Array
(
[0] => 1000
[1] => 2000
[2] => 1020
[3] => 5000
[4] => 2000
[5] => 3000
)

Related

Parse strictly formatted text containing multiple entries with no delimiting character

I have a string containing multiple products orders which have been joined together without a delimiter.
I need to parse the input string and convert sets of three substrings into separate rows of data.
I tried splitting the string using split() and strstr() function, but could not generate the desired result.
How can I convert this statement into different columns?
RM is Malaysian Ringgit
From this statement:
"2 x Brew Coffeee Panas: RM7.42 x Tongkat Ali Ais: RM8.6"
Into seperate row:
2 x Brew Coffeee Panas: RM7.4
2 x Tongkat Ali Ais: RM8.6
And this 2 row into this table in DB:
Table: Products
Product Name
Quantity
Total Amount (RM)
Brew Coffeee Panas
2
7.4
Tongkat Ali Ais
2
8.6
*Note: the "total amount" substrings will reliably have a numeric value with precision to one decimal place.
You could use regex if your string format is consistent. Here's an expression that could do that:
(\d) x (.+?): RM(\d+\.\d)
Basic usage
$re = '/(\d) x (.+?): RM(\d+\.\d)/';
$str = '2 x Brew Coffeee Panas: RM7.42 x Tongkat Ali Ais: RM8.6';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
var_export($matches);
Which gives
array (
0 =>
array (
0 => '2 x Brew Coffeee Panas: RM7.4',
1 => '2',
2 => 'Brew Coffeee Panas',
3 => '7.4',
),
1 =>
array (
0 => '2 x Tongkat Ali Ais: RM8.6',
1 => '2',
2 => 'Tongkat Ali Ais',
3 => '8.6',
),
)
Group 0 will always be the full match, after that the groups will be quantity, product and price.
Try it online
Capture one or more digits
Match the space, x, space
Capture one or more non-colon characters until the first occuring colon
Match the colon, space, then RM
Capture the float value that has a max decimal length of 1OP says in comment under question: it only take one decimal place for the amount
There are no "lazy quantifiers" in my pattern, so the regex can move most swiftly.
This regex pattern is as Accurate as the sample data and requirement explanation allows, as Efficient as it can be because it only contains greedy quantifiers, as Concise as it can be thanks to the negated character class, and as Readable as the pattern can be made because there are no superfluous characters.
Code: (Demo)
var_export(
preg_match_all('~(\d+) x ([^:]+): RM(\d+\.\d)~', $string, $m)
? array_slice($m, 1) // omit the fullstring matches
: [] // if there are no matches
);
Output:
array (
0 =>
array (
0 => '2',
1 => '2',
),
1 =>
array (
0 => 'Brew Coffeee Panas',
1 => 'Tongkat Ali Ais',
),
2 =>
array (
0 => '7.4',
1 => '8.6',
),
)
You can add the PREG_SET_ORDER argument to the preg_match_all() call to aid in iterating the matches as rows.
preg_match_all('~(\d+) x ([^:]+): RM(\d+\.\d)~', $string, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
echo '<tr><td>' . implode('</td><td>', array_slice($match, 1)) . '</td></tr>';
}
You can use a regex like this:
/(\d+)\sx\s([^:]+):\sRM(\d+\.?\d?)(?=\d|$)/
Explanation:
(\d+) captures one or more digits
\s matches a whitespace character
([^:]+): captures one or more non : characters that come before a : character (you can also use something like [a-zA-Z0-9\s]+): if you know exactly which characters can exist before the : character - in this case lower case and upper case letters, digits 0 through 9 and whitespace characters)
(\d+\.?\d?) captures one or more digits, followed by a . and another digit if they exist
(?=\d|$) is a positive lookahead which matches a digit after the main expression without including it in the result, or the end of the string
You can also add the PREG_SET_ORDER flag to preg_match_all() to group the results:
PREG_SET_ORDER
Orders results so that $matches[0] is an array of first set of matches, $matches[1] is an array of second set of matches, and so on.
Code example:
<?php
$txt = "2 x Brew Coffeee Panas: RM7.42 x Tongkat Ali Ais: RM8.62 x B026 Kopi Hainan Kecil: RM312 x B006 Kopi Hainan Besar: RM19.5";
$pattern = "/(\d+)\sx\s([^:]+):\sRM(\d+\.?\d?)(?=\d|$)/";
if(preg_match_all($pattern, $txt, $matches, PREG_SET_ORDER)) {
print_r($matches);
}
?>
Output:
Array
(
[0] => Array
(
[0] => 2 x Brew Coffeee Panas: RM7.4
[1] => 2
[2] => Brew Coffeee Panas
[3] => 7.4
)
[1] => Array
(
[0] => 2 x Tongkat Ali Ais: RM8.6
[1] => 2
[2] => Tongkat Ali Ais
[3] => 8.6
)
[2] => Array
(
[0] => 2 x B026 Kopi Hainan Kecil: RM31
[1] => 2
[2] => B026 Kopi Hainan Kecil
[3] => 31
)
[3] => Array
(
[0] => 2 x B006 Kopi Hainan Besar: RM19.5
[1] => 2
[2] => B006 Kopi Hainan Besar
[3] => 19.5
)
)
See it live here php live editor and here regex tester.
The first thing I would do would be to perform a simple replacement using preg_replace to insert, with the aid of a a back-reference to the captured item, based upon the known format of a single decimal point. Anything beyond that single decimal point forms part of the next item - the quantity in this case.
$str="2 x Brew Coffeee Panas: RM7.42 x Tongkat Ali Ais: RM8.625 x Koala Kebabs: RM15.23 x Fried Squirrel Fritters: RM32.4";
# qty price
# 2 7.4
# 2 8.6
# 25 15.2
# 3 32.4
/*
Our RegEx to find the decimal precision,
to split the string apart and the quantity
*/
$pttns=(object)array(
'repchar' => '#(RM\d{1,}\.\d{1})#',
'splitter' => '#(\|)#',
'combo' => '#^((\d{1,}) x)(.*): RM(\d{1,}\.\d{1})$#'
);
# create a new version of the string with our specified delimiter - the PIPE
$str = preg_replace( $pttns->repchar, '$1|', $str );
# split the string intp pieces - discard empty items
$a=array_filter( preg_split( $pttns->splitter, $str, null ) );
#iterate through matches - find the quantity,item & price
foreach($a as $str){
preg_match($pttns->combo,$str,$matches);
$qty=$matches[2];
$item=$matches[3];
$price=$matches[4];
printf('%s %d %d<br />',$item,$qty,$price);
}
Which yields:
Brew Coffeee Panas 2 7
Tongkat Ali Ais 2 8
Koala Kebabs 25 15
Fried Squirrel Fritters 3 32

How to keep the delimiter in the next item in preg_split?

I split a string by a set of characters as
$str = 'a-1 90 b55 0 -4 4 c9';
$array = preg_split('#(?<=[abc])#',$str, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
it preserves the delimiter in the previous element as (demo)
Array
(
[0] => a
[1] => -1 90 b
[2] => 55 0 -4 4 c
[3] => 9
)
but I want to keep it in the next item as
Array
(
[0] => a-1 90
[1] => b55 0 -4 4
[2] => c9
)
Use lookahead instead of lookbehind:
$str = 'a-1 90 b55 0 -4 4 c9';
$array = preg_split('#(?=[abc])#',$str, -1, PREG_SPLIT_NO_EMPTY);
print_r($array);
Since you're not using any capture group in your regex, therefore there is no need to use PREG_SPLIT_DELIM_CAPTURE flag.
Code Demo

Regex Optionally match a pattern multiple times

I have a string and I want to match a specific pattern optionally as many times as may occur.
My String
0.91 0.45 0.69 58 47 45 23 83 90 $595 NO IDL
After 45 until $595 There could be upto 6 more number there. How can I optionally look for repeating number in that space?
Here's what I have so far:
/([\d.]+) ([\d.]+) ([\d.]+)? (\d+) (\d+) (\d+) \$(\d+)/ig
Here are some samples with expected outputs:
0.91 0.45 0.69 58 47 45 23 83 90 $595 NO IDL
output: array([0] => 0.91,
[1] => 0.45,
[2] => 0.69,
[3] => 58,
[4] => 47,
[5] => 45,
[6] => 23,
[7] => 83,
[8] => 90,
[9] => 595)
0.91 0.45 0.69 58 47 45 $595 NO IDL
output: array([0] => 0.91,
[1] => 0.45,
[2] => 0.69,
[3] => 58,
[4] => 47,
[5] => 45,
[5] => 595)
0.91 0.45 0.69 0.63 58 47 45 $595 NO IDL
output: Does not match the pattern because we only want 3 of the first items to contain decimals.
This seems to split the last number into multiple numbers. Can't figure out whats going on.
I am using php preg_match method for this so would like not empty elements in the resulting array if possible. Thanks.
You may validate the string with a positive lookahead triggered at the start of the string, and then match all numbers from the start up to the currency value once the validation succeeds:
'~(?:\G(?!^)|^(?=\d+\.\d+ \d+\.\d+ \d+(?:\.\d+)?(?: \d+)* \$\d))\s*\$?\K\d+(?:\.\d+)?~'
See the regex demo
Details
(?:\G(?!^)|^(?=\d+\.\d+ \d+\.\d+ \d+(?:\.\d+)?(?: \d+)* \$\d)) - either the end of the previous match (\G(?!^)) or start of a string (^) that is followed with
\d+\.\d+
- a space
\d+\.\d+
- a space
\d+ - 1+ digits
(?:\.\d+)? - an optional fractional part
(?: \d+)* - 0+ sequences of a space followed with 1+ digits
- space
\$\d - a $ and a digit.
\s* - 0+ whitespaces
\$? - an optional $ char
\K - match reset operator
\d+(?:\.\d+)? - an int/float number (1+ digits followed with an optional sequence of . and 1+ digits).
PHP demo:
$strs = ['0.91 0.45 0.69 58 47 45 23 83 90 $595 NO IDL','0.91 0.45 0.69 58 47 45 $595 NO IDL','0.91 0.45 0.69 0.63 58 47 45 $595 NO IDL'];
$rx = '~(?:\G(?!^)|^(?=\d+\.\d+ \d+\.\d+ \d+(?:\.\d+)?(?: \d+)* \$\d))\s*\$?\K\d+(?:\.\d+)?~';
foreach ($strs as $s) {
echo "$s:\n";
if (preg_match_all($rx, $s, $matches)) {
print_r($matches[0]);
echo "---------\n";
} else {
echo "NO MATCH!!!\n---------\n";
}
}
Output:
0.91 0.45 0.69 58 47 45 23 83 90 $595 NO IDL:
Array
(
[0] => 0.91
[1] => 0.45
[2] => 0.69
[3] => 58
[4] => 47
[5] => 45
[6] => 23
[7] => 83
[8] => 90
[9] => 595
)
---------
0.91 0.45 0.69 58 47 45 $595 NO IDL:
Array
(
[0] => 0.91
[1] => 0.45
[2] => 0.69
[3] => 58
[4] => 47
[5] => 45
[6] => 595
)
---------
0.91 0.45 0.69 0.63 58 47 45 $595 NO IDL:
NO MATCH!!!
---------
This should give you the expected results:
/([\d\$.]+)/ig
You might repeat the amount of numbers until you matched 45 which is the 6th number.
Explanation
(?:\d+\.\d+)(?: \d+\.\d+){2} Match the number at the start (digit with an decimal part) 3 times
(?: \d+){3} Match a digit with a whitespace 3 times. That will match up till 45
\s* Match zero or more whitespace characters
| Or
\G(?!^) Assert the position at the end of the previous match using a negative lookahead to assert not start of the string
(\d+)\s Capture the digits and match the whitespace in a capturing group
(?:\d+\.\d+)(?: \d+\.\d+){2}(?: \d+){3}\s*|\G(?!^)(\d+)\s
Regex demo
For example a demo to extract the 3 digits after 45:
Demo

Using preg_match_all from title

I need to extract from post title strings like:
12ml
12 ml
123ml
123 ml
12.3ml
12.3 ml
Now im using:
preg_match_all("/[0-9]+\sml/i", $post->post_title, $percentage);
if(isset($percentage[0][0]) && $percentage[0][0] != "" ){
$text = $percentage[0][0]." ";
}
echo $text;
But dont know how to set it for point separated numbers.
You could do:
$str = "abc 12ml def 12 ml xyz 123ml tuv 123 ml jhsfg 12.3ml qjsdfkjfhg 12.3 ml";
if (preg_match_all("/\d+(?:\.\d+)?\s*ml/i", $str, $percentage)) {
print_r($percentage);
}
Output:
Array
(
[0] => Array
(
[0] => 12ml
[1] => 12 ml
[2] => 123ml
[3] => 123 ml
[4] => 12.3ml
[5] => 12.3 ml
)
)
Explanation:
/ : regex delimiter
\d+ : 1 or more digits
(?: : start non capture group
\. : a dot
\d+ : 1 or more digits
)? : end group, optional
\s* : 0 or more spaces
ml : literally ml
/i : regex delimiter, flag case insensitive

Finding (regex?) 10 digits in a row (PHP)

I am facing a problem i am not capable to solve. I have a string consisting of not needed text and 10 digit numbers who always start with "2" or "6". I need to get those in 10digit numbers into an array. I thought of regex and found this article Regular Expression for matching a numeric sequence? which is pretty close to what i need (except the descending/ascending thing) yet, as i could never and will NEVER be able to understand regex, i cant modify to my needs. If anyone could help me out here i would highly appreciate it!
Here is a sample of my string:
".........693 7098469 - ZQH X Bop. Hrtepou 50 flerpoUrroXn ........210 5014166 - 0E000PA E KapaoAn Anpn-rPou 21
EAArivtg .....................................................210 9618677 - MAPIA KapaoAri Arpn-rptou 21 Elanvolo .. 210 9643623 - MAPIA E ...................................................... 210 9643887 - MAPIA 0 loucrrivou 8 HX.toOrran ..............210 9914534 AIPITAKHE APTEMIOE n Avrtnopou 22
Reptcrrept ....._.........._......._................697 7440896 , -10AN."
Thank you very much in advance!
Greetings from Greece!
As I see your string your digits have an space between, and if you want strictly make your selections this is the regex:
[62]\d{2}\s*\d{7}
Explanation:
[62] # Start with 6 or 2
\d{2} # 2 more digits
\s* # any number of white spaces
\d{7} # 7 more digits
Live demo
and PHP code which has preg_match_all to match all occurrences of those strings:
preg_match_all("/[62]\d{2}\s*\d{7}/", $text, $matches);
Output:
Array
(
[0] => 693 7098469
[1] => 210 5014166
[2] => 210 9618677
[3] => 210 9643623
[4] => 210 9643887
[5] => 210 9914534
[6] => 697 7440896
)
PHP live demo
Maybe like this:
<?php
$x=
".........693 7098469 - ZQH X Bop. Hrtepou 50 flerpoUrroXn ........210 5014166 - 0E000PA E KapaoAn Anpn-rPou 21 EAArivtg ....................................................210 9618677 - MAPIA KapaoAri Arpn-rptou 21 Elanvolo .. 210 9643623 - MAPIA E ...................................................... 210 9643887 - MAPIA 0 loucrrivou 8 HX.toOrran ..............210 9914534 AIPITAKHE APTEMIOE n Avrtnopou 22
Reptcrrept ....._.........._......._................697 7440896 , -10AN.";
$x=str_replace(' ','',$x);
preg_match_all('/((2|6)\d{9})/',$x,$matches);
print_r($matches[0]);
And the result:
Array
(
[0] => 6937098469
[1] => 2105014166
[2] => 2109618677
[3] => 2109643623
[4] => 2109643887
[5] => 2109914534
[6] => 6977440896
)
there is a pretty cool page, that visualize the regex code for better understading:
https://www.debuggex.com/
this should work
((?:2|6)[0-9]{2} [0-9]{7})

Categories