Get number before the percent sign - php

I have a string like that:
$string = "Half Board, 10% Off & 100 Euro Star - Save £535";
The percentage can be anywhere in the string.
This is what I have, but I don't like the fact that it has a preg_replace AND a loop operation, which are heavy. I'd like a ReGex expression that would do it in one operation.
$string = "Half Board, 10% Off & 100 Euro Star - Save £535";
$string_array = explode(" ", $string);
$pattern = '/[^0-9,.]*/'; // non-digits
foreach($string_array as $pc) {
if(stristr($pc, '%')) {
$percent = preg_replace($pattern, '', $pc);
break;
}
}
echo $percent;
exit;

Update:
From the code you added to your question, I get the impression your percentages might look like 12.3% or even .50%. In which case, the regex you're looking for is this:
if (preg_match_all('/(\d+|\d+[.,]\d{1,2})(?=\s*%)/','some .50% and 5% text with 12.5% random percentages and 123 digits',$matches))
{
print_r($matches);
}
Which returns:
Array
(
[0] => Array
(
[0] => .50
[1] => 5
[2] => 12.5
)
[1] => Array
(
[0] => .50
[1] => 5
[2] => 12.5
)
)
the expression explained:
(\d+|\d*[.,]\d{1,2}): is an OR -> either match digits \d+, or \d* zero or more digits, followed by a decimal separator ([.,]) and 1 or 2 digits (\d{1,2})
(?=\s*%): only if the afore mentioned group is followed by zero or more spaces and a % sign
Using a regular expression, with a positive lookahead, you can get exactly what you want:
if (preg_match_all('/\d+(?=%)/', 'Save 20% if you buy 5 iPhone charches (excluding 9% tax)', $matches))
{
print_r($matches[0]);
}
gives you:
array (
0 => '20',
1 => '9'
)
Which is, I believe, what you are looking for
The regex works like this:
\d+ matches at least 1 digit (as many as possible)
(?=%): provided they are followed by a % sign
Because of the lookahead, the 5 isn't matched in the example I gave, because it's followed by a space, not a % sign.
If your string might be malformed (have any number of spaces between the digit and the % sign) a lookahead can deal with that, too. As ridgerunner pointed out to me, only lookbehinds need to be of fixed size, so:
preg_match_all('/\d+(?=\s*%)/', $txt, $matches)
The lookahead works like this
\s*: matches zero or more whitespace chars
%: and percent sign
Hence, both 123 % and 123% fit the pattern, and will match.
A good place to read up on regex's is regular-expressions.info
If "complex" regex's (ie with lookaround assertions) aren't your cup of tea (yet, though I strongly suggest learning to use them), you could resort to splitting the string:
$parts = array_map('trim', explode('%', $string));
$percentages = array();
foreach($parts as $part)
{
if (preg_match('/\d+$/', $part, $match))
{//if is required, because the last element of $parts might not end with a number
$percentages[] = $match[0];
}
}
Here, I simply use the % as delimiter, to create an array, and trim each string section (to avoid trailing whitespace), and then procede to check each substring, and match any number that is on the end of that substring:
'get 15% discount'
['get 15', 'discount']
/\d+$/, 'get 15' = [15]
But that's just an awful lot of work, using a lookahead is just way easier.

$str = "Half Board, 10% Off & 100 Euro Star - Save £535";
preg_match("|\d+|", $str, $arr);
print_r($arr);

Try with split like
$str_arr = split(' ',$str);
$my_str = split('%',$str_arr[1]);
echo $my_str[0];

This should work:
$str = "Save 20% on iPhone chargers...";
if (preg_match_all('/\d+(?=%)/', $str, $match))
print_r($match[0]);
Live Demo: http://ideone.com/FLKtE9

Related

Standardize/Sanitize variably-formatted phone numbers to be purely 10-digit strings

Before I store user-supplied phone numbers in my database, I need to standatdize/sanitize the string to consist of exactly 10 digits.
I want to end up with 1112223333 from all of these potential input values:
(111)222-3333
111-222-3333
111.222.3333
+11112223333
11112223333
In the last two strings, there's a 1 as the country code.
I was able to make some progress with:
preg_replace('/\D/', '', mysqli_real_escape_string($conn, $_POST["phone"]));
Can anyone help me to fix up the strings that have more than 10 digits?
Using your preg_replace which got all but the last one. Next you count the length of the string and remove the first number if it's over 9 numbers.
preg_replace('/\D/', '', mysqli_real_escape_string($conn, $_POST["phone"]));
if(strlen($str) > 9){
$str = substr($str, 1);
}
If you want to parse phone numbers, a very useful library is giggsey/libphonenumber-for-php. It is based on Google's libphonenumber, it has also a demo online to show how it works
Do it in two passes:
$phone = [
'(111)222-3333',
'111-222-3333',
'111.222.3333',
'+11112223333',
'11112223333',
'+331234567890',
];
# remove non digit
$res = preg_replace('/\D+/', '', $phone);
# keep only 10 digit
$res = preg_replace('/^\d+(\d{10})$/', '$1', $res);
print_r($res);
Output:
Array
(
[0] => 1112223333
[1] => 1112223333
[2] => 1112223333
[3] => 1112223333
[4] => 1112223333
[5] => 1234567890
)
This task can/should be accomplished by making just one pass over the string to replace unwanted characters.
.* #greedily match zero or more of any character
(\d{3}) #capture group 1
\D* #greedily match zero or more non-digits
(\d{3}) #capture group 2
\D* #greedily match zero or more non-digits
(\d{4}) #capture group 3
$ #match end of string
Matching the position of the end of the string ensures that the final 10 digits from the string are captured and any extra digits at the front of the string are ignored.
Code: (Demo)
$strings = [
'(111)222-3333',
'111-222-3333',
'111.222.3333',
'+11112223333',
'11112223333'
];
foreach ($strings as $string) {
echo preg_replace(
'/.*(\d{3})\D*(\d{3})\D*(\d{4})$/',
'$1$2$3',
$string
) . "\n---\n";
}
Output:
1112223333
---
1112223333
---
1112223333
---
1112223333
---
1112223333
---
The same result can be achieved by changing the third capture group to be a lookahead and only using two backreferences in the replacement string. (Demo)
echo preg_replace(
'/.*(\d{3})\D*(\d{3})\D*(?=\d{4}$)/',
'$1$2',
$string
);
Finally, a much simpler pattern can be used to purge all non-digits, but this alone will not trim the string down to 10 characters. Calling substr() with a starting offset of -10 will ensure that the last 10 digits are preserved. (Demo)
echo substr(preg_replace('/\D+/', '', $string), -10);
As a side note, you should use a prepared statement to interact with your database instead of relying on escaping which may have vulnerabilities.
Use str_replace with an array of the characters you want to remove.
$str = "(111)222-3333 111-222-3333 111.222.3333 +11112223333";
echo str_replace(["(", ")", "-", "+", "."], "", $str);
https://3v4l.org/80AWc

How to find all occurences of a word that's preceded by a number/decimal/fraction

I'm looking for a regular expression that can match characters that are preceded by a number (integer, decimal or fraction) plus 0 or more spaces
e.g.
$str1="12.5km of road";
$str2="1/2 mile";
$str3="1 l milk";
In the case of $str1, for example, I need something like:
$searchString="km";
preg_match("/THE_REGEX_I_NEED".$searchString."/", $str1, $arrayOfMatches);
I'm not competent with writing regex, so any help here would be appreciated!
You can use:
$str1="12.5km of road";
if (preg_match_all('~\d+(?:[/.]\d+)?\s*(\S+)~', $str1, $arr))
print_r($arr[1]);
EDIT: To match only known strings use this code:
$str1="2 miles of road in 50 states";
if (preg_match_all('~\d+(?:[/.]\d+)?\s*(miles|km)\b~', $str1, $arr))
print_r($arr[1]);
OUTPUT:
Array
(
[0] => miles
)

Regular expression to match digits preceded by a dot

I have a string:
Product, Q.ty: 1, Price: 120.00
I want to select everything after the first comma up to the last two decimal digits (.00) - or, in other words, select the Product, which will be variable though; what is not variable is , Q.t and it is also known that the last two characters in the string will be two digits preceded by a dot . - However only the last one will be always 0, the one preceding it could be anything 0-9, but always a digit.
I've used this to match the string:
preg_replace('/' . preg_quote(', Q.t') . '.*?' . preg_quote('.00') . '/', '', $data );
the problem is that it fails when the last two digits are not 00 but something else like 50, 40, 30 etc. If I use the same regex with a single digit '0', it won't work either because it will catch the first 0 in a string like in my earlier example and will leave out the remaining 0.
How to adjust this expression to catch a group of digits preceded by a '.' dot?
*one further note: this preg_replace is inside a foreach loop; some data won't match at all the pattern I'm trying to pass; which is ok, so in those cases I can print the strings the way they are; but for the cases in the foreach where there's a match, I want to replace part of the string with nothing*
Thank you
/([^,]*), Q\.ty: (\d*), Price: (\d*\.\d{2})/
By using ([^,]*), it will use the comma in the string as the first delimiter. This will capture the beginning of the string up to the first comma, the second match will be quantity and the last match will be the price.
So your provided string:
Product, Q.ty: 1, Price: 120.00
will return
$1 = Product
$2 = 1
$3 = 120.00
on a side note I don't know if that period after Q in Q.ty is intentional in your example or just a typo.
Why not just
/(\d+\.\d{2})$/
which would capture any trailing "numbers" with a decimal place?
You can try
(.+?), (Q\.ty: \d+, .+?\.\d{2})
This should capture everything from the first comma to the last two decimal digits into $2, with the product label being kept in $1
I figured someone (there always is) would say "You can get the pieces with str_replace() and explode()." However it's not faster.
<?php
$string = "Product, Q.ty: 1, Price: 120.00";
$removals = array(",",":");
$stime = microtime();
$nstring = str_replace($removals,'',$string);
$parts = explode(" ",$nstring);
echo microtime()-$stime."secs\n";
print_r($parts);
$pattern = "!^([A-Za-z]+),\s([A-Za-z.]+)\:\s([0-9]+),\s([A-Za-z]+):\s([0-9.]+)$!";
$ptime = microtime();
$m = preg_match($pattern,$string,$matches);
echo microtime()-$ptime."secs\n";
print_r($matches);
?>
Output
4.0999999999958E-5secs
Array
(
[0] => Product
[1] => Q.ty
[2] => 1
[3] => Price
[4] => 120.00
)
3.5000000000007E-5secs
Array
(
[0] => Product, Q.ty: 1, Price: 120.00
[1] => Product
[2] => Q.ty
[3] => 1
[4] => Price
[5] => 120.00
)
Using a more literal approach ,providing the $string doesn't deviate, does not improve performance of the preg_match function.
$pattern = "!^(Product), (Q\.ty): ([0-9]+), (Price): ([0-9.]+)$!";
If you want a literal dot, you should scape it: \.

Split string on non-alphanumeric characters and on positions between digits and non-digits

I'm trying to split a string by non-alphanumeric delimiting characters AND between alternations of digits and non-digits. The end result should be a flat array of consisting of alphabetic strings and numeric strings.
I'm working in PHP, and would like to use REGEX.
Examples:
ES-3810/24MX should become ['ES', '3810', '24', 'MX']
CISCO1538M should become ['CISCO' , '1538', 'M']
The input file sequence can be indifferently DIGITS or ALPHA.
The separators can be non-ALPHA and non-DIGIT chars, as well as a change between a DIGIT sequence to an APLHA sequence, and vice versa.
The command to match all occurrances of a regex is preg_match_all() which outputs a multidimensional array of results. The regex is very simple... any digit ([0-9]) one or more times (+) or (|) any letter ([A-z]) one or more times (+). Note the capital A and lowercase z to include all upper and lowercase letters.
The textarea and php tags are inluded for convenience, so you can drop into your php file and see the results.
<textarea style="width:400px; height:400px;">
<?php
foreach( array(
"ES-3810/24MX",
"CISCO1538M",
"123ABC-ThatsHowEasy"
) as $string ){
// get all matches into an array
preg_match_all("/[0-9]+|[[:upper:][:lower:]]+/",$string,$matches);
// it is the 0th match that you are interested in...
print_r( $matches[0] );
}
?>
</textarea>
Which outputs in the textarea:
Array
(
[0] => ES
[1] => 3810
[2] => 24
[3] => MX
)
Array
(
[0] => CISCO
[1] => 1538
[2] => M
)
Array
(
[0] => 123
[1] => ABC
[2] => ThatsHowEasy
)
$str = "ES-3810/24MX35 123 TEST 34/TEST";
$str = preg_replace(array("#[^A-Z0-9]+#i","#\s+#","#([A-Z])([0-9])#i","#([0-9])([A-Z])#i"),array(" "," ","$1 $2","$1 $2"),$str);
echo $str;
$data = explode(" ",$str);
print_r($data);
I could not think on a more 'cleaner' way.
The most direct preg_ function to produce the desired flat output array is preg_split().
Because it doesn't matter what combination of alphanumeric characters are on either side of a sequence of non-alphanumeric characters, you can greedily split on non-alphanumeric substrings without "looking around".
After that preliminary obstacle is dealt with, then split on the zero-length positions between a digit and a non-digit OR between a non-digit and a digit.
/ #starting delimiter
[^a-z\d]+ #match one or more non-alphanumeric characters
| #OR
\d\K(?=\D) #match a number, then forget it, then lookahead for a non-number
| #OR
\D\K(?=\d) #match a non-number, then forget it, then lookahead for a number
/ #ending delimiter
i #case-insensitive flag
Code: (Demo)
var_export(
preg_split('/[^a-z\d]+|\d\K(?=\D)|\D\K(?=\d)/i', $string, 0, PREG_SPLIT_NO_EMPTY)
);
preg_match_all() isn't a silly technique, but it doesn't return the array, it returns the number of matches and generates a reference variable containing a two dimensional array of which the first element needs to be accessed. Admittedly, the pattern is shorter and easier to follow. (Demo)
var_export(
preg_match_all('/[a-z]+|\d+/i', $string, $m) ? $m[0] : []
);

How do i break string into words at the position of number

I have some string data with alphanumeric value. like us01name, phc01name and other i.e alphabates + number + alphabates.
i would like to get first alphabates + number in first string and remaining on second.
How can i do it in php?
You can use a regular expression:
// if statement checks there's at least one match
if(preg_match('/([A-z]+[0-9]+)([A-z]+)/', $string, $matches) > 0){
$firstbit = $matches[1];
$nextbit = $matches[2];
}
Just to break the regular expression down into parts so you know what each bit does:
( Begin group 1
[A-z]+ As many alphabet characters as there are (case agnostic)
[0-9]+ As many numbers as there are
) End group 1
( Begin group 2
[A-z]+ As many alphabet characters as there are (case agnostic)
) End group 2
Try this code:
preg_match('~([^\d]+\d+)(.*)~', "us01name", $m);
var_dump($m[1]); // 1st string + number
var_dump($m[2]); // 2nd string
OUTPUT
string(4) "us01"
string(4) "name"
Even this more restrictive regex will also work for you:
preg_match('~([A-Z]+\d+)([A-Z]+)~i', "us01name", $m);
You could use preg_split on the digits with the pattern capture flag. It returns all pieces, so you'd have to put them back together. However, in my opinion is more intuitive and flexible than a complete pattern regex. Plus, preg_split() is underused :)
Code:
$str = 'user01jason';
$pieces = preg_split('/(\d+)/', $str, -1, PREG_SPLIT_DELIM_CAPTURE);
print_r($pieces);
Output:
Array
(
[0] => user
[1] => 01
[2] => jason
)

Categories