I need to validate measurements entered into a form generated by PHP.
I intend to compare them to upper and lower control limits and decide if they fail or pass.
As a first step, I imagine a PHP function which accepts strings representing engineering measurements and converts them to pure numbers before the comparison.
At the moment I'm only expecting measurements of small voltages and currents, so strings like
'1.234uA', '2.34 nA', '39.9mV'. or '-1.003e-12'
will be converted to
1.234e-6, 2.34e-9, 3.99e-2 and -1.003e-12, respectively.
But the method should be generalisable to any measured quantity.
function convert($value) {
$units = array('p' => 'e-12',
'n' => 'e-9',
'u' => 'e-6',
'm' => 'e-3');
$unitstring = implode("", array_keys($units));
$matches = array();
$pattern = "/^(-?(?:\\d*\.\\d+)|(?:\\d+))\s*([$unitstring])([a-z])$/i";
$result = preg_match($pattern, $value, $matches);
if ($result)
$retval = $matches[1].$units[$matches[2]].$matches[3];
else
$retval = $value;
return $retval;
}
So to explain what the above does:
$units is an array to map unit-prefix to the exponent.
$unitstring conglomerates the units into a single string (in the example it would be 'pnum')
The regular expression will match an optional -, followed by either 0 or more digits, a period and 1 or more digits OR 1 or more digits, followed by one of the unit prefixes (only one) and then a single alphabetical character. There can be any amount of whitespace between the number and the units.
Because of the parethesis and the use of preg_match, the number section, the unit prefix, and the unit are all separately captured in the array $matches as elements 1, 2, and 3. (0 will contain the entire string)
$result will be 1 if it matched the regex, 0 otherwise.
$retval is constructed by just connecting the number, the exponent (based on the unit prefix from the array) and the units provided, or it will just be the passed in string (such as if you're given the -1.003e-12, it will be returned)
Of course you can tweak some things, but in general this is a good start. Hope it helps.
In your function
first you need to initialize values for units like -6 for u, -3 for m...etc
divide the string in Number and Unit(i.e micro(u),mili(m),etc).
and then say the entered no is NUM; and unit is UNIT..(char like u,m etc);
while(NUM>10)
{
NUM=NUM/10;
x++; //x is keeping track of the DOT.
}
UNIT=UNIT+x; //i.e UNIT is increased(for M,K,etc) or decreased(for u,m,etc)
echo NUM.e.UNIT;
May be it will do!
My own possibly simple-minded approach has been to use an array of patterns in preg_replace
function convert($value) {
$result = preg_replace($patterns, $replacements, $value);
return $result;
}
Where
$patterns = array('/p[av]/i', '/n[av]/i', '/u[av]/i', '/m[av]/i');
$replacements = array('e-12', 'e-9', 'e-6', 'e-3');
And it could be extended to higher prefixes, but it seems heavy-handed to keep adding increasingly complex regexes to the $patterns array.
Edit: The comparison, later, should interpret the return value as a real number.
I'm hoping someone can suggest something more elegant.
Related
I have a bunch of emails that I read as text in my program and they all have phone numbers such as these:
+370 655 54298
+37065782505
37069788505
865782825
65782825
(686) 51852
How would I go about finding them and saving it into a variable?
For now I am doing it like this:
$found = preg_match('^[0-9\-\+]{9,15}^', $text, $num);
But it does not working at all
Have a look at the "libphonenumber" Google Library.
There are two functions you may find useful
isPossibleNumber - quickly guessing whether a number is a possible phonenumber by using only the length information, much faster than a full validation.
isValidNumber - full validation of a phone number for a region using length and prefix information.
This should work https://regex101.com/r/E2PzRN/2
#\+?\(?\d+\)?\s?\d+\s?\d+#
<?php
$regex = '#\+?\(?\d+\)?\s?\d+\s?\d+#';
$x = [
'+370 655 54298',
'+37065782505',
'37069788505',
'865782825',
'hjtgfjtdfjtgdfjt',
'65782825',
'(686) 51852',
];
foreach ($x as $y) {
if (preg_match($regex, $y, $match)) {
echo $match[0] . "\n";
}
}
Check it in action here https://3v4l.org/6AlQa
We distinguish here 3 types of phone numbers.
The first type is this one:
+37065782505
37069788505
865782825
65782825
Here, the beginning + is optional. we thus consider that we have 7 digits minimum for these numbers.
The regular expression obtained is therefore
(\+?[0-9]{7,})
The second type is this one:
+370 655 54298
Here we have a first block consisting of a + followed by 2 to 6 digits and then several other blocks of 2 to 6 digits and separated by spaces.
The regular expression obtained is therefore
(\+[0-9]{2,6}(\s[0-9]{2,6})+)
The last type is this one:
(686) 51852
This is a first block consisting of 2 to 6 digits surrounded by parentheses and then several other blocks of 2 to 6 digits and separated by spaces.
The regular expression obtained is therefore
(\([0-9]{2,6}\)(\s[0-9]{2,6})+)
The complete extraction code is therefore
preg_match_all("#(\+?[0-9]{7,})|(\+[0-9]{2,6}(\s[0-9]{2,6})+)|(\([0-9]{2,6}\)(\s[0-9]{2,6})+)#",$text,$out);
$found = $out[0];
where $found is an array.
I would suggest stripping out '+','(',')',' ' and testing if it is a ctype_digit
remove all characters and test if numeric, this assumes that the result is a phone no, if you were to run this on an email address the result would be false
var_dump(ctype_digit(str_replace([' ', '+', '(', ')'], '', '(686) 51852')));
TRUE
var_dump(ctype_digit(str_replace([' ', '+', '(', ')'], '', 'r#pm.mr')));
FALSE
I have several values(strings mostly) which I want to process and I was wondering which way is the best to use in my case.
What I will have will be a foreach loop in which I want to have a check and insert into the database edited values.
Structure example:
foreach($values as $value)
{
//string check is going to be here
// .....
//insert the data into the database
$sql = "INSERT INTO results VALUES ('', 'xx', 'xx', 'xx', 'xx')";
$result = mysql_query($sql);
}
Example of values I might get:
value, value
Somewhere in the universe
3.532523, -55.523525
value - value
What I want to do is not accept (actually change their value to 'Not given')
a) numbers
b) strings longer than 10 characters
c) no spaces between the words
d) if there is a , or - only keep the first part of the string before these
A string check which I am testing for example is if I have a value like this
=> string, string
I only want to keep the first part, which is done by
$str = value, value $str = substr($str, 0, stripos($str, ','));
With which technique I will be able to do all these checks better? (preg_match & replace or substr & stipos)
This regex may match what you want to keep, but will allow numbers and _:
/^\w{,10}/
Fix numbers allowed with a negative lookahead and avoids _:
/^(?=\d{,9}[a-zA-Z])[a-zA-Z]{,10}/
If digits are not allowed at all:
/^[a-zA-Z]{,10}/
I've got a large string that I want to put in an array after each 50 words. I thought about using strsplit to cut, but realised that wont take the words in to consideration, just split when it gets to x char.
I've read about str_word_count but can't work out how to put the two together.
What I've got at the moment is:
$outputArr = str_split($output, 250);
foreach($outputArr as $arOut){
echo $arOut;
echo "<br />";
}
But I want to substitute that to form each item of the array at 50 words instead of 250 characters.
Any help will be much appreciated.
Assuming that str_word_count is sufficient for your needs¹, you can simply call it with 1 as the second parameter and then use array_chunk to group the words in groups of 50:
$words = str_word_count($string, 1);
$chunks = array_chunk($words, 50);
You now have an array of arrays; to join every 50 words together and make it an array of strings you can use
foreach ($chunks as &$chunk) { // important: iterate by reference!
$chunk = implode(' ', $chunk);
}
¹ Most probably it is not. If you want to get what most humans consider acceptable results when processing written language you will have to use preg_split with some suitable regular expression instead.
There's another way:
<?php
$someBigString = <<<SAMPLE
This, actually, is a nice' old'er string, as they said, "divided and conquered".
SAMPLE;
// change this to whatever you need to:
$number_of_words = 7;
$arr = preg_split("#([a-z]+[a-z'-]*(?<!['-]))#i",
$someBigString, $number_of_words + 1, PREG_SPLIT_DELIM_CAPTURE);
$res = implode('', array_slice($arr, 0, $number_of_words * 2));
echo $res;
Demo.
I consider preg_split a better tool (than str_word_count) here. Not because the latter is inflexible (it is not: you can define what symbols can make up a word with its third param), but because preg_split will essentially stop processing the string after getting N items.
The trick, as quite common with this function, is to capture delimiters as well, then use them to reconstruct the string with the first N words (where N is given) AND punctuation marks saved.
(of course, the regex used in my example does not strictly comply to str_word_count locale-dependent behavior. But it still restricts the words to consist of alpha, ' and - symbols, with the latter two not at the beginning and the end of any word).
This is for an osCommerce contribution called
("Automatically add multiple products with attribute to cart from external source")
This existing code uses sscanf to 'explode' a string that represents a
- product ID,
- a productOption,
- and quantity:
sscanf('28{8}17[1]', '%d{%d}%d[%f]',
$productID, // 28
$productOptionID, $optionValueID, //{8}17 <--- Product Options!!!
$productQuantity //[1]
);
This works great if there is only 1 'set' of Product Options (e.g. {8}17).
But this procedure needs to be adapted so that it can handle multiple Product Options, and put them into an array, e.g.:
'28{8}17{7}15{9}19[1]' //array(8=>17, 7=>15, 9=>19)
OR
'28{8}17{7}15[1]' //array(8=>17, 7=>15)
OR
'28{8}17[1]' //array(8=>17)
Thanks in advance. (I'm a pascal programmer)
You should not try to do complex recursive parses with one sscanf. Stick it in a loop. Something like:
<?php
$str = "28{8}17{7}15{9}19[1]";
#$str = "28{8}17{7}15[1]";
#$str = "28{8}17[1]";
sscanf($str,"%d%s",$prod,$rest);
printf("Got prod %d\n", $prod);
while (sscanf($rest,"{%d}%d%s",$opt,$id,$rest))
{
printf("opt=%d id=%d\n",$opt,$id);
}
sscanf($rest,"[%d]",$quantity);
printf("Got qty %d\n",$quantity);
?>
Maybe regular expressions may be interesting
$a = '28{8}17{7}15{9}19[1]';
$matches = null;
preg_match_all('~\\{[0-9]{1,3}\\}[0-9]{1,3}~', $a, $matches);
To get the other things
$id = (int) $a; // ;)
$quantity = substr($a, strrpos($a, '[')+ 1, -1);
According the comment a little update
$a = '28{8}17{7}15{9}19[1]';
$matches = null;
preg_match_all('~\\{([0-9]{1,3})\\}([0-9]{1,3})~', $a, $matches, PREG_SET_ORDER);
$result = array();
foreach ($matches as $entry) {
$result[$entry[1]] = $entry[2];
}
sscanf() is not the ideal tool for this task because it doesn't handle recurring patterns and I don't see any real benefit in type casting or formatting the matched subexpressions.
If this was purely a text extraction task (in other words your incoming data was guaranteed to be perfectly formatted and valid), then I could have recommended a cute solution that used strtr() and parse_str() to quickly generate a completely associative multi-dimensional output array.
However, when you commented "with sscanf I had an infinite loop if there is a missing bracket in the string (because it looks for open and closing {}s). Or if I leave out a value. But with your regex solution, if I drop a bracket or leave out a value", then this means that validation is an integral component of this process.
For that reason, I'll recommend a regex pattern that both validates the string and breaks the string into its meaningful parts. There are several logical aspects to the pattern but the hero here is the \G metacharacter that allows the pattern to "continue" matching where the pattern last finished matching in the string. This way we have an array of continuous fullstring matches to pull data from when creating your desired multidimensional output.
The pattern ^\d+(?=.+\[\d+]$)|\G(?!^)(?:{\K\d+}\d+|\[\K\d(?=]$)) in preg_match_all() generates the following type of output in the fullstring element ([0]):
[id], [option0, option1, ...](optional), [quantity]
The first branch in the pattern (^\d+(?=.+\[\d+]$)) validates the string to start with the id number and ends with a square brace wrapped number representing the quantity.
The second branch begins with the "continue" character and contains two logical branches itself. The first matches an option expression (and forgets the leading { thanks to \K) and the second matches the number in the quantity expression.
To create the associative array of options, target the "middle" elements (if there are any), then split the strings on the lingering } and assign these values as key-value pairs.
This is a direct solution because it only uses one preg_ call and it does an excellent job of validating and parsing the variable length data.
Code: (Demo with a battery of test cases)
if (!preg_match_all('~^\d+(?=.+\[\d+]$)|\G(?!^)(?:{\K\d+}\d+|\[\K\d(?=]$))~', $test, $m)) {
echo "invalid input";
} else {
var_export(
[
'id' => array_shift($m[0]),
'quantity' => array_pop($m[0]),
'options' => array_reduce(
$m[0],
function($result, $string) {
[$key, $result[$key]] = explode('}', $string, 2);
return $result;
},
[]
)
]
);
}
I have a form in which people will be entering dollar values.
Possible inputs:
$999,999,999.99
999,999,999.99
999999999
99,999
$99,999
The user can enter a dollar value however they wish. I want to read the inputs as doubles so I can total them.
I tried just typecasting the strings to doubles but that didn't work. Total just equals 50 when it is output:
$string1 = "$50,000";
$string2 = "$50000";
$string3 = "50,000";
$total = (double)$string1 + (double)$string2 + (double)$string3;
echo $total;
A regex won't convert your string into a number. I would suggest that you use a regex to validate the field (confirm that it fits one of your allowed formats), and then just loop over the string, discarding all non-digit and non-period characters. If you don't care about validation, you could skip the first step. The second step will still strip it down to digits and periods only.
By the way, you cannot safely use floats when calculating currency values. You will lose precision, and very possibly end up with totals that do not exactly match the inputs.
Update: Here are two functions you could use to verify your input and to convert it into a decimal-point representation.
function validateCurrency($string)
{
return preg_match('/^\$?(\d{1,3})(,\d{3})*(.\d{2})?$/', $string) ||
preg_match('/^\$?\d+(.\d{2})?$/', $string);
}
function makeCurrency($string)
{
$newstring = "";
$array = str_split($string);
foreach($array as $char)
{
if (($char >= '0' && $char <= '9') || $char == '.')
{
$newstring .= $char;
}
}
return $newstring;
}
The first function will match the bulk of currency formats you can expect "$99", "99,999.00", etc. It will not match ".00" or "99.", nor will it match most European-style numbers (99.999,00). Use this on your original string to verify that it is a valid currency string.
The second function will just strip out everything except digits and decimal points. Note that by itself it may still return invalid strings (e.g. "", "....", and "abc" come out as "", "....", and ""). Use this to eliminate extraneous commas once the string is validated, or possibly use this by itself if you want to skip validation.
You don't ever want to represent monetary values as floats!
For example, take the following (seemingly straight forward) code:
$x = 1.0;
for ($ii=0; $ii < 10; $ii++) {
$x = $x - .1;
}
var_dump($x);
You might assume that it would produce the value zero, but that is not the case. Since $x is a floating point, it actually ends up being a tiny bit more than zero (1.38777878078E-16), which isn't a big deal in itself, but it means that comparing the value with another value isn't guaranteed to be correct. For example $x == 0 would produce false.
http://p2p.wrox.com/topic.asp?TOPIC_ID=3099
goes through it step by step
[edit] typical...the site seems to be down now... :(
not a one liner, but if you strip out the ','s you can do: (this is pseudocode)
m/^\$?(\d+)(?:\.(\d\d))?$/
$value = $1 + $2/100;
That allows $9.99 but not $9. or $9.9 and fails to complain about missplaced thousands separators (bug or feature?)
There is a potential 'locality' issue here because you are assuming that thousands are done with ',' and cents as '.' but in europe it is opposite (e.g. 1.000,99)
I recommend not to use a float for storing currency values. You can get rounding errors if the sum gets large. (Ok, if it gets very large.)
Better use an integer variable with a large enough range, and store the input in cents, not dollars.
I belive that you can accomplish this with printf, which is similar to the c function of the same name. its parameters can be somewhat esoteric though. you can also use php's number_format function
Assuming that you are getting real money values, you could simply strip characters that are not digits or the decimal point:
(pseudocode)
newnumber = replace(oldnumber, /[^0-9.]/, //)
Now you can convert using something like
double(newnumber)
However, this will not take care of strings such as "5.6.3" and other such non-money strings. Which raises the question, "Do you need to handle badly formatted strings?"