php's range function behavior - php

PHP.net's documentation on the range function is a little lacking. These functions produce unexpected (to me anyways) results when given character ranges.
$m = range('A','z');
print_r($m);
$m = range('~','"');
print_r($m);
I'm looking for a reference that might explicitly define its behavior.

The issue is that range treats its arguments like integers, and if you give it a single character it will convert it to its ASCII character code.
In the first case, you're getting all characters between character 'A' (integer 65) and character 'z' (integer 122). This is expected behavior for those of us coming from a C (or C-like language) background.
This is one of the rare cases where PHP converts single characters to their ASCII codes rather than parsing the string as integer the way it does normally. Most of the PHP documentation is better at telling you when to expect this. strpos for example, notes:
Needle
If needle is not a string, it is converted to an integer and applied as the ordinal value of a character.
The documentation for range is strangely quiet about it.

Consider:
foreach (range('A','z') as $c)
echo $c."\n";
to be equivalent to:
for ($i = ord('A'); $i <= ord('z'); ++$i)
echo chr($i)."\n";
Likewise, your second example is equivalent to (since ord('~') > ord('"')):
for ($i = ord('~'); $i >= ord('"'); --$i)
echo chr($i)."\n";
It's not well documented, but that's how it is supposed to work.

that is because " is a lower character than ~ try
m = range('A','z'); print_r($m);
$m = range('z','A'); print_r($m);
the characters are pulled by their chr (ASCII Table) values:
http://www.asciitable.com/
the array is returned in the directional order of the 2 parameters.

Related

PHP detect variable length string contains any character other than 1

Using PHP I sometimes have strings that look like the following:
111
110
011
1111
0110012
What is the most efficient way (preferably without regex) to determine if a string contains any character other then the character 1?
Here's a one-line code solution that can be put into a conditional etc.:
strlen(str_replace('1','',$mystring))==0
It strips out the "1"s and sees if there's anything left.
User Don't Panic commented that str_replace could be replaced by trim:
strlen(trim($mystring, '1'))==0
which removes leading and trailing 1s and sees if there's anything left. This would work for the particular case in OP's request but the first option will also tell you how many non-"1" characters you have (if that information matters). Depending on implementation, trim might run slightly faster because PHP doesn't have to check any characters between the first and last non-"1" characters.
You could also use a string like a character array and iterate through from the beginning until you find a character which is not =='1' (in which case, return true) or reach the end of the array (in which case, return false).
Finally, though OP here said "preferably without regex," others open to regexes might use one:
preg_match("/[^1]/", $mystring)==1
Another way to do it:
if (base_convert($string, 2, 2) === $string) {
// $string has only 0 and 1 characters.
}
since your $string is basically a binary number, you can check it with base_convert.
How it works:
var_dump(base_convert('110', 2, 2)); // 110
var_dump(base_convert('11503', 2, 2)); // 110
var_dump(base_convert('9111111111111111111110009', 2, 2)); // 11111111111111111111000
If the returned value of base_convert is different from the input, there're something other characters, beside 0 and 1.
If you want checks if the string has only 1 characters:
if(array_sum(str_split($string)) === strlen($string)) {
// $string has only 1 characters.
}
You retrieve all the single numbers with str_split, and sum them with array_sum. If the result isn't the same as the length of the string, then you've other number in the string beside 1.
Another option is treat string like array of symbols and check for something that is not 1. If it is - break for loop:
for ($i = 0; $i < strlen($mystring); $i++) {
if ($mystring[$i] != '1') {
echo 'FOUND!';
break;
}
}

how to use similar text php code in arabic

Trying to use php similar_text() with arabic, but it's not working.
However it works great with english.
<?php
$var = similar_text("ياسر","عمار","$per");
echo $var;
?>
outbot : 5
that's wrong result, it should be 2. Is there similar_text() with arabic letters?
Here's one I'm using
//from http://www.phperz.com/article/14/1029/31806.html
function mb_split_str($str) {
preg_match_all("/./u", $str, $arr);
return $arr[0];
}
//based on http://www.phperz.com/article/14/1029/31806.html, added percent
function mb_similar_text($str1, $str2, &$percent) {
$arr_1 = array_unique(mb_split_str($str1));
$arr_2 = array_unique(mb_split_str($str2));
$similarity = count($arr_2) - count(array_diff($arr_2, $arr_1));
$percent = ($similarity * 200) / (strlen($str1) + strlen($str2) );
return $similarity;
}
So
$var = mb_similar_text('عمار', 'ياسر', $per);
output: $var = 2, $per = 25
Because the Arabic text are multibyte strings normal PHP functions cannot be used (such as 'similar_text()').
echo(strlen("عمار"));
The above code outputs: 8
echo(mb_strlen("عمار", "UTF-8"));
Using the mb_strlen function with the UTF-8 encoding specified, the output is: 4 (the correct number of characters).
You can use the mb_ functions to make your own version of the similar_text function: http://php.net/manual/en/ref.mbstring.php
Just for the record and hopefully to make some help, I want to clarify the behavior of the similar_text() function when some multi-byte character strings are given (including the character strings of the Arabic.)
The function simply treats each byte of the input string as an individual character (which implies it neither supports multi-byte characters nor the Unicode.)
The byte streams of the عمار and ياسر strings are respectively represented as the following (the bytes (in the hexadecimal representation) are separated using . and, where the end of a character is reached, then a : is used instead):
06.39:06.45:06.27:06.31 <-- Byte stream for عمار
|| || || || ||
06.4A:06.27:06.33:06.31 <-- Byte stream for ياسر
As you can tell, there are five matching, and that's the reason why the function returns 5 in this case (every two hexadecimal digits represent a byte.)

PHP increment operator

<?php
$s = "pa99";
$s++;
echo $s;
?>
The above code outputs to "pb00"
What i wanted was "pa100" and so on.
But also in case its "pa", I want it to go to "pb" which works well with
increment operator.
You are, as Michael says, trying to increment a string - it Does Not Work That Way (tm). What you want to do is this:
<?php
$s = "pa"; //we're defining the string separately!
$i = 99; //no quotes, this is a number
$i++;
echo $s.$i; //concatenate $i onto $s
?>
There's no automated way to increment a string (aa, ab, etc) the way you're asking. You could turn each letter into a number between 1-26 and increment them, and then increment the previous one on overflow. That's kind of messy, though.
To separate the integer from the string, try this:
PHP split string into integer element and string
From the docs:
PHP follows Perl's convention when dealing with arithmetic operations on character variables and not C's. For example, in PHP and Perl $a = 'Z'; $a++; turns $a into 'AA', while in C a = 'Z'; a++; turns a into '[' (ASCII value of 'Z' is 90, ASCII value of '[' is 91). Note that character variables can be incremented but not decremented and even so only plain ASCII characters (a-z and A-Z) are supported. Incrementing/decrementing other character variables has no effect, the original string is unchanged.
<?php
$s_letter = substr($s,0,2);
$s_number = substr($s,2,9);
$s_letter++; $s_number++;
$s_result = $s_letter.$s_number;
echo $s_result;
?>

PHP method for stripping duplicate chars from a multibyte string?

Arrrgh. Does anyone know how to create a function that's the multibyte character equivalent of the PHP count_chars($string, 3) command?
Such that it will return a list of ONLY ONE INSTANCE of each unique character. If that was English and we had
"aaabggxxyxzxxgggghq xcccxxxzxxyx"
It would return "abgh qxyz" (Note the space IS counted).
(The order isn't important in this case, can be anything).
If Japanese kanji (not sure browsers will all support this):
漢漢漢字漢字私私字私字漢字私漢字漢字私
And it will return just the 3 kanji used:
漢字私
It needs to work on any UTF-8 encoded string.
Hey Dave, you're never going to see this one coming.
php > $kanji = '漢漢漢字漢字私私字私字漢字私漢字漢字私';
php > $not_kanji = 'aaabcccbbc';
php > $pattern = '/(.)\1+/u';
php > echo preg_replace($pattern, '$1', $kanji);
漢字漢字私字私字漢字私漢字漢字私
php > echo preg_replace($pattern, '$1', $not_kanji);
abcbc
What, you thought I was going to use mb_substr again?
In regex-speak, it's looking for any one character, then one or more instances of that same character. The matched region is then replaced with the one character that matched.
The u modifier turns on UTF-8 mode in PCRE, in which it deals with UTF-8 sequences instead of 8-bit characters. As long as the string being processed is UTF-8 already and PCRE was compiled with Unicode support, this should work fine for you.
Hey, guess what!
$not_kanji = 'aaabbbbcdddbbbbccgggcdddeeedddaaaffff';
$l = mb_strlen($not_kanji);
$unique = array();
for($i = 0; $i < $l; $i++) {
$char = mb_substr($not_kanji, $i, 1);
if(!array_key_exists($char, $unique))
$unique[$char] = 0;
$unique[$char]++;
}
echo join('', array_keys($unique));
This uses the same general trick as the shuffle code. We grab the length of the string, then use mb_substr to extract it one character at a time. We then use that character as a key in an array. We're taking advantage of PHP's positional arrays: keys are sorted in the order that they are defined. Once we've gone through the string and identified all of the characters, we grab the keys and join'em back together in the same order that they appeared in the string. You also get a per-character character count from this technique.
This would have been much easier if there was such a thing as mb_str_split to go along with str_split.
(No Kanji example here, I'm experiencing a copy/paste bug.)
Here, try this on for size:
function mb_count_chars_kinda($input) {
$l = mb_strlen($input);
$unique = array();
for($i = 0; $i < $l; $i++) {
$char = mb_substr($input, $i, 1);
if(!array_key_exists($char, $unique))
$unique[$char] = 0;
$unique[$char]++;
}
return $unique;
}
function mb_string_chars_diff($one, $two) {
$left = array_keys(mb_count_chars_kinda($one));
$right = array_keys(mb_count_chars_kinda($two));
return array_diff($left, $right);
}
print_r(mb_string_chars_diff('aabbccddeeffgg', 'abcde'));
/* =>
Array
(
[5] => f
[6] => g
)
*/
You'll want to call this twice, the second time with the left string on the right, and the right string on the left. The output will be different -- array_diff just gives you the stuff in the left side that's missing from the right, so you have to do it twice to get the whole story.
Please try to check the iconv_strlen PHP standard library function. Can't say about orient encodings, but it works fine for european and east europe languages. In any case it gives some freedom!
$name = "My string";
$name_array = str_split($name);
$name_array_uniqued = array_unique($name_array);
print_r($name_array_uniqued);
Much easier. User str_split to turn the phrase into an array with each character as an element. Then use array_unique to remove duplicates. Pretty simple. Nothing complicated. I like it that way.

How to add currency strings (non-standardized input) together in PHP?

I have a form in which people will be entering dollar values.
Possible inputs:
$999,999,999.99
999,999,999.99
999999999
99,999
$99,999
The user can enter a dollar value however they wish. I want to read the inputs as doubles so I can total them.
I tried just typecasting the strings to doubles but that didn't work. Total just equals 50 when it is output:
$string1 = "$50,000";
$string2 = "$50000";
$string3 = "50,000";
$total = (double)$string1 + (double)$string2 + (double)$string3;
echo $total;
A regex won't convert your string into a number. I would suggest that you use a regex to validate the field (confirm that it fits one of your allowed formats), and then just loop over the string, discarding all non-digit and non-period characters. If you don't care about validation, you could skip the first step. The second step will still strip it down to digits and periods only.
By the way, you cannot safely use floats when calculating currency values. You will lose precision, and very possibly end up with totals that do not exactly match the inputs.
Update: Here are two functions you could use to verify your input and to convert it into a decimal-point representation.
function validateCurrency($string)
{
return preg_match('/^\$?(\d{1,3})(,\d{3})*(.\d{2})?$/', $string) ||
preg_match('/^\$?\d+(.\d{2})?$/', $string);
}
function makeCurrency($string)
{
$newstring = "";
$array = str_split($string);
foreach($array as $char)
{
if (($char >= '0' && $char <= '9') || $char == '.')
{
$newstring .= $char;
}
}
return $newstring;
}
The first function will match the bulk of currency formats you can expect "$99", "99,999.00", etc. It will not match ".00" or "99.", nor will it match most European-style numbers (99.999,00). Use this on your original string to verify that it is a valid currency string.
The second function will just strip out everything except digits and decimal points. Note that by itself it may still return invalid strings (e.g. "", "....", and "abc" come out as "", "....", and ""). Use this to eliminate extraneous commas once the string is validated, or possibly use this by itself if you want to skip validation.
You don't ever want to represent monetary values as floats!
For example, take the following (seemingly straight forward) code:
$x = 1.0;
for ($ii=0; $ii < 10; $ii++) {
$x = $x - .1;
}
var_dump($x);
You might assume that it would produce the value zero, but that is not the case. Since $x is a floating point, it actually ends up being a tiny bit more than zero (1.38777878078E-16), which isn't a big deal in itself, but it means that comparing the value with another value isn't guaranteed to be correct. For example $x == 0 would produce false.
http://p2p.wrox.com/topic.asp?TOPIC_ID=3099
goes through it step by step
[edit] typical...the site seems to be down now... :(
not a one liner, but if you strip out the ','s you can do: (this is pseudocode)
m/^\$?(\d+)(?:\.(\d\d))?$/
$value = $1 + $2/100;
That allows $9.99 but not $9. or $9.9 and fails to complain about missplaced thousands separators (bug or feature?)
There is a potential 'locality' issue here because you are assuming that thousands are done with ',' and cents as '.' but in europe it is opposite (e.g. 1.000,99)
I recommend not to use a float for storing currency values. You can get rounding errors if the sum gets large. (Ok, if it gets very large.)
Better use an integer variable with a large enough range, and store the input in cents, not dollars.
I belive that you can accomplish this with printf, which is similar to the c function of the same name. its parameters can be somewhat esoteric though. you can also use php's number_format function
Assuming that you are getting real money values, you could simply strip characters that are not digits or the decimal point:
(pseudocode)
newnumber = replace(oldnumber, /[^0-9.]/, //)
Now you can convert using something like
double(newnumber)
However, this will not take care of strings such as "5.6.3" and other such non-money strings. Which raises the question, "Do you need to handle badly formatted strings?"

Categories