Put something after the nth digit, rather than nth character? - php

I'm working on an autocomplete for SSN-like numbers in PHP. So if the user searches for '123', it should find the number 444123555. I want to bold the results, thus, 444<b>123</b>555. I then, however, want to format it as an SSN - thus creating 444-<b>12-3</b>555.
Is there some way to say 'put the dash after the nth digit'? Because I don't want the nth character, just the nth digit - if I could say 'put a dash after the third digit and the fifth digit, ignoring non-numeric characters like <, b, and >' that would be awesome. Is this doable in a regex?
Or is there a different method that's escaping me here?

Just iterate over the string and check that each character is a digit and count the digits as you go.
That will be so much faster than regex, even if regex were a feasible solution here (which I am not convinced it is).

This will do exactly what you asked for:
$str = preg_replace('/^ ((?:\D*\d){3}) ((?:\D*\d){2}) /x', '$1-$2-', $str);
The (?:\D*\d) Will match any number of non-digits, then a digit. By repeating that n times, you match n digits, "ignoring" everything else.

Here's a simple function using an iterative approach as Platinum Azure suggests:
function addNumberSeparator($numString, $n, $separator = '-')
{
$numStringLen = strlen($numString);
$numCount = 0;
for($i = 0; $i < $numStringLen; $i++)
{
if(is_numeric($numString[$i]))
{
$numCount++;
//echo $numCount . '-' . $i;
}
if($numCount == $n)
return substr($numString, 0, $i + 1) . $separator . substr($numString, $i + 1);
}
}
$string = '444<b>123</b>555';
$string = addNumberSeparator($string, 3);
$string = addNumberSeparator($string, 5);
echo $string;
This outputs the following:
4x<b>x123</b>555
That will, of course, only work with a non-numeric separator character. Not the most polished piece of code, but it should give you a start!
Hope that helps.

If you want to get formated number and surrounding text:
<?php
preg_match("/(.*)(\d{3})(12)(3)(.*)/", "assd444123666as555", $match);
$str = $match[1];
if($match[2]!=="") $str.=$match[2]."-<b>";
$str.=$match[3]."-".$match[4]."</b>";
if($match[5]!=="") $str.=$match[5];
echo $str;
?>
If only formatted number:
<?php
preg_match("/(.*)(\d{3})(12)(3)(.*)/", "as444123666as555", $match);
$str = "";
if($match[2]!=="") $str.=$match[2]."-<b>";
$str.=$match[3]."-".$match[4]."</b>";
echo $str;
?>
Sorry, but it is a bit ambiguous.

Related

Add a space on a string but counting right to left

Iv seeing some answers like: Add space after every 4th character using
echo wordwrap('1234567890' , 4 , '-' , true )
But in this case I need to count the characters from right to left.
For example to format a phone number user friendly 123-123-1234. The problem is that sometimes the user could submit a code area, and If I start normally left to right I can get this: 012-312-3123-4 So I am thinking of starting right to left.
Any ideas?
A regex with a lookahead assertion that there are one or more groups of 4 characters between the matched position and the end of the string should do this for you.
echo preg_replace("/(?=(.{4})+$)/", "-", "1234567890");
// 12-3456-7890
You'll need to handle strings with an exact multiple of 4 characters which will end up with a hyphen at the beginning. You could either add a lookbehind assertion to the regex or it might be easier to read if you trim the hyphen off afterwards.
echo preg_replace("/(?=(.{4})+$)/", "-", "123456789012");
// -1234-5678-9012
echo preg_replace("/(?<=.)(?=(.{4})+$)/", "-", "123456789012");
// 1234-5678-9012
echo ltrim(preg_replace("/(?=(.{4})+$)/", "-", "123456789012"), "-");
// 1234-5678-9012
This works
function myFormat($s, $len, $delimiter = "-")
{
$techChar = " ";
$newLen = ceil(strlen($s) / $len) * $len;
$s = str_pad($s, $newLen, $techChar, STR_PAD_LEFT);
$s = wordwrap($s, $len, $delimiter, true);
$s = ltrim($s, $techChar);
return $s;
}

php regex replace each character with asterisk

I am trying to something like this.
Hiding users except for first 3 characters.
EX)
apple -> app**
google -> goo***
abc12345 ->abc*****
I am currently using php like this:
$string = "abcd1234";
$regex = '/(?<=^(.{3}))(.*)$/';
$replacement = '*';
$changed = preg_replace($regex,$replacement,$string);
echo $changed;
and the result be like:
abc*
But I want to make a replacement to every single character except for first 3 - like:
abc*****
How should I do?
Don't use regex, use substr_replace:
$var = "abcdef";
$charToKeep = 3;
echo strlen($var) > $charToKeep ? substr_replace($var, str_repeat ( '*' , strlen($var) - $charToKeep), $charToKeep) : $var;
Keep in mind that regex are good for matching patterns in string, but there is a lot of functions already designed for string manipulation.
Will output:
abc***
Try this function. You can specify how much chars should be visible and which character should be used as mask:
$string = "abcd1234";
echo hideCharacters($string, 3, "*");
function hideCharacters($string, $visibleCharactersCount, $mask)
{
if(strlen($string) < $visibleCharactersCount)
return $string;
$part = substr($string, 0, $visibleCharactersCount);
return str_pad($part, strlen($string), $mask, STR_PAD_RIGHT);
}
Output:
abc*****
Your regex matches all symbols after the first 3, thus, you replace them with a one hard-coded *.
You can use
'~(^.{3}|(?!^)\G)\K.~'
And replace with *. See the regex demo
This regex matches the first 3 characters (with ^.{3}) or the end of the previous successful match or start of the string (with (?!^)\G), and then omits the characters matched from the match value (with \K) and matches any character but a newline with ..
See IDEONE demo
$re = '~(^.{3}|(?!^)\G)\K.~';
$strs = array("aa","apple", "google", "abc12345", "asdddd");
foreach ($strs as $s) {
$result = preg_replace($re, "*", $s);
echo $result . PHP_EOL;
}
Another possible solution is to concatenate the first three characters with a string of * repeated the correct number of times:
$text = substr($string, 0, 3).str_repeat('*', max(0, strlen($string) - 3));
The usage of max() is needed to avoid str_repeat() issue a warning when it receives a negative argument. This situation happens when the length of $string is less than 3.

PHP wrapping last two letters of string with HTML

I am running into a problem trying to do a replacement on a few strings. Essentially what I have is a bunch of prices on my page that look like
RMB148.00
What i am trying to do is run a replace on only the last 2 numbers so i can do something like
RMB14800
Preg replace works fine for the RMB part because it is always there.
My problem is the last two numbers can be anything it all depends on the price so I cant just remove and replace, I need to just wrap HTML <sup> tags around them.
$string = $product['price'];
$string = preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $string);
echo preg_replace('/RMB/', '<sup class="currency-sym">RMB</sup>', $string, 1);
Assuming the last two characters are digits, you could just
$string=preg_replace('/(\d\d)$/', '<sup class="currency-sym">\1</sup>', $string);
If not,
$string=preg_replace('/(..)$/', '<sup class="currency-sym">\1</sup>', $string);
should do the trick.
Alternativly use
$string=substr($string,0,-2).'<sup class="currency-sym">'.substr($string,-2).'</sup>';
Here is a regex solution that looks for the final digit notation at the end of your string.
$string = 'RMB148.00';
$string = preg_replace('/(\d+)\.(\d{2})\z/','$1<sup>$2</sup>',$string);
echo $string;
You could use the following with the explode () function
$string = explode ('.', $product['price']);
$new_string = $string[0].'<sup>'. $string [1]. '</sup>';
And do the regex for the RMB the same way.
Code.
<?php
$string = '14842.00';
$string = substr($string, 0, strlen($string) - 2) . '<sup>' . substr($string, strlen($string) - 2, 2) . '</sup>';
echo $string;
Try online sandbox.
Explanation.
substr($s, $i, $l) gets $l symbols of $s, started from $i index (indexes starts from zero).
So first substr($string, 0, strlen($string) - 2) gets all string except last two symbols.
Second substr($string, strlen($string) - 2, 2) gets only last two symbols.
More about substr.
You should use a pattern matching regex. Note the $1 in the replacement argument matches (\d{2}) in the pattern argument. preg_replace() only replaces the matched pattern. This pattern matches . followed by any two digits. Since . is not included in the replacement argument it does not show up in your $string.
$string = preg_replace('/\.(\d{2})$/', '<sup>$1</sup>', $string);
Of course, you could use one preg_replace to do what you want:
$string = preg_replace('/^(RMB)(\d+)(\.(\d{2}))?$/', "<sup class='currency-sym'>$1</sup>$2<sup>$4</sup>", $string);
The second example may be a good idea if you want DOM integrity, otherwise it creates an empty <sup></sup> when there is no decimal.

Unicode (UTF8) string word count in PHP

I need to have the word count of the following unicode string. Using str_word_count:
$input = 'Hello, chào buổi sáng';
$count = str_word_count($input);
echo $count;
the result is
7
which is aparentley wrong.
How to get the desired result (4)?
$tags = 'Hello, chào buổi sáng';
$word = explode(' ', $tags);
echo count($word);
Here's a demo: http://codepad.org/667Cr1pQ
Here is a quick and dirty regex-based (using Unicode) word counting function:
function mb_count_words($string) {
preg_match_all('/[\pL\pN\pPd]+/u', $string, $matches);
return count($matches[0]);
}
A "word" is anything that contains one or more of:
Any alphabetic letter
Any digit
Any hyphen/dash
This would mean that the following contains 5 "words" (4 normal, 1 hyphenated):
echo mb_count_words('Hello, chào buổi sáng, chào-sáng');
Now, this function is not well suited for very large texts; though it should be able to handle most of what counts as a block of text on the internet. This is because preg_match_all needs to build and populate a big array only to throw it away once counted (it is very inefficient). A more efficient way of counting would be to go through the text character by character, identifying unicode whitespace sequences, and incrementing an auxiliary variable. It would not be that difficult, but it is tedious and takes time.
You may use this function to count unicode words in given string:
function count_unicode_words( $unicode_string ){
// First remove all the punctuation marks & digits
$unicode_string = preg_replace('/[[:punct:][:digit:]]/', '', $unicode_string);
// Now replace all the whitespaces (tabs, new lines, multiple spaces) by single space
$unicode_string = preg_replace('/[[:space:]]/', ' ', $unicode_string);
// The words are now separated by single spaces and can be splitted to an array
// I have included \n\r\t here as well, but only space will also suffice
$words_array = preg_split( "/[\n\r\t ]+/", $unicode_string, 0, PREG_SPLIT_NO_EMPTY );
// Now we can get the word count by counting array elments
return count($words_array);
}
All credits go to the author.
I'm using this code to count word. You can try this
$s = 'Hello, chào buổi sáng';
$s1 = array_map('trim', explode(' ', $s));
$s2 = array_filter($s1, function($value) { return $value !== ''; });
echo count($s2);

Condensed function to strip double letters away from a string (PHP)

I need to take every double letter occurrence away from a word. (I.E. "attached" have to become: "aached".)
I wrote this function:
function strip_doubles($string, $positions) {
for ($i = 0; $i < strlen($string); $i++) {
$stripped_word[] = $string[$i];
}
foreach($positions['word'] as $position) {
unset($stripped_word[$position], $stripped_word[$position + 1]);
}
$returned_string= "";
foreach($stripped_words $key => $value) {
$returned_string.= $stripped_words[$key];
}
return $returned_string;
}
where $string is the word to be stripped and $positions is an array containing the positions of any first double letter.
It perfectly works but how would a real programmer write the same function... in a more condensed way? I have a feeling it could be possible to do the same thing without three loops and so much code.
Non-regex solution, tested:
$string = 'attached';
$stripped = '';
for ($i=0,$l=strlen($string);$i<$l;$i++) {
$matched = '';
// if current char is the same as the next, skip it
while (substr($string, $i, 1)==substr($string, $i+1, 1)) {
$matched = substr($string, $i, 1);
$i++;
}
// if current char is NOT the same as the matched char, append it
if (substr($string, $i, 1) != $matched) {
$stripped .= substr($string, $i, 1);
}
}
echo $stripped;
You should use a regular expression. It matches on certain characteristics and can replace the matched occurences with some other string(s).
Something like
$result = preg_replace('#([a-zA-Z]{1})\1#i', '', $string);
Should work. It tells the regexp to match one character from a-z followed by the match itself, thus effectively two identical characters after each other. The # mark the start and end of the regexp. If you want more characters than just a-z and A-Z, you could use other identifiers like [a-ZA-Z0-9]{1} or for any character .{1} or for only Unicode characters (including combined characters), use \p{L}\p{M}*
The i flag after the last # means 'case insensitive' and will instruct the regexp to also match combinations with different cases, like 'tT'. If you want only combinations in the same case, so 'tt' and 'TT', then remove the 'i' from the flags.
The '' tells the regexp to replace the matched occurences (the two identical characters) with an empty string.
See http://php.net/manual/en/function.preg-replace.php and http://www.regular-expressions.info/

Categories