Is there a nice way to iterate on the characters of a string? I'd like to be able to do foreach, array_map, array_walk, array_filter etc. on the characters of a string.
Type casting/juggling didnt get me anywhere (put the whole string as one element of array), and the best solution I've found is simply using a for loop to construct the array. It feels like there should be something better. I mean, if you can index on it shouldn't you be able to iterate as well?
This is the best I've got
function stringToArray($s)
{
$r = array();
for($i=0; $i<strlen($s); $i++)
$r[$i] = $s[$i];
return $r;
}
$s1 = "textasstringwoohoo";
$arr = stringToArray($s1); //$arr now has character array
$ascval = array_map('ord', $arr); //so i can do stuff like this
$foreach ($arr as $curChar) {....}
$evenAsciiOnly = array_filter( function($x) {return ord($x) % 2 === 0;}, $arr);
Is there either:
A) A way to make the string iterable
B) A better way to build the character array from the string (and if so, how about the other direction?)
I feel like im missing something obvious here.
Use str_split to iterate ASCII strings (since PHP 5.0)
If your string contains only ASCII (i.e. "English") characters, then use str_split.
$str = 'some text';
foreach (str_split($str) as $char) {
var_dump($char);
}
Use mb_str_split to iterate Unicode strings (since PHP 7.4)
If your string might contain Unicode (i.e. "non-English") characters, then you must use mb_str_split.
$str = 'μυρτιὲς δὲν θὰ βρῶ';
foreach (mb_str_split($str) as $char) {
var_dump($char);
}
Iterate string:
for ($i = 0; $i < strlen($str); $i++){
echo $str[$i];
}
If your strings are in Unicode you should use preg_split with /u modifier
From comments in php documentation:
function mb_str_split( $string ) {
# Split at all position not after the start: ^
# and not before the end: $
return preg_split('/(?<!^)(?!$)/u', $string );
}
You can also just access $s1 like an array, if you only need to access it:
$s1 = "hello world";
echo $s1[0]; // -> h
For those who are looking for the fastest way to iterate over strings in php, Ive prepared a benchmark testing.
The first method in which you access string characters directly by specifying its position in brackets and treating string like an array:
$string = "a sample string for testing";
$char = $string[4] // equals to m
I myself thought the latter is the fastest method, but I was wrong.
As with the second method (which is used in the accepted answer):
$string = "a sample string for testing";
$string = str_split($string);
$char = $string[4] // equals to m
This method is going to be faster cause we are using a real array and not assuming one to be an array.
Calling the last line of each of the above methods for 1000000 times lead to these benchmarking results:
Using string[i]
0.24960017204285 Seconds
Using str_split
0.18720006942749 Seconds
Which means the second method is way faster.
Most of the answers forgot about non English characters !!!
strlen counts BYTES, not characters, that is why it is and it's sibling functions works fine with English characters, because English characters are stored in 1 byte in both UTF-8 and ASCII encodings, you need to use the multibyte string functions mb_*
This will work with any character encoded in UTF-8
// 8 characters in 12 bytes
$string = "abcdأبتث";
$charsCount = mb_strlen($string, 'UTF-8');
for($i = 0; $i < $charsCount; $i++){
$char = mb_substr($string, $i, 1, 'UTF-8');
var_dump($char);
}
This outputs
string(1) "a"
string(1) "b"
string(1) "c"
string(1) "d"
string(2) "أ"
string(2) "ب"
string(2) "ت"
string(2) "ث"
Expanded from #SeaBrightSystems answer, you could try this:
$s1 = "textasstringwoohoo";
$arr = str_split($s1); //$arr now has character array
Hmm... There's no need to complicate things. The basics work great always.
$string = 'abcdef';
$len = strlen( $string );
$x = 0;
Forward Direction:
while ( $len > $x ) echo $string[ $x++ ];
Outputs: abcdef
Reverse Direction:
while ( $len ) echo $string[ --$len ];
Outputs: fedcba
// Unicode Codepoint Escape Syntax in PHP 7.0
$str = "cat!\u{1F431}";
// IIFE (Immediately Invoked Function Expression) in PHP 7.0
$gen = (function(string $str) {
for ($i = 0, $len = mb_strlen($str); $i < $len; ++$i) {
yield mb_substr($str, $i, 1);
}
})($str);
var_dump(
true === $gen instanceof Traversable,
// PHP 7.1
true === is_iterable($gen)
);
foreach ($gen as $char) {
echo $char, PHP_EOL;
}
Related
If one is experienced in PHP, then one knows how to find whole words in a string and their position using a regex and preg_match() or preg_match_all. But, if you're looking instead for a lighter solution, you may be tempted to try with strpos(). The question emerges as to how one can use this function without it detecting substrings contained in other words. For example, how to detect "any" but not those characters occurring in "company"?
Consider a string like the following:
"Will *any* company do *any* job, (are there any)?"
How would one apply strpos() to detect each appearance of "any" in the string? Real life often involves more than merely space delimited words. Unfortunately, this sentence didn't appear with the non-alphabetical characters when I originally posted.
I think you could probably just remove all the whitespace characters you care about (e.g., what about hyphenations?) and test for " word ":
var_dump(firstWordPosition('Will company any do any job, (are there any)?', 'any'));
var_dump(firstWordPosition('Will *any* company do *any* job, (are there any)?', 'any'));
function firstWordPosition($str, $word) {
// There are others, maybe also pass this in or array_merge() for more control.
$nonchars = ["'",'"','.',',','!','?','(',')','^','$','#','\n','\r\n','\t',];
// You could also do a strpos() with an if and another argument passed in.
// Note that we're padding the $str to with spaces to match begin/end.
$pos = stripos(str_replace($nonchars, ' ', " $str "), " $word ");
// Have to account for the for-space on " $str ".
return $pos ? $pos - 1: false;
}
Gives 12 (offset from 0)
https://3v4l.org/qh9Rb
<?php
$subject = "any";
$b = " ";
$delimited = "$b$subject$b";
$replace = array("?","*","(",")",",",".");
$str = "Will *any* company do *any* job, (are there any)?";
echo "\nThe string: \"$str\"";
$temp = str_replace($replace,$b,$str);
while ( ($pos = strpos($temp,$delimited)) !== false )
{
echo "\nThe subject \"$subject\" occurs at position ",($pos + 1);
for ($i=0,$max=$pos + 1 + strlen($subject); $i <= $max; $i++) {
$temp[$i] = $b;
}
}
See demo
The script defines a word boundary as a blank space. If the string has non-alphabetical characters, they are replaced with blank space and the result is stored in $temp. As the loop iterates and detects $subject, each of its characters changes into a space in order to locate the next appearance of the subject. Considering the amount of work involved one may wonder if such effort really pays off compared to using a regex with a preg_ function. That is something that one will have to decide themselves. My purpose was to show how this may be achieved using strpos() without resorting to the oft repeated conventional wisdom of SO which advocates using a regex.
There is an option if you are loathe to create a replacement array of non-alphabetical characters, as follows:
<?php
function getAllWholeWordPos($s,$word){
$b = " ";
$delimited = "$b$word$b";
$retval = false;
for ($i=0, $max = strlen( $s ); $i < $max; $i++) {
if ( !ctype_alpha( $s[$i] ) ){
$s[$i] = $b;
}
}
while ( ( $pos = stripos( $s, $delimited) ) !== false ) {
$retval[] = $pos + 1;
for ( $i=0, $max = $pos + 1 + strlen( $word ); $i <= $max; $i++) {
$s[$i] = $b;
}
}
return $retval;
}
$whole_word = "any";
$str = "Will *$whole_word* company do *$whole_word* job, (are there $whole_word)?";
echo "\nString: \"$str\"";
$result = getAllWholeWordPos( $str, $whole_word );
$times = count( $result );
echo "\n\nThe word \"$whole_word\" occurs $times times:\n";
foreach ($result as $pos) {
echo "\nPosition: ",$pos;
}
See demo
Note, this example with its update improves the code by providing a function which uses a variant of strpos(), namely stripos() which has the added benefit of being case insensitive. Despite the more labor-intensive coding, the performance is speedy; see performance.
Try the following code
<!DOCTYPE html>
<html>
<body>
<?php
echo strpos("I love php, I love php too!","php");
?>
</body>
</html>
Output: 7
How can I format an arbitrary string according to a flexible pattern? The only solution I came up with is using regular expressions, but then I need 2 "patterns" (one for the search and one for the output).
Example:
$str = '123ABC5678";
Desired output: 12.3AB-C5-67.8
I would like to use a pattern in a variable (one that a user can easily define without knowledge of regular expressions) It could look like this:
$pattern = '%%.%%%-%%-%%.%';
So the user would just have to use 2 different characters (% and .)
A solution with regex would look like this:
$str = '123ABC5678';
$pattern_src = '#(.{2})(.{3})(.{2})(.{2})(.{1})#';
$pattern_rpl = "$1.$2-$3-$4.$5";
$res = preg_replace($pattern_src, $pattern_rpl, $str);
//$res eq 12.3AB-C5-67.8
Way too complicated since the user would need to define $pattern_src and $pattern_rpl. If the string could vary in length, it would be even more complex to explain.
Yes, I could write a function/parser that builds the required regular expressions based on a simple user pattern like %%.%%%-%%-%%.%. But I wonder if there is any "built in" way to achieve this with php? I was thinking about sprintf etc., but that doesn't seem to do the trick. Any ideas?
I was thinking about sprintf etc., but that doesn't seem to do the trick.
You're on the right track. You can accomplish this with vsprintf as follows:
$str = '123ABC5678';
$pattern = '%%.%%%-%%-%%.%';
echo vsprintf(str_replace('%', '%s', $pattern), str_split($str));
Output:
12.3AB-C5-67.8
This is assuming the number of % characters in $pattern match the length of $str.
Why not write a simple parser that works as follows:
For each character of pattern:
if you match percent character, output next character from input
if you match any other character, output it
$str = '123ABC5678';
$pattern = '%%.%%%-%%-%%.%';
if (strlen($str) < substr_count($pattern, '%'))
Die('The length of input string is lower than number number of placeholders');
$len = strlen($pattern);
$stringIndex = 0;
$output = '';
for ($i = 0; $i < $len; $i++) {
if ($pattern[$i] === '%') {
$output .= $str[$stringIndex];
$stringIndex++;
} else {
$output .= $pattern[$i];
}
}
echo $output;
I have a similar solution that looks like this.
<?php
$format = '%%.%%%-%%-%%.%';
$string = '123ABC5678';
$new_string = '';
$c = 0;
for( $i = 0; $i < strlen( $format ); $i++ )
{
if( $format[ $i ] == '%' )
{
$new_string .= $string[ $c ];
$c++;
}
else
{
$new_string .= $format[ $i ];
}
}
echo $new_string;
Output:
12.3AB-C5-67.8
How about this pattern from the user?
2.3-2-2.1
Where the pattern is a number means n chars, a dot or dash means add a dot or dash.
Now you make a regex to parse the user input:
preg_match_all("/(.)/", $User_input, $pattern);
Now you will have an array with either numbers or dots and dashes.
So loop through the array and build the string:
$string = '123ABC5678';
$User_input = "2.3-2-2.1";
preg_match_all("/(.)/", $User_input, $pattern);
$i=0;
$str="";
foreach($pattern[1] as $val){
if(is_numeric($val)){
$str .= substr($string,$i,$val);
$i=$i+$val;
}else{
$str .= $val;
}
}
echo $str;
https://3v4l.org/5eg5G
Given two equal-length strings, is there an elegant way to get the offset of the first different character?
The obvious solution would be:
for ($offset = 0; $offset < $length; ++$offset) {
if ($str1[$offset] !== $str2[$offset]) {
return $offset;
}
}
But that doesn't look quite right, for such a simple task.
You can use a nice property of bitwise XOR (^) to achieve this: Basically, when you xor two strings together, the characters that are the same will become null bytes ("\0"). So if we xor the two strings, we just need to find the position of the first non-null byte using strspn:
$position = strspn($string1 ^ $string2, "\0");
That's all there is to it. So let's look at an example:
$string1 = 'foobarbaz';
$string2 = 'foobarbiz';
$pos = strspn($string1 ^ $string2, "\0");
printf(
'First difference at position %d: "%s" vs "%s"',
$pos, $string1[$pos], $string2[$pos]
);
That will output:
First difference at position 7: "a" vs "i"
So that should do it. It's very efficient since it's only using C functions, and requires only a single copy of memory of the string.
Edit: A MultiByte Solution Along The Same Lines:
function getCharacterOffsetOfDifference($str1, $str2, $encoding = 'UTF-8') {
return mb_strlen(
mb_strcut(
$str1,
0, strspn($str1 ^ $str2, "\0"),
$encoding
),
$encoding
);
}
First the difference at the byte level is found using the above method and then the offset is mapped to the character level. This is done using the mb_strcut function, which is basically substr but honoring multibyte character boundaries.
var_dump(getCharacterOffsetOfDifference('foo', 'foa')); // 2
var_dump(getCharacterOffsetOfDifference('©oo', 'foa')); // 0
var_dump(getCharacterOffsetOfDifference('f©o', 'fªa')); // 1
It's not as elegant as the first solution, but it's still a one-liner (and if you use the default encoding a little bit simpler):
return mb_strlen(mb_strcut($str1, 0, strspn($str1 ^ $str2, "\0")));
If you convert a string to an array of single character one byte values you can use the array comparison functions to compare the strings.
You can achieve a similar result to the XOR method with the following.
$string1 = 'foobarbaz';
$string2 = 'foobarbiz';
$array1 = str_split($string1);
$array2 = str_split($string2);
$result = array_diff_assoc($array1, $array2);
$num_diff = count($result);
$first_diff = key($result);
echo "There are " . $num_diff . " differences between the two strings. <br />";
echo "The first difference between the strings is at position " . $first_diff . ". (Zero Index) '$string1[$first_diff]' vs '$string2[$first_diff]'.";
Edit: Multibyte Solution
$string1 = 'foorbarbaz';
$string2 = 'foobarbiz';
$array1 = preg_split('((.))u', $string1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
$array2 = preg_split('((.))u', $string2, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
$result = array_diff_assoc($array1, $array2);
$num_diff = count($result);
$first_diff = key($result);
echo "There are " . $num_diff . " differences between the two strings.\n";
echo "The first difference between the strings is at position " . $first_diff . ". (Zero Index) '$string1[$first_diff]' vs '$string2[$first_diff]'.\n";
I wanted to add this as as comment to the best answer, but I do not have enough points.
$string1 = 'foobarbaz';
$string2 = 'foobarbiz';
$pos = strspn($string1 ^ $string2, "\0");
if ($pos < min(strlen($string1), strlen($string2)){
printf(
'First difference at position %d: "%s" vs "%s"',
$pos, $string1[$pos], $string2[$pos]
);
} else if ($pos < strlen($string1)) {
print 'String1 continues with' . substr($string1, $pos);
} else if ($pos < strlen($string2)) {
print 'String2 continues with' . substr($string2, $pos);
} else {
print 'String1 and String2 are equal';
}
string strpbrk ( string $haystack , string $char_list )
strpbrk() searches the haystack string for a char_list.
The return value is the substring of $haystack which begins at the first matched character.
As an API function it should be zippy. Then loop through once, looking for offset zero of the returned string to obtain your offset.
I'm trying strip every third character (in the example a period) below is my best guess and is close as ive gotten but im missing something, probably minor. Also would this method (if i could get it working) be better than a regex match, remove?
$arr = 'Ha.pp.yB.ir.th.da.y';
$strip = '';
for ($i = 1; $i < strlen($arr); $i += 2) {
$arr[$i] = $strip;
}
One way you can do it is:
<?php
$oldString = 'Ha.pp.yB.ir.th.da.y';
$newString = "";
for ($i = 0; $i < strlen($oldString ); $i++) // loop the length of the string
{
if (($i+1) % 3 != 0) // skip every third letter
{
$newString .= $oldString[$i]; // build up the new string
}
}
// $newString is HappyBirthday
echo $newString;
?>
Alternatively the explode() function might work, if the letter you're trying to remove is always the same one.
This might work:
echo preg_replace('/(..)./', '$1', 'Ha.pp.yB.ir.th.da.y');
To make it general purpose:
echo preg_replace('/(.{2})./', '$1', $str);
where 2 in this context means you are keeping two characters, then discarding the next.
A way of doing it:
$old = 'Ha.pp.yB.ir.th.da.y';
$arr = str_split($old); #break string into an array
#iterate over the array, but only do it over the characters which are a
#multiple of three (remember that arrays start with 0)
for ($i = 2; $i < count($arr); $i+=2) {
#remove current array item
array_splice($arr, $i, 1);
}
$new = implode($arr); #join it back
Or, with a regular expression:
$old = 'Ha.pp.yB.ir.th.da.y';
$new = preg_replace('/(..)\./', '$1', $old);
#selects any two characters followed by a dot character
#alternatively, if you know that the two characters are letters,
#change the regular expression to:
/(\w{2})\./
I'd just use array_map and a callback function. It'd look roughly like this:
function remove_third_char( $text ) {
return substr( $text, 0, 2 );
}
$text = 'Ha.pp.yB.ir.th.da.y';
$new_text = str_split( $text, 3 );
$new_text = array_map( "remove_third_char", $new_text );
// do whatever you want with new array
I am making a method so your password needs at least one captial and one symbol or number.
I was thinking of splitting the string in to lose chars and then use preggmatch to count if it contains one capital and symbol/number.
however i did something like this in action script but can't figure out how this is called in php. i cant find a way to put every char of a word in a array.
AS3 example
for(var i:uint = 0; i < thisWordCode.length -1 ; i++)
{
thisWordCodeVerdeeld[i] = thisWordCode.charAt(i);
//trace (thisWordCodeVerdeeld[i]);
}
Thanks,
Matthy
you can convert a string to array with str_split and use foreach
$chars = str_split($str);
foreach($chars as $char){
// your code
}
You can access characters in strings in the same way as you would access an array index, e.g.
$length = strlen($string);
$thisWordCodeVerdeeld = array();
for ($i=0; $i<$length; $i++) {
$thisWordCodeVerdeeld[$i] = $string[$i];
}
You could also do:
$thisWordCodeVerdeeld = str_split($string);
However you might find it is easier to validate the string as a whole string, e.g. using regular expressions.
Since str_split() function is not multibyte safe, an easy solution to split UTF-8 encoded string is to use preg_split() with u (PCRE_UTF8) modifier.
preg_split( '//u', $str, null, PREG_SPLIT_NO_EMPTY )
You can access a string using [], as you do for arrays:
$stringLength = strlen($str);
for ($i = 0; $i < $stringLength; $i++)
$char = $str[$i];
Try this, It works beter for UTF8 characters (Kurdish, Persian and Arabic):
<?php
$searchTerm = "هێڵەگ";
$chars = preg_split("//u", $searchTerm, -1, PREG_SPLIT_DELIM_CAPTURE);
foreach($chars as $char){echo $char."<br>";}
?>