PHP count of occurrences of characters of a string within another string - php

Let's say I have two strings.
$needle = 'AGUXYZ';
$haystack = 'Agriculture ID XYZ-A';
I want to count how often characters that are in $needle occur in $haystack. In $haystack, there are the characters 'A' (twice), 'X', 'Y' and 'Z', all of which are in the needle, thus the result is supposed to be 5 (case-sensitive).
Is there any function for that in PHP or do I have to program it myself?
Thanks in advance!

You can calculate the length of the original string and the length of the string without these characters. The differences between them is the number of matches.
Basically,
$needle = 'AGUXYZ';
$haystack = 'Agriculture ID XYZ-A';
Here is the part that does the work. In one line.
$count = strlen($haystack) - strlen(str_replace(str_split($needle), '', $haystack));
Explanation: The first part is self-explanatory. The second part is the length of the string without the characters in the $needle string. This is done by replacing each occurrences of any characters inside the $needle with a blank string.
To do this, we split $needle into an array, once character for each item, using str_split. Then pass it to str_replace. It replaces each occurence of any items in the $search array with a blank string.
Echo it out,
echo "Count = $count\n";
you get:
Count = 5

Try this;
function count_occurences($char_string, $haystack, $case_sensitive = true){
if($case_sensitive === false){
$char_string = strtolower($char_string);
$haystack = strtolower($haystack);
}
$characters = str_split($char_string);
$character_count = 0;
foreach($characters as $character){
$character_count = $character_count + substr_count($haystack, $character);
}
return $character_count;
}
To use;
$needle = 'AGUXYZ';
$haystack = 'Agriculture ID XYZ-A';
print count_occurences($needle, $haystack);
You can set the third parameter to false to ignore case.

There's no built-in function that handles character sets, but you simply use the substr_count function in a loop as such:
<?php
$sourceCharacters = str_split('AGUXYZ');
$targetString = 'Agriculture ID XYZ-A';
$occurrenceCount = array();
foreach($sourceCharacters as $currentCharacter) {
$occurrenceCount[$currentCharacter] = substr_count($targetString, $currentCharacter);
}
print_r($occurrenceCount);
?>

There is no specific method to do this, but this built in method can surely help you:
$count = substr_count($haystack , $needle);
edit: I just reported the general substr_count method..in your particular case you need to call it for each character inside $needle (thanks #Alan Whitelaw)

If you are not interested in the character distribution, you could use a Regex
echo preg_match_all("/[$needle]/", $haystack, $matches);
which returns the number of full pattern matches (which might be zero), or FALSE if an error occurred. The solution offered by #thai above should be significantly faster though.
If the character distribution is of any importance, you can use count_chars:
$needle = 'AGUXYZ';
$haystack = 'Agriculture ID XYZ-A';
$occurences = array_intersect_key(
count_chars($haystack, 1),
array_flip(
array_map('ord', str_split($needle))
)
);
The result would be an array with keys being the ASCII values of the character.
You can then iterate over it with
foreach($occurences as $char => $amount) {
printf("There is %d occurences of %s\n", $amount, chr($char));
}
You could still pass the $occurences array to array_sum to calculate the total.

substr_count will get you close. However, it will not do individual characters. So you could loop over each character in $needle and call this function while summing the counts.

There is a PHP function substr_count to count the number of instances of a character in a string. It would be trivial to extend it for multiple characters:
function substr_multi_count ($haystack, $needle, $offset = 0, $length = null) {
$ret = 0;
if ($length === null) {
$length = strlen($haystack) - $offset;
}
for ($i = strlen($needle); $i--; ) {
$ret += substr_count($haystack, $needle, $offset, $length);
}
return $ret;
}

I have a recursive method to overcome this:
function countChar($str){
if(strlen($str) == 0) return 0;
if(substr($str,-1) == "x") return 1 + countChar(substr($str,0,-1));
return 0 + countChar(substr($str,0,-1));
}
echo countChar("xxSR"); // 2
echo countChar("SR"); // 0
echo countChar("xrxrpxxx"); // 5

I'd do something like:
split the string to chars (str_split), and then
use array_count_values to get an array of characters with the respective number of occurrences.
Code:
$needle = 'AGUXYZ';
$string = "asdasdadas asdadas asd asdsd";
$array_chars = str_split($string);
$value_count = array_count_values($array_chars);
for ($i = 0; $i < count($needle); $i++)
echo $needle[$i]. " is occur " .
($value_count[$needle[$i]] ? $value_count[$needle[$i]] : '0')." times";

Related

PHP remove values below a given value in a "|"-separated string

I have this value:
$numbers= "|800027|800036|800079|800097|800134|800215|800317|800341|800389"
And I want to remove the values below 800130 including the starting "|". I guess it is possible, but I can not find any examples anywhere. If anyone can point me to the right direction I would be thankful.
You could split the input string on pipe, then remove all array elements which, when cast to numbers, are less than 800130. Then, recombine to a pipe delimited string.
$input= "|800027|800036|800079|800097|800134|800215|800317|800341|800389";
$input = ltrim($input, '|');
$numbers = explode("|", $input);
$array = [];
foreach ($numbers as $number) {
if ($number >= 800130) array_push($array, $number);
}
$output = implode("|", $array);
echo "|" . $output;
This prints:
|800134|800215|800317|800341|800389
This should work as well:
$numbers= "|800027|800036|800079|800097|800134|800215|800317|800341|800389";
function my_filter($value) {
return ($value >= "800130");
}
$x = explode("|", $numbers); // Convert to array
$y = array_filter($x, "my_filter"); // Filter out elements
$z = implode("|", $y); // Convert to string again
echo $z;
Note that it's not necessary to have different variables (x,y,z). It's just there to make it a little bit easier to follow the code :)
PHP has a built in function preg_replace_callback which takes a regular expression - in your case \|(\d+) - and applies a callback function to the matched values. Which means you can do this with a simple comparison of each matched value...
$numbers= "|800027|800036|800079|800097|800134|800215|800317|800341|800389";
echo preg_replace_callback("/\|(\d+)/", function($match){
return $match[1] < 800130 ? "" : $match[0];
}, $numbers);
Use explode and implode functions and delete the values that are less than 80031:
$numbers= "|800027|800036|800079|800097|800134|800215|800317|800341|800389";
$values = explode("|", $numbers);
for ($i=1;$i<sizeof($values);$i++) {
if (intval($values[$i])<800130) {
unset($values[$i]);
}
}
// Notice I didn't start the $i index from 0 in the for loop above because the string is starting with "|", the first index value for explode is ""
// If you will not do this, you will get "|" in the end in the resulting string, instead of start.
$result = implode("|", $values);
echo $result;
It will print:
|800134|800215|800317|800341|800389
You can split them with a regex and then filter the array.
$numbers= "|800027|800036|800079|800097|800134|800215|800317|800341|800389";
$below = '|'.join('|', array_filter(preg_split('/\|/', $numbers, -1, PREG_SPLIT_NO_EMPTY), fn($n) => $n < 800130));
|800027|800036|800079|800097

Working with substr_count() and arrays in PHP

So what I need is to compare a string to an array (string as a haystack and array as a needle) and get the elements from the string that repeat within the array. For this purpose I've taken a sample function for using an array as a needle in the substr_count function.
$animals = array('cat','dog','bird');
$toString = implode(' ', $animals);
$data = array('a');
function substr_count_array($haystack, $needle){
$initial = 0;
foreach ($needle as $substring) {
$initial += substr_count($haystack, $substring);
}
return $initial;
}
echo substr_count_array($toString, $data);
The problem is that if I search for a character such as 'a', it gets through the check and validates as a legit value because 'a' is contained within the first element. So the above outputs 1. I figured this was due to the foreach() but how do I bypass that? I want to search for a whole string match, not partial.
You can break up the $haystack into individual words, then do an in_array() check over it to make sure the word exists in that array as a whole word before doing your substr_count():
$animals = array('cat','dog','bird', 'cat', 'dog', 'bird', 'bird', 'hello');
$toString = implode(' ', $animals);
$data = array('cat');
function substr_count_array($haystack, $needle){
$initial = 0;
$bits_of_haystack = explode(' ', $haystack);
foreach ($needle as $substring) {
if(!in_array($substring, $bits_of_haystack))
continue; // skip this needle if it doesn't exist as a whole word
$initial += substr_count($haystack, $substring);
}
return $initial;
}
echo substr_count_array($toString, $data);
Here, cat is 2, dog is 2, bird is 3, hello is 1 and lion is 0.
Edit: here's another alternative using array_keys() with the search parameter set to the $needle:
function substr_count_array($haystack, $needle){
$bits_of_haystack = explode(' ', $haystack);
return count(array_keys($bits_of_haystack, $needle[0]));
}
Of course, this approach requires a string as the needle. I'm not 100% sure why you need to use an array as the needle, but perhaps you could do a loop outside the function and call it for each needle if you need to - just another option anyway!
Just throwing my solution in the ring here; the basic idea, as outlined by scrowler as well, is to break up the search subject into separate words so that you can compare whole words.
function substr_count_array($haystack, $needle)
{
$substrings = explode(' ', $haystack);
return array_reduce($substrings, function($total, $current) use ($needle) {
return $total + count(array_keys($needle, $current, true));
}, 0);
}
The array_reduce() step is basically this:
$total = 0;
foreach ($substrings as $substring) {
$total = $total + count(array_keys($needle, $substring, true));
}
return $total;
The array_keys() expression returns the keys of $needle for which the value equals $substring. The size of that array is the number of occurrences.

strpos() with multiple needles?

I am looking for a function like strpos() with two significant differences:
To be able to accept multiple needles. I mean thousands of needles at ones.
To search for all occurrences of the needles in the haystack and to return an array of starting positions.
Of course it has to be an efficient solution not just a loop through every needle. I have searched through this forum and there were similar questions to this one, like:
Using an array as needles in strpos
Define multiple needles using stripos
Can't search an array in PHP in_array for the presence of multiple needles
but nether of them was what I am looking for. I am using strpos just to illustrate my question better, probably something entirely different has to be used for this purpose.
I am aware of Zend_Search_Lucene and I am interested if it can be used to achieve this and how (just the general idea)?
Thanks a lot for Your help and time!
try preg match for multiple
if (preg_match('/word|word2/i', $str))
Checking for multiple strpos values
Here's some sample code for my strategy:
function strpos_array($haystack, $needles, $offset=0) {
$matches = array();
//Avoid the obvious: when haystack or needles are empty, return no matches
if(empty($needles) || empty($haystack)) {
return $matches;
}
$haystack = (string)$haystack; //Pre-cast non-string haystacks
$haylen = strlen($haystack);
//Allow negative (from end of haystack) offsets
if($offset < 0) {
$offset += $heylen;
}
//Use strpos if there is no array or only one needle
if(!is_array($needles)) {
$needles = array($needles);
}
$needles = array_unique($needles); //Not necessary if you are sure all needles are unique
//Precalculate needle lengths to save time
foreach($needles as &$origNeedle) {
$origNeedle = array((string)$origNeedle, strlen($origNeedle));
}
//Find matches
for(; $offset < $haylen; $offset++) {
foreach($needles as $needle) {
list($needle, $length) = $needle;
if($needle == substr($haystack, $offset, $length)) {
$matches[] = $offset;
break;
}
}
}
return($matches);
}
I've implemented a simple brute force method above that will work with any combination of needles and haystacks (not just words). For possibly faster algorithms check out:
Aho–Corasick string matching algorithm
Other Solution
function strpos_array($haystack, $needles, $theOffset=0) {
$matches = array();
if(empty($haystack) || empty($needles)) {
return $matches;
}
$haylen = strlen($haystack);
if($theOffset < 0) { // Support negative offsets
$theOffest += $haylen;
}
foreach($needles as $needle) {
$needlelen = strlen($needle);
$offset = $theOffset;
while(($match = strpos($haystack, $needle, $offset)) !== false) {
$matches[] = $match;
$offset = $match + $needlelen;
if($offset >= $haylen) {
break;
}
}
}
return $matches;
}
I know this doesn't answer the OP's question but wanted to comment since this page is at the top of Google for strpos with multiple needles. Here's a simple solution to do so (again, this isn't specific to the OP's question - sorry):
$img_formats = array('.jpg','.png');
$missing = array();
foreach ( $img_formats as $format )
if ( stripos($post['timer_background_image'], $format) === false ) $missing[] = $format;
if (count($missing) == 2)
return array("save_data"=>$post,"error"=>array("message"=>"The background image must be in a .jpg or .png format.","field"=>"timer_background_image"));
If 2 items are added to the $missing array that means that the input doesn't satisfy any of the image formats in the $img_formats array. At that point you know that you can return an error, etc. This could easily be turned into a little function:
function m_stripos( $haystack = null, $needles = array() ){
//return early if missing arguments
if ( !$needles || !$haystack ) return false;
// create an array to evaluate at the end
$missing = array();
//Loop through needles array, and add to $missing array if not satisfied
foreach ( $needles as $needle )
if ( stripos($haystack, $needle) === false ) $missing[] = $needle;
//If the count of $missing and $needles is equal, we know there were no matches, return false..
if (count($missing) == count($needles)) return false;
//If we're here, be happy, return true...
return true;
}
Back to our first example using then the function instead:
$needles = array('.jpg','.png');
if ( !m_strpos( $post['timer_background_image'], $needles ) )
return array("save_data"=>$post,"error"=>array("message"=>"The background image must be in a .jpg or .png format.","field"=>"timer_background_image"));
Of course, what you do after the function returns true or false is up to you.
It seems you are searching for whole words. In this case, something like this might help. As it uses built-in functions, it should be faster than custom code, but you have to profile it:
$words = str_word_count($str, 2);
$word_position_map = array();
foreach($words as $position => $word) {
if(!isset($word_position_map[$word])) {
$word_position_map[$word] = array();
}
$word_position_map[$word][] = $position;
}
// assuming $needles is an array of words
$result = array_intersect_key($word_position_map, array_flip($needles));
Storing the information (like the needles) in the right format will improve the runtime ( e.g. as you don't have to call array_flip).
Note from the str_word_count documentation:
For the purpose of this function, 'word' is defined as a locale dependent string containing alphabetic characters, which also may contain, but not start with "'" and "-" characters.
So make sure you set the locale right.
You could use a regular expression, they support OR operations. This would however make it fairly slow, compared to strpos.
How about a simple solution using array_map()?
$string = 'one two three four';
$needles = array( 'five' , 'three' );
$strpos_arr = array_map( function ( $check ) use ( $string ) {
return strpos( $string, $check );
}, $needles );
As return, you're going to have an array where the keys are the needles positions and the values are the starting positions, if found.
//print_r( $strpos_arr );
Array
(
[0] =>
[1] => 8
)

Finding matching portions of two strings in PHP

I'm looking for a simple way to find matching portions of two strings in PHP (specifically in the context of a URI)
For example, consider the two strings:
http://2.2.2.2/~machinehost/deployment_folder/
and
/~machinehost/deployment_folder/users/bob/settings
What I need is to chop off the matching portion of these two strings from the second string, resulting in:
users/bob/settings
before appending the first string as a prefix, forming an absolute URI.
Is there some simple way (in PHP) to compare two arbitrary strings for matching substrings within them?
EDIT: as pointed out, I meant the longest matching substring common to both strings
Assuming your strings are $a and $b, respectively, you can use this:
$a = 'http://2.2.2.2/~machinehost/deployment_folder/';
$b = '/~machinehost/deployment_folder/users/bob/settings';
$len_a = strlen($a);
$len_b = strlen($b);
for ($p = max(0, $len_a - $len_b); $p < $len_b; $p++)
if (substr($a, $len_a - ($len_b - $p)) == substr($b, 0, $len_b - $p))
break;
$result = $a.substr($b, $len_b - $p);
echo $result;
This result is http://2.2.2.2/~machinehost/deployment_folder/users/bob/settings.
Finding the longest common match can also be done using regex.
The below function will take two strings, use one to create a regex, and execute it against the other.
/**
* Determine the longest common match within two strings
*
* #param string $str1
* #param string $str2 Two strings in any order.
* #param boolean $case_sensitive Set to true to force
* case sensitivity. Default: false (case insensitive).
* #return string The longest string - first match.
*/
function get_longest_common_subsequence( $str1, $str2, $case_sensitive = false ) {
// First check to see if one string is the same as the other.
if ( $str1 === $str2 ) return $str1;
if ( ! $case_sensitive && strtolower( $str1 ) === strtolower( $str2 ) ) return $str1;
// We'll use '#' as our regex delimiter. Any character can be used as we'll quote the string anyway,
$delimiter = '#';
// We'll find the shortest string and use that to check substrings and create our regex.
$l1 = strlen( $str1 );
$l2 = strlen( $str2 );
$str = $l1 <= $l2 ? $str1 : $str2;
$str2 = $l1 <= $l2 ? $str2 : $str1;
$l = min( $l1, $l2 );
// Next check to see if one string is a substring of the other.
if ( $case_sensitive ) {
if ( strpos( $str2, $str ) !== false ) {
return $str;
}
}
else {
if ( stripos( $str2, $str ) !== false ) {
return $str;
}
}
// Regex for each character will be of the format (?:a(?=b))?
// We also need to capture the last character, but this prevents us from matching strings with a single character. (?:.|c)?
$reg = $delimiter;
for ( $i = 0; $i < $l; $i++ ) {
$a = preg_quote( $str[ $i ], $delimiter );
$b = $i + 1 < $l ? preg_quote( $str[ $i + 1 ], $delimiter ) : false;
$reg .= sprintf( $b !== false ? '(?:%s(?=%s))?' : '(?:.|%s)?', $a, $b );
}
$reg .= $delimiter;
if ( ! $case_sensitive ) {
$reg .= 'i';
}
// Resulting example regex from a string 'abbc':
// '#(?:a(?=b))?(?:b(?=b))?(?:b(?=c))?(?:.|c)?#i';
// Perform our regex on the remaining string
$str = $l1 <= $l2 ? $str2 : $str1;
if ( preg_match_all( $reg, $str, $matches ) ) {
// $matches is an array with a single array with all the matches.
return array_reduce( $matches[0], function( $a, $b ) {
$al = strlen( $a );
$bl = strlen( $b );
// Return the longest string, as long as it's not a single character.
return $al >= $bl || $bl <= 1 ? $a : $b;
}, '' );
}
// No match - Return an empty string.
return '';
}
It'll generate a regex using the shorter of the two strings, although performance will most likely be the same either way. It may incorrectly match strings with recurring substrings, and we're limited to matching strings of two characters or more, unless they are equal or one is a substring of the other. For Instance:
// Works as intended.
get_longest_common_subsequence( 'abbc', 'abc' ) === 'ab';
// Returns incorrect substring based on string length and recurring substrings.
get_longest_common_subsequence( 'abbc', 'abcdef' ) === 'abc';
// Does not return any matches, as all recurring strings are only a single character long.
get_longest_common_subsequence( 'abc', 'ace' ) === '';
// One of the strings is a substring of the other.
get_longest_common_subsequence( 'abc', 'a' ) === 'a';
Regardless, it functions using an alternate method and the regex can be refined to tackle additional situations.
I'm not sure to understand your full request, but the idea is:
Let A be your URL and B your "/~machinehost/deployment_folder/users/bob/settings"
search B in A -> you get an index i (where i is the position of the first / of B in A)
let l = length(A)
You need to cut B from (l-i) to length(B) to grab the last part of B (/users/bob/settings)
I have not tested yet, but if you really need, I can help you make this brilliant (ironical) solution work.
Note that it may be possible with regular expressions like
$pattern = "$B(.*?)"
$res = array();
preg_match_all($pattern, $A, $res);
Edit: I think your last comment invalidates my response. But what you want is finding substrings. So you can first start with a heavy algorithm trying to find B[1:i] in A for i in {2, length(B)} and then use some dynamic programming stuffs.
it does not seem to be an out of the box code out there for your requirement. So lets look for a simple way.
For this exercise I utilized two methods, one for finding the longest match, and another one to chop off the matching portion.
The FindLongestMatch() method, takes apart a path, piece by piece seeks for a match in the other path, keeping just one match, the longest one (no arrays, no sorting).
The RemoveLongestMatch() method takes the suffix or 'remainder' after the longest match found position.
Here the full source code:
<?php
function FindLongestMatch($relativePath, $absolutePath)
{
static $_separator = '/';
$splitted = array_reverse(explode($_separator, $absolutePath));
foreach ($splitted as &$value)
{
$matchTest = $value.$_separator.$match;
if(IsSubstring($relativePath, $matchTest))
$match = $matchTest;
if (!empty($value) && IsNewMatchLonger($match, $longestMatch))
$longestMatch = $match;
}
return $longestMatch;
}
//Removes from the first string the longest match.
function RemoveLongestMatch($relativePath, $absolutePath)
{
$match = findLongestMatch($relativePath, $absolutePath);
$positionFound = strpos($relativePath, $match);
$suffix = substr($relativePath, $positionFound + strlen($match));
return $suffix;
}
function IsNewMatchLonger($match, $longestMatch)
{
return strlen($match) > strlen($longestMatch);
}
function IsSubstring($string, $subString)
{
return strpos($string, $subString) > 0;
}
This is a representative subset of Test Cases:
//TEST CASES
echo "<br>-----------------------------------------------------------";
echo "<br>".$absolutePath = 'http://2.2.2.2/~machinehost/deployment_folder/';
echo "<br>".$relativePath = '/~machinehost/deployment_folder/users/bob/settings';
echo "<br>Longest match: ".findLongestMatch($relativePath, $absolutePath);
echo "<br>Suffix: ".removeLongestMatch($relativePath, $absolutePath);
echo "<br>-----------------------------------------------------------";
echo "<br>".$absolutePath = 'http://1.1.1.1/root/~machinehost/deployment_folder/';
echo "<br>".$relativePath = '/root/~machinehost/deployment_folder/users/bob/settings';
echo "<br>Longest match: ".findLongestMatch($relativePath, $absolutePath);
echo "<br>Suffix: ".removeLongestMatch($relativePath, $absolutePath);
echo "<br>-----------------------------------------------------------";
echo "<br>".$absolutePath = 'http://2.2.2.2/~machinehost/deployment_folder/users/';
echo "<br>".$relativePath = '/~machinehost/deployment_folder/users/bob/settings';
echo "<br>Longest match: ".findLongestMatch($relativePath, $absolutePath);
echo "<br>Suffix: ".removeLongestMatch($relativePath, $absolutePath);
echo "<br>-----------------------------------------------------------";
echo "<br>".$absolutePath = 'http://3.3.3.3/~machinehost/~machinehost/subDirectory/deployment_folder/';
echo "<br>".$relativePath = '/~machinehost/subDirectory/deployment_folderX/users/bob/settings';
echo "<br>Longest match: ".findLongestMatch($relativePath, $absolutePath);
echo "<br>Suffix: ".removeLongestMatch($relativePath, $absolutePath);
Running previous Test Cases provides the following output:
http://2.2.2.2/~machinehost/deployment_folder/
/~machinehost/deployment_folder/users/bob/settings
Longuest match: ~machinehost/deployment_folder/
Suffix: users/bob/settings
http://1.1.1.1/root/~machinehost/deployment_folder/
/root/~machinehost/deployment_folder/users/bob/settings
Longuest match: root/~machinehost/deployment_folder/
Suffix: users/bob/settings
http://2.2.2.2/~machinehost/deployment_folder/users/
/~machinehost/deployment_folder/users/bob/settings
Longuest match: ~machinehost/deployment_folder/users/
Suffix: bob/settings
http://3.3.3.3/~machinehost/~machinehost/subDirectory/deployment_folder/
/~machinehost/subDirectory/deployment_folderX/users/bob/settings
Longuest match: ~machinehost/subDirectory/
Suffix: deployment_folderX/users/bob/settings
Maybe you can take the idea of this piece of code and turn it into something that you find useful for your current project.
Let me know if it worked for you too. By the way, Mr. oreX answer looks good too.
Try this.
http://pastebin.com/GqS3UiPD

How to find first non-repetitive character from a string?

I've spent half day trying to figure out this and finally I got working solution.
However, I feel like this can be done in simpler way.
I think this code is not really readable.
Problem: Find first non-repetitive character from a string.
$string = "abbcabz"
In this case, the function should output "c".
The reason I use concatenation instead of $input[index_to_remove] = ''
in order to remove character from a given string
is because if I do that, it actually just leave empty cell so that my
return value $input[0] does not not return the character I want to return.
For instance,
$str = "abc";
$str[0] = '';
echo $str;
This will output "bc"
But actually if I test,
var_dump($str);
it will give me:
string(3) "bc"
Here is my intention:
Given: input
while first char exists in substring of input {
get index_to_remove
input = chars left of index_to_remove . chars right of index_to_remove
if dupe of first char is not found from substring
remove first char from input
}
return first char of input
Code:
function find_first_non_repetitive2($input) {
while(strpos(substr($input, 1), $input[0]) !== false) {
$index_to_remove = strpos(substr($input,1), $input[0]) + 1;
$input = substr($input, 0, $index_to_remove) . substr($input, $index_to_remove + 1);
if(strpos(substr($input, 1), $input[0]) == false) {
$input = substr($input, 1);
}
}
return $input[0];
}
<?php
// In an array mapped character to frequency,
// find the first character with frequency 1.
echo array_search(1, array_count_values(str_split('abbcabz')));
Python:
def first_non_repeating(s):
for i, c in enumerate(s):
if s.find(c, i+1) < 0:
return c
return None
Same in PHP:
function find_first_non_repetitive($s)
{
for($i = 0; i < strlen($s); $i++) {
if (strpos($s, $s[i], $i+1) === FALSE)
return $s[i];
}
}
Pseudocode:
Array N;
For each letter in string
if letter not exists in array N
Add letter to array and set its count to 1
else
go to its position in array and increment its count
End for
for each position in array N
if value at potition == 1
return the letter at position and exit for loop
else
//do nothing (for clarity)
end for
Basically, you find all distinct letters in the string, and for each letter, you associate it with a count of how many of that letter exist in the string. then you return the first one that has a count of 1
The complexity of this method is O(n^2) in the worst case if using arrays. You can use an associative array to increase it's performance.
1- use a sorting algotithm like mergesort (or quicksort has better performance with small inputs)
2- then control repetetive characters
non repetetive characters will be single
repetetvives will fallow each other
Performance : sort + compare
Performance : O(n log n) + O(n) = O(n log n)
For example
$string = "abbcabz"
$string = mergesort ($string)
// $string = "aabbbcz"
Then take first char form string then compare with next one if match repetetive
move to the next different character and compare
first non-matching character is non-repetetive
This can be done in much more readable code using some standard PHP functions:
// Count number of occurrences for every character
$counts = count_chars($string);
// Keep only unique ones (yes, we use this ugly pre-PHP-5.3 syntax here, but I can live with that)
$counts = array_filter($counts, create_function('$n', 'return $n == 1;'));
// Convert to a list, then to a string containing every unique character
$chars = array_map('chr', array_keys($counts));
$chars = implode($chars);
// Get a string starting from the any of the characters found
// This "strpbrk" is probably the most cryptic part of this code
$substring = strlen($chars) ? strpbrk($string, $chars) : '';
// Get the first character from the new string
$char = strlen($substring) ? $substring[0] : '';
// PROFIT!
echo $char;
$str="abbcade";
$checked= array(); // we will store all checked characters in this array, so we do not have to check them again
for($i=0; $i<strlen($str); $i++)
{
$c=0;
if(in_array($str[$i],$checked)) continue;
$checked[]=$str[$i];
for($j=$i+1;$j<=strlen($str);$j++)
{
if($str[$i]==$str[$j])
{
$c=1;
break;
}
}
if($c!=1)
{
echo "First non repetive char is:".$str[$i];
break;
}
}
This should replace your code...
$array = str_split($string);
$array = array_count_values($array);
$array = array_filter($array, create_function('$key,$val', 'return($val == 1);'));
$first_non_repeated_letter = key(array_shift($array));
Edit: spoke too soon. Took out 'array_unique', thought it actually dropped duplicate values. But character order should be preserved to be able to find the first character.
Here's a function in Scala that would do it:
def firstUnique(chars:List[Char]):Option[Char] = chars match {
case Nil => None
case head::tail => {
val filtered = tail filter (_!=head)
if (tail.length == filtered.length) Some(head) else firstUnique(filtered)
}
}
scala> firstUnique("abbcabz".toList)
res5: Option[Char] = Some(c)
And here's the equivalent in Haskell:
firstUnique :: [Char] -> Maybe Char
firstUnique [] = Nothing
firstUnique (head:tail) = let filtered = (filter (/= head) tail) in
if (tail == filtered) then (Just head) else (firstUnique filtered)
*Main> firstUnique "abbcabz"
Just 'c'
You can solve this more generally by abstracting over lists of things that can be compared for equality:
firstUnique :: Eq a => [a] -> Maybe a
Strings are just one such list.
Can be also done using array_key_exists during building an associative array from the string. Each character will be a key and will count the number as value.
$sample = "abbcabz";
$check = [];
for($i=0; $i<strlen($sample); $i++)
{
if(!array_key_exists($sample[$i], $check))
{
$check[$sample[$i]] = 1;
}
else
{
$check[$sample[$i]] += 1;
}
}
echo array_search(1, $check);

Categories