Split an array with a regular expression

Split an array with a regular expression - php

I'm wondering if it is possible to truncate an array by using a regular expression.
In particular I have an array like this one:
$array = array("AaBa","AaBb","AaBc","AaCa","AaCb","AaCc","AaDa"...);
I have this string:
$str = "AC";
I'd like the slice of $array from the start to the last occurrence of a string matching /A.C./ (in the sample, "AaCc" at index 5):
$result = array("AaBa","AaBb","AaBc","AaCa","AaCb","AaCc");
How can I do this? I thought I might use array_slice, but I don't know how to use a RegEx with it.

Here's my bid
function split_by_contents($ary, $pattern){
if (!is_array($ary)) return FALSE; // brief error checking
// keep track of the last position we had a match, and the current
// position we're searching
$last = -1; $c = 0;
// iterate over the array
foreach ($ary as $k => $v){
// check for a pattern match
if (preg_match($pattern, $v)){
// we found a match, record it
$last = $c;
}
// increment place holder
$c++;
}
// if we found a match, return up until the last match
// if we didn't find one, return what was passed in
return $last != -1 ? array_slice($ary, 0, $last + 1) : $ary;
}
Update
My original answer has a $limit argument that served no purpose. I did originally have a different direction I was going to go with the solution, but decided to keep it simple. However, below is the version that implements that $limit. So...
function split_by_contents($ary, $pattern, $limit = 0){
// really simple error checking
if (!is_array($ary)) return FALSE;
// track the location of the last match, the index of the
// element we're on, and how many matches we have found
$last = -1; $c = 0; $matches = 0;
// iterate over all items (use foreach to keep key integrity)
foreach ($ary as $k => $v){
// text for a pattern match
if (preg_match($pattern, $v)){
// record the last position of a match
$last = $c;
// if there is a specified limit, capture up until
// $limit number of matches, then exit the loop
// and return what we have
if ($limit > 0 && ++$matches == $limit){
break;
}
}
// increment position counter
$c++;
}

I think the easiest way might be with a foreach loop, then using a regex against each value - happy to be proven wrong though!
One alternative could be to implode the array first...
$array = array("AaBa","AaBb","AaBc","AaCa","AaCb","AaCc","AaDa"...);
$string = implode('~~',$array);
//Some regex to split the string up as you want, guessing something like
// '!~~A.C.~~!' will match just what you want?
$result = explode('~~',$string);
If you'd like a hand with the regex I can do, just not 100% on exactly what you're asking - the "A*C*"-->"AaCc" bit I'm not too sure on?

Assuming incremental numeric indices starting from 0
$array = array("AaBa","AaBb","AaBc","AaCa","AaCb","AaCc","AaDa");
$str = "AC";
$regexpSearch = '/^'.implode('.',str_split($str)).'.$/';
$slicedArray = array_slice($array,
0,
array_pop(array_keys(array_filter($array,
function($entry) use ($regexpSearch) {
return preg_match($regexpSearch,$entry);
}
)
)
)+1
);
var_dump($slicedArray);
PHP >= 5.3.0 and will give a
Strict standards: Only variables should be passed by reference
And if no match is found, will still return the first element.

Related

recursively get user input value in array values

I am leaning recursion and I want to create a search engine which depends on a user value and gets from an array all values which together make up the word that the user typed.
For example I have this array :
$array = array('it', 'pro', 'gram', 'grammer', 'mer', 'programmer');
$string = "itprogrammer";
If anyone can help I appreciate it a lot. Thank you for your help.

Here is a recursive function that will do what you want. It loops through the array, looking for words that match the beginning of the string. It it finds one, it then recursively tries to find words in the array (excluding the word already matched) which match the the string after it has had the first match removed.
function find_words($string, $array) {
// if the string is empty, we're done
if (strlen($string) == 0) return array();
$output = array();
for ($i = 0; $i < count($array); $i++) {
// does this word match the start of the string?
if (stripos($string, $array[$i]) === 0) {
$match_len = strlen($array[$i]);
$this_match = array($array[$i]);
// see if we can match the rest of the string with other words in the array
$rest_of_array = array_merge($i == 0 ? array() : array_slice($array, 0, $i), array_slice($array, $i+1));
if (count($matches = find_words(substr($string, $match_len), $rest_of_array))) {
// yes, found a match, return it
foreach ($matches as $match) {
$output[] = array_merge($this_match, $match);
}
}
else {
// was end of string or didn't match anything more, just return the current match
$output[] = $this_match;
}
}
}
// any matches? if so, return them, otherwise return false
return $output;
}
You can display the output in the format you desire with:
$wordstrings = array();
if (($words_array = find_words($string, $array)) !== false) {
foreach ($words_array as $words) {
$wordstrings[] = implode(', ', $words);
}
echo implode("<br>\n", $wordstrings);
}
else {
echo "No match found!";
}
I made a slightly more complex example (demo on rextester):
$array = array('pro', 'gram', 'merit', 'mer', 'program', 'it', 'programmer');
$strings = array("programmerit", "probdjsabdjsab", "programabdjsab");
Output:
string: 'programmerit' matches:
pro, gram, merit<br>
pro, gram, mer, it<br>
program, merit<br>
program, mer, it<br>
programmer, it
string: 'probdjsabdjsab' matches:
pro
string: 'programabdjsab' matches:
pro, gram<br>
program
Update
Updated code and demo based on OPs comments about not needing to match the whole string.

How to check array data that matches from random characters in php?

I have an array like below:
$fruits = array("apple","orange","papaya","grape")
I have a variable like below:
$content = "apple";
I need to filter some condition like: if this variable matches at least one of the array elements, do something. The variable, $content, is a bunch of random characters that is actually one of these available in the array data like below:
$content = "eaplp"; // it's a dynamically random char from the actual word "apple`
what have I done was like the below:
$countcontent = count($content);
for($a=0;$a==count($fruits);$a++){
$countarr = count($fruits[$a]);
if($content == $fruits[$a] && $countcontent == $countarr){
echo "we got".$fruits[$a];
}
}
I tried to count how many letters these phrases had and do like if...else... when the total word in string matches with the total word on one of array data, but is there something that we could do other than that?

We can check if an array contains some value with in_array. So you can check if your $fruits array contains the string "apple" with,
in_array("apple", $fruits)
which returns a boolean.
If the order of the letters is random, we can sort the string alphabetically with this function:
function sorted($s) {
$a = str_split($s);
sort($a);
return implode($a);
}
Then map this function to your array and check if it contains the sorted string.
$fruits = array("apple","orange","papaya","grape");
$content = "eaplp";
$inarr = in_array(sorted($content), array_map("sorted", $fruits));
var_dump($inarr);
//bool(true)
Another option is array_search. The benefit from using array_search is that it returns the position of the item (if it's found in the array, else false).
$pos = array_search(sorted($content), array_map("sorted", $fruits));
echo ($pos !== false) ? "$fruits[$pos] found." : "not found.";
//apple found.

This will also work but in a slightly different manner.
I split the strings to arrays and sort them to match eachoter.
Then I use array_slice to only match the number of characters in $content, if they match it's a match.
This means this will match in a "loose" way to with "apple juice" or "apple curd".
Not sure this is wanted but figured it could be useful for someone.
$fruits = array("apple","orange","papaya","grape","apple juice", "applecurd");
$content = "eaplp";
$content = str_split($content);
$count = count($content);
Foreach($fruits as $fruit){
$arr_fruit = str_split($fruit);
// sort $content to match order of $arr_fruit
$SortCont = array_merge(array_intersect($arr_fruit, $content), array_diff($content, $arr_fruit));
// if the first n characters match call it a match
If(array_slice($SortCont, 0, $count) == array_slice($arr_fruit, 0, $count)){
Echo "match: " . $fruit ."\n";
}
}
output:
match: apple
match: apple juice
match: applecurd
https://3v4l.org/hHvp3
It is also comparable in speed with t.m.adams answer. Sometimes faster sometimes slower, but note how this code can find multiple answers. https://3v4l.org/IbuuD

I think this is the simplest way to answer that question. some of the algorithm above seems to be "overkill".
$fruits = array("apple","orange","papaya","grape");
$content = "eaplp";
foreach ($fruits as $key => $fruit) {
$fruit_array = str_split($fruit); // split the string into array
$content_array = str_split($content); // split the content into array
// check if there's no difference between the 2 new array
if ( sizeof(array_diff($content_array, $fruit_array)) === 0 ) {
echo "we found the fruit at key: " . $key;
return;
}
}

What about using only native PHP functions.
$index = array_search(count_chars($content), array_map('count_chars', $fruits));
If $index is not null you will get the position of $content inside $fruits.
P.S. Be aware that count_chars might not be the fastest approach to that problem.

With a random token to search for a value in your array, you have a problem with false positives. That can give misleading results depending on the use case.
On search cases, for example wrong typed words, I would implement a filter solution which produces a matching array. One could sort the results by calculating the levenshtein distance to fetch the most likely result, if necessary.
String length solution
Very easy to implement.
False positives: Nearly every string with the same length like apple and grape would match.
Example:
$matching = array_filter($fruits, function ($fruit) use ($content) {
return strlen($content) == strlen($fruit);
});
if (count($matching)) {
// do your stuff...
}
Regular expression solution
It compares string length and in a limited way containing characters. It is moderate to implement and has a good performance on big data cases.
False positives: A content like abc would match bac but also bbb.
Example:
$matching = preg_grep(
'#['.preg_quote($content).']{'.strlen($content).'}#',
$fruits
);
if (count($matching)) {
// do your stuff...
}
Alphanumeric sorting solution
Most accurate but also a slow approach concerning performance using PHP.
False positives: A content like abc would match on bac or cab.
Example:
$normalizer = function ($value) {
$tokens = str_split($value);
sort($tokens);
return implode($tokens);
};
$matching = array_filter($fruits, function ($fruit) use ($content, $normalizer) {
return ($normalizer($fruit) == $normalizer($content));
});
if (count($matching)) {
// do your stuff...
}

Here's a clean approach. Returns unscrambled value early if found, otherwise returns null. Only returns an exact match.
function sortStringAlphabetically($stringToSort)
{
$splitString = str_split($stringToSort);
sort($splitString);
return implode($splitString);
}
function getValueFromRandomised(array $dataToSearch = [], $dataToFind)
{
$sortedDataToFind = sortStringAlphabetically($dataToFind);
foreach ($dataToSearch as $value) {
if (sortStringAlphabetically($value) === $sortedDataToFind) {
return $value;
}
}
return null;
}
$fruits = ['apple','orange','papaya','grape'];
$content = 'eaplp';
$dataExists = getValueFromRandomised($fruits, $content);
var_dump($dataExists);
// string(5) "apple"
Not found example:
$content = 'eaplpo';
var_dump($dataExists);
// NULL
Then you can use it (eg) like this:
echo !empty($dataExists) ? $dataExists . ' was found' : 'No match found';
NOTE: This is case sensitive, meaning it wont find "Apple" from "eaplp". That can be resolved by doing strtolower() on the loop's condition vars.

How about looping through the array, and using a flag to see if it matches?
$flag = false;
foreach($fruits as $fruit){
if($fruit == $content){
$flag = true;
}
}
if($flag == true){
//do something
}

I like t.m.adams answer but I also have a solution for this issue:
array_search_random(string $needle, array $haystack [, bool $strictcase = FALSE ]);
Description: Searches a string in array elements regardless of the position of the characters in the element.
needle: the caracters you are looking for as a string
haystack: the array you want to search
strictcase: if set to TRUE needle 'mood' will match 'mood' and 'doom' but not 'Mood' and 'Doom', if set to FALSE (=default) it will match all of these.
Function:
function array_search_random($needle, $haystack, $strictcase=false){
if($strictcase === false){
$needle = strtolower($needle);
}
$needle = str_split($needle);
sort($needle);
$needle = implode($needle);
foreach($haystack as $straw){
if($strictcase === false){
$straw = strtolower($straw);
}
$straw = str_split($straw);
sort($straw);
$straw = implode($straw);
if($straw == $needle){
return true;
}
}
return false;
}

if(in_array("apple", $fruits)){
true statement
}else{
else statement
}

Efficient way to check if any of the prefixes stored in comma separated list is the prefix of a word

I have a comma separated list of prefixes stored in a variable
$prefixes = “fa,go,urg”;
and a word stored in another variable
$word = “good”;
Now I want to know efficient way to check if any of the prefixes stored in $prefixes is the prefix of $word or not.
My intention is
If any of the prefixes stored in $prefixes is the prefix of the word stored in $word return TRUE.
If none of the prefixes stored in $prefixes is the prefix of the word stored in $word return FALSE.
Note:- Comma separated list of prefixes is provide by user using text box.

One thing that can be done is to have the prefixes within an array, and then check if $word is present within the array $preArr using in_array
in_array
(PHP 4, PHP 5, PHP 7)
in_array — Checks if a value exists in an array
$prefixes = “fa,go,urg”;
$preArr = explode(',', $prefixes); // Convert to array
$word = “good”;
if (in_array($word, $preArr)) {
echo "Success!";
} else {
echo "Failure!";
}

The substr function can achieve the desired result. It checks for the word good in the prefixes at specified location, which is the beginning of the word.
From the PHP Manual:
substr — Return part of a string
Description
string substr ( string $string , int $start [, int $length ] )
Returns the portion of string specified by the start and length parameters.
Try this:
$prefixes = “fa,go,urg”;
$word = “good”;
$Arr[] = explode(',', $prefixes); // Convert to array
$elements = count($Arr[]); //get total elements in array
for ($i=0;$i<count;$i++) {
if (substr( $Arr(i), 0, 4 ) === $word) {
return true;
}
else {return false;}
}

Your problem can be solved by a few ways, the most programmatic method being to just do a simple check, iterate across $prefixes and check it against 0.....i where i = N - 1 and N = count($prefixes[$i])
function inPrefixArr($prefixes, $word) {
$prefixesInArray = explode(',', $prefixes);
for ($i = 0; $i < count($prefixesInArray); i++) {
if (count($prefixesInArray[$i]) <= count($word)) {
if ($prefixesInArray[$i] == substr($word, 0, count($prefixesInArray[$i]))) {
return True;
}
}
}
return False
}
This checks if the any of the prefixes are a prefix of the word given in O(mn) time where m is the max length of some prefix in the array given. It is also the fastest and most space optimal solution that can be found.
As it seems you weren't asking for a theoretical/CS question, there are other interesting ways to implement this in other data structures which can yield better runtimes if you do this repeatedly.

Regex for number comparison?

I would like to perform regex to return true/false if the input 5 digit from input matching data in database, no need to cater of the sequence, but need the exact numbers.
Eg: In database I have 12345
When I key in a 5 digit value into search, I want to find out whether it is matching the each number inside the 12345.
If I key in 34152- it should return true
If I key in 14325- it should return true
If I key in 65432- it should return false
If I key in 11234- it should return false
Eg: In database I have 44512
If I key in 21454- it should return true
If I key in 21455- it should return false
How to do this using php with regex

This is a way avoiding regex
<?php
function cmpkey($a,$b){
$aa = str_split($a); sort($aa);
$bb = str_split($b); sort($bb);
return ( implode("",$aa) == implode("",$bb));
}
?>

Well, it's not going to be a trivial regex, I can tell you that. You could do something like this:
$chars = count_chars($input, 1);
$numbers = array();
foreach ($chars as $char => $frequency) {
if (is_numeric(chr($char))) {
$numbers[chr($char)] = $frequency;
}
}
// that converts "11234" into array(1 => 2, 2 => 1, 3 => 1, 4 => 1)
Now, since MySQL doesn't support assertions in regex, you'll need to do this in multiple regexes:
$where = array();
foreach ($numbers AS $num => $count) {
$not = "[^$num]";
$regex = "^";
for ($i = 0; $i < $count; $i++) {
$regex .= "$not*$num";
}
$regex .= "$not*";
$where[] = "numberField REGEXP '$regex'";
}
$where = '((' . implode(') AND (', $where).'))';
That'll produce:
(
(numberField REGEXP '^[^1]*1[^1]*1[^1]*$')
AND
(numberField REGEXP '^[^2]*2[^2]*$')
AND
(numberField REGEXP '^[^3]*3[^3]*$')
AND
(numberField REGEXP '^[^4]*4[^4]*$')
)
That should do it for you.
It's not pretty, but it should take care of all of the possible permutations for you, assuming that your stored data format is consistent...
But, depending on your needs, you should try to pull it out and process it in PHP. In which case the regex would be far simpler:
^(?=.*1.*1})(?=.*2)(?=.*3)(?=.*4)\d{5}$
Or, you could also pre-sort the number before you insert it. So instead of inserting 14231, you'd insert 11234. That way, you always know the sequence is ordered properly, so you just need to do numberField = '11234' instead of that gigantic beast above...

Try using
^(?=.*1)(?=.*2)(?=.*3)(?=.*4)(?=.*5).{5}$
This will get much more complicated, when you have duplicate numbers.
You really should not do this with regex. =)

How to find first non-repetitive character from a string?

I've spent half day trying to figure out this and finally I got working solution.
However, I feel like this can be done in simpler way.
I think this code is not really readable.
Problem: Find first non-repetitive character from a string.
$string = "abbcabz"
In this case, the function should output "c".
The reason I use concatenation instead of $input[index_to_remove] = ''
in order to remove character from a given string
is because if I do that, it actually just leave empty cell so that my
return value $input[0] does not not return the character I want to return.
For instance,
$str = "abc";
$str[0] = '';
echo $str;
This will output "bc"
But actually if I test,
var_dump($str);
it will give me:
string(3) "bc"
Here is my intention:
Given: input
while first char exists in substring of input {
get index_to_remove
input = chars left of index_to_remove . chars right of index_to_remove
if dupe of first char is not found from substring
remove first char from input
}
return first char of input
Code:
function find_first_non_repetitive2($input) {
while(strpos(substr($input, 1), $input[0]) !== false) {
$index_to_remove = strpos(substr($input,1), $input[0]) + 1;
$input = substr($input, 0, $index_to_remove) . substr($input, $index_to_remove + 1);
if(strpos(substr($input, 1), $input[0]) == false) {
$input = substr($input, 1);
}
}
return $input[0];
}

<?php
// In an array mapped character to frequency,
// find the first character with frequency 1.
echo array_search(1, array_count_values(str_split('abbcabz')));

Python:
def first_non_repeating(s):
for i, c in enumerate(s):
if s.find(c, i+1) < 0:
return c
return None
Same in PHP:
function find_first_non_repetitive($s)
{
for($i = 0; i < strlen($s); $i++) {
if (strpos($s, $s[i], $i+1) === FALSE)
return $s[i];
}
}

Pseudocode:
Array N;
For each letter in string
if letter not exists in array N
Add letter to array and set its count to 1
else
go to its position in array and increment its count
End for
for each position in array N
if value at potition == 1
return the letter at position and exit for loop
else
//do nothing (for clarity)
end for
Basically, you find all distinct letters in the string, and for each letter, you associate it with a count of how many of that letter exist in the string. then you return the first one that has a count of 1
The complexity of this method is O(n^2) in the worst case if using arrays. You can use an associative array to increase it's performance.

1- use a sorting algotithm like mergesort (or quicksort has better performance with small inputs)
2- then control repetetive characters
non repetetive characters will be single
repetetvives will fallow each other
Performance : sort + compare
Performance : O(n log n) + O(n) = O(n log n)
For example
$string = "abbcabz"
$string = mergesort ($string)
// $string = "aabbbcz"
Then take first char form string then compare with next one if match repetetive
move to the next different character and compare
first non-matching character is non-repetetive

This can be done in much more readable code using some standard PHP functions:
// Count number of occurrences for every character
$counts = count_chars($string);
// Keep only unique ones (yes, we use this ugly pre-PHP-5.3 syntax here, but I can live with that)
$counts = array_filter($counts, create_function('$n', 'return $n == 1;'));
// Convert to a list, then to a string containing every unique character
$chars = array_map('chr', array_keys($counts));
$chars = implode($chars);
// Get a string starting from the any of the characters found
// This "strpbrk" is probably the most cryptic part of this code
$substring = strlen($chars) ? strpbrk($string, $chars) : '';
// Get the first character from the new string
$char = strlen($substring) ? $substring[0] : '';
// PROFIT!
echo $char;

$str="abbcade";
$checked= array(); // we will store all checked characters in this array, so we do not have to check them again
for($i=0; $i<strlen($str); $i++)
{
$c=0;
if(in_array($str[$i],$checked)) continue;
$checked[]=$str[$i];
for($j=$i+1;$j<=strlen($str);$j++)
{
if($str[$i]==$str[$j])
{
$c=1;
break;
}
}
if($c!=1)
{
echo "First non repetive char is:".$str[$i];
break;
}
}

This should replace your code...
$array = str_split($string);
$array = array_count_values($array);
$array = array_filter($array, create_function('$key,$val', 'return($val == 1);'));
$first_non_repeated_letter = key(array_shift($array));
Edit: spoke too soon. Took out 'array_unique', thought it actually dropped duplicate values. But character order should be preserved to be able to find the first character.

Here's a function in Scala that would do it:
def firstUnique(chars:List[Char]):Option[Char] = chars match {
case Nil => None
case head::tail => {
val filtered = tail filter (_!=head)
if (tail.length == filtered.length) Some(head) else firstUnique(filtered)
}
}
scala> firstUnique("abbcabz".toList)
res5: Option[Char] = Some(c)
And here's the equivalent in Haskell:
firstUnique :: [Char] -> Maybe Char
firstUnique [] = Nothing
firstUnique (head:tail) = let filtered = (filter (/= head) tail) in
if (tail == filtered) then (Just head) else (firstUnique filtered)
*Main> firstUnique "abbcabz"
Just 'c'
You can solve this more generally by abstracting over lists of things that can be compared for equality:
firstUnique :: Eq a => [a] -> Maybe a
Strings are just one such list.

Can be also done using array_key_exists during building an associative array from the string. Each character will be a key and will count the number as value.
$sample = "abbcabz";
$check = [];
for($i=0; $i<strlen($sample); $i++)
{
if(!array_key_exists($sample[$i], $check))
{
$check[$sample[$i]] = 1;
}
else
{
$check[$sample[$i]] += 1;
}
}
echo array_search(1, $check);

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Split an array with a regular expression - php

Related

recursively get user input value in array values

How to check array data that matches from random characters in php?

Efficient way to check if any of the prefixes stored in comma separated list is the prefix of a word

Regex for number comparison?

How to find first non-repetitive character from a string?

Categories

Resources