"Unfolding" a String - php

I have a set of strings, each string has a variable number of segments separated by pipes (|), e.g.:
$string = 'abc|b|ac';
Each segment with more than one char should be expanded into all the possible one char combinations, for 3 segments the following "algorithm" works wonderfully:
$result = array();
$string = explode('|', 'abc|b|ac');
foreach (str_split($string[0]) as $i)
{
foreach (str_split($string[1]) as $j)
{
foreach (str_split($string[2]) as $k)
{
$result[] = implode('|', array($i, $j, $k)); // more...
}
}
}
print_r($result);
Output:
$result = array('a|b|a', 'a|b|c', 'b|b|a', 'b|b|c', 'c|b|a', 'c|b|c');
Obviously, for more than 3 segments the code starts to get extremely messy, since I need to add (and check) more and more inner loops. I tried coming up with a dynamic solution but I can't figure out how to generate the correct combination for all the segments (individually and as a whole). I also looked at some combinatorics source code but I'm unable to combine the different combinations of my segments.
I appreciate if anyone can point me in the right direction.

Recursion to the rescue (you might need to tweak a bit to cover edge cases, but it works):
function explodinator($str) {
$segments = explode('|', $str);
$pieces = array_map('str_split', $segments);
return e_helper($pieces);
}
function e_helper($pieces) {
if (count($pieces) == 1)
return $pieces[0];
$first = array_shift($pieces);
$subs = e_helper($pieces);
foreach($first as $char) {
foreach ($subs as $sub) {
$result[] = $char . '|' . $sub;
}
}
return $result;
}
print_r(explodinator('abc|b|ac'));
Outputs:
Array
(
[0] => a|b|a
[1] => a|b|c
[2] => b|b|a
[3] => b|b|c
[4] => c|b|a
[5] => c|b|c
)
As seen on ideone.

This looks like a job for recursive programming! :P
I first looked at this and thought it was going to be a on-liner (and probably is in perl).
There are other non-recursive ways (enumerate all combinations of indexes into segments then loop through, for example) but I think this is more interesting, and probably 'better'.
$str = explode('|', 'abc|b|ac');
$strlen = count( $str );
$results = array();
function splitAndForeach( $bchar , $oldindex, $tempthread) {
global $strlen, $str, $results;
$temp = $tempthread;
$newindex = $oldindex + 1;
if ( $bchar != '') { array_push($temp, $bchar ); }
if ( $newindex <= $strlen ){
print "starting foreach loop on string '".$str[$newindex-1]."' \n";
foreach(str_split( $str[$newindex - 1] ) as $c) {
print "Going into next depth ($newindex) of recursion on char $c \n";
splitAndForeach( $c , $newindex, $temp);
}
} else {
$found = implode('|', $temp);
print "Array length (max recursion depth) reached, result: $found \n";
array_push( $results, $found );
$temp = $tempthread;
$index = 0;
print "***************** Reset index to 0 *****************\n\n";
}
}
splitAndForeach('', 0, array() );
print "your results: \n";
print_r($results);

You could have two arrays: the alternatives and a current counter.
$alternatives = array(array('a', 'b', 'c'), array('b'), array('a', 'c'));
$counter = array(0, 0, 0);
Then, in a loop, you increment the "last digit" of the counter, and if that is equal to the number of alternatives for that position, you reset that "digit" to zero and increment the "digit" left to it. This works just like counting with decimal numbers.
The string for each step is built by concatenating the $alternatives[$i][$counter[$i]] for each digit.
You are finished when the "first digit" becomes as large as the number of alternatives for that digit.
Example: for the above variables, the counter would get the following values in the steps:
0,0,0
0,0,1
1,0,0 (overflow in the last two digit)
1,0,1
2,0,0 (overflow in the last two digits)
2,0,1
3,0,0 (finished, since the first "digit" has only 3 alternatives)

Related

Trying to fix SKU numbers from 55A_3 to A55_3 with PHP in SQL

I have SKU numbers imported from a CSV file into SQL DB.
Pattern look like:
55A_3
345W_1+04B_1
128T_2+167T_2+113T_8+115T_8
I am trying to move all the letters in front of the numbers.
like:
A55_3
W345_1+B04_1
T128_2+T167_2+T113_8+T115_8
My best idea how to do it was to search for 345W and so, and to replace it with W345 and so:
$sku = "345W_1+04B_1";
$B_range_num = range(0,400);
$B_range_let = range("A","Z");
then generating the find and replace arrays
$B_find =
$B_replace =
maybe just using str_replace??
$res = str_replace($B_find,$B_replace,$sku);
Result should be for all SKU numbers
W345_1+B04_1
Any ideas?
You can use preg_replace to do this job, looking for some digits, followed by a letters and one of an _ with digits, a + or end of string, and then swapping the order of the digits and letters:
$skus = array('55A_3',
'345W_1+04B_1',
'128T_2+167T_2+113T_8+115T_8',
'55A');
foreach ($skus as &$sku) {
$sku = preg_replace('/(\d+)([A-Z]+)(?=_\d+|\+|$)/', '$2$1', $sku);
}
print_r($skus);
Output:
Array
(
[0] => A55_3
[1] => W345_1+B04_1
[2] => T128_2+T167_2+T113_8+T115_8
[3] => A55
)
Demo on 3v4l.org
Here I defined a method with specific format with unlimited length.
$str = '128T_2+167T_2+113T_8+115T_8';
echo convertProductSku($str);
function convertProductSku($str) {
$arr = [];
$parts = explode('+', $str);
foreach ($parts as $part) {
list($first, $second) = array_pad(explode('_', $part), 2, null);
$letter = substr($first, -1);
$number = substr($first, 0, -1);
$arr[] = $letter . $number . ($second ? '_' . $second : '');
}
return implode('+', $arr);
}

Delete duplicate words in array with sentences in PHP

I have a string array with words and sentences included.
For example:
array("dog","cat","the dog is running","some other text","some","text")
And I want to remove duplicate words, leaving only unique words in it. I want to remove these words even in sentences.
The result should look like:
array("dog","cat","the is running","other","some","text")
I tried the array_unique function but it didn't work.
You can use the array_unique after loop with explode and array_push:
$res = [];
foreach($arr as $e) {
array_push($res, ...explode(" ", $e));
}
print_r(array_unique($res));
Reference:
array_push, explode, array-unique
Live example: 3v4l
If you want to keep the sentences use:
$arr = array("dog","cat","the dog is running","some other text","some","text");
// sort first to get the shortest sentence first
usort($arr, function ($a, $b) {return count(explode(" ", $a)) - count(explode(" ", $b)); });
$words = [];
foreach($arr as &$e) {
$res[] = trim(strtr($e, $words)); //get the word after swapping existing
foreach(explode(" ", $e) as $w)
$words[$w] =''; //add all new words to the swapping array with value of empty string
}
This solution is not pretty, but should get the job done and meet some of the edge cases at hand. I'm assuming that no more than one space separates words in a sentence string and that you want to preserve original ordering.
The approach is to walk the array twice, once to filter out duplicate single words, then once again to filter out duplicate words in sentences. This guarantees priority for single words. Finally, ksort the array (this is the ugly part from a time complexity standpoint: everything is O(max_len_sentence * n) up until now).
$arr = ["dog","cat","the dog is running","some other text","some","text"];
$seen = [];
$result = [];
foreach ($arr as $i => $e) {
if (preg_match("/^\w+$/", $e) &&
!array_key_exists($e, $seen)) {
$result[$i] = $e;
$seen[$e] = 1;
}
}
foreach ($arr as $i => $e) {
$words = explode(" ", $e);
if (count($words) > 1) {
$filtered = [];
foreach ($words as $word) {
if (!array_key_exists($word, $seen)) {
$seen[$word] = 0;
}
if (++$seen[$word] < 2) {
$filtered[]= $word;
}
}
if ($filtered) {
$result[$i] = implode($filtered, " ");
}
}
}
ksort($result);
$result = array_values($result);
print_r($result);
Output
Array
(
[0] => dog
[1] => cat
[2] => the is running
[3] => other
[4] => some
[5] => text
)

How to display specific letters from a list of words in a loop?

I received this challenge in an interview and I would like some help solving it.
Using the input string: PHP CODING TECH, produce the following output.
PCT
PHCT
PHPCT
PHPCOT
PHPCODT
PHPCODIT
PHPCODINT
PHPCODINGT
PHPCODINGTE
PHPCODINGTEC
PHPCODINGTECH
As I understand it, the logic is to explode the input string on the spaces and then in a loop structure, display the leading letter(s) of each word as a single string. During each iteration (after the first), the earliest incomplete word displays an additional leading letter.
This is my coding attempt:
$str = "PHP CODING TECH";
$a = explode(' ', $str);
for ($i=0; $i < count($a); $i++) {
for ($j=0; $j < strlen($a[$i]) ; $j++) {
//echo "<pre>";
$b[$i][$j] = explode(' ', $a[$i][$j]);
}
}
echo "<pre>";
print_r($b);
Code: (Demo) (or with DO-WHILE())
$input = "PHP CODING TECH";
$counters = array_fill_keys(explode(' ', $input), 1); // ['PHP' => 1, 'CODING' => 1, 'TECH' => 1]
$bump = false; // permit outer loop to run
while (!$bump) { // while still letters to output....
$bump = true; // stop after this iteration unless more letters to output
foreach ($counters as $word => &$len) { // $len is mod-by-ref for incrementing
echo substr($word, 0, $len); // echo letters using $len
if ($bump && isset($word[$len])) { // if no $len has been incremented during inner loop...
++$len; // increment this word's $len
$bump = false; // permit outer loop to run again
}
}
echo "\n"; // separate outputs
}
Output:
PCT
PHCT
PHPCT
PHPCOT
PHPCODT
PHPCODIT
PHPCODINT
PHPCODINGT
PHPCODINGTE
PHPCODINGTEC
PHPCODINGTECH
Explanation:
I am generating an array of words and initial lengths from the exploded input string. $bump is dual-purpose; it not only controls the outer loop, it also dictates the word which gets a length increase within the inner loop. $len is "modifiable by reference" so that any given word's $len value can be incremented and stored for use in the next iteration. isset() is used on $word[$len] to determine if the current word has more available letters to output in the next iteration; if not, the next word gets a chance (until all words are fully displayed).
And while I was waiting for this page to be reopened, I whacked together an alternative method:
$input = "PHP CODING TECH";
$words = explode(' ', $input); // generates: ['PHP', 'CODING', 'TECH']
$master = ''; // initialize for first offset and then concatenation
foreach ($words as $word) {
$offsets[] = strlen($master); // after loop, $offsets = [0, 3, 9]
$master .= $word; // after loop, $master = 'PHPCODINGTECH'
}
$master_offsets = range(0, strlen($master)); // generates: [0,1,2,3,4,5,6,7,8,9,10,11,12]
do {
foreach ($offsets as $offset) {
echo $master[$offset];
}
echo "\n";
} while ($master_offsets !== ($offsets = array_intersect($master_offsets, array_merge($offsets, [current(array_diff($master_offsets, $offsets))])))); // add first different offset from $master_offsets to $offsets until they are identical

php remove duplicate words in an array

Sorry for English is not my mother language, maybe the question title is not quite good. I want to do something like this.
$str = array("Lincoln Crown","Crown Court","go holiday","house fire","John Hinton","Hinton Jailed");
here is an array, "Lincoln Crown" contain "Lincoln" and "Crown", so remove next words, which contains these 2 words, and "Crown Court(contain Crown)" has been removed.
in another case. "John Hinton" contain "John" and "Hinton", so "Hinton Jailed(contain Hinton)" has been removed. the final output should be like this:
$output = array("Lincoln Crown","go holiday","house fire","John Hinton");
for my php skill is not good, it is not simply to use array_unique() array_diff(), so open a question for help, thanks.
I think this might work :P
function cool_function($strs){
// Black list
$toExclude = array();
foreach($strs as $s){
// If it's not on blacklist, then search for it
if(!in_array($s, $toExclude)){
// Explode into blocks
foreach(explode(" ",$s) as $block){
// Search the block on array
$found = preg_grep("/" . preg_quote($block) . "/", $strs);
foreach($found as $k => $f){
if($f != $s){
// Place each found item that's different from current item into blacklist
$toExclude[$k] = $f;
}
}
}
}
}
// Unset all keys that was found
foreach($toExclude as $k => $v){
unset($strs[$k]);
}
// Return the result
return $strs;
}
$strs = array("Lincoln Crown","Crown Court","go holiday","house fire","John Hinton","Hinton Jailed");
print_r(cool_function($strs));
Dump:
Array
(
[0] => Lincoln Crown
[2] => go holiday
[3] => house fire
[4] => John Hinton
)
Seems like you would need a loop and then build a list of words in the array.
Like:
<?
// Store existing array's words; elements will compare their words to this array
// if an element's words are already in this array, the element is deleted
// else the element has its words added to this array
$arrayWords = array();
// Loop through your existing array of elements
foreach ($existingArray as $key => $phrase) {
// Get element's individual words
$words = explode(" ", $phrase);
// Assume the element will not be deleted
$keepWords = true;
// Loop through the element's words
foreach ($words as $word) {
// If one of the words is already in arrayWords (another element uses the word)
if (in_array($word, $arrayWords)) {
// Delete the element
unset($existingArray[$key]);
// Indicate we are not keeping any of the element's words
$keepWords = false;
// Stop the foreach loop
break;
}
}
// Only add the element's words to arrayWords if the entire element stays
if ($keepWords) {
$arrayWords = array_merge($arrayWords, $words);
}
}
?>
As I would do in your case:
$words = array();
foreach($str as $key =>$entry)
{
$entryWords = explode(' ', $entry);
$isDuplicated = false;
foreach($entryWords as $word)
if(in_array($word, $words))
$isDuplicated = true;
if(!$isDuplicated)
$words = array_merge($words, $entryWords);
else
unset($str[$key]);
}
var_dump($str);
Output:
array (size=4)
0 => string 'Lincoln Crown' (length=13)
2 => string 'go holiday' (length=10)
3 => string 'house fire' (length=10)
4 => string 'John Hinton' (length=11)
I can imagine quite a few techniques that can provide your desired output, but the logic that you require is poorly defined in your question. I am assuming that whole word matching is required -- so word boundaries should be used in any regex patterns. Case sensitivity isn't mentioned. I am unsure if only fully unique elements (multi-word strings) should have their words entered into the black list. I'll offer a few snippets, but choosing the appropriate technique will depend on exact logical requirements.
Demo
$output = [];
$blacklist = [];
foreach ($input as $string) {
if (!$blacklist || !preg_match('/\b(?:' . implode('|', $blacklist) . ')\b/', $string)) {
$output[] = $string;
}
foreach(explode(' ', $string) as $word) {
$blacklist[$word] = preg_quote($word);
}
}
var_export($output);
Demo
$output = [];
$blacklist = [];
foreach ($input as $string) {
$words = explode(' ', $string);
foreach ($words as $word) {
if (in_array($word, $blacklist)) {
continue 2;
}
}
array_push($blacklist, ...$words);
$output[] = $string;
}
var_export($output);
And my favorite because it performs fewest iterations in the parent loop, is more compact, and doesn't require the declaration/maintenance of a blacklist array.
Demo
$output = [];
while ($input) {
$output[] = $words = array_shift($input);
$input = preg_grep('~\b(?:\Q' . str_replace(' ', '\E|\Q', $words) . '\E)\b~', $input, PREG_GREP_INVERT);
}
var_export($output);
You can explode each string in the original array and then compare per-words using a loop (comparing each word from one array with each word from another, and if they match, remove the whole array).
array_unique() example
<?php
$input = array("a" => "green", "red", "b" => "green", "blue", "red");
$result = array_unique($input);
print_r($result);
?>
output:
Array
(
[a] => green
[0] => red
[1] => blue
)
Source

Find most repeated sub strings in array

I have an array:
$myArray=array(
'hello my name is richard',
'hello my name is paul',
'hello my name is simon',
'hello it doesn\'t matter what my name is'
);
I need to find the sub string (min 2 words) that is repeated the most often, maybe in an array format, so my return array could look like this:
$return=array(
array('hello my', 3),
array('hello my name', 3),
array('hello my name is', 3),
array('my name', 4),
array('my name is', 4),
array('name is', 4),
);
So I can see from this array of arrays how often each string was repeated amongst all strings in the array.
Is the only way to do it like this?..
function repeatedSubStrings($array){
foreach($array as $string){
$phrases=//Split each string into maximum number of sub strings
foreach($phrases as $phrase){
//Then count the $phrases that are in the strings
}
}
}
I've tried a solution similar to the above but it was too slow, processing around 1000 rows per second, can anyone do it faster?
A solution to this might be
function getHighestRecurrence($strs){
/*Storage for individual words*/
$words = Array();
/*Process multiple strings*/
if(is_array($strs))
foreach($strs as $str)
$words = array_merge($words, explode(" ", $str));
/*Prepare single string*/
else
$words = explode(" ",$strs);
/*Array for word counters*/
$index = Array();
/*Aggregate word counters*/
foreach($words as $word)
/*Increment count or create if it doesn't exist*/
(isset($index[$word]))? $index[$word]++ : $index[$word] = 1;
/*Sort array hy highest value and */
arsort($index);
/*Return the word*/
return key($index);
}
While this has a higher runtime, I think it's simpler from an implementation perspective:
$substrings = array();
foreach ($myArray as $str)
{
$subArr = explode(" ", $str);
for ($i=0;$i<count($subArr);$i++)
{
$substring = "";
for ($j=$i;$j<count($subArr);$j++)
{
if ($i==0 && ($j==count($subArr)-1))
break;
$substring = trim($substring . " " . $subArr[$j]);
if (str_word_count($substring, 0) > 1)
{
if (array_key_exists($substring, $substrings))
$substrings[$substring]++;
else
$substrings[$substring] = 1;
}
}
}
}
arsort($substrings);
print_r($substrings);
I'm assuming by "substring" you really mean "substring split along word boundaries" since that's what your example shows.
In that case, assuming any maximum repeated substring will do (since there may be ties), you can always choose just a single word as a maximum repeated substring, if you think about it. For any phrase "A B", the phrases "A" and "B" individually must occur at least as often as "A B" because they both occur every time "A B" does and they may occur at other times. Therefore, a single word must be have a count that at least ties with any substring that contains that word.
So you just need to split all phrases into a set of unique words, and then just count the words and return one of the words with the highest count. This will run way faster than actually counting every possible substring.
This should run in O(n) time
$twoWordPhrases = function($str) {
$words = preg_split('#\s+#', $str, -1, PREG_SPLIT_NO_EMPTY);
$phrases = array();
foreach (range(0, count($words) - 2) as $offset) {
$phrases[] = array_slice($words, $offset, 2);
}
return $phrases;
};
$frequencies = array();
foreach ($myArray as $str) {
$phrases = $twoWordPhrases($str);
foreach ($phrases as $phrase) {
$key = join('/', $phrase);
if (!isset($frequencies[$key])) {
$frequencies[$key] = 0;
}
$frequencies[$key]++;
}
}
print_r($frequencies);

Categories