Related
Could you tell why Codility tell me the next error, please?
Running solution... Compilation successful.
Example test: [-1, 3, -4, 5, 1, -6, 2, 1] Output (stderr): Invalid
result type, int expected. RUNTIME ERROR (tested program terminated
unexpectedly)
Detected some errors.
My solution was wrote on PHP.
function solution($A) {
$N = count($A);
$Ps = array();
foreach ( $A as $KeyP => $P ) {
$sum = 0;
if ( $KeyP == 0 ) {
for ( $x = 1; $x < $N; $x++ ) {
$sum += $A[$x];
}
if ( $sum == $P ) {
$Ps[] = $KeyP;
}
}
else {
if ( ($KeyP+1) == $N ) {
for ( $z = 0; $z < $KeyP; $z++) :
$sum += $A[$z];
endfor;
if ( ( $sum >= 0 ) AND ( $sum < $N ) ) {
$Ps[] = $KeyP;
}
}
else {
$sum1 = 0;
$sum2 = 0;
for ( $z = 0; $z < $KeyP; $z++ ) :
$sum1 += $A[$z];
endfor;
for ( $y = ( $KeyP+1 ); $y <= ($N-1); $y++ ) :
$sum2 += $A[$y];
endfor;
if ( $sum1 == $sum2 ) {
if ( $sum1 < $N ) {
$Ps[] = $KeyP;
}
}
}
}
}
return ( count($Ps) <= 0 ) ? -1: $Ps;
}
The output of my function given the next array has been:
array(-1, 3, -4, 5, 1, -6, 2, 1);
Ouput
Array ( [0] => 1 [1] => 3 [2] => 7 )
It's just like they request me in the task but Codility show me all those errors.
The demo task is below:
This is a demo task.
A zero-indexed array A consisting of N integers is given. An
equilibrium index of this array is any integer P such that 0 ≤ P < N
and the sum of elements of lower indices is equal to the sum of
elements of higher indices, i.e. A[0] + A[1] + ... + A[P−1] = A[P+1]
+ ... + A[N−2] + A[N−1]. Sum of zero elements is assumed to be equal to 0. This can happen if P = 0 or if P = N−1.
For example, consider the following array A consisting of N = 8
elements:
A[0] => -1
A[1] => 3
A[2] => -4
A[3] => 5
A[4] => 1
A[5] => -6
A[6] => 2
A[7] => 1
P = 1 is an equilibrium index of this array, because:
A[0] = −1 = A[2] + A[3] + A[4] + A[5] + A[6] + A[7] P = 3 is an
equilibrium index of this array, because:
A[0] + A[1] + A[2] = −2 = A[4] + A[5] + A[6] + A[7] P = 7 is also an
equilibrium index, because:
A[0] + A[1] + A[2] + A[3] + A[4] + A[5] + A[6] = 0 and there are no
elements with indices greater than 7.
P = 8 is not an equilibrium index, because it does not fulfill the
condition 0 ≤ P < N.
Write a function:
function solution($A);
that, given a zero-indexed array A consisting of N integers, returns
any of its equilibrium indices. The function should return −1 if no
equilibrium index exists.
For example, given array A shown above, the function may return 1, 3
or 7, as explained above.
Assume that:
N is an integer within the range [0..100,000]; each element of array A
is an integer within the range [−2,147,483,648..2,147,483,647].
Complexity:
expected worst-case time complexity is O(N); expected worst-case space
complexity is O(N), beyond input storage (not counting the storage
required for input arguments). Elements of input arrays can be
modified.
Thank you.
for the Codility error please check this post :
https://stackoverflow.com/a/19804284/4369087
Try this one it's more readable, in this solution I'm making the code more readable by introducing two functions.
sumRight(), sumLeft() in which I use built in php functions.
array_sum() : Calculate the sum of values in an array :
it returns the sum of values as an integer or float; 0 if the array is empty.
array_slice(): Extract a slice of the array: array_slice() returns
the sequence of elements from the array as specified by the
offset and length parameters.
So every time we loop over the array we calculate to sum of the right and left slice of the array from the given position $i :
<?php
function solution(array $a)
{
$result = [];
$count = count($a);
for($i = 0; $i < $count; $i++) {
if(sumLeft($a, $i-1) === sumRight($a, $i+1)) {
$result[] = $i;
}
}
return count($result) ? $result : -1;
}
function sumRight(array $a, float $position): float
{
return array_sum(array_slice($a, $position));;
}
function sumLeft(array $a, float $position): float
{
return array_sum(array_slice($a, 0, $position + 1));
}
echo "<pre>";
print_r(solution([-1, 3, -4, 5, 1, -6, 2, 1]));
output:
Array
(
[0] => 1
[1] => 3
[2] => 7
)
The Tape Equilibrium questions is worded super confusingly.
P is the values of the array $A
P values must be greater than 0 and less than N
N is the total number of indexes in the array $A
$A is a non-empty array consisting of N integers
The part "0 < P < N" doesn't make sense. If N is the integers in non-empty array A then N should be the length of the array and data such as [100,200] should not work because P=100 & P=200, both of which are greater than 2.
My opinion is that the explanation is poorly worded. However, the example is clear, so the solution is relatively straight forward if you can get past the opening text.
A non-empty array A consisting of N integers is given. Array A represents numbers on a tape.
Any integer P, such that 0 < P < N, splits this tape into two
non-empty parts: A[0], A[1], ..., A[P − 1] and A[P], A[P + 1], ...,
A[N − 1].
The difference between the two parts is the value of: |(A[0] + A[1] +
... + A[P − 1]) − (A[P] + A[P + 1] + ... + A[N − 1])|
In other words, it is the absolute difference between the sum of the
first part and the sum of the second part.
The part that stumped me initially was that I was returning NULL for erroneous array input and receiving a Codility error that NULL was returned when it was expecting an INT. I thought, how can I return an INT when there is no solution for bad data? Turns out, they want you to return a zero.
Here is my 100/100 PHP solution:
function solution($A) {
if ( empty($A) ) {
return 0;
}
$count = count($A);
if ($count == 1) {
return $A[0];
}
for ($i=0, $max_position = $count - 1; $i<$max_position; $i++) {
if (!is_int($A[$i])) {
return;
}
if ($i == 0) {
$left = $A[0];
$right = array_sum($A) - $left;
$min = abs($left - $right);
}
else {
$left += $A[$i];
$right -= $A[$i];
$min = min([$min, abs($left - $right)]);
}
}
return $min;
}
Three things to take note of.
IF $A is empty then return an integer (0) because NULL results in a
codility error
If $A only has one index then return an integer (0) because you cannot split a single index into two equal parts like they want and
NULL results in a codility error
Apparently, invalid non-integer values are not tested - you can see where I checked for !is_int($A[$i]) and then a "return;" that
should have resulted in a NULL (and an error about the NULL not
being an int), but I did not.
Assume that: N is an integer within the range [0..100,000]; each element of array A is an integer within the range [−2,147,483,648..2,147,483,647].
This means I don't see any reason to check if an element isInt() as suggested by #Floyd.
For best efficiency, always break/return as early as possible.
There is no reason to iterate on the first or last index in the array because splitting the array on the first element will result in nothing on the "left side" and splitting on the last element will result in nothing on the "right side". For this reason, start iterating from 1 and only iterate while $i is less than array count - 1.
I struggled to follow the asker's coding attempt, but it seems likely that iterating/evaluating the bookend elements was to blame.
Code -- not actually tested on Codility: (Demo)
$test = [-1, 3, -4, 5, 1, -6, 2, 1];
function solution(array $array): int {
for ($i = 1, $count = count($array) - 1; $i < $count; ++$i) {
if (array_sum(array_slice($array, 0, $i)) === array_sum(array_slice($array, $i + 1))) {
return $i;
}
}
return -1;
}
echo solution($test);
I'm trying to build an algorithm for processing bracket sheet of competitions. I need to go through a range of numbers. For each number there will be the athlete name. Numbers are assigned to athletes randomly but the number's pairing must always stay the same. There are two groups odd and even, i.e. A and B.
The only problem that I can't find the proper algorithm to iterate numbers the exact way as follows:
Group A:
--------
1
17
9
25
------
5
21
13
29
------
3
19
11
27
------
7
23
15
31
Group B:
--------
2
18
10
26
------
6
22
14
30
------
4
20
12
28
------
8
24
16
32
Could someone please help with advice or example of how to get the output above?
EDIT 1:
The example above is the bracket sheet for 32 athletes! Same logic must be applied if you use a sheet for 4,8,16,64 or 128 athletes!
EDIT 2:
Let's make it more clear with examples of the sheet for 4 athletes and then the sheet for 16 athletes.
The sheet for 4 athletes:
Group A:
--------
1
3
Group B:
--------
2
4
The sheet for 16 athletes:
Group A:
--------
1
9
5
13
------
3
11
7
15
Group B:
--------
2
10
6
14
------
4
12
8
16
EDIT 3:
The last part, is that I'm planning to have an array with athlete name and its status in it.
By status I mean that, if the athlete has been a champion previously (strong), then he/she gets 1 for status, if the athlete's previous achievements are not known or minimal (weak), then the status is 0. It's done that way, so we could separate strongest athletes into different groups and make sure that they will not fight against each other in the first fight but rather meet each other closer to the semi-final or final.
Example of PHP array:
$participants = array(
array("John", 0),
array("Gagan", 0),
array("Mike Tyson", 1),
array("Gair", 0),
array("Gale", 0),
array("Roy Johnes", 1),
array("Galip", 0),
array("Gallagher", 0),
array("Garett", 0),
array("Nikolai Valuev", 1),
array("Garner", 0),
array("Gary", 0),
array("Gelar", 0),
array("Gershom", 0),
array("Gilby", 0),
array("Gilford", 0)
);
From this example we see that those, who have status 1 must be in different groups, i.e. A and B. But we have only two groups of numbers odd and even and in this example, there are 3 strong athletes. Thus two of them will be at the same group. The final result must be, that those two strong athletes, that got in the same group, must not meet at the very first fight (it means that they will not be on the same pair of numbers and as far away from each other as possible, so they wouldn't meet on the second fight as well).
Then randomly, I'm planning to rearrange the array and send athletes to the bracket sheet - every time, with different numbers, every time, those that have a flag 1 go to different groups and/or never meet at the first fight and every time, athletes' names assigned to the same pair of numbers.
Considering the number of participants is always a power of 2, this piece of code should give you the order you're expecting.
function getOrder($numberOfParticipants) {
$order = array(1, 2);
for($i = 2; $i < $numberOfParticipants; $i <<= 1) {
$nextOrder = array();
foreach($order as $number) {
$nextOrder[] = $number;
$nextOrder[] = $number + $i;
}
$order = $nextOrder;
}
return $order; // which is for instance [1, 17, 9, 25, and so on...] with 32 as argument
}
About the way it works, let's take a look at what happens when doubling the number of participants.
Participants | Order
2 | 1 2
4 | 1 3=1+2 2 4=2+2
8 | 1 5=1+4 3 7=3+4 2 6=2+4 4 8=4+4
... |
N | 1 X Y Z ...
2N | 1 1+N X X+N Y Y+N Z Z+N ...
The algorithm I used is the exact same logic. I start with an array containing only [1, 2] and $i is actually the size of this array. Then I'm computing the next line until I reach the one with the right number of participants.
On a side note: $i <<= 1 does the same than $i *= 2. You can read documentation about bitwise operators for further explanations.
About strong athletes, as you want to keep as much randomness as possible, here is a solution (probably not optimal but that's what I first thought):
Make two arrays, one with strongs and one with weaks
If there are no strongs or a single one, just shuffle the whole array and go to 8.
If there are more strongs than weaks (dunno if it can happen in your case but better be safe than sorry), shuffle the strongs and put the last ones with weaks so both arrays are the same size
Otherwise, fill up the strongs with null elements so the array size is a power of 2 then shuffle it
Shuffle the weaks
Prepare as many groups as they are elements in the strongs array and put in each group one of the strongs (or none if you have a null element) and complete with as many weaks as needed
Shuffle each group
Return the participants, ordered the same way than previous function resulting array
And the corresponding code:
function splitStrongsAndWeaks($participants) {
$strongs = array();
$weaks = array();
foreach($participants as $participant) {
if($participant != null && $participant[1] == 1)
$strongs[] = $participant;
else
$weaks[] = $participant;
}
return array($strongs, $weaks);
}
function insertNullValues($elements, $totalNeeded)
{
$strongsNumber = count($elements);
if($strongsNumber == $totalNeeded)
return $elements;
if($strongsNumber == 1)
{
if(mt_rand(0, 1))
array_unshift($elements, null);
else
$elements[] = null;
return $elements;
}
if($strongsNumber & 1)
$half = ($strongsNumber >> 1) + mt_rand(0, 1);
else
$half = $strongsNumber >> 1;
return array_merge(insertNullValues(array_splice($elements, 0, $half), $totalNeeded >> 1), insertNullValues($elements, $totalNeeded >> 1));
}
function shuffleParticipants($participants, $totalNeeded) {
list($strongs, $weaks) = splitStrongsAndWeaks($participants);
// If there are only weaks or a single strong, just shuffle them
if(count($strongs) < 2) {
shuffle($participants);
$participants = insertNullValues($participants, $totalNeeded);
}
else {
shuffle($strongs);
// If there are more strongs, we need to put some with the weaks
if(count($strongs) > $totalNeeded / 2) {
list($strongs, $strongsToWeaks) = array_chunk($strongs, $totalNeeded / 2);
$weaks = array_merge($weaks, $strongToWeaks);
$neededGroups = $totalNeeded / 2;
}
// Else we need to make sure the number of groups will be a power of 2
else {
$neededGroups = 1 << ceil(log(count($strongs), 2));
if(count($strongs) < $neededGroups)
$strongs = insertNullValues($strongs, $neededGroups);
}
shuffle($weaks);
// Computing needed non null values in each group
$neededByGroup = $totalNeeded / $neededGroups;
$neededNonNull = insertNullValues(array_fill(0, count($participants), 1), $totalNeeded);
$neededNonNull = array_chunk($neededNonNull, $neededByGroup);
$neededNonNull = array_map('array_sum', $neededNonNull);
// Creating groups, putting 0 or 1 strong in each
$participants = array();
foreach($strongs as $strong) {
$group = array();
if($strong != null)
$group[] = $strong;
$nonNull = array_shift($neededNonNull);
while(count($group) < $nonNull)
$group[] = array_shift($weaks);
while(count($group) < $neededByGroup)
$group[] = null;
// Shuffling again each group so you can get for instance 1 -> weak, 17 -> strong
shuffle($group);
$participants[] = $group;
}
// Flattening to get a 1-dimension array
$participants = call_user_func_array('array_merge', $participants);
}
// Returned array contains participants ordered the same way as getOrder()
// (eg. with 32 participants, first will have number 1, second number 17 and so on...)
return $participants;
}
If you want the resulting array to have as indexes the number in the bracket, you can simply do:
$order = getOrder(count($participants));
$participants = array_combine($order, shuffleParticipants($participants, count($order)));
Okay, I finally managed to convert my Tcl code to PHP! I changed some things too:
<?php
// Function generating order participants will be placed in array
function getBracket($L) {
// List will hold insert sequence
$list = array();
// Bracket will hold final order of participants
$bracket = array();
// The algorithm to generate the insert sequence
for ($n = 1; $n <= $L; $n += 1) {
// If 'perfect' number, just put it (Perfect no.s: 2, 4, 8, 16, 32, etc)
if (substr(log($n)/log(2), -2) == ".0") {
$list[] = $n;
// If odd number, stuff...
} elseif ($n % 2 == 1) {
$list[] = $list[($n-1)/2];
// Else even number, stuff...
} else {
$list[] = $list[$n/2-1]+$n/2;
}
}
// Insert participant order as per insert sequence
for ($i = 1; $i <= sizeof($list); $i += 1) {
$id = $i-1;
array_splice($bracket, $list[$id], 0, $i);
}
return $bracket;
}
// Find number of participants over 'perfect' number if any
function cleanList($L) {
for ($d = 1; $L > $d; $d += 1) {
$sq = $L-pow(2,$d);
if($sq == 0) {break;}
if($sq < 0) {
$d = pow(2,$d-1);
$diff = $L-$d;
break;
}
}
return $diff;
}
$participants = array(
array(0, "John", 2),
array(1, "Gagan", 1),
array(2, "Mike Tyson", 1),
array(3, "Gair", 1),
array(4, "Gale", 0),
array(5, "Roy Johnes", 0),
array(6, "Galip", 0),
array(7, "Gallagher", 0),
array(8, "Garett", 0),
array(9, "Nikolai Valuev", 0),
array(10, "Garner", 1),
array(11, "Gary", 0),
array(12, "Gelar", 0),
array(13, "Gershom", 1),
array(14, "Gilby", 0),
array(15, "Gilford", 1),
array(16, "Arianna", 0)
);
// Extract strength of participant
foreach ($participants as $array) {
$finorder[] = $array[2];
}
// Sort by strength, strongest first
array_multisort($finorder,SORT_DESC,$participants);
$order = array();
$outside = array();
// Remove participants above 'perfect' number
$remove = cleanList(sizeof($participants));
for ($r = 1; $r <= $remove; $r += 1) {
$removed = array_shift($participants);
$outside[] = $removed;
}
// Get corresponding bracket
$res = getBracket(sizeof($participants));
foreach ($res as $n) {
$order[] = $n;
}
// Align bracket results with participant list
array_multisort($order, $participants);
$participants = array_combine($res, $participants);
echo "The final arrangement of participants\n";
print_r($participants);
print_r($outside);
?>
Codepad demo
To get the logic for the order of insertion of elements, I used this pattern.
Also, since I'm not too familiar with PHP, there might be ways to make some things shorter, but oh well, as long as it works ^^
EDIT: Fixed an issue with first participant sorting and added new ticket numbers. For results without old ticket numbers, see here.
EDIT2: Managed to move keys into arrays; see here.
EDIT3: I thought that 'extra' participants should go outside the bracket. If you want null instead in the bracket, you can use this.
EDIT4: Somehow, PHP versions on codepad broke some stuff... fixing it below and removing initial index...:
<?php
// Function generating order participants will be placed in array
function getBracket($L) {
// List will hold insert sequence
$list = array();
// Bracket will hold final order of participants
$bracket = array();
// The algorithm to generate the insert sequence
for ($n = 1; $n <= $L; $n += 1) {
// If 'perfect' number, just put it (Perfect no.s: 2, 4, 8, 16, 32, etc)
if (int(log($n)/log(2)) || $n == 1) {
$list[] = $n;
// If odd number, stuff...
} elseif ($n % 2 == 1) {
$list[] = $list[($n-1)/2];
// Else even number, stuff...
} else {
$list[] = $list[$n/2-1]+$n/2;
}
}
// Insert participant order as per insert sequence
for ($i = 1; $i <= sizeof($list); $i += 1) {
$id = $list[$i-1]-1;
array_splice($bracket, $id, 0, $i);
}
return $bracket;
}
// Find number of participants over 'perfect' number if any
function cleanList($L) {
for ($d = 1; $L > $d; $d += 1) {
$diff = $L-pow(2,$d);
if($diff == 0) {break;}
if($diff < 0) {
$diff = pow(2,$d)-$L;
break;
}
}
return $diff;
}
$participants = array(
array("John", 2),
array("Gagan", 1),
array("Mike Tyson", 1),
array("Gair", 1),
array("Gale", 0),
array("Roy Johnes", 0),
array("Galip", 0),
array("Gallagher", 0),
array("Garett", 0),
array("Nikolai Valuev", 0),
array("Garner", 1),
);
// Extract strength of participant
foreach ($participants as $array) {
$finorder[] = $array[2];
}
// Sort by strength, strongest first
array_multisort($finorder,SORT_DESC,$participants);
$order = array();
// Add participants until 'perfect' number
$add = cleanList(sizeof($participants));
for ($r = 1; $r <= $add; $r += 1) {
$participants[] = null;
}
// Get corresponding bracket
$res = getBracket(sizeof($participants));
// Align bracket results with participant list
foreach ($res as $n) {
$order[] = $n;
}
array_multisort($order, $participants);
$participants = array_combine($res, $participants);
echo "The final arrangement of participants\n";
print_r($participants);
?>
ideone
viper-7
This sketchy code might be what you want:
<?php
class Pair
{
public $a;
public $b;
function __construct($a, $b) {
if(($a & 1) != ($b & 1))
throw new Exception('Invalid Pair');
$this->a = $a;
$this->b = $b;
}
}
class Competition
{
public $odd_group = array();
public $even_group = array();
function __construct($order) {
$n = 1 << $order;
$odd = array();
$even = array();
for($i = 0; $i < $n; $i += 4) {
$odd[] = $i + 1;
$odd[] = $i + 3;
$even[] = $i + 2;
$even[] = $i + 4;
}
shuffle($odd);
shuffle($even);
for($i = 0; $i < count($odd); $i += 2) {
$this->odd_group[] = new Pair($odd[$i], $odd[$i+1]);
$this->even_group[] = new Pair($even[$i], $even[$i+1]);
}
echo "Odd\n";
for($i = 0; $i < count($this->odd_group); ++$i) {
$pair = $this->odd_group[$i];
echo "{$pair->a} vs. {$pair->b}\n";
}
echo "Even\n";
for($i = 0; $i < count($this->even_group); ++$i) {
$pair = $this->even_group[$i];
echo "{$pair->a} vs. {$pair->b}\n";
}
}
}
new Competition(5);
?>
I need a function that randomizes an array similar to what shuffle does, with the difference that each element has different chances.
For example, consider the following array:
$animals = array('elephant', 'dog', 'cat', 'mouse');
elephant has an higher chance of getting on the first index than dog. Dog has an higher chance than cat and so on. For example, in this particular example elephant could have a chance of 40% in getting in the 1st position, 30% of getting on the 2nd position, 20% on getting on 3rd and 10% getting on last.
So, after the shuffling, the first elements in the original array will be more likely (but not for sure) to be in the first positions and the last ones in the last positions.
Normal shuffle may be implemented just as
dropping items randomly at some range
picking them up from left to right
We can adjust dropping step, drop every element not into whole range, but at some sliding window. Let N would be amount of elements in array, window width would be w and we'll move it at each step by off. Then off*(N-1) + w would be total width of the range.
Here's a function, which distorts elements' positions, but not completely at random.
function weak_shuffle($a, $strength) {
$len = count($a);
if ($len <= 1) return $a;
$out = array();
$M = mt_getrandmax();
$w = round($M / ($strength + 1)); // width of the sliding window
$off = ($M - $w) / ($len - 1); // offset of that window for each step.
for ($i = 0; $i < $len; $i++) {
do {
$idx = intval($off * $i + mt_rand(0, $w));
} while(array_key_exists($idx, $out));
$out[$idx] = $a[$i];
}
ksort($out);
return array_values($out);
}
$strength = 0 ~normal shuffle.
$strength = 0.25 ~your desired result (40.5%, 25.5%, 22%, 12% for elephant)
$strength = 1 first item will never be after last one.
$strength >= 3 array is actually never shuffled
Playground for testing:
$animals = array( 'elephant', 'dog', 'cat', 'mouse' );
$pos = array(0,0,0,0);
for ($iter = 0; $iter < 100000; $iter++) {
$shuffled = weak_shuffle($animals, 0.25);
$idx = array_search('elephant', $shuffled);
$pos[$idx]++;
}
print_r($pos);
Try to use this algorithm:
$animals = [ 'elephant', 'dog', 'cat', 'mouse' ]; // you can add more animals here
$shuffled = [];
$count = count($animals);
foreach($animals as $chance => $animal) {
$priority = ceil(($count - $chance) * 100 / $count);
$shuffled = array_merge($shuffled, array_fill(0, $priority, $animal));
}
shuffle($shuffled);
$animals = array_unique($shuffled);
You have an array, let's say of n elements. The probability that the i'th element will go to the j'th position is P(i, j). If I understood well, the following formula holds:
(P(i1, j1) >= P(i2, j2)) <=> (|i1 - j1| <= |j1 - i1|)
Thus, you have a Galois connection between the distance in your array and the shuffle probability. You can use this Galois connection to implement your exact formula if you have one. If you don't have a formula, you can invent one, which will meet the criteria specified above. Good luck.
I have an array like this:
$sports = array(
'Softball - Counties',
'Softball - Eastern',
'Softball - North Harbour',
'Softball - South',
'Softball - Western'
);
I would like to find the longest common prefix of the string. In this instance, it would be 'Softball - '
I am thinking that I would follow this process
$i = 1;
// loop to the length of the first string
while ($i < strlen($sports[0]) {
// grab the left most part up to i in length
$match = substr($sports[0], 0, $i);
// loop through all the values in array, and compare if they match
foreach ($sports as $sport) {
if ($match != substr($sport, 0, $i) {
// didn't match, return the part that did match
return substr($sport, 0, $i-1);
}
} // foreach
// increase string length
$i++;
} // while
// if you got to here, then all of them must be identical
Questions
Is there a built in function or much simpler way of doing this ?
For my 5 line array that is probably fine, but if I were to do several thousand line arrays, there would be a lot of overhead, so I would have to be move calculated with my starting values of $i, eg $i = halfway of string, if it fails, then $i/2 until it works, then increment $i by 1 until we succeed. So that we are doing the least number of comparisons to get a result.
Is there a formula/algorithm out already out there for this kind of problem?
If you can sort your array, then there is a simple and very fast solution.
Simply compare the first item to the last one.
If the strings are sorted, any prefix common to all strings will be common to the sorted first and last strings.
sort($sport);
$s1 = $sport[0]; // First string
$s2 = $sport[count($sport)-1]; // Last string
$len = min(strlen($s1), strlen($s2));
// While we still have string to compare,
// if the indexed character is the same in both strings,
// increment the index.
for ($i=0; $i<$len && $s1[$i]==$s2[$i]; $i++);
$prefix = substr($s1, 0, $i);
I would use this:
$prefix = array_shift($array); // take the first item as initial prefix
$length = strlen($prefix);
// compare the current prefix with the prefix of the same length of the other items
foreach ($array as $item) {
// check if there is a match; if not, decrease the prefix by one character at a time
while ($length && substr($item, 0, $length) !== $prefix) {
$length--;
$prefix = substr($prefix, 0, -1);
}
if (!$length) {
break;
}
}
Update
Here’s another solution, iteratively comparing each n-th character of the strings until a mismatch is found:
$pl = 0; // common prefix length
$n = count($array);
$l = strlen($array[0]);
while ($pl < $l) {
$c = $array[0][$pl];
for ($i=1; $i<$n; $i++) {
if ($array[$i][$pl] !== $c) break 2;
}
$pl++;
}
$prefix = substr($array[0], 0, $pl);
This is even more efficient as there are only at most numberOfStrings·commonPrefixLength atomic comparisons.
I implemented #diogoriba algorithm into code, with this result:
Finding the common prefix of the first two strings, and then comparing this with all following strings starting from the 3rd, and trim the common string if nothing common is found, wins in situations where there is more in common in the prefixes than different.
But bumperbox's original algorithm (except the bugfixes) wins where the strings have less in common in their prefix than different. Details in the code comments!
Another idea I implemented:
First check for the shortest string in the array, and use this for comparison rather than simply the first string. In the code, this is implemented with the custom written function arrayStrLenMin().
Can bring down iterations dramatically, but the function arrayStrLenMin() may itself cause ( more or less) iterations.
Simply starting with the length of first string in array seems quite clumsy, but may turn out effective, if arrayStrLenMin() needs many iterations.
Get the maximum common prefix of strings in an array with as little iterations as possible (PHP)
Code + Extensive Testing + Remarks:
function arrayStrLenMin ($arr, $strictMode = false, $forLoop = false) {
$errArrZeroLength = -1; // Return value for error: Array is empty
$errOtherType = -2; // Return value for error: Found other type (than string in array)
$errStrNone = -3; // Return value for error: No strings found (in array)
$arrLength = count($arr);
if ($arrLength <= 0 ) { return $errArrZeroLength; }
$cur = 0;
foreach ($arr as $key => $val) {
if (is_string($val)) {
$min = strlen($val);
$strFirstFound = $key;
// echo("Key\tLength / Notification / Error\n");
// echo("$key\tFound first string member at key with length: $min!\n");
break;
}
else if ($strictMode) { return $errOtherType; } // At least 1 type other than string was found.
}
if (! isset($min)) { return $errStrNone; } // No string was found in array.
// SpeedRatio of foreach/for is approximately 2/1 as dicussed at:
// http://juliusbeckmann.de/blog/php-foreach-vs-while-vs-for-the-loop-battle.html
// If $strFirstFound is found within the first 1/SpeedRatio (=0.5) of the array, "foreach" is faster!
if (! $forLoop) {
foreach ($arr as $key => $val) {
if (is_string($val)) {
$cur = strlen($val);
// echo("$key\t$cur\n");
if ($cur == 0) { return $cur; } // 0 is the shortest possible string, so we can abort here.
if ($cur < $min) { $min = $cur; }
}
// else { echo("$key\tNo string!\n"); }
}
}
// If $strFirstFound is found after the first 1/SpeedRatio (=0.5) of the array, "for" is faster!
else {
for ($i = $strFirstFound + 1; $i < $arrLength; $i++) {
if (is_string($arr[$i])) {
$cur = strlen($arr[$i]);
// echo("$i\t$cur\n");
if ($cur == 0) { return $cur; } // 0 is the shortest possible string, so we can abort here.
if ($cur < $min) { $min = $cur; }
}
// else { echo("$i\tNo string!\n"); }
}
}
return $min;
}
function strCommonPrefixByStr($arr, $strFindShortestFirst = false) {
$arrLength = count($arr);
if ($arrLength < 2) { return false; }
// Determine loop length
/// Find shortest string in array: Can bring down iterations dramatically, but the function arrayStrLenMin() itself can cause ( more or less) iterations.
if ($strFindShortestFirst) { $end = arrayStrLenMin($arr, true); }
/// Simply start with length of first string in array: Seems quite clumsy, but may turn out effective, if arrayStrLenMin() needs many iterations.
else { $end = strlen($arr[0]); }
for ($i = 1; $i <= $end + 1; $i++) {
// Grab the part from 0 up to $i
$commonStrMax = substr($arr[0], 0, $i);
echo("Match: $i\t$commonStrMax\n");
// Loop through all the values in array, and compare if they match
foreach ($arr as $key => $str) {
echo(" Str: $key\t$str\n");
// Didn't match, return the part that did match
if ($commonStrMax != substr($str, 0, $i)) {
return substr($commonStrMax, 0, $i-1);
}
}
}
// Special case: No mismatch (hence no return) happened until loop end!
return $commonStrMax; // Thus entire first common string is the common prefix!
}
function strCommonPrefixByChar($arr, $strFindShortestFirst = false) {
$arrLength = count($arr);
if ($arrLength < 2) { return false; }
// Determine loop length
/// Find shortest string in array: Can bring down iterations dramatically, but the function arrayStrLenMin() itself can cause ( more or less) iterations.
if ($strFindShortestFirst) { $end = arrayStrLenMin($arr, true); }
/// Simply start with length of first string in array: Seems quite clumsy, but may turn out effective, if arrayStrLenMin() needs many iterations.
else { $end = strlen($arr[0]); }
for ($i = 0 ; $i <= $end + 1; $i++) {
// Grab char $i
$char = substr($arr[0], $i, 1);
echo("Match: $i\t"); echo(str_pad($char, $i+1, " ", STR_PAD_LEFT)); echo("\n");
// Loop through all the values in array, and compare if they match
foreach ($arr as $key => $str) {
echo(" Str: $key\t$str\n");
// Didn't match, return the part that did match
if ($char != $str[$i]) { // Same functionality as ($char != substr($str, $i, 1)). Same efficiency?
return substr($arr[0], 0, $i);
}
}
}
// Special case: No mismatch (hence no return) happened until loop end!
return substr($arr[0], 0, $end); // Thus entire first common string is the common prefix!
}
function strCommonPrefixByNeighbour($arr) {
$arrLength = count($arr);
if ($arrLength < 2) { return false; }
/// Get the common string prefix of the first 2 strings
$strCommonMax = strCommonPrefixByChar(array($arr[0], $arr[1]));
if ($strCommonMax === false) { return false; }
if ($strCommonMax == "") { return ""; }
$strCommonMaxLength = strlen($strCommonMax);
/// Now start looping from the 3rd string
echo("-----\n");
for ($i = 2; ($i < $arrLength) && ($strCommonMaxLength >= 1); $i++ ) {
echo(" STR: $i\t{$arr[$i]}\n");
/// Compare the maximum common string with the next neighbour
/*
//// Compare by char: Method unsuitable!
// Iterate from string end to string beginning
for ($ii = $strCommonMaxLength - 1; $ii >= 0; $ii--) {
echo("Match: $ii\t"); echo(str_pad($arr[$i][$ii], $ii+1, " ", STR_PAD_LEFT)); echo("\n");
// If you find the first mismatch from the end, break.
if ($arr[$i][$ii] != $strCommonMax[$ii]) {
$strCommonMaxLength = $ii - 1; break;
// BUT!!! We may falsely assume that the string from the first mismatch until the begining match! This new string neighbour string is completely "unexplored land", there might be differing chars closer to the beginning. This method is not suitable. Better use string comparison than char comparison.
}
}
*/
//// Compare by string
for ($ii = $strCommonMaxLength; $ii > 0; $ii--) {
echo("MATCH: $ii\t$strCommonMax\n");
if (substr($arr[$i],0,$ii) == $strCommonMax) {
break;
}
else {
$strCommonMax = substr($strCommonMax,0,$ii - 1);
$strCommonMaxLength--;
}
}
}
return substr($arr[0], 0, $strCommonMaxLength);
}
// Tests for finding the common prefix
/// Scenarios
$filesLeastInCommon = array (
"/Vol/1/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/a/1",
"/Vol/2/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/a/2",
"/Vol/1/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/b/1",
"/Vol/1/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/b/2",
"/Vol/2/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/b/c/1",
"/Vol/2/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/a/1",
);
$filesLessInCommon = array (
"/Vol/1/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/a/1",
"/Vol/1/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/a/2",
"/Vol/1/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/b/1",
"/Vol/1/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/b/2",
"/Vol/2/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/b/c/1",
"/Vol/2/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/a/1",
);
$filesMoreInCommon = array (
"/Voluuuuuuuuuuuuuumes/1/a/a/1",
"/Voluuuuuuuuuuuuuumes/1/a/a/2",
"/Voluuuuuuuuuuuuuumes/1/a/b/1",
"/Voluuuuuuuuuuuuuumes/1/a/b/2",
"/Voluuuuuuuuuuuuuumes/2/a/b/c/1",
"/Voluuuuuuuuuuuuuumes/2/a/a/1",
);
$sameDir = array (
"/Volumes/1/a/a/",
"/Volumes/1/a/a/aaaaa/2",
);
$sameFile = array (
"/Volumes/1/a/a/1",
"/Volumes/1/a/a/1",
);
$noCommonPrefix = array (
"/Volumes/1/a/a/",
"/Volumes/1/a/a/aaaaa/2",
"Net/1/a/a/aaaaa/2",
);
$longestLast = array (
"/Volumes/1/a/a/1",
"/Volumes/1/a/a/aaaaa/2",
);
$longestFirst = array (
"/Volumes/1/a/a/aaaaa/1",
"/Volumes/1/a/a/2",
);
$one = array ("/Volumes/1/a/a/aaaaa/1");
$empty = array ( );
// Test Results for finding the common prefix
/*
I tested my functions in many possible scenarios.
The results, the common prefixes, were always correct in all scenarios!
Just try a function call with your individual array!
Considering iteration efficiency, I also performed tests:
I put echo functions into the functions where iterations occur, and measured the number of CLI line output via:
php <script with strCommonPrefixByStr or strCommonPrefixByChar> | egrep "^ Str:" | wc -l GIVES TOTAL ITERATION SUM.
php <Script with strCommonPrefixByNeighbour> | egrep "^ Str:" | wc -l PLUS | egrep "^MATCH:" | wc -l GIVES TOTAL ITERATION SUM.
My hypothesis was proven:
strCommonPrefixByChar wins in situations where the strings have less in common in their beginning (=prefix).
strCommonPrefixByNeighbour wins where there is more in common in the prefixes.
*/
// Test Results Table
// Used Functions | Iteration amount | Remarks
// $result = (strCommonPrefixByStr($filesLessInCommon)); // 35
// $result = (strCommonPrefixByChar($filesLessInCommon)); // 35 // Same amount of iterations, but much fewer characters compared because ByChar instead of ByString!
// $result = (strCommonPrefixByNeighbour($filesLessInCommon)); // 88 + 42 = 130 // Loses in this category!
// $result = (strCommonPrefixByStr($filesMoreInCommon)); // 137
// $result = (strCommonPrefixByChar($filesMoreInCommon)); // 137 // Same amount of iterations, but much fewer characters compared because ByChar instead of ByString!
// $result = (strCommonPrefixByNeighbour($filesLeastInCommon)); // 12 + 4 = 16 // Far the winner in this category!
echo("Common prefix of all members:\n");
var_dump($result);
// Tests for finding the shortest string in array
/// Arrays
// $empty = array ();
// $noStrings = array (0,1,2,3.0001,4,false,true,77);
// $stringsOnly = array ("one","two","three","four");
// $mixed = array (0,1,2,3.0001,"four",false,true,"seven", 8888);
/// Scenarios
// I list them from fewest to most iterations, which is not necessarily equivalent to slowest to fastest!
// For speed consider the remarks in the code considering the Speed ratio of foreach/for!
//// Fewest iterations (immediate abort on "Found other type", use "for" loop)
// foreach( array($empty, $noStrings, $stringsOnly, $mixed) as $arr) {
// echo("NEW ANALYSIS:\n");
// echo("Result: " . arrayStrLenMin($arr, true, true) . "\n\n");
// }
/* Results:
NEW ANALYSIS:
Result: Array is empty!
NEW ANALYSIS:
Result: Found other type!
NEW ANALYSIS:
Key Length / Notification / Error
0 Found first string member at key with length: 3!
1 3
2 5
3 4
Result: 3
NEW ANALYSIS:
Result: Found other type!
*/
//// Fewer iterations (immediate abort on "Found other type", use "foreach" loop)
// foreach( array($empty, $noStrings, $stringsOnly, $mixed) as $arr) {
// echo("NEW ANALYSIS:\n");
// echo("Result: " . arrayStrLenMin($arr, true, false) . "\n\n");
// }
/* Results:
NEW ANALYSIS:
Result: Array is empty!
NEW ANALYSIS:
Result: Found other type!
NEW ANALYSIS:
Key Length / Notification / Error
0 Found first string member at key with length: 3!
0 3
1 3
2 5
3 4
Result: 3
NEW ANALYSIS:
Result: Found other type!
*/
//// More iterations (No immediate abort on "Found other type", use "for" loop)
// foreach( array($empty, $noStrings, $stringsOnly, $mixed) as $arr) {
// echo("NEW ANALYSIS:\n");
// echo("Result: " . arrayStrLenMin($arr, false, true) . "\n\n");
// }
/* Results:
NEW ANALYSIS:
Result: Array is empty!
NEW ANALYSIS:
Result: No strings found!
NEW ANALYSIS:
Key Length / Notification / Error
0 Found first string member at key with length: 3!
1 3
2 5
3 4
Result: 3
NEW ANALYSIS:
Key Length / Notification / Error
4 Found first string member at key with length: 4!
5 No string!
6 No string!
7 5
8 No string!
Result: 4
*/
//// Most iterations (No immediate abort on "Found other type", use "foreach" loop)
// foreach( array($empty, $noStrings, $stringsOnly, $mixed) as $arr) {
// echo("NEW ANALYSIS:\n");
// echo("Result: " . arrayStrLenMin($arr, false, false) . "\n\n");
// }
/* Results:
NEW ANALYSIS:
Result: Array is empty!
NEW ANALYSIS:
Result: No strings found!
NEW ANALYSIS:
Key Length / Notification / Error
0 Found first string member at key with length: 3!
0 3
1 3
2 5
3 4
Result: 3
NEW ANALYSIS:
Key Length / Notification / Error
4 Found first string member at key with length: 4!
0 No string!
1 No string!
2 No string!
3 No string!
4 4
5 No string!
6 No string!
7 5
8 No string!
Result: 4
*/
Probably there is some terribly well-regarded algorithm for this, but just off the top of my head, if you know your commonality is going to be on the left-hand side like in your example, you could do way better than your posted methodology by first finding the commonality of the first two strings, and then iterating down the rest of the list, trimming the common string as necessary to achieve commonality or terminating with failure if you trim all the way to nothing.
I think you're on the right way. But instead of incrementing i when all of the string passes, you could do this:
1) Compare the first 2 strings in the array and find out how many common characters they have. Save the common characters in a separate string called maxCommon, for example.
2) Compare the third string w/ maxCommon. If the number of common characters is smaller, trim maxCommon to the characters that match.
3) Repeat and rinse for the rest of the array. At the end of the process, maxCommon will have the string that is common to all of the array elements.
This will add some overhead because you'll need to compare each string w/ maxCommon, but will drastically reduce the number of iterations you'll need to get your results.
I assume that by "common part" you mean "longest common prefix". That is a much simpler to compute than any common substring.
This cannot be done without reading (n+1) * m characters in the worst case and n * m + 1 in the best case, where n is the length of the longest common prefix and m is the number of strings.
Comparing one letter at a time achieves that efficiency (Big Theta (n * m)).
Your proposed algorithm runs in Big Theta(n^2 * m), which is much, much slower for large inputs.
The third proposed algorithm of finding the longest prefix of the first two strings, then comparing that with the third, fourth, etc. also has a running time in Big Theta(n * m), but with a higher constant factor. It will probably only be slightly slower in practice.
Overall, I would recommend just rolling your own function, since the first algorithm is too slow and the two others will be about equally complicated to write anyway.
Check out WikiPedia for a description of Big Theta notation.
Here's an elegant, recursive implementation in JavaScript:
function prefix(strings) {
switch (strings.length) {
case 0:
return "";
case 1:
return strings[0];
case 2:
// compute the prefix between the two strings
var a = strings[0],
b = strings[1],
n = Math.min(a.length, b.length),
i = 0;
while (i < n && a.charAt(i) === b.charAt(i))
++i;
return a.substring(0, i);
default:
// return the common prefix of the first string,
// and the common prefix of the rest of the strings
return prefix([ strings[0], prefix(strings.slice(1)) ]);
}
}
not that I know of
yes: instead of comparing the substring from 0 to length i, you can simply check the ith character (you already know that characters 0 to i-1 match).
Short and sweet version, perhaps not the most efficient:
/// Return length of longest common prefix in an array of strings.
function _commonPrefix($array) {
if(count($array) < 2) {
if(count($array) == 0)
return false; // empty array: undefined prefix
else
return strlen($array[0]); // 1 element: trivial case
}
$len = max(array_map('strlen',$array)); // initial upper limit: max length of all strings.
$prevval = reset($array);
while(($newval = next($array)) !== FALSE) {
for($j = 0 ; $j < $len ; $j += 1)
if($newval[$j] != $prevval[$j])
$len = $j;
$prevval = $newval;
}
return $len;
}
// TEST CASE:
$arr = array('/var/yam/yamyam/','/var/yam/bloorg','/var/yar/sdoo');
print_r($arr);
$plen = _commonprefix($arr);
$pstr = substr($arr[0],0,$plen);
echo "Res: $plen\n";
echo "==> ".$pstr."\n";
echo "dir: ".dirname($pstr.'aaaa')."\n";
Output of the test case:
Array
(
[0] => /var/yam/yamyam/
[1] => /var/yam/bloorg
[2] => /var/yar/sdoo
)
Res: 7
==> /var/ya
dir: /var
#bumperbox
Your basic code needed some correction to work in ALL scenarios!
Your loop only compares until one character before the last character!
The mismatch can possibly occur 1 loop cycle after the latest common character.
Hence you have to at least check until 1 character after your first string's last character.
Hence your comparison operator must be "<= 1" or "< 2".
Currently your algorithm fails
if the first string is completely included in all other strings,
or completely included in all other strings except the last character.
In my next answer/post, I will attach iteration optimized code!
Original Bumperbox code PLUS correction (PHP):
function shortest($sports) {
$i = 1;
// loop to the length of the first string
while ($i < strlen($sports[0])) {
// grab the left most part up to i in length
// REMARK: Culturally biased towards LTR writing systems. Better say: Grab frombeginning...
$match = substr($sports[0], 0, $i);
// loop through all the values in array, and compare if they match
foreach ($sports as $sport) {
if ($match != substr($sport, 0, $i)) {
// didn't match, return the part that did match
return substr($sport, 0, $i-1);
}
}
$i++; // increase string length
}
}
function shortestCorrect($sports) {
$i = 1;
while ($i <= strlen($sports[0]) + 1) {
// Grab the string from its beginning with length $i
$match = substr($sports[0], 0, $i);
foreach ($sports as $sport) {
if ($match != substr($sport, 0, $i)) {
return substr($sport, 0, $i-1);
}
}
$i++;
}
// Special case: No mismatch happened until loop end! Thus entire str1 is common prefix!
return $sports[0];
}
$sports1 = array(
'Softball',
'Softball - Eastern',
'Softball - North Harbour');
$sports2 = array(
'Softball - Wester',
'Softball - Western',
);
$sports3 = array(
'Softball - Western',
'Softball - Western',
);
$sports4 = array(
'Softball - Westerner',
'Softball - Western',
);
echo("Output of the original function:\n"); // Failure scenarios
var_dump(shortest($sports1)); // NULL rather than the correct 'Softball'
var_dump(shortest($sports2)); // NULL rather than the correct 'Softball - Wester'
var_dump(shortest($sports3)); // NULL rather than the correct 'Softball - Western'
var_dump(shortest($sports4)); // Only works if the second string is at least one character longer!
echo("\nOutput of the corrected function:\n"); // All scenarios work
var_dump(shortestCorrect($sports1));
var_dump(shortestCorrect($sports2));
var_dump(shortestCorrect($sports3));
var_dump(shortestCorrect($sports4));
How about something like this? It can be further optimised by not having to check the lengths of the strings if we can use the null terminating character (but I am assuming python strings have length cached somewhere?)
def find_common_prefix_len(strings):
"""
Given a list of strings, finds the length common prefix in all of them.
So
apple
applet
application
would return 3
"""
prefix = 0
curr_index = -1
num_strings = len(strings)
string_lengths = [len(s) for s in strings]
while True:
curr_index += 1
ch_in_si = None
for si in xrange(0, num_strings):
if curr_index >= string_lengths[si]:
return prefix
else:
if si == 0:
ch_in_si = strings[0][curr_index]
elif strings[si][curr_index] != ch_in_si:
return prefix
prefix += 1
I would use a recursive algorithm like this:
1 - get the first string in the array
2 - call the recursive prefix method with the first string as a param
3 - if prefix is empty return no prefix
4 - loop through all the strings in the array
4.1 - if any of the strings does not start with the prefix
4.1.1 - call recursive prefix method with prefix - 1 as a param
4.2 return prefix
// Common prefix
$common = '';
$sports = array(
'Softball T - Counties',
'Softball T - Eastern',
'Softball T - North Harbour',
'Softball T - South',
'Softball T - Western'
);
// find mini string
$minLen = strlen($sports[0]);
foreach ($sports as $s){
if($minLen > strlen($s))
$minLen = strlen($s);
}
// flag to break out of inner loop
$flag = false;
// The possible common string length does not exceed the minimum string length.
// The following solution is O(n^2), this can be improve.
for ($i = 0 ; $i < $minLen; $i++){
$tmp = $sports[0][$i];
foreach ($sports as $s){
if($s[$i] != $tmp)
$flag = true;
}
if($flag)
break;
else
$common .= $sports[0][$i];
}
print $common;
The solutions here work only for finding commonalities at the beginning of strings. Here is a function that looks for the longest common substring anywhere in an array of strings.
http://www.christopherbloom.com/2011/02/24/find-the-longest-common-substring-using-php/
The top answer seemed a bit long, so here's a concise solution with a runtime of O(n2).
function findLongestPrefix($arr) {
return array_reduce($arr, function($prefix, $item) {
$length = min(strlen($prefix), strlen($item));
while (substr($prefix, 0, $length) !== substr($item, 0, $length)) {
$length--;
}
return substr($prefix, 0, $length);
}, $arr[0]);
}
print findLongestPrefix($sports); // Softball -
For what it's worth, here's another alternative I came up with.
I used this for finding the common prefix for a list of products codes (ie. where there are multiple product SKUs that have a common series of characters at the start):
/**
* Try to find a common prefix for a list of strings
*
* #param array $strings
* #return string
*/
function findCommonPrefix(array $strings)
{
$prefix = '';
$chars = array_map("str_split", $strings);
$matches = call_user_func_array("array_intersect_assoc", $chars);
if ($matches) {
$i = 0;
foreach ($matches as $key => $value) {
if ($key != $i) {
unset($matches[$key]);
}
$i++;
}
$prefix = join('', $matches);
}
return $prefix;
}
This is an addition to the #Gumbo answer. If you want to ensure that the chosen, common prefix does not break words, use this. I am just having it look for a blank space at the end of the chosen string. If that exists we know that there was more to all of the phrases, so we truncate it.
function product_name_intersection($array){
$pl = 0; // common prefix length
$n = count($array);
$l = strlen($array[0]);
$first = current($array);
while ($pl < $l) {
$c = $array[0][$pl];
for ($i=1; $i<$n; $i++) {
if (!isset($array[$i][$pl]) || $array[$i][$pl] !== $c) break 2;
}
$pl++;
}
$prefix = substr($array[0], 0, $pl);
if ($pl < strlen($first) && substr($prefix, -1, 1) != ' ') {
$prefix = preg_replace('/\W\w+\s*(\W*)$/', '$1', $prefix);
}
$prefix = preg_replace('/^\W*(.+?)\W*$/', '$1', $prefix);
return $prefix;
}
Sharing a Typescript solution for this question. I split it into 2 methods, just to keep it clean while at it.
function longestCommonPrefix(strs: string[]): string {
let output = '';
if(strs.length > 0) {
output = strs[0];
if(strs.length > 1) {
for(let i=1; i <strs.length; i++) {
output = checkCommonPrefix(output, strs[i]);
}
}
}
return output;
};
function checkCommonPrefix(str1: string, str2: string): string {
let output = '';
let len = Math.min(str1.length, str2.length);
let i = 0;
while(i < len) {
if(str1[i] === str2[i]) {
output += str1[i];
} else {
i = len;
}
i++;
}
return output;
}
I have a table that looks like this:
<22 23-27
8-10 1.3 1.8
11-13 2.2 2.8
14-16 3.2 3.8
and it goes on. So I'd like to lookup a value like this:
lookup(11,25)
and get the response, in this case 2.8. What is the best data structure to use for this? I have the data in CSV format.
I'm looking to program this in PHP.
Thank you.
I'm certainly not claiming this is the best or most efficient data structure, but this is how I'd map your data into a two-dimensional PHP array that very closely resembles your raw data:
$fp = fopen('data.csv', 'r');
$cols = fgetcsv($fp);
array_shift($cols); // remove empty first item
$data = array();
while ($row = fgetcsv($fp)) {
list($min, $max) = explode('-', $row[0]);
// TODO: Handle non-range values here (e.g. column header "<22")
$data["$min-$max"] = array();
for ($x = 0; $x < count($cols); $x++) {
$data["$min-$max"][$cols[$x]] = $row[$x + 1];
}
}
You'd then need to add some parsing logic in your lookup function:
function lookup($row, $col) {
$return = null;
// Loop through all rows
foreach ($data as $row_name => $cols) {
list($min, $max) = explode('-', $row_name);
if ($min <= $row && $max >= $row) {
// If row matches, loop through columns
foreach ($cols as $col_name => $value) {
// TODO: Add support for "<22"
list($min, $max) = explode('-', $col_name);
if ($min <= $col && $max >= $col) {
$return = $value;
break;
}
}
break;
}
}
return $return;
}
How about some kind of two dimensional data structure.
X "coordinates" being <22, 23-27
Y "coordinates" being ...
A two dimensional Array would probably work for this purpose.
You will then need some function to map the specific X and Y values to the ranges, but that should not be too hard.
Database structure:
values
------
value
x_range_start
x_range_end
y_range_start
y_range_end
Code:
function lookup(x, y) {
sql = "
SELECT * FROM values
WHERE
x >= x_range_start
AND
x <= x_range_end
AND
y >= y_range_start
AND
y <= y_range_end
"
/---/
}
Your data would map to the database like so:
<22 23-27
8-10 1.3 1.8
11-13 2.2 2.8
14-16 3.2 3.8
(value, x start, x end, y start, y end)
1.3, 0, 22, 8, 10
1.8, 23, 27, 8, 10
2.2, 0, 22, 11, 13
...
Basically store the x and y axis start and end numbers for each value in the table.
I'm partial to the 2 Dimensional array with a "hash" function that maps the ranges into specific addresses in the table.
So your underlying data structure would be a 2 dimensional array:
0 1
0 1.3 1.8
1 2.2 2.8
2 3.2 3.8
Then you would write two functions:
int xhash(int);
int yhash(int);
That take the original arguments and convert them into indexes into your array. So xhash performs the conversion:
8-10 0
11-13 1
14-16 2
Finally, your lookup operation becomes.
function lookup($x, $y)
{
$xIndex = xhash($x);
$yIndex = yhash($y);
// Handle invalid indices!
return $data[$xIndex][$yIndex];
}
Well, the other answers all use 2D arrays, which means using a 2D loop to retrieve it. Which, if your ranges are age ranges or something similar, may be finite (there are only so many age ranges!), and not an issue (what's a few hundred iterations?). If your ranges are expected to scale to enormous numbers, a play on a hash map may be your best bet. So, you create a hashing function that turns any number into the relevant range, then you do direct lookups, instead of a loop. It would be O(1) access instead of O(n^2).
So your hash function could be like: function hash(n) { if (n < 22) return 1; if (n < 25) return 2; return -1; }, and then you can specify your ranges in terms of those hash values (1, 2, etc.), and then just go $data[hash(11)][hash(25)]
the simplest option: create array of arrays, where each array consists of 5 elements: minX, maxX, minY, maxY, value, in your case it would be
$data = array(
array(8, 10, 0, 22, 1.3),
array(8, 10, 23, 27, 1.8),
array(11, 13, 0, 22, 2.2), etc
write a loop that goes through every element and compares min & max values with your arguments:
function find($x, $y) {
foreach($data as $e) {
if($x <= $e[0] && $x >= $e[1] && $y <= $e[2] && $y >= $e[3])
return $e[4];
}
with a small dataset this will work fine, if your dataset is bigger you should consider using a database.