What's wrong with my semi complete Sudoku solver? - php

I'm trying to practice algorithm questions and I'm currently attempting a sudoku solver, please bare in mind that it isn't currently finished! I haven't added backtracking when there is more than one option that the cell could be, but my issue currently is that my if statement to check if there is only one possible answer the cell could be is failing, as the semi filled sudoku its returning is wrong.
Also feel free to give me tips on how to speed things up etc.
function sudoku(array $puzzle): array
{
// Return the solved puzzle as a 9 × 9 grid
$data = [];
for ($i = 0; $i < 8; $i++) {
for ($a = 0; $a < 8; $a++) {
if ($puzzle[$i][$a] == 0) {
$horizontal_missing = getHorizontalNumbers($puzzle[$i]);
$vertical_missing = getVerticalNumbers($puzzle, $a);
$square_missing = getSquareNumbers($puzzle, $i, $a);
$intersect = array_intersect($horizontal_missing,$vertical_missing,$square_missing);
if (count($intersect) == 1) {
sort($intersect);
$puzzle[$i][$a] = $intersect[0];
$i = 0;
$a = 0;
}
}
}
}
return $puzzle;
}
function getSquareNumbers($p, $row, $col)
{
$sectors = [1 => [0, 2], 2 => [3, 5], 3 => [6, 8]];
$across = getSector($sectors, $row);
$down = getSector($sectors, $col);
switch (($across * $down)) {
case 1:
$row = [
$p[0][0], $p[0][1], $p[0][2],
$p[1][0], $p[1][1], $p[1][2],
$p[2][0], $p[2][1], $p[2][2]
];
break;
case 2:
$row = [
$p[0][3], $p[0][4], $p[0][5],
$p[1][3], $p[1][4], $p[1][5],
$p[2][3], $p[2][4], $p[2][5]
];
break;
case 3:
$row = [
$p[0][6], $p[0][7], $p[0][8],
$p[1][6], $p[1][7], $p[1][8],
$p[2][6], $p[2][7], $p[2][8]
];
break;
case 4:
$row = [
$p[3][0], $p[3][1], $p[3][2],
$p[4][0], $p[4][1], $p[4][2],
$p[5][0], $p[5][1], $p[5][2]
];
break;
case 5:
$row = [
$p[3][3], $p[3][4], $p[3][5],
$p[4][3], $p[4][4], $p[4][5],
$p[5][3], $p[5][4], $p[5][5]
];
break;
case 6:
$row = [
$p[3][6], $p[3][7], $p[3][8],
$p[4][6], $p[4][7], $p[4][8],
$p[5][6], $p[5][7], $p[5][8]
];
break;
case 7:
$row = [
$p[6][0], $p[6][1], $p[6][2],
$p[7][0], $p[7][1], $p[7][2],
$p[8][0], $p[8][1], $p[8][2]
];
break;
case 8:
$row = [
$p[6][3], $p[6][4], $p[6][5],
$p[7][3], $p[7][4], $p[7][5],
$p[8][3], $p[8][4], $p[8][5]
];
break;
case 9:
$row = [
$p[6][6], $p[6][7], $p[6][8],
$p[7][6], $p[7][7], $p[7][8],
$p[8][6], $p[8][7], $p[8][8]
];
break;
}
return getHorizontalNumbers($row);
}
function getSector($sectors, $num)
{
if (($sectors[1][0] <= $num) && ($num <= $sectors[1][1])) {
return 1;
} else if (($sectors[2][0] <= $num) && ($num <= $sectors[2][1])) {
return 2;
} else if (($sectors[3][0] <= $num) && ($num <= $sectors[3][1])) {
return 3;
}
}
function getHorizontalNumbers($row)
{
$missing = [];
for ($i = 1; $i <= 9; $i++) {
if (!in_array($i, $row)) {
$missing[] = $i;
}
}
return $missing;
}
function getVerticalNumbers($puzzle, $col)
{
$row = [
$puzzle[0][$col],
$puzzle[1][$col],
$puzzle[2][$col],
$puzzle[3][$col],
$puzzle[4][$col],
$puzzle[5][$col],
$puzzle[6][$col],
$puzzle[7][$col],
$puzzle[8][$col]
];
return getHorizontalNumbers($row);
}
$data = sudoku([
[5, 3, 0, 0, 7, 0, 0, 0, 0],
[6, 0, 0, 1, 9, 5, 0, 0, 0],
[0, 9, 8, 0, 0, 0, 0, 6, 0],
[8, 0, 0, 0, 6, 0, 0, 0, 3],
[4, 0, 0, 8, 0, 3, 0, 0, 1],
[7, 0, 0, 0, 2, 0, 0, 0, 6],
[0, 6, 0, 0, 0, 0, 2, 8, 0],
[0, 0, 0, 4, 1, 9, 0, 0, 5],
[0, 0, 0, 0, 8, 0, 0, 7, 9]
]);
$string = '';
$count = 0;
foreach($data as $key => $row){
foreach($row as $cell){
$count++;
if ($count == 9){
$string.= $cell.", \n";
$count = 0;
} else {
$string.= $cell.", ";
}
}
}
echo nl2br($string);
Surely if I'm only inputting numbers where there is only ONE common denominator between vertical line, horizontal line and the square there shouldn't be any errors in the SEMI filled Sudoku so far, yet there is? What am I missing? My brain can't compute lol.

function test_sectors($puzzle=null) {
if (!$puzzle) {
$puzzle = [
[1,1,1, 2,2,2, 3,3,3],
[1,1,1, 2,2,2, 3,3,3],
[1,1,1, 2,2,2, 3,3,3],
[4,4,4, 5,5,5, 6,6,6],
[4,4,4, 5,5,5, 6,6,6],
[4,4,4, 5,5,5, 6,6,6],
[7,7,7, 8,8,8, 9,9,9],
[7,7,7, 8,8,8, 9,9,9],
[7,7,7, 8,8,8, 9,9,9]
];
}
for ($i=0; $i<9; $i++) {
for ($j=0; $j<9; $j++) {
$expected = $puzzle[$i][$j];
$sector = getSquareNumbers($puzzle, $i, $j);
if ($sector[0] == $expected) {
echo ".";
} else {
echo "F (got sector {$sector[0]} expected {$expected} at coord [{$i}][{$j}])\n";
};
}
}
}
test_sectors();
I did this function (I do not have a php at hand now) to test the behaviour of your getSquareNumbers function that seems odd to me. You can use it to see if the correct square is selected for every possible coordinate, you can automate it too (but it is up to you).
Edit:
Rewritten a la unit test fashion!

In getSquareNumbers you assume that $across * $down gives unique values, but that is not true:
For example 1*3 is 3, but also 3*1 is 1. Furthermore, there is no case where the product is 5, 7 or 8. So in many cases your code is looking at the wrong square area.
You can avoid a lot of code (duplication) by just using the modulo operator and slice the rows of the grid at given indices.
Secondly, you seem to swap the meaning of $down and $across. The first should be about rows and the second about columns, and the code does relates them differently.
Here is the suggested fix for that function:
function getSquareNumbers($p, $row, $col)
{
$down = $row - $row % 3; // This will be 0, 3 or 6
$across = $col - $col % 3; // This will be 0, 3 or 6
$row = [...array_slice($p[$down+0], $across, $across + 3),
...array_slice($p[$down+1], $across, $across + 3),
...array_slice($p[$down+2], $across, $across + 3)];
return getHorizontalNumbers($row);
}
Now your grid will be filled more... still leaving some zeroes, but those zeroes really represent cells that cannot be solved as there is no value that can be put there without violating one of the rules. So this is where you need to implement backtracking to continue the process in an alternative direction.

Related

Get all the combinations a number can be split into

I have a fixed number (5) and a fixed length (2):
I want to find all the combinations that can sum 50 (this can vary of course).
For example, for 50, and a length of 2, I want to get:
[[1, 49], [2, 48], [3, 47], [4, 46], ...],
For 50 and a length of 4, I want to get:
[[1, 1, 1, 47], [1, 1, 2, 46], [1, 1, 3, 45], [1, 1, 4, 44], ...],
I have no clue how to do this, or whether there's a name in mathematics for this.
This is what I have (doesn't have to use generators):
function getCombinations(array $numbers, int $sum, int $setLength) : \Generator {
$numbersLowestFirst = $numbers;
asort($numbersLowestFirst);
$lowestNumber = 0;
foreach ($numbersLowestFirst as $number) {
// array might be associative and we want to preserve the keys
$lowestNumber = $number;
break;
}
$remainingAmount = $sum - $lowestNumber;
$remainingNumberofItems = $setLength - 1;
$possibleItemCombinations = array_pad([$remainingAmount], $remainingNumberofItems, $lowestNumber);
}
I first made it using JS then translated to PHP. See below for a JS demonstration. The idea is as #Barmar said in comments.
Note: There will be duplicates. If you care for that, you should sort each "triplet" and look for duplications. See JS solution for that below.
function getCombinations($sum, $n)
{
if ($n == 1) {
return [[$sum]];
}
$arr = [];
for ($i = 0; $i <= $sum / 2; $i++) {
$combos = getCombinations($sum - $i, $n - 1);
foreach ($combos as $combo) {
$combo[] = $i;
$arr[] = $combo;
}
}
return ($arr);
}
JS style, it's the same idea only more comfortable to test on browser.
function getCombinations(sum, n) {
if (n == 1) return [[sum]];
var arr = []
for (var i = 0; i <= sum / 2; i++) {
var combos = getCombinations(sum - i, n - 1)
combos.forEach(combo => {
arr.push([i, ...combo]);
})
}
// removing duplicates
arr = Object.values(arr.reduce(function(agg, triplet) {
triplet = triplet.sort();
agg["" + triplet] = triplet;
return agg;
}, {}))
return arr;
}
console.log(getCombinations(5, 3))

best way to count the bigger elements on the right and left side of an array

For example, in php
$arr = [9, 4, 3, 5, 2, 6];
then,
$output = [[0,0], [1,2], [2,2], [1,1], [4,1], [1,0]];
[0, 0] = the bigger elements of 9 is 0 on both side
[1, 2] = the bigger elements of 4 is 1 (9) on left and 2 (5, 6) on right side ... [ 9 > 4] - [ 5 > 4, 6 > 4 ]
[2, 2] = the bigger elements of 3 is 2 (9, 4) on the left and 2 (5, 6) on right side
[1, 1] = the bigger elements of 5 is 1 (9) on the left and 1 (6) on the right side
[4, 1] = the bigger elements of 2 (9, 4, 3, 5) is 4 on the left and 1 (6) on the right side
[1, 0] = the bigger elements of 6 is 1 (9) on the left and 0 (no elements after 6) on the right side
I want it in O(n log(n)), is it possible?
Looks like lots of answers already, you could do this with some of the array functions like some of the answers did, but since this is probably for your homework best to keep it simple. I keep track of whether it's left or right that should be incremented each iteration by using the $side variable.
$arr = [9, 4, 3, 5, 2, 6];
$results = [];
for ($x = 0; $x < count($arr); $x++) {
$results[$x] = [0,0];
$side = 0;
for ($y = 0; $y < count($arr); $y++) {
if ($arr[$y] > $arr[$x]) {
$results[$x][$side]++;
} elseif ($arr[$x] == $arr[$y]) {
$side = 1;
}
}
}
You need to loop through $arr to get each value, and then, in the loop, loop again through $arr to get the other values. Then, in the second loop, you build your output array by comparing both the value (to know if the number is, indeed, bigger) and the key (to know if it's on the left or the right).
$arr = array(9, 4, 3, 5, 2, 6);
$output = array();
foreach ($arr as $key=>$value) {
$out = array(0, 0);
foreach ($arr as $key2=>$value2) {
if ($key2 == $key) # If it's the same element
continue;
if ($value2 > $value) {
if ($key2 < $key)
$out[0]++;
else
$out[1]++;
}
}
$output[] = $out;
}
print_r($output);
See the output here.
Try this:
function fix_array($array) {
$return_array = array();
foreach ($array as $i => $value){
$left = array_slice($array, 0, $i);
$count_left = count(array_filter($left, function($var) use($value){
return $var > $value;
}));
$right = array_slice($array, $i + 1);
$count_right = count(array_filter($right, function($var) use($value){
return $var > $value;
}));
$return_array[] = [$count_left, $count_right];
}
return $return_array;
}
$arr = [9, 4, 3, 5, 2, 6];
$new_array = fix_array($arr);
print_r($new_array);
Simply compare it with left and right values. Try this:
$arr = [9, 4, 3, 5, 2, 6];
$total = count($arr);
$new_arr=array();
foreach ($arr as $key => $value) {
$left = 0;
$right = 0;
for ($i=0; $i < $total; $i++) {
if($key > $i && $arr[$i] > $arr[$key])
{
$left++;
}
elseif ($key < $i && $arr[$i] > $arr[$key]) {
$right++;
}
}
$new_arr[]=[$left,$right];
}
echo "<pre>";
print_r($new_arr);
Try the following code using array_walk()
<?php
$arr = [9, 4, 3, 5, 2, 6];
$finalArray =[];
array_walk($arr, function($value,$key) use(&$finalArray,&$arr) {
$param ['pre_val']=0;
$param ['post_val']=0;
$param ['current_index'] = $key;
$param ['current_value'] = $value;
$arr2 = $arr;
array_walk($arr2, function(&$value,$key) use(&$finalArray,&$param) {
if($key < $param['current_index']){
if($value > $param['current_value']){$param['pre_val'] ++;}
}else{
if($value > $param['current_value']){$param['post_val'] ++;}
}
$finalArray[$param['current_index']][0] = $param['pre_val'];
$finalArray[$param['current_index']][1] = $param['post_val'];
});
});
print_r($finalArray);

Find key of array based on variable value between array values [duplicate]

This question already has answers here:
Find a matching or closest value in an array
(13 answers)
Closed 7 years ago.
I need to find the left-near key of a base array corresponding to a variable value.
Searched value (in this case) is always between 1 and 779
Better with an example:
$fixedArr = [ 0, 5, 8, 20, 40, 60, 90, 135, 780 ];
$search = 42; // $result = $arr[4] -> 4;
$search = 110; // $result = $arr[6] -> 6;
$search = 134; // $result = $arr[6] -> 6;
$search = 135; // $result = $arr[7] -> 7;
I try with a foreach loop but with no luck, any idea??
Thanks
searched value is always between (in this case) 1 and 779
$fixedArr = [ 0, 5, 8, 20, 40, 60, 90, 135, 780 ];
$search = 42;
for ($i = 0; $i < count($fixedArr); $i++)
if ($search < $fixedArr[$i]) break;
echo $i-1;
This maybe help you;
$fixedArr = [ 0, 5, 8, 20, 40, 60, 90,135,780 ];
//
$search = 111; // $result = $arr[4] -> 4;
//$search = 110; // $result = $arr[6] -> 6;
//
function leftORright($fixedArr,$search){
$max = max($fixedArr)+1;
$near = array(
'left'=>array('key'=>'none','value'=>'none','bool'=>false),
'right'=>array('key'=>'none','value'=>$max,'bool'=>false),
'center'=>array('key'=>'none','value'=>'none','bool'=>false)
);
foreach($fixedArr as $k=>$v){
if($v == $search){
$near['center']['key'] = $k;
$near['center']['value'] = $v;
}
if($v < $search){
$near['left']['key'] = $k;
$near['left']['value'] = $v;
}
if($v > $search and $near['right']['value'] > $v){
$near['right']['key'] = $k;
$near['right']['value'] = $v;
}
}
//decide near left or right
$respright = $near['right']['value'] - $search;
$respleft = $search - $near['left']['value'] ;
$right_left_equals = false;
if($near['center']['value'] !== 'none'){
$near['center']['bool'] = true;
}else if($respleft < $respright && $near['left']['key']!='none'){
$near['left']['bool'] = true;
}else if($respleft > $respright && $near['right']['key']!='none'){
$near['right']['bool'] = true;
}else if($near['center']['value'] != 'none'){
$near['center']['value'] = true;
}else{
$right_left_equals = true;
}
//var_dump($near);
//Result is:
foreach($near as $k=>$v){
foreach($v as $k2=>$v2){
if($v2===true){
var_dump('near is for '.$k);
return $v;
}
}
}
//equal for right and left
if($right_left_equals){
var_dump('near right left are equals');
return array($near['right'],$near['left']);
}
}
$result = leftORright($fixedArr,$search);
var_dump($result);
response:
string 'near is for left' (length=16)
array (size=3)
'key' => int 6
'value' => int 90
'bool' => boolean true
Use array_search:
array_search — Searches the array for a given value and returns the corresponding key if successful
$search = array_search(42, $fixedArr); // -> 4
$search = array_search(110, $fixedArr); // -> 6
Your question is similar to finding the closest one:
<?php
$fixedArr = [ 0, 5, 8, 20, 40, 60, 90, 135, 780 ];
function getClosest($search, $arr) {
$left = 0;
foreach ($arr as $val) {
if ($search > $val)
$left = $val;
elseif ($search < $val) {
$right = $val;
break;
}
else {
$right = $val;
break;
}
}
return array($left, $right, array_search($left, $arr), array_search($right, $arr), (($search - $left) > ($right - $search) ? array_search($right, $arr) : array_search($left, $arr)));
}
print_r(getClosest(4, $fixedArr));
?>
Try this : It will Work in all cases.
$fixedArr = [ 0, 5, 8, 20, 40, 60, 90, 135, 780 ];
$find = 14;
for ($i=0; $i < count($fixedArr); $i++) {
if($fixedArr[$i] <= $find){
$large[] = $fixedArr[$i];
}else{
$small[] = $fixedArr[$i];
}
}
$near1 = max($large);
$near2 = min($small);
echo "Value $find coming in between $near1 and $near2";
echo "<br>";
if($find >= ($near1 + $near2) / 2){
echo "Closed Value is : $near2";
}else{
echo "Closed Value is : $near1";
}
Output:
Value 14 coming in between 8 and 20
Closed Value is : 20

Can we include condition as an argument in custom functions?

Suppose I have a multidimensional array
$num1 = array(1, 4, 6, 12, 15, 16, 21, 34, 25, 29);
$num2 = array(1, 5, 18, 19, 23, 19, 23, 45, 23, 16);
$array = array($num1, $num2);
I want to extract all the values from $array where the $array[0] values meet some condition e.g. have a value between 10 & 20.
to get the required values from $array I can use this code:
$count = count($array[0]);
$new_array = array_fill(0, $count, array());
for($i = 0; $i < $count; $i++)
{
if($array[0][$i] >= 10 && $array[0][$i] <= 20)
{
$new_array[0][] = $array[0][$i];
$new_array[1][] = $array[1][$i];
}
}
//I get the array that I need
print_r($new_array);
This is the code I need to change every time
$array[0][$i] >= 10 && $array[0][$i] <= 20 (condition --> values from the first sub array are >= 10 and <= 20)
result would be
Array(0 => Array(0 => 12, 1 => 15, 2 => 16), 1 => Array(0 => 19, 1 => 23, 2 => 19))
another condition
$array[1][$i] >= 20 && $array[1][$i] <= 30 (condition --> values from the second sub array are >= 20 and <= 30)
result would be
Array(0 => Array(0 => 15, 1 => 21, 2 => 25), 1 => Array(0 => 23, 1 => 23, 2 => 23))
I need to do such operations with different columns using different conditions. So, instead of writing code for looping every time, I want to create a function with condition as an argument. Is it possible, if so how?
I would like to have a function with three arguments as shown below
function_name ($array, $column_num, $condition)
Any alternative solutions are also welcome.... :)
Code I used finally to get this done....
<?php
function mdarray_condition_extract($array, $column, $condition)
{
$count = count($array[0]);
$nsac = count($array);
$new_array = array_fill(0, $nsac, array());
for ($i = 0; $i < $count; $i++)
{
$valToTest = $array[$column][$i];
if ($condition($valToTest))
{
for($k = 0; $k < $nsac; $k++)
{
$new_array[$k][] = $array[$k][$i];
}
}
}
return $new_array;
}
$array = array(array(1, 4, 6, 12, 15, 16), array(1, 5, 18, 19, 23, 19));
$columns = array(0,1,0);
$conditions = [
0 => function($val){return $val >= 10 && $val <= 20;},
1 => function($val){return $val >= 20 && $val <= 30;},
2 => function($val){return $val == 6 || $val == 20;}
];
$combo = array($columns, $conditions);
$condcount = count($combo[0]);
for($i = 0; $i < $condcount; $i++)
{
print_r(mdarray_condition_extract($array, $combo[0][$i], $combo[1][$i])); echo "<br><br>";
}
?>
Thanks all for the response, it helped me in a great way...!!
Your existing code is very close.
function filterParallelArrays($array, $predicateFilterIndex, $predicate)
{
$count = count($array[0]);
$new_array = array_fill(0, $count, array());
for ($i = 0; $i < $count; $i++)
{
$valToTest = $array[$predicateFilterIndex][$i];
if ($predicate($valToTest))
{
$new_array[0][] = $array[0][$i];
$new_array[1][] = $array[1][$i];
}
}
return $new_array;
}
$predicate = function($val)
{
return $val >= 10 && $val <= 20;
};
print_r(filterParallelArrays($array, 0, $predicate));
You question is about filtering your arrays. For that, PHP has built-in function array_filter() . It takes callback for filtering values.
If I got your question correctly, you want to apply filter to your sub-arrays - and, probably, with different conditions. Normally, if do that statically, it is:
$array[0] = array_filter($array[0], function($x)
{
return $x>=10 && $x<=20; //item between 10 and 20
});
-but if you have predefined list for each column, you can fill a map:
$conditions = [
//column index => condition callback:
0 => function($x){ return $x>=10 && $x<=20; },
1 => function($x){ return $x==50 || $x==80; },
//e t.c.
];
foreach($array as $key=>$column)
{
if(array_key_exists($key, $conditions))
{
$array[$key] = array_filter($array[$key], $conditions[$key]);
}
}
So, using array_filter() - you can do it natively, I think you won't need your own custom function to do this (because wrapping native function has little sense in this case).
Given your latest update, you could do something like this. But I'm still not really sure what your objective is here, so this might be totally out to lunch.
function array_function($key, $lcond, $rcond)
{
$count = count($array[0]);
$new_array = array_fill(0, $count, array());
for($i = 0; $i < $count; $i++)
{
if($array[$key][$i] >= $lcond && $array[$key][$i] <= $rcond)
{
$new_array[0][] = $array[0][$i];
$new_array[1][] = $array[$key][$i];
}
}
//I get the array that I need
return $new_array;
}

Finding frequent sequence of numbers in an array

Array (3, 5, 1, 3, 5, 48, 4, 7, 13, 55, 65, 4, 7, 13, 32)
the frequent sequence of numbers will be (3, 5) f=2 + (4, 7, 13) f=2
any Algorithm or Pseudo code to find that ?
Update(1):
if (7, 13) also occurrence it will be included in the longest one by update its frequency so
(4, 7, 13) f=3 and so on...
Update(2):
in case of (1,2,3,4,1,2,3,4,1,2,7,8,7,8,3,4,3,4,1,2) the output should be (1,2,3,4) & (3,4,1,2)
& (7,8) , to make it clear consider each number as a word and you want to find most frequent phrases
so it is common to see same word(s) in a lot of phrases but if any phrase was sub-string for any other
phrase(s) should not be consider as a phrase but will update frequency of each phrase includes it
** EDIT ** : slightly better implementation, now also returns frequences and has a better sequence filter.
function getFrequences($input, $minimalSequenceSize = 2) {
$sequences = array();
$frequences = array();
$len = count($input);
for ($i=0; $i<$len; $i++) {
$offset = $i;
for ($j=$i+$minimalSequenceSize; $j<$len; $j++) {
if ($input[$offset] == $input[$j]) {
$sequenceSize = 1;
$sequence = array($input[$offset]);
while (($offset + $sequenceSize < $j)
&& ($input[$offset+$sequenceSize] == $input[$j+$sequenceSize])) {
if (false !== ($seqIndex = array_search($sequence, $frequences))) {
// we already have this sequence, since we found a bigger one, remove the old one
array_splice($sequences, $seqIndex, 1);
array_splice($frequences, $seqIndex, 1);
}
$sequence[] = $input[$offset+$sequenceSize];
$sequenceSize++;
}
if ($sequenceSize >= $minimalSequenceSize) {
if (false !== ($seqIndex = array_search($sequence, $sequences))) {
$frequences[$seqIndex]++;
} else {
$sequences[] = $sequence;
$frequences[] = 2; // we have two occurances already
}
// $i += $sequenceSize; // move $i so we don't reuse the same sub-sequence
break;
}
}
}
}
// remove sequences that are sub-sequence of another frequence
// ** comment this to keep all sequences regardless **
$len = count($sequences);
for ($i=0; $i<$len; $i++) {
$freq_i = $sequences[$i];
for ($j=$i+1; $j<$len; $j++) {
$freq_j = $sequences[$j];
$freq_inter = array_intersect($freq_i, $freq_j);
if (count($freq_inter) != 0) {
$len--;
if (count($freq_i) > count($freq_j)) {
array_splice($sequences, $j, 1);
array_splice($frequences, $j, 1);
$j--;
} else {
array_splice($sequences, $i, 1);
array_splice($frequences, $i, 1);
$i--;
break;
}
}
}
}
return array($sequences, $frequences);
};
Test case
header('Content-type: text/plain');
$input = array(3, 5, 1, 3, 5, 48, 4, 7, 13, 55, 3, 5, 65, 4, 7, 13, 32, 5, 48, 4, 7, 13);
list($sequences, $frequences) = getFrequences($input);
foreach ($sequences as $i => $s) {
echo "(" . implode(',', $s) . ') f=' . $frequences[$i] . "\n";
}
** EDIT ** : here's an update to the function. It was almost completely rewritten... tell me if this is what you were looking for. I also added a redundancy check to prevent counting the same sequence, or subsequence, twice.
function getFrequences2($input, $minSequenceSize = 2) {
$sequences = array();
$last_offset = 0;
$last_offset_len = 0;
$len = count($input);
for ($i=0; $i<$len; $i++) {
for ($j=$i+$minSequenceSize; $j<$len; $j++) {
if ($input[$i] == $input[$j]) {
$offset = 1;
$sub = array($input[$i]);
while ($i + $offset < $j && $j + $offset < $len) {
if ($input[$i + $offset] == $input[$j + $offset]) {
array_push($sub, $input[$i + $offset]);
} else {
break;
}
$offset++;
}
$sub_len = count($sub);
if ($sub_len >= $minSequenceSize) {
// $sub must contain more elements than the last sequence found
// otherwise we will count the same sequence twice
if ($last_offset + $last_offset_len >= $i + $sub_len) {
// we already saw this sequence... ignore
continue;
} else {
// save offset and sub_len for future check
$last_offset = $i;
$last_offset_len = $sub_len;
}
foreach ($sequences as & $sequence) {
$sequence_len = count($sequence['values']);
if ($sequence_len == $sub_len && $sequence['values'] == $sub) {
//echo "Found add-full ".var_export($sub, true)." at $i and $j...\n";
$sequence['frequence']++;
break 2;
} else {
if ($sequence_len > $sub_len) {
$end = $sequence_len - $sub_len;
$values = $sequence['values'];
$slice_len = $sub_len;
$test = $sub;
} else {
$end = $sub_len - $sequence_len;
$values = $sub;
$slice_len = $sequence_len;
$test = $sequence['values'];
}
for ($k=0; $k<=$end; $k++) {
if (array_slice($values, $k, $slice_len) == $test) {
//echo "Found add-part ".implode(',',$sub)." which is part of ".implode(',',$values)." at $i and $j...\n";
$sequence['values'] = $values;
$sequence['frequence']++;
break 3;
}
}
}
}
//echo "Found new ".implode(',',$sub)." at $i and $j...\n";
array_push($sequences, array('values' => $sub, 'frequence' => 2));
break;
}
}
}
}
return $sequences;
};
In Python3
>>> from collections import Counter
>>> count_hash=Counter()
>>> T=(3, 5, 1, 3, 5, 48, 4, 7, 13, 55, 65, 4, 7, 13, 32)
>>> for i in range(2,len(T)+1):
... for j in range(len(T)+1-i):
... count_hash[T[j:j+i]]+=1
...
>>> for k,v in count_hash.items():
... if v >= 2:
... print(k,v)
...
(3, 5) 2
(4, 7, 13) 2
(7, 13) 2
(4, 7) 2
Do you need to filter the (7,13) and the (4,7) out? What happens if there was also (99, 7, 14) in the sequence?
a Counter is just like a hash used to keep track of the number of times we see each substring
The two nested for loops produce all the substrings of T, using count_hash to accumulate the count of each substring.
The final for loop filters all those substrings that only occurred once
Here is a version with a filter
from collections import Counter
def substrings(t, minlen=2):
tlen = len(t)
return (t[j:j+i] for i in range(minlen, tlen+1) for j in range(tlen+1-i))
def get_freq(*t):
counter = Counter(substrings(t))
for k in sorted(counter, key=len):
v=counter[k]
if v < 2:
del counter[k]
continue
for t in substrings(k):
if t in counter:
if t==k:
continue
counter[k]+=counter[t]-v
del counter[t]
return counter
print(get_freq(3, 5, 1, 3, 5, 48, 4, 7, 13, 55, 65, 4, 7, 13, 32, 4, 7))
print(get_freq(1,2,3,4,1,2,3,4,1,2,7,8,7,8,3,4,3,4,1,2))
the output is
Counter({(4, 7, 13): 3, (3, 5): 2})
Counter({(1, 2, 3, 4, 1, 2): 8, (7, 8): 2}) # Is this the right answer?
Which is why I asked how the filtering should work for the sequence I gave in the comments
Ok, just to start off the discussion.
Create another array/map, call this
weightage array.
Start iterating on the values array.
For each value in
values array,increment the
corresponding position in weightage
array. Eg: for 3 increase
weightage[3]++, for 48
weightage[48]++.
After the iteration the weightage array contains
repetitions

Categories