Improving algorithm using MySQL - php

The code below is written mainly using PHP, but I am hoping to speed up the process, and parsing strings in PHP is slow.
Assume the following where I get a string from the database, and converted it into an array.
$data['options_list'] = array(
"Colours" => array('red','blue','green','purple'),
"Length" => array('3','4','5','6'),
"Voltage" => array('6v','12v','15v'),
);
These subarrays will each be a dropdown Select list, an the end user can select exactly 1 from each of the select lists.
When the user hits submit, I will want to match the submitted values against a "price table" pre-defined by the admins. Potentially "red" and "6v" would cost $5, but "red" and "5"(length) and "6v" would cost $6.
The question is, how to do so?
Currently the approach I have taken is such:
Upon submission of the form (of the 3 select lists), I get the relevant price rules set by the admin from the database. I've made an example of results.
$data['price_table'] =
array(
'red;4'=>'2',
'red;5'=>'3',
'red;6'=>'4',
'blue;3'=>'5',
'blue;4'=>'6',
'blue;5'=>'7',
'blue;6'=>'8',
'green;3'=>'9',
'green;4'=>'10',
'green;5'=>'11',
'green;6'=>'12',
'purple;3'=>'13',
'purple;4'=>'14',
'purple;5'=>'15',
'purple;6'=>'16',
'red;3'=>'1',
'red;3;12v'=>'17',
'blue;6;15v'=>'18',
);
Note : The order of the above example can be of any order, and the algorithm should work.
I then explode each of the above elements into an array, and gets the result that matches the best score.
$option_choices = $this->input->post('select');
$score = 0;
foreach($data['price_table'] as $key=>$value)
{
$temp = 0;
$keys = explode(';',$key);
foreach($keys as $k)
{
if(in_array($k, $option_choices))
{
$temp++;
}else{
$temp--;
}
}
if($temp > $score)
{
$score = $temp;
$result = $value;
}
}
echo "Result : ".$result;
Examples of expected results:
Selected options: "red","5"
Result: 3
Selected Options: "3", "red"
Result: 1
Selected Options: "red", "3", "12v"
Result: 17
The current method works as expected. However, handling these using PHP is slow. I've thought of using JSON, but that would mean that I would be giving the users my whole price table, which isn't really what I am looking for. I have also thought of using another language, (e.g python) but it wouldn't particularly be practical considering the costs. That leaves me with MySQL.
If someone can suggest a cheap and cost-efficient way to do this, please provide and example. Better still if you could provide an even better PHP solution to this which works fast.
Thank you!

It looks like you did work to make the results read faster but you're still parsing and testing every array part against the full list? This would probably run faster moving the search to MySQL and having extra columns there.
Since you can control the array (or test string) perhaps try fixed length strings:
$results = explode("\n", "
1 Red v1
22 Blue v2
333 Green v3");
$i = 0;
while($i < count($results)) {
$a = substr($results[$i], 0, 10);
$b = substr($results[$i], 10, 20);
$c = substr($results[$i], strpos(' ', strrev($results[$i]))-1);
if(stripos($userInput, $a . $b . $c) !== false) {
// parse more...
Supposedly JavaScript is good at this with memoizaion:
http://addyosmani.com/blog/faster-javascript-memoization/

Related

How do you set up an array to hold '0's for empty locations so graphs(Highcharts) can format them correctly?

I'm trying to get my Highcharts graph to work. The Reason I'm having so much trouble with it this time is because I have to keep the program adaptable for future changes when it comes to my columns(named issues1 through 12).
The Goal is pretty simple, I just need to grab the issues between hours 1-12 during a certain time period, then create a graph.
My idea Is that I should create a view that organizes the desired information because there is a lot more to that table that I left out, and then create an SQL to organize the data from there. Which I realize might be overkill, but I'm an intern and my supervisor probably did it to help make it simple for me.
There are 4 different places I need to use SQL to make the Table work.
X-Axis
Day shift numbers
Swing shift numbers
Night shift numbers
So for my code The X-Axis, It works fine for just calling in the names.
xAxis: {
categories: [
<?php
foreach ($xAxisresult as $Xrow) {
echo "'" . $Xrow['IssueName'] . "'" . ',';
}
?>
]
I believe the Day/Swing/Grave SQL statements should all be similar so I'm just going to focus on one. But this is where the problem starts with how I have it set up. I tried to run an If statement were I compare the two arrays I have set up and try to match the IssueName Columns.
name: 'Day',
data: [
<?php
foreach ($Dresult as $Drow) {
if ($Xrow['IssueName'] == $Drow['IssueName']){
echo $Drow['Issues'] . ',';
}
else{
echo $Drow['Issues'] . ',';
}
}
You guys can most likely see a lot of whats wrong here. But I need to make a loop or array that will find out that if there is an empty spot in the array and output a 0 so the data stays correct.
Sorry for the wall of Text, I just wanted to give you guys as much information as possible.
To answer your question how to create an array that holds zero values and merge with the data array (I assume).
You can use array_fill to create the array with zeros, and use array_replace to replace with the data array.
$arr = array_fill(0, 10, 0); //[0,0,0,0,0,0,0,0,0,0]
$data = [2 => 15, 5 =>10, 7 => 16]; // your data
$new = array_replace($arr, $data);
var_dump($new); // [0,0,15,0,0,10,0,16,0,0]

Assign random but even amount from an array to a list array in PHP?

I have two arrays. One array $People currently creates number of 44 individuals. Lets just assume currently its
$People = array('1','2',...,'44');.
I have another array of 15 elements.
$Test = array('A','B',...'O');
Now I want to be able to assign the test randomly to each individual. I know how to do this using random function in php.
Where it has got tricky for me and what I need help with is how can I even out the test array. What I mean by this is since there are currently 44 individuals (this array will grow in future), what I want is 14 test versions to have 3 individuals and 1 version would have 2. So I want the test array to even out. I also want it to handle as growth of $People array.
Ex: Test Version D will have individual '4', '25'. Every other version has three random individuals.
Few ideas I came up with are things like running random integer on $Test array and which ever gets highest/lowest gets 2 individuals and rest three. This would give a problem when I increase the size of $People array; to deal with that I though about using modulus to figure out what will be even number of $Test beforehand.
I can do this and other ideas but am pretty sure there has to be a better way to do this.
Clarifying your situation:
You want to distribute the values inside $People randomly amongst your $Test array. The problem you stated you are having is that the amount of values in $People isn't always perfectly dividable by the amount of values in $Test, and you aren't sure how to go about implementing code to distribute the values evenly.
Proposed solution:
You could obtain the values in a foreach loop randomly 1 by 1 from a shuffled version of $People and put them in a new array called $Result. You would also have a conditional checking if you have extracted all the values from the shuffled $People array: if($count>=$arrayCount) where $arrayCount=$count($shuffledPeople);. If you have obtained all the values, you first make the $bool value false (in order not to iterate through the while loop anymore, and then you break; out of the foreach loop.
$Result =[];//the array containing the results
$shuffledPeople = $People;
shuffle($shuffledPeople);//mixing up the array
$arrayCount = count($shuffledPeople);//finding the total amount of people
$count = 0;
$bool = TRUE;
while ($bool)
{
foreach($Test as $value)
{
$Result[$value][] = $shuffledPeople[$count];
$count++;
if ($count>=$arrayCount)
{
$bool = FALSE;
break;
}
}
}
To view the results, all you would need to do is:
foreach ($Result as $key => $value)
{
echo "{$key}: <br>";
if (is_array($value))
{
foreach ($value as $something)
{
echo "-->{$something}<br>";
}
}
else
{
echo "-->{$value}<br>";
}
}
I believe that this is what you want to do...
Assume that you have $people and $test arrays. You want to know how many people per test...
$ppt = ceil(sizeof($people)/sizeof($test));
Now, $ppt is the people per test. The next step is to shuffle up the people so they are randomly assigned to the tests.
shuffle($people);
Now, we can chunk up the people into sub-arrays such that each sub-array is assigned to a test. Remember, the chunks are random now because we shuffled.
$chunks = array_chunk($people, $ppt);
Now, everyone in $chunks[0] will take $test[0]. Everyone in $chunks[1] will take $test[1]. Everyone in $chunks[2] will take $test[2].

Find index of value in associative array in php?

If you have any array $p that you populated in a loop like so:
$p[] = array( "id"=>$id, "Name"=>$name);
What's the fastest way to search for John in the Name key, and if found, return the $p index? Is there a way other than looping through $p?
I have up to 5000 names to find in $p, and $p can also potentially contain 5000 rows. Currently I loop through $p looking for each name, and if found, parse it (and add it to another array), splice the row out of $p, and break 1, ready to start searching for the next of the 5000 names.
I was wondering if there if a faster way to get the index rather than looping through $p eg an isset type way?
Thanks for taking a look guys.
Okay so as I see this problem, you have unique ids, but the names may not be unique.
You could initialize the array as:
array($id=>$name);
And your searches can be like:
array_search($name,$arr);
This will work very well as native method of finding a needle in a haystack will have a better implementation than your own implementation.
e.g.
$id = 2;
$name= 'Sunny';
$arr = array($id=>$name);
echo array_search($name,$arr);
Echoes 2
The major advantage in this method would be code readability.
If you know that you are going to need to perform many of these types of search within the same request then you can create an index array from them. This will loop through the array once per index you need to create.
$piName = array();
foreach ($p as $k=>$v)
{
$piName[$v['Name']] = $k;
}
If you only need to perform one or two searches per page then consider moving the array into an external database, and creating the index there.
$index = 0;
$search_for = 'John';
$result = array_reduce($p, function($r, $v) use (&$index, $search_for) {
if($v['Name'] == $search_for) {
$r[] = $index;
}
++$index;
return $r;
});
$result will contain all the indices of elements in $p where the element with key Name had the value John. (This of course only works for an array that is indexed numerically beginning with 0 and has no “holes” in the index.)
Edit: Possibly even easier to just use array_filter, but that will not return the indices only, but all array element where Name equals John – but indices will be preserved:
$result2 = array_filter($p, function($elem) {
return $elem["Name"] == "John" ? true : false;
});
var_dump($result2);
What suits your needs better, resp. which one is maybe faster, is for you to figure out.

Fastest way to find relevant results in array from an input array

As a mostly front-end developer, this is in the realm of computer science that I don't often delve into, but here's my scenario:
I've got an input of a string, split on spaces, say "pinto beans"
I've got a array of results to search, that contains results like:
["beans, mung","beans, pinto","beans, yellow","beans, fava"]
what might be the quickest way (preferably in javascript or php) to find the most "relevant" results, aka most matches, for instance, in the above case, I would like to sort the return array so that "beans, pinto" is put at the top, and the rest come below, and any other results would go below those.
My first attempt at this would be to do something like matching each result item against each input item, and incrementing matches on each one, then sorting by most matches to least.
This approach would require me to iterate through the entire result array a ton of times though, and I feel that my lack of CS knowledge is leaving me without the best solution here.
/* EDIT: Here's how I ended up dealing with the problem: */
Based on crazedfred's suggestion and the blog post he mentioned (which was VERY helpful), I wrote some php that basically uses a combination of the trie method and the boyer-moore method, except searching from the beginning of the string (as I don't want to match "bean" in "superbean").
I chose php for the ranking based on the fact that I'm using js libraries, and getting real benchmarks while using convenience functions and library overhead wouldn't produce the testable results I'm after, and I can't guarantee that it won't explode in one browser or another.
Here's the test data:
Search String: lima beans
Result array (from db): ["Beans, kidney","Beans, lima","Beans, navy","Beans, pinto","Beans, shellie","Beans, snap","Beans, mung","Beans, fava","Beans, adzuki","Beans, baked","Beans, black","Beans, black turtle soup","Beans, cranberry (roman)","Beans, french","Beans, great northern","Beans, pink","Beans, small white","Beans, yellow","Beans, white","Beans, chili","Beans, liquid from stewed kidney beans","Stew, pinto bean and hominy"]
First, I drop both the search string and the result array into php variables, after explode()ing the string on spaces.
then, I precompile my patterns to compare the results to:
$max = max(array_map('strlen',$input));
$reg = array();
for($m = 0; $m < $max; $m++) {
$reg[$m] = "";
for($ia = 0; $ia < count($input); $ia++) {
$reg[$m]. = $input[$ia][$m];
}
}
this gives me something like : ["lb","ie","ma","an","s"]
then, I basically take each result string (split on spaces), and match a case insensitive character class with the corresponding character number to it. If at any point during that comparison process I don't get any matches, I skip the word. This means if only 1 result starts with "b" or "l", I'll only run one comparison per WORD, which is really fast. Basically I'm taking the part of trie that compiles the searches together, and the constant speedup of the Boyer-Moore stuff.
Here's the php - I tried whiles, but got SIGNIFICANTLY better results with foreaches:
$sort = array();
foreach($results as $result) {
$matches = 0;
$resultStrs = explode(' ', $result);
foreach($resultStrs as $r) {
$strlen = strlen($r);
for($p = 0; $p < $strlen; $p++) {
if($reg[$p])
preg_match('/^['.$reg[$p].']/i',$r[$p],$match);
if($match==true) {
$matches++;
} else {
break 2;
}
}
}
$sort[$result] = $matches;
}
That outputs an array with the results on the keys, and how many character matches we got in total on the values.
The reason I put it that way is to avoid key collisions that would ruin my data, and more importantly, so I can do a quick asort and get my results in order.
That order is in reverse, and on the keys, so after the above code block, I run:
asort($sort);
$sort = array_reverse(array_keys($sort));
That gives me a properly indexed array of results, sorted most to least relevant. I can now just drop that in my autocomplete box.
Because speed is the whole point of this experiment, here's my results - obviously, they depend partially on my computer.
2 input words, 40 results: ~5ms
2 input words, (one single character, one whole) 126 results: ~9ms
Obviously there's too many variables at stake for these results to mean much to YOU, but as an example, I think it's pretty impressive.
If anyone sees something wrong with the above example, or can think of a better way than that, I'd love to hear about it. The only thing I can think of maybe causing problems right now, is if I were to search for the term lean bimas, I would get the same result score as lima beans, because the pattern isn't conditional based on the previous matches. Because the results I'm looking for and the input strings I'm expecting shouldn't make this happen very often, I've decided to leave it how it is for now, to avoid adding any overhead to this quick little script. However, if I end up feeling like my results are being skewed by it, I'll come back here and post about how I sorted that part.
I try to suggest a solution for JavaScript but it can works in PHP too.
In this way you don't need to use nested loops, you can use just a sorting function and some regular expression.
Something like this:
var query = 'pinto beans';
var results = [ 'beans, mung','beans, pinto','beans, yellow','beans, fava' ];
// Evaluate a regular expression for your
// query like /pinto|beans/g joining each
// query item with the alternation operator
var pattern = eval( '/' + query.split( ' ' ).join( '|' ) + '/g' );
// Define a function for custom sorting
function regsort( a, b ) {
var ra = a.match( pattern );
var rb = b.match( pattern );
// return the indexing based on
// any single query item matched
return ( rb ? rb.length : 0 ) - ( ra ? ra.length : 0 );
}
// Just launch the sorting
var sorted = results.sort( regsort );
// Should output the right sort:
// ["beans, pinto", "beans, mung", "beans, yellow", "beans, fava"]
console.log( sorted );
I'm not sure if this is the fastest way to handle your request but for sure it could be a nice solution to avoid a nested loop + string comparsion.
Hope this helps!
Ciao
Since you specifically noted that it could be in several languages, I'll leave my answer in pseudocode so you can adapt to the language of your choice.
Since you are matching an array-to-array, performance is going to vary a lot based on your implementation, so trying several ways and considering exactly when/how/how-often this will be used in a good idea.
The simple way is to leave the data as-is and run an O(n^2) search:
for (every element in array X)
for (every element in array Y)
if (current X == current Y)
add current X to results
return results
If you sorted the arrays first (a sorting algorithm such a squicksort is implemented for you in many languages, check your documentation!), then the actual matching is quicker. Use whatever string-comparison your language has:
Sort array X
Sort array Y
Let A = first element of X
Let B = first element of Y
while (A and B are in array)
if (A > B)
Next B
else if (A < B)
Next A
else //match!
Add A to results
Next A
Next B
//handle case where one array is larger (trivial loop)
return results
Now the important part to the above solution is if the sorting of the arrays saved time versus just an ordinary O(n^2) sort. Usually, moving elements around in arrays is fast whereas string comparisons are not, so it may be worth it. Again, try both.
Finally, there's this crazy algorithm the Mailinator guy dreamed up for doing huge string comparisons in constant time using some awesome data structures. Never tried it myself, but it must work since he runs his whole site on very low-end hardware. He blogged about it here if you're interested. (Note: the blog post is about filtering spam, so some words in the post may be slightly NSFW.)
VERY PRIMITIVE
Just wanted to get that tid-bit out of the way up-front. But here is a simple implementation of sorting based on terms used. I will be up-front and mention that word variations (e.g. searching for "seasoned" and the result comes back as "seasonal") will have no effect on sorting (in fact, these words will, for all intents and purposes, but thought of as being just as different as "apple" and "orange" due to their suffix).
With that out of the way, here's is one method to do what you're looking for. I'm using this more as just an introduction as to how you can implement, it's up to you how you want to implement the "scoreWord" function. I also use jQuery's .each() method just because I'm lazy, but this can be replaced very easily with a for statement.
Anyways, here's one version, I hope it helps.
var search = "seasoned pinto beans";
var results = ["beans, mung","beans, pinto","beans, yellow","beans, pinto, seasonal","beans, fava"];
// scoreWord
// returns a numeric value representing how much of a match this is to the search term.
// the higher the number, the more definite a match.
function scoreWord(result){
var terms = search.toLowerCase().split(/\W/),
words = result.toLowerCase().split(/\W/),
score = 0;
// go through each word found in the result
$.each(words,function(w,word){
// and then through each word found in the search term
$.each(terms,function(t,term){
// exact word matches score higher (a whole point)
if (term==word)
score++;
// a word found in a word should be considered a partial
// match and carry less weight (1/2 point in this case)
else if (term.indexOf(word)!=-1||word.indexOf(term)!=-1)
score+=0.5;
});
});
// return the score
return score;
}
// go through and sort the array.
results.sort(function(a,b){
// grab each result's "score", them compare them to see who rates higher in
// the array
var aScore = scoreWord(a), bScore = scoreWord(b);
return (aScore>bScore?-1:(aScore<bScore?1:0));
});
// they are officially "sorted" by relevance at this point.
You can precompute a map which maps a word to many results.
var results = ["beans, mung","beans, pinto","beans, yellow","beans, fava"];
var index = {};
for (var i = 0; i < results.length; i ++) {
results[i].replace(/\w+/g, function(a) {
if (!index[a]) {
index[a] = [i];
} else {
index[a].push (i);
}
});
}
When searching, you can split the query into words.
function doSearch(searchString) {
var words = [];
var searchResults = [];
var currentIndex;
searchString.replace(/\w+/g, function(a) {
words.push (a);
});
Build the search result as a copy of the results array, but I put it in object so that it can hold both the text and the score.
for (var i = 0; i < results.length; i ++) {
searchResults.push ({
text: results[i],
score: 0
});
}
Then, for each search word, increase the score in the search results.
for (var i = 0; i < words.length; i ++) {
if ((currentIndex = index[words[i]])) {
for (var j = 0; j < currentIndex.length; j ++) {
searchResults[currentIndex[j]].score ++;
}
}
}
Finally, sort it by score.
searchResults.sort (function(a, b) {
return b.score - a.score;
});
return searchResults;
}
Doing `doSearch("pinto beans"), it returns an array of search results and also the score.
[{text:"beans, pinto", score:2}, {text:"beans, mung", score:1}, {text:"beans, yellow", score:1}, {text:"beans, fava", score:1}]

What's quicker, an array lookup (including array build) or an IF stack?

I was wondering which was better:
$lookup = array( "a" => 1, "b" => 2, "c" => 3 );
return $lookup[$key];
or
if ( $key == "a" ) return 1
else if ( $key == "b" ) return 2
else if ( $key == "c" ) return 3
or maybe just a nice switch...
switch($key){
case "a": return 1;
case "b": return 2;
case "c": return 3;
}
I always prefer the first method as I can separate the data from the code; At this scale it looks quite silly but on a larger scale with thousands of lines of lookup entries; How much longer is PHP going to take building an array and then only checking maybe 1 or 2 entries per request.
I think it'd have to be tested and clocked, but I'd say the bigger and more complicated the array the slower it's going to become.
PHP Should be able to handle lookups faster than I can in PHP-code, but building the array in the first place surely takes up a lot of time.
For anything with measurable performance (not only 3 entries) lookup is fastest way. That's what hash tables are for.
First, it's easy to test it yourself.
Second, and more importantly, which is most appropriate for the code you're using? The amount of time you'll save is negligible in any case.
There will be a tipping point you will just have to test to find it. My guess is with 3 items you are better off with if/then/else. This is a nice article on bit counting which compared computing the number of bits and using lookups. Spoiler: Lookups won!
Are you building the array every time, or can you build it once and cache it?
If you are building it every time, I cannot see how that could be faster. Building the array by itself should take longer that the chained if()s (Adding one item to the array would be close in time to one if(), but you'd have to add every item, when you could exit from the if() early)
If you can use a cached array, than I think that would be the clear winner.
So I did a bit of testing with this example and got the following results:
emptyfunction: 0.00000087601416110992430969503855231472755349386716
lookuparray: 0.00000136602194309234629100648257538086483009465155
makearrayonly: 0.00000156002373695373539708814922266633118397294311
makearray: 0.00000174602739810943597796187489595842734502184612
ifblock: 0.00000127001986503601083772739543942265072473674081
switchblock: 0.00000131001937389373773757957151314679222764425504
Each was inside a method, so I also included the time for an empty method. They were ran 1,000,000 times each and then averaged out.
Just doing a lookup (without the building of the array) is actually slower than an if block (uses a global lookup the same as my code) and just by a fraction slower than a switch block.
I can't be bothered scaling this up to hundreds of if statements but it just shows that the if statement is faster even at this level against a single lookup.
If you've got thousands of entries, an array lookup will win hands down. The associative array might be a bit slow, but finding an array key is much faster than doing thousands of if() blocks (not to mention the time it takes to type it all out!)
You could test how long it takes to see if the value is not there as well. My guess is the array key lookup will be faster. And if you have to query it twice or more, caching the array should make it faster.
But speed is not the most important thing here. The array key is better for the long term, where you want to add and remove data. You can also get the data from another source in the future.
If you just want to look up a value
you use an array.
If you want to take an action then
if and switch both have their uses.
This is a little test for array manipulations
{
$x = 0;
foreach ($test as $k => $v) {
$x = sprintf(” % s=>%sn”,$k,$v);}
}
{
$x = 0;
reset($test);
while (list($k, $v) = each($test)) {
$x = sprintf(” % s=>%sn”,$k,$v);
}
}
{
$x = 0;
$k = array_keys($test);
$co = sizeof($k);
for ($it = 0; $it < $co; $it++) {
$x = sprintf(” % s=>%sn”,$k[$it],$test[$k[$it]]);
}
}
{
$x = 0;
reset($test);
while ($k = key($test)) {
$x = sprintf(” % s=>%sn”,$k,current($test)); next($test);
}
}
access time (ms)
8.1222
10.3221
9.7921
8.9711

Categories