I'm writing a small algorithm in PHP that goes through n number of movies with ratings, and will store the top 5. I'm not reading from a datafile, but from a stream so I cannot simply order the movies by rating.
My question is what is the most efficent way to keep track of the top 5 rated movies as I read the stream? Currently I do the following:
Read in 5 movies (into an array called movies[]), with two keys movies[][name] and movies[][rating]
Order the array by movies[rating] using array_multisort() (highest rating now sits at movies[4])
Read in the next movie
If this new movie rating > movies[0][rating] then replace movies[0] with this new movie
Re-order the list
Repeat 3-5 until finished
My method works, but requires a sort on the list after every read. I believe this to be an expensive method mostly due to the fact that every time I use array_multisort() I must do a for loop on 5 movies just to build the index to sort on. Can anyone suggest a better way to approach this?
Linked lists would work here.
Build a linked list that chains the first 5 movies in the correct order. For each new movie, just start at the the end of the chain and walk it until your movie is between one with a higher rating and one with a lower rating. Then insert your link into the list here. If the movie was better than the worst (and thus your list is now 6 long), just remove the last link in the chain, and you are back to 5.
No sorting, no indexing.
Your algorithm looks fine. I am not sure how the arrays are implemented in PHP. From an algorithm point of view: use a heap instead of an array.
No point in re-sorting after every read since you really only need to insert a new entry. Use the following algorithm, it's likely to get you the best speed. It's basically an unrolled loop, not the most beautiful code.
set movies[0..4].rating to -1.
while more movies in stream:
read in next movie.
if movie.rating < movies[0].rating:
next while
if movie.rating < movies[1].rating:
movies[0] = movie
next while
if movie.rating < movies[2].rating:
movies[0] = movies[1]
movies[1] = movie
next while
if movie.rating < movies[3].rating:
movies[0] = movies[1]
movies[1] = movies[2]
movies[2] = movie
next while
if movie.rating < movies[4].rating:
movies[0] = movies[1]
movies[1] = movies[2]
movies[2] = movies[3]
movies[3] = movie
next while
movies[0] = movies[1]
movies[1] = movies[2]
movies[2] = movies[3]
movies[3] = movies[4]
movies[4] = movie
At the end, you have your sorted list of movies. If there's less than 5, those others will have a rating of -1 so you'll know they're invalid. This is assuming that the rating on a real movie is zero or greater but you can adjust the values if they're not.
If you need to adjust it for more than 5 movies, you can. The best bet would be to roll up the loop again. At some point, however, it's going to become more efficient to sort it than use this method. This method's only really good for a small data set.
My method works, but requires a sort on the list after every read.
No it doesn't, it only requires a sort after you find a new movie whos rating is > movies[0][rating].
This method seems efficient to me. You only sort occasionally when there's a new entry for the top 5, which will happen less the more movies you process.
How big is the list? I'm guessing it's not an option to keep the entire list in memory, and sort it at the end?
there is no need for two keys in array. array with name as key, and rating as value will do. Sort it with arsort();
the algorithm is not perfect, you can do it optimally with linked list. Although I think linked list implemented in PHP will be actually slower that function call to asort() for 6 elements. For big O estimation, you can assume that sorting 6 elements has constant time.
You'll only sort when you encounter movie rated higher then the actual, so in average case you'll do it less an less often, while progressing. You'll sort on every movie only in worst case scenario of having initial list sorted from lowest rated.
Here’s what I would do:
// let’s say get_next_movie () returns array with 'rating' and 'name' keys
while ($m = get_next_movie ()) {
$ratings[$m['rating']][] = $m['movie'];
$temp_ratings = $ratings;
$top5 = array ();
$rating = 5;
while (1) {
if (count ($temp_ratings[$rating])) {
$top5[] = array_shift ($temp_ratings[$rating]);
} elseif ($rating > 0) {
--$rating;
} else {
break;
}
}
// $top5 has current top 5 :-)
}
$ratings array looks like this, each rating has array of movies inside:
Array
(
[5] => Array
(
[0] => Five!
)
[3] => Array
(
[0] => Three
[1] => Threeeeee
[2] => Thr-eee-eee
)
[4] => Array
(
[0] => FOR
)
)
Maybe this can be of help.
class TopList {
private $items = array();
private $indexes = array();
private $count = 0;
private $total = 5;
private $lowest;
private $sorted = false;
public function __construct($total = null) {
if (is_int($total))
$this->total = $total;
$this->lowest = -1 * (PHP_INT_MAX - 1);
}
public function addItem($index, $item) {
if ($index <= $this->lowest)
return;
$setLowest = $this->count === $this->total;
if ($setLowest) {
/* //remove first added
$lowestIndex = array_search($this->lowest, $this->indexes);
/*/ //remove last added
$lowestIndex = end(array_keys($this->indexes, $this->lowest));
//*/
unset($this->indexes[$lowestIndex], $this->items[$lowestIndex]);
} else {
++$this->count;
$setLowest = $this->count === $this->total;
}
$this->indexes[] = $index;
$this->items[] = $item;
$this->sorted = false;
if ($setLowest)
$this->lowest = min($this->indexes);
}
public function getItems() {
if (!$this->sorted) {
array_multisort($this->indexes, SORT_DESC, $this->items);
$this->sorted = true;
}
return $this->items;
}
}
$top5 = new TopList(5);
foreach ($movies as $movie) {
$top5->addItem($movie['rating'], $movie);
}
var_dump($top5->getItems());
Related
I have a controller function in CodeIgniter that looks like this:
$perm = $this->job_m->getIdByGroup();
foreach($perm as $pe=>$p)
{
$pId = $p['id'];
$result = $this->job_m->getDatapermission($pId);
}
$data['permission'] = $result;
What I need to do is list the data in the result in the view, but I get only the last value while using this method. How can I pass all the results to the view?
Store it in an array. Like this:
foreach($perm as $pe=>$p){
$result[] = $this->job_m->getDatapermission($p['id']);
}
Because $result is not an array...
try this:
$result=array();
foreach($perm as $pe=>$p)
{
$pId = $p['id'];
$result[] = $this->job_m->getDatapermission($pId);
}
$data['permission'] = $result;
Note:
My answer uses a counter to enable the display of a single group result when needed.
Guessing from your need to loop and display the value of $result, possibly, it is an array or object returned by $query->result(). Things could be a bit complex.
Example: if $perm is an array of 5 items( or groups), the counter assigns keys 1 - 5 instead of 0 - 4 as would [] which could be misleading. Using the first view example, you could choose to display a single group value if you wants by passing it via a url segment. Making the code more flexible and reusable. E.g. You want to show just returns for group 2, in my example, $result[2] would do just that else next code runs. See my comments in the code.
$perm = $this->job_m->getIdByGroup();
$counter = 1;
foreach($perm as $pe=>$p)
{
$pId = $p['id'];
$result[$counter] = $this->job_m->getDatapermission($pId);
$counter++;
}
$data['permission'] = $result;
As mentioned above Note:
I Added a Counter or Key so you target specific level. If the groups are:
Men, Women, Boys, Girls, Children; you'd know women is group two(2) If you desire to display values for just that group, you don't need to rewrite the code below. Just pass the group key would be as easy as telling it by their sequence. To display all the loop without restrictions, use the second view example. To use both, use an if statement for that.
###To access it you could target a specific level like
if(isset($permission)){
foreach($permission[2] as $key => $value){
echo $value->columnname;
}
###To get all results:
foreach($permission as $array){
foreach($array as $key => $value){
echo $value->columnname;
}
}
}
I'm new to web developing.
This is part of a phone service, and I'm trying to filter through 3 different arrays that are filled with strings from three database searches: $sfaa, $sfipc, and $sfuaa. I have to filter the three database arrays to locate available customer service agents. The output would be an array filled with the IVR_Number to dial.
Heres an example of the string: "'Id', 'IVR_Number', 'Market_Id'"
I have to explode the string in order to get my data from each value in the arrays. Then based on a one-to-many id in each string I have to check if the id from $sfaa is in $sfipc or $sfuaa. If not then I have to build an array with the filtered records, from there I have to locate a value from the exploded string in $sfaa that belongs to that id. I wrote the following code but theres got to be an easier way?? I hope.... The client has to wait for these results before moving forward. There is usually only 10 or 15 records.
This code works I'm just wondering if there is an easier way to do this
Any tips
// formalua needed to filter above results and fill $aadl array
// explode each active agent array
$activeagentsfec=0;
$aaivra= array();
$aaida= array();
foreach ($sfaa as $aavalue)
{
${'aadetails'.$activeagentsfec} = explode("'",$aavalue);
${'aaivr'.$activeagentsfec} = ${'aadetails'.$activeagentsfec}[5];
${'aaid'.$activeagentsfec} = ${'aadetails'.$activeagentsfec}[1];
array_push($aaivra, ${'aaivr'.$activeagentsfec});
array_push($aaida,${'aaid'.$activeagentsfec});
$activeagentsfec++;
}
// explode each inprogress call array
$activecallsfec=0;
$actida= array();
$acfida= array();
foreach ($sfipc as $acvalue)
{
${'acdetails'.$activecallsfec} = explode("'",$acvalue);
${'actid'.$activecallsfec} = ${'acdetails'.$activecallsfec}[5];
${'acfid'.$activecallsfec} = ${'acdetails'.$activecallsfec}[7];
array_push($actida, ${'actid'.$activecallsfec});
array_push($acfida, ${'acfid'.$activecallsfec});
$activecallsfec++;
}
// explode each unvailable agent
$unavailableagentsfec=0;
$uaaida= array();
foreach ($sfuaa as $uavalue)
{
${'uadetails'.$unavailableagentsfec} = explode("'",$uavalue);
${'uaaid'.$unavailableagentsfec} = ${'uadetails'.$unavailableagentsfec}[3];
array_push($uaaida, ${'uaaid'.$unavailableagentsfec});
$unavailableagentsfec++;
}
// create available agent array by id
$aaafec=0;
$aada= array();
foreach ($aaida as $aaidavalue)
{
if (in_array($aaidavalue,$actida,true))
$aaafec++;
elseif(in_array($aaidavalue,$acfida,true))
$aaafec++;
elseif(in_array($aaidavalue,$uaaida,true))
$aaafec++;
else
array_push($aada, $aaidavalue);
}
// available agent arry by ivr
$aadl= array();
foreach ($aada as $aadavalue)
{
$aaaivrsv= array_search($aadavalue,$aaida,true);
array_push($aadl,$aaivra[$aaaivrsv]);
}
Given what you were saying in the comments, I'll try to give you some useful thoughts...
You carry out much the same process to parse $sfaa, $sfipc, and $sfuaa - explode, get certain columns. If you had some way to abstract that process, with a generic function for the parsing, that returns the data in a better format, called three times on each array, you'd see better through your code.
In the same way, your process is tightly coupled to the current state of the data - e.g. ${'acdetails'.$activecallsfec}[5]; is your fifth item today, but will it always be? Something generic, where you seek an column by name, might save you a lot of trouble...
finally, when merging data, if the data is sorted before hand the merge can be a lot quicker - seeking N items in a list of M, with an unsorted list takes O(n*m) operations, but if both are sorted it's O(min(m,n)).
I've taken the time to go through your code... Unless you're usign some of its variables elsewhere, here is a shorter equivalent:
// formula needed to filter above results and fill $aadl array
// explode each active agent array
$aaivra= array();
$aaida= array();
foreach ($sfaa as $aavalue)
{
$a = explode("'",$aavalue);
array_push($aaivra, $a[5]);
array_push($aaida,$a[1]);
}
// explode each inprogress call array
$actida= array();
$acfida= array();
foreach ($sfipc as $acvalue)
{
$a = explode("'",$acvalue);
array_push($actida, $a[5]);
array_push($acfida, $a[7]);
}
// explode each unvailable agent
$uaaida= array();
foreach ($sfuaa as $uavalue)
{
$a= explode("'",$uavalue);
array_push($uaaida, $a[3]);
}
// create available agent array by id
$aada= array();
foreach ($aaida as $aaidavalue)
{
if (!in_array($aaidavalue,$actida,true) &&
!in_array($aaidavalue,$acfida,true) &&
!in_array($aaidavalue,$uaaida,true))
array_push($aada, $aaidavalue);
}
// available agent arry by ivr
$aadl= array();
foreach ($aada as $aadavalue)
{
$aaaivrsv= array_search($aadavalue,$aaida,true);
array_push($aadl,$aaivra[$aaaivrsv]);
}
I have the following array:
$masterlist=[$companies][$fieldsofcompany][0][$number]
The third dimension only exists if the field selected from $fieldsofcompany = position 2 which contains the numbers array. Other positions contain regular variables. The 3rd dimension is always 0 (the numbers array) or Null. Position 4 contains numbers.
I want to cycle through all companies and remove from the $masterlist all companies which contain duplicate numbers.
My current implementation is this code:
for($i=0;$i<count($masterlist);$i++)
{
if($masterlist[$i][2][0][0] != null)
$id = $masterlist[$i][0];
for($j=0;$j<count($masterlist[$i][2][0]);$j++)
{
$number = $masterlist[$i][2][0][$j];
$query = "INSERT INTO numbers VALUES('$id','$number')";
mysql_query($query);
}
}
Which inserts numbers and associated IDs into a table. I then select unique numbers like so:
SELECT ID,number
FROM numbers
GROUP BY number
HAVING (COUNT(number)=1)
This strikes me as incredibly brain-dead. My question is what is the best way to do this? I'm not looking for code per se, but approaches to the problem. For those of you who have read this far, thank you.
For starters, you should prune the data before sticking it into the database.
Keep a look up table that keeps track of the 'number'.
If the number is not in the look up table then use it and mark it, otherwise if its in the look up table you can ignore it.
Using an array for the look up table and with keys being the 'number' you can use the isset function to test if the number has appeared before or not.
Example pseudo code:
if(!isset($lookupTable[$number])){
$lookupTable[$number]=1;
//...Insert into database...
}
Now that I think I understand what you really want, you might want to stick with your two-pass approach but skip the MySQL detour.
In the first pass, gather numbers and duplicate companies:
$duplicate_companies = array();
$number_map = array();
foreach ($masterlist as $index => $company)
{
if ($company[2][0][0] === null)
continue;
foreach ($company[2][0] as $number)
{
if (!isset($number_map[$number])
{
// We have not seen this number before, associate it
// with the first company index.
$number_map[$number] = $index;
}
else
{
// Both the current company and the one with the index stored
// in $number_map[$number] are duplicates.
$duplicate_companies[] = $index;
$duplicate_companies[] = $number_map[$number];
}
}
}
In the second pass, remove the duplicates we have found from the master list:
foreach (array_unique($duplicate_companies) as $index)
{
unset($masterlist[$index]);
}
I'm coding a project that generates two arrays containing data. One array contains data for a specific country and the other contains data for all countries.
For example, if a user from the US makes a request, we will generate two arrays with data. One with data only for the US and the other with data for worldwide, including the US. I want to give the US array a 60% chance of being selected if the visitor is from the US. That means the other array will have a 40% chance of being selected.
How does one code this??
if(rand(1, 100) <= $probability_for_first_array)
{
use_the($first_array);
}
else
{
use_the($second_array);
}
I find this a straightforward, easy to read solution
<?php
$us_data = "us";
$worldwide_data = "worldwide";
$probabilities = array($us_data => 0.60, $worldwide_data => 0.40);
/* Code courtesy of Jesse Farmer
* For more details see http://goo.gl/fzq5
*/
function get_data($prob)
{
$random = mt_rand(0, 1000);
$offset = 0;
foreach ($prob as $key => $probability)
{
$offset += $probability * 1000;
if ($random <= $offset)
{
return $key;
}
}
}
?>
Gabi's example is fine for two sets, but if you have more data sets to pick from, the if-else structure is not appropriate.
I have a PHP script which reads a large CSV and performs certain actions, but only if the "username" field is unique. The CSV is used in more than one script, so changing the input from the CSV to only contain unique usernames is not an option.
The very basic program flow (which I'm wondering about) goes like this:
$allUsernames = array();
while($row = fgetcsv($fp)) {
$username = $row[0];
if (in_array($username, $allUsernames)) continue;
$allUsernames[] = $username;
// process this row
}
Since this CSV could actually be quite large, it's that in_array bit which has got me thinking. The most ideal situation when searching through an array for a member is if it is already sorted, so how would you build up an array from scratch, keeping it in order? Once it is in order, would there be a more efficient way to search it than using in_array(), considering that it probably doesn't know the array is sorted?
Not keeping the array in order, but how about this kind of optimization? I'm guessing isset() for an array key should be faster than in_array() search.
$allUsernames = array();
while($row = fgetcsv($fp)) {
$username = $row[0];
if (isset($allUsernames[$username])) {
continue;
} else {
$allUsernames[$username] = true;
// do stuff
}
}
The way to build up an array from scratch in sorted order is an insertion sort. In PHP-ish pseudocode:
$list = []
for ($element in $elems_to_insert) {
$index = binary_search($element, $list);
insert_into_list($element, $list, $index);
}
Although, it might actually turn out to be faster to just create the array in unsorted order and then use quicksort (PHP's builtin sort functions use quicksort)
And to find an element in a sorted list:
function binary_search($list, $element) {
$start = 0;
$end = count($list);
while ($end - $start > 1) {
$mid = ($start + $end) / 2;
if ($list[$mid] < $element){
$start = $mid;
}
else{
$end = $mid;
}
}
return $end;
}
With this implementation you'd have to test $list[$end] to see if it is the element you want, since if the element isn't in the array, this will find the point where it should be inserted. I did it that way so it'd be consistent with the previous code sample. If you want, you could check $list[$end] === $element in the function itself.
The array type in php is an ordered map (php array type). If you pass in either ints or strings as keys, you will have an ordered map...
Please review item #6 in the above link.
in_array() does not benefit from having a sorted array. PHP just walks along the whole array as if it were a linked list.