I need a way to get highest points players within salary range i.e 50,000
There is a similar question here Algorithm to select Player with max points but with a given cost.
Basically I have to select optimal 9-player line-up.
I googling lot and I found this can be achieve using linear programming.But I don't know how can I use Lp in php.
Any idea how can I achieve this or there is any other way to do this?
If you store the information in arrays, I believe you could achieve the result using array_multisort which would give result similar to SQL order by. For example, order by points DESC, salary ASC. This would give back the array having top points players at top and if any of them have the same amount of points, the first would be the one with the lowest salary.
The answer to this question shows how to use array_multisort.
Related
I have a slight problem. I have a dataset, which contains values measured by a weather station, which I want to analyze further using MySQL database and PHP.
Basically, the first column of the db contains the date and the other columns temperature, humidity, pressure etc.
Now, the problem is, that for the calculation of the mean, st.dev., max, min etc. it is quite simple. However there are no build-in commands for other parameters which I need, such as kurtosis etc.
What I need is for example to calculate the skewness, mean, stdev etc. for the individual months, then days etc.
For the build-in functions it is easy, for example finding some of the parameters for the individual months would be:
SELECT AVG(Temp), STD(Temp), MAX(Temp)
FROM database
GROUP BY YEAR(Date), MONTH(Date)
Obviously I cannot use this for the more advanced parameters. I thought about ways of achieving this and I could only think of one solution. I manually wrote a function, which processes the values and calculates the things such as kurtosis using the particular formulae. But, what that means is that I would need to create arrays of data for each month, day, etc. depending on what I am currently calculating. So for example, i would first need to take the data and split it into arrays lets say Jan11, Feb11, Mar11...... and each array would contain the data for that month. Then I would apply the function on those arrays and create new variables with the result (lets say kurtosis_jan11, kurtosis_feb11 etc.)
Now to my question. I need help with the splitting of data. The problem is that I dont know in advance which month the data starts and which it ends, so I cannot set fixed variables for this. The program first has to check the first month and then create new array for each month, day etc. until it reaches the last record. And for each it would create the array.
That of course would be maybe one solution but if anyone has any other ideas about how to go around this problem I would very much appreciate your help.
You can do more complex queries to achieve this. Here are some examples http://users.drew.edu/skass/sql/ , including Skew
SELECT AVG(Temp), STD(Temp), MAX(Temp)
FROM database
GROUP BY YEAR(Date), MONTH(Date)
having date between date_from and date_to
I think you want a group of data in between a data range.
I'm trying to paginate user submitted information into a catalog. At first I had something like this:
/?page=3&count=20&sort=date
$floor = ($page-1)*$count;
$ceiling = $count;
SELECT * FROM catalog ORDER BY date ASC LIMIT $floor, $ceiling
This, as I have read, is bad since it will count all the results, not stopping at the limit (floor+ceiling).
Now, I'm trying to make it faster by paging with respect to the last item on the page
/?last_date=2012&count=20&sort=date
$ceiling = $count;
SELECT * FROM catalog WHERE date>$last_date ORDER BY date ASC LIMIT $ceiling
However, this won't work right? Some dates will be the same. For the sake of argument, let's assume that I cannot use a more precise timestamp. For instance sorting by price would only go to 2 decimal places and there would definitely be overlap.
Is there anything that I can do to make this improvement work, or should i revert back to my previous query?
In order to use this, you'd have to constrain by date in both directions. What you could do is multipage each date range if it is too long, otherwise, select the whole set of dates. The disadvantage is that page sizes will be somewhat arbitrary and it might take longer to page through results in some cases.
This would mean you'd also have to first do a count() on that date range so you'd know if the next page would be in this or the next range of dates.
I ended up using my first pagination example.
If i understand it good, i think the LIKE operator in WHERE Clause in you SQL Statement should help... But i can be wrong, can you more specify what's exactly wrong?
I have a PHP rating system (1-5), in which, some judges come rate some products. I want the results of these products to be fair. Normally what happens is some judges are very strict and may rate products only in the range of 1-2. While some judges rate products only in range of 4-5. Some judge correctly between 1-5.
Can some one give an idea or help in creating an algorithm for mean judges which scales the judges' ratings and compute the product score.
I thought of taking mean of the judges scores on all products but is that the way to go forward or some one has another good alternative to get fair results.
Edit
The rating system is not for an ecommerce application. Here there are only few judges say 10 who rate all the products. The product may be a song in a contest for example. Some of the judges may be very strict and some very liberal. There maybe several contests, so I have to record ratings of these very strict and liberal judges even for other contests and set a rule for them.
Simply put, you assign a weight to a judge based on the range of their typical votes (note, they must not be aware of this weight, or they will throw the system off.) Judges who always vote a single score get the lowest weight. Judges that give things a wide range of scores are considered more accurate.
This also assumes that these judges judge products with a fair range of quality; so if you give them a bunch of good or bad products and expect a range of vote levels, it might be unrealistic.
What you're looking for is the judge with the highest standard deviation (highest variation) in votes having the highest weight, whereas the judge with the lowest would have the least.
The non-algorithmic solution is (essentially) to run the algorithm on the judges, and then pick, American Idol style, judges that balance each other off to get what feels like an accurate result. In which case, you'd want to note the average vote as well as the standard deviation, and perhaps set three judges, one with the wide standard deviation, and then two narrows, one high and one low (liberal and strict) to judge it. This way they don't feel like they get 'less voice' because they are stricter or looser.
Then again, that could be an impetus for them to be less/more strict - if they are too easy or too hard on the product consistently they 'lose voice'.
It sounds like you may be trying to apply an algorithmic solution to a non-algorithmic problem. I'd think about why some "judges" vote only 1-2 and others vote only 4-5.
One possible cause could be self-selection. For example, people who bought an item online may be more likely to review the item if they were particularly disappointed or particularly pleased with their purchase. If this is your problem, you could try to to encourage shoppers to vote more, so that even those who had a non-extreme experience come back to vote.
Another possible issue may be guidance. Maybe your explanation of the rating system isn't clear to the judges. You can try to add a description of what each rating means, and see if that improves the quality of data.
In summary, any kind of a solution to your rating problem will need to have a "human" component and take into account the full story of how the judges choose ratings and why. There is not a whole lot that a ranking algorithm can do if your input data is poor quality. On the other hand, if your data has decent quality, then taking a mean works quite well.
One unrelated problem with taking a mean is that an item with one 5-star rating will rank above an item with hundred 5-star ratings + one 4-star rating. One simple solution is Laplace Smoothing, which addresses the problem by effectively starting every item with one vote of each value (1,2,3,4,5). You don't display the "smoothed" values, but you use them when sorting. See How Not To Sort By Average Rating post for an alternate solution.
How about truncated mean? Here is a good explanation of the idea.
EDIT
Let's say you have votes like: [1,4,3,2,5,1,1,3,2,4].
You need to sort the array in ascending order, giving you: [1,1,1,2,2,3,3,4,4,5].
Then let's say you want to get rid of 25% of the votes, which is 3 (rounding up). You simply discard three votes from the left and from the right, giving you [2,2,3,3].
Then, use arithmetic mean to get 2.5.
EDIT 2
Depending on your database schema, you could query the database to return the votes in ascending order. Then, calculate the percentage, use array_slice() to help you (read the documentation) and calculating the arithmetic mean is the least of your concerns now.
I'm designing a site and don't know how to rate the system in terms of logic.
Outcome is I want an item with 4 stars with 1000 votes to be ranked higher than an item with 1 vote of 5 stars. However, I don't want an item with 1 star with 1000 votes to be ranked higher than an item with 4 stars and 200 votes.
Anyone have any ideas or advice on what to do?
I found these two questions
Sorting by weighted rating in SQL?
MySQL Rating System - Find Rating
and they have their drawbacks and in the first one I don't understand what the winner means by "You may want to denormalize this rating value into event for performance reasons if you have a lot of ratings coming in." Please share some insight? Thank you!
Here's a quick sketch-up of such a system which works by defining a bonus factor xₙ for each flag number. According to your question you want:
x₄*4*1000 > x₅*1*5
and
x₁*1*1000 < x₄*4*200
Setting the factors to for example x₁=1, x₄=2 and x₅=2 will satisfy this, but you will of course want to adjust it and add the missing factors.
He means, you should put rating-data into the event-table (and thus have redundant data) to optimize it for performance.
See the wiki for Denormalization: http://en.wikipedia.org/wiki/Denormalization
The data you have to determine the rank of items is:
average rating
number of ratings
The hard part is probably to make rules for the ranking. Like: If the average rating for an item > 4 and the number of ratings < 4 treat it like rated 3.9
For convenience, I would put this value (how to treat the items for ranking) in the item-table.
Is there a way to specify a sorting procedure for ORDER BY, or some kind of custom logic? What I need is to check some other data for the column being ordered, which is also in the row. For example if one column has a higher value than another, but a certain condition isn't met, it's sorted as lower. Right now I pull all the data in the column, sort it in PHP with usort(), and then paginate it, but this is a pretty bad performance hog. I would really like to move it into MySQL, is it possible? If so, how? :P
Thanks in advance!
Example of problem on the website here - the records get sorted on win percentage, but players who have 1 game played turn out on top with 100 % win. I'd like to set a threshold on games and then sort them lower, even though their win percentage is higher.
You can order by multiple expressions:
ORDER BY games_played < 10, wins / losses DESC
The first expression sorts all those players who have played 10 or more games above all the players that have playes fewer than 10 games. The second expression sorts by win/loss ratio. The second expression is only used to tie-break rows that were equal for the first expression. This means that a player who has played 10 games will always appear above a player who has played only 9 games regardless of their win/loss ratios.