I have an app that works with an idea of "redemption codes" (schema: ID, NAME, USES, CODE). And example would be "32, Stack Overflow, 75, 75%67-15hyh"
So this code is given to the SO community, let's say, and it has 75 redemptions. You redeem it by entering some shipping info and the code. When entered, this check is preformed:
if (code exists){
if (count_entries_where_code=$code < $uses_set_at_creation){
//enter information into DB for processing
{
//echo "sorry, not a real code"
}
So the number of total uses is hardcoded, but the current # of redemptions is generated with a SQL query (count_results from entry_data WHERE code=$code). This part works fine, but here is the question:
At the view page where I manage the codes, I have the basic setup (in pseudo PHP, with the real code separated into an MVC setup):
$results = "SELECT * FROM codes";
foreach ($result as $code){
echo $code->code;
echo $code->name;
//etc. It's actually all in a nice HTML table.
}
So I want to have a column listing "# of uses remaining on code". Should something like this be stored in the DB, and drawn out that way? It would be easier to generate with the foreach loop, but I don't usually prefer to store "generated" statistics like that. Is there a clever way to get those results onto the correct rows of the table created with the foreach loop?
(I'm fine with code so I don't need a working/great syntax example, just an explanation of a pattern that might fit this problem, and maybe a discussion of a common design for something like this. Am I right to avoid storing generate-able data like # of uses left? etc.)
Am I right to avoid storing generate-able data like # of uses left?
Yes, you are correct to not store computed values.
Computation logic can change, and working with a stored computed value to reverse engineer it can be a nightmare - if it is possible at all in some cases.
It sounds like you want to combine the two queries:
SELECT c.id,
c.name,
c.uses,
c.code,
x.num_used
FROM CODES c
JOIN (SELECT ed.code,
COUNT(*) 'num_used'
FROM ENTRY_DATA ed
GROUP BY ed.code) x ON x.code = c.code
When you run your query to get the codes for the page add a subquery to get the number of used codes from the entry_data table.
select codes.id, codes.name, codes.uses, codes.code (select count(code) from entry_data where entry_data.code=codes.code ) as used_codes
Id use code_id as a foreign key and not code.
This is all assuming i'm reading your problem correctly
Related
I'm working on an application which is a large database of chemical substances (approx 250,000 but rising) and associated data. I'm looking at ways to optimise the way searching is performed.
The application is running under PHP 7.0.27, MariaDB 5.5.56, and Apache 2.4.6
The application allows searching by chemical name and various chemical codes (such as EC number and CAS number). The schema is such that there are separate tables to hold the data, and the relationships of which codes apply to which chemicals.
These tables are in the database:
substances - unique ID and name for each chemical substance.
ecs - a list of EC Numbers
ecs_substances - which EC Number(s) apply to which substances
cas - a list of CAS Numbers
cas_substances - which CAS Number(s) apply to which substances
Note: there are other tables than the ones above where similar logic will apply, but for now I want to focus on these for this example.
It is possible for a substance to have multiple EC/CAS numbers, and a small number do not have them - i.e. it's not a simple 1:1 relationship.
The application has search fields for the substance name (substances.name), EC number (ecs.value) CAS number (cas.value). These can be used on their own, or in conjuction with each other. For example: find a substance by name, or find a substance by name and CAS number.
I believe the "quickest" way of performing a search for any given value would be to use a LIKE condition on the specific table required. So if I want to look up substances which have "acids" as part of the name:
SELECT id FROM substances WHERE name LIKE '%acids%' LIMIT 0,250
However the results that the application gives are shown in a table which includes headings for substance name, CAS number, EC number. It also allows the results to be ordered on a column (e.g. order by substance name, CAS, EC, etc). This requires JOIN conditions.
I'm doing it like this:
$sql = 'SELECT
DISTINCT(substances.`id`),
substances.`name`,
"" AS cas_number,
"" AS ec_number
FROM
substances ';
// Search - EC Number, or if trying to order by EC column (JOIN has to occur to make that possible)
if ( (isset($search['ecNumber'])) || (isset($order['column']) && ($order['column'] == 'ec_number')) ) {
$sql .= ' LEFT JOIN ecs_substances ON substances.id = ecs_substances.substance_id LEFT JOIN ecs ON ecs_substances.ec_id = ecs.id ';
}
// Search - CAS Number, or if trying to order by CAS column (JOIN has to occur to make that possible)
if ( (isset($search['casNumber'])) || (isset($order['column']) && ($order['column'] == 'cas_number')) ) {
$sql .= ' LEFT JOIN cas_substances ON cas_substances.substance_id = substances.id LEFT JOIN cas ON cas_substances.cas_id = cas.id ';
}
The problem is that because of all the JOINs that are occurring it's slowing down how quickly the results can be obtained.
Benchmark: The first query I posted which just uses a LIKE condition on 1 table will execute in 140ms, whereas it's taking 506ms for the same search criteria with all of the JOIN statements in the second block of code.
I'd like to know if there are ways to optimise this such that the time taken to present results to the user decreases.
It's worth mentioning that the results are displayed in DataTables and PHP is producing a JSON feed of the results. The LIMIT 0,250 is something the end user can override by setting results per page, but I'm happy to limit them to say no more than 500 per page.
Some things I've looked into are:
Caching the JSON. Not a big fan of this because the data is updated quite regularly. The data presented must always be what is in the database, not some cached copy.
Do a search on the required table as in the first code sample. Update the other columns with ajax. This would "appear" to give instant results on the column the user has searched and then quickly thereafter populate the other columns required by the DataTable. This seems incredibly fiddly to do and I don't know whether it's really a good idea.
Consider FULLTEXT because it allows for much faster searching than LIKE with a leading wildcard %. `MATCH(col) AGAINST('+acid' IN BOOLEAN MODE)
Sounds like you need a "many:many" mapping table. Tips on efficiency in such: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
Consider using GROUP_CONCAT(cas) for provideing a comma-list of CASs.
JSON does not seem practical. And even less so since you are using only MySQL 5.5.
I think a response time of half a second is quite good, given what you want to do. You must have done all necessary database optimizations? (db type, indexes, etc).
There are several things you could explore:
Prepare all possible searches and store them in a database for quick access. This may sound stupid but this is how I often achieve fast searches. It's difficult for me to judge what the best way to do this, with your data, would be. You could start by adding a TEXT column to your substances table and store all the information about the substance in it: It's name, and all EC/CAS numbers. Separate the items with something like '|', or any other character not used in searches. I would call that the 'search' column. Alternatively you could make a new table, just for searching with that column in it, and the id of the substance. Now you can make one search input field for all three types of data and search in one column only. Would that work for you? Would it be faster? Possibly, but I cannot guarantee it. I don't know, but it's quite easy to try. There is a disadvantage: You would have to update that column with every change in the database.
Use a proper search engine. Several are available for mariadb. Start at: https://mariadb.com/kb/en/library/about-sphinxse It basically does something far more advanced than what I described under point 1: Prepare a database with data for optimized searching.
Still, a response of half a second would be something I could live with.
I have a string in a database field (called term6eyfs) that is made up of numbers -> 555555.
I want to count how many of them have a particular number in a particular position.
I have tried the following code, but I'm met with a Boolean given... error
$pos=2;
$analyse_ot="SELECT COUNT(*) AS ot_count FROM base, users
WHERE base.base_id=$base
AND
users.base_id=$base
AND
users.SUBSTRING(term6eyfs,$pos,1)='4'";
$result_ot=mysqli_query($con,$analyse_ot);
$row_ot = mysqli_fetch_assoc($result_ot);//this is where I get the error
$total_ot= $row_ot['ot_count'];
$otper=($total_ot/$total)*100;
I'm guessing that they way I have constructed my query (particularly the final line) isn't correct, but why?
Based on the additional details you gave in the comment I'd say that is what you are looking for:
SELECT COUNT(*) AS ot_count
FROM base, users
WHERE base.base_id=$base
AND users.base_id=$base
AND SUBSTRING(users.term6eyfs,$pos,1)='4'";
(this assumes that "term6eyfs" is the name of a column in the users table)
The context of this query is unclear. But in general it does make sense to use "parameter binding" to inject php variable values into a query string. You want to read about that, it enhances security and robustness.
Also reconsider if you really want to use the , operator to join those two tables. That operator is extremely slow, usually a LEFT JOIN delivery a much better performance.
Change users.substring() to substring
$analyse_ot="SELECT COUNT(*) AS ot_count FROM base, users
WHERE base.base_id=$base
AND
users.base_id=$base
AND
SUBSTRING(term6eyfs,$pos,1)='4'";
I'm making a micro-blogging website. The users can follow each other. I've to make stream of posts (activity stream) for the current user ( $userid ) based on the users the current user is following, like in Twitter. I know two ways of implementing this. Which one is better?
Tables:
Table: posts
Columns: PostID, AuthorID, TimeStamp, Content
Table: follow
Columns: poster, follower
The first way, by joining these two tables:
select `posts`.* from `posts`,`follow` where `follow`.`follower`='$userid' and
`posts`.`AuthorID`=`follow`.`poster` order by `posts`.`postid` desc
The second way is by making an array of users the $userid is following (posters), then doing php implode on this array, and then doing where in:
One thing I'll like to tell here that I'm storing the the number of users a user is following in the `following` record of the `user` table, so here I'll use this number as a limit when extracting the list of posters - the 'followingList':
function followingList($userid){
$listArray=array();
$limit="select `following` from `users` where `userid`='$userid' limit 1";
$limit=mysql_query($limit);
$limit=mysql_fetch_row($limit);
$limit= (int) $limit[0];
$sql="select `poster` from `follow` where `follower`='$userid' limit $limit";
$result=mysql_query($sql);
while($data = mysql_fetch_row($result)){
$listArray[] = $data[0];
}
$posters=implode("','",$listArray);
return $posters;
}
Now I've a comma separated list of user IDs the current $userid is following.And now selecting the posts to make the activity stream:
$posters=followingList($userid);
$sql = "select * from `posts` where (`AuthorID` in ('$posters'))
order by `postid` desc";
Which of the two methods is better?
And can knowing the total number of following (number of users the current user is following), make things faster in the first method as it's doing in the second method?
Any other better method?
You should go all the way with the first option. Always try as much as possible to process the data on the mysql server instead of in your PHP code. PHP will not implicitly cache the results of the operations while MySQL will do it.
The most important thing is to make sure you index your data correctly. Try using "EXPLAIN" statements to make sure you have optimized your database as much as possible and use #1 to link your data together.
http://dev.mysql.com/doc/refman/5.0/en/explain.html
This will allow you later to compute statistics also, while the second method requires you to process a part of the statistics.
The first important point is that PHP is good at building pages but very bad are managing data, everything manipulated by PHP will fill the memory and no special behavior can be applied in PHP to prevent using to much memory, except crashing.
On the other side the datatase job is to analyse relation between the tables, real number used by the query (cardinality of indexes and statictics on rows and index usage in fact), and a lot of different mechanism can be choosen by the engine depending on the size of data (merge joins, temporary tables, etc). That means you could have 256.278.242 posts and 145.268 users, with 5.684 average followers the datatabase job would be to find the fastest way to give you an answer. Well, when you hit really big numbers you'll see that all databases are not equal, but that's another problem.
On the PHP side Retrieving the list of users from the fisrt query coudl became very long (with a big number of followed users, let's say 15.000. Simply building the query string with 15 000 identifiers inside would take a quite big amount a memory. Trasnferring this new query to the SQL server would also be slow. It's definitively the wrong way.
Now be careful of the way you build your SQL request. A request is something you should be able to read from the top to the end, explaining what you really want. This will help the SQL (good) engine in choosing the right solution.
select `posts`.*
from `posts`
INNER JOIN `follow` ON posts`.`AuthorID`=`follow`.`poster`
where `follow`.`follower`='#userid'
order by `posts`.`postid` desc
LIMIT 15
Several remarks:
I have used an INNER JOIN.I want an INNER JOIN, let's write it, it will be easier to read for me later and it should be the same for the query analyser.
if #userid is an int do not use quotes. Please use ints for identifiers (this is really faster than strings). And on the PHP side cast the int "SELECT ..." . (int) $user_id ." ORDER ... or use query with parameters (This is for security).
I have used a LIMIT 15, maybe an offset could be used as well, if you want to show some pagination control around the posts. Let's say this query will retrieve 15.263 documents from my 5.642 folowwed users, you do not want, and the user do not want, to show theses 15.263 documents on a web page. And knowing with $limit that the number is 15.263 is a good thing but certainly not for a request limit. You know this number, but the database may know it as well if it has a good query analyser and some good internal statistics.
The request limit has several goals
1. Limit the size of data transfered from the database to your PHP script
2. Limit the memory usage of your PHP script (an array with 15.263 documents containg some HTMl stuff... ouch)
3. Limit the size of the final user output (and get a faster response)
I am trying to create a script that finds a matching percentage between my table rows. For example my mySQL database in the table products contains the field name (indexed, FULLTEXT) with values like
LG 50PK350 PLASMA TV 50" Plasma TV Full HD 600Hz
LG TV 50PK350 PLASMA 50"
LG S24AW 24000 BTU
Aircondition LG S24AW 24000 BTU Inverter
As you may see all of them have some same keyword. But the 1st name and 2nd name are more similar. Additionally, 3rd and 4th have more similar keywords between them than 1st and 2nd.
My mySQL DB has thousands of product names. What I want is to find those names that have more than a percentage (let's say 60%) of similarity.
For example, as I said, 1st, 2nd (and any other name) that match between them with more than 60%, will be echoed in a group-style-format to let me know that those products are similar. 3rd and 4th and any other with more than 60% matching will be echoed after in another group, telling me that those products match.
If it is possible, it would be great to echo the keywords that satisfy all the grouped matching names. For example LG S24AW 24000 BTU is the keyword that is contained in 3rd and 4th name.
At the end I will create a list of all those keywords.
What I have now is the following query (as Jitamaro suggested)
Select t1.name, t2.name From products t1, products t2
that creates a new name field next to all other names. Excuse me that I don't know how to explain it right but this is what it does: (The real values are product names like above)
Before the query
-name-
A
B
C
D
E
After the query
-name- -name-
A A
B A
C A
D A
E A
A B
B B
C B
D B
E B
.
.
.
Is there a way either with mySQL or PHP that will find me the matching names and extract the keywords as I described above? Please share code examples.
Thank you community.
Query the DB with LIKE OR REGEXP:
SELECT * FROM product WHERE product_name LIKE '%LG%';
SELECT * FROM product WHERE product_name REGEXP "LG";
Loop the results and use similar_text():
$a = "LG 50PK350 PLASMA TV 50\" Plasma TV Full HD 600Hz"; // DB value
$b = "LG TV 50PK350 PLASMA 50\"" ; // USER QUERY
$i = similar_text($a, $b, $p);
echo("Matched: $i Percentage: $p%");
//outputs: Matched: 21 Percentage: 58.3333333333%
Your second example matches 62.0689655172%:
$a = "LG S24AW 24000 BTU"; // DB value
$b = "Aircondition LG S24AW 24000 BTU Inverter" ; // USER QUERY
$i = similar_text($a, $b, $p);
echo("Matched: $i Percentage: $p%");
You can define a percentage higher than, lets say, 40%, to match products.
Please note that similar_text() is case SensItivE so you should lower case the string.
As for your second question, the levenshtein() function (in MySQL) would be a good candidate.
When I look at your examples, I consider how I would try to find similar products based on the title. From your two examples, I can see one thing in each line that stands out above anything else: the model numbers. 50PK350 probably doesn't show up anywhere other than as related to this one model.
Now, MySQL itself isn't designed to deal with questions like this, but some bolt-on tools above it are. Part of the problem is that querying across all those fields in all positions is expensive. You really want to split it up a certain way and index that. The similarity class of Lucene will grant a high score to words that rarely appear across all data, but do appear as a high percentage of your data. See High level explanation of Similarity Class for Lucene?
You should also look at Comparison of full text search engine - Lucene, Sphinx, Postgresql, MySQL?
Scoring each word against the Lucene similarity class ought to be faster and more reliable. The sum of your scores should give you the most related products. For the TV, I'd expect to see exact matches first, then some others of the same size, then brand, then TVs in general, etc.
Whatever you do, realize that unless you alter the data structures by using another tool on top of the SQL system to create better data structures, your queries will be too slow and expensive. I think Lucene is probably the way to go. Sphinx or other options not mentioned may also be up for consideration.
This is trickier than it seems and there is information missing in your post:
How are people going to use this auto-complete function?
Is it relevant that you can find all names for a product? Because apparently not all stores name their products similarly so a clerk might not be able to find the product (s)he found.
Do you have information about which product names are for the same product?
Is it relevant from which store you're searching? where is this auto-complete used?
Should the auto-complete really only suggest products that match all the words you typed? (it's not so hard, technically, to correct typos)
I think you need a more clear picture of what you (or better yet: the users) want this auto-complete function to do.
An auto-complete function is very much a user-friendly type feature. It aids the user, possibly in a fuzzy way so there is no single right answer. You have to figure out what works best, not what is easiest to do technically.
First figure out what you want, then worry about technology.
One possible solution is to use Damerau-Levenstein distance. It could be used like this
select *
from products p
where DamerauLevenstein(p.name, '*user input here*')<=*X*
You'll have to figure out X that suites your needs best. It should be integer greater than zero. You could have it hard-coded, parameterized or calculated as needed.
The trickiest thing here is DamerauLevenstein. It has to be stored procedure, that implements Damerau-Levenstein algorithm. I don't have MySQL here, so I might write it for you later this day.
Update: MySQL does not support arrays in stored procedures, so there is no way to implement Damerau-Levenstein in MySQL, except using temporary table for each function call. And that will result in terrible performance. So you have two options: loop through the results in PHP with levenstein like Alix Axel suggests, or migrate your database to PostgreSQL, where arrays are supported.
There is also an option to create User-Defined function, but this requires writing this function in C, linking it to MySQL and possibly rebuilding MySQL, so this way you'll just add more headache.
Your approach seems sound. For matching similar products, I would suggest a trigram search. There's a pretty decent explanation of how this works along with the String::Trigram Perl module.
I would suggest using trigram search to get a list of matches, perhaps coupled with some manual review depending on how much data you have to deal with and how frequent you need to add new products. I've found this approach to work quite well in practice.
Maybe you want to find the longest common substring from the 2 strings? Then you need to compute a suffix tree for each of your strings see here http://en.wikipedia.org/wiki/Longest_common_substring_problem.
If you want to check all names against each other you need a cross join in mysql. There are many ways to achieve this:
1. Select a, b From t1, t2
2. Select a, b From t1 Join t2
3. Select a, b From t1 Cross Join t2
Then you can loop through the result. This is the same when I say create a 2d array with n^2-(n-1) elements and each element is connected with each other.
P.S.: Select t1.name, t2.name From products t1, products t2
It sounds like you've gone through all this trouble to explain a complex scenario, then said that you want to ignore the optimal answers and just get us to give you the "handshake" protocol (everything is compared to everything that hasn't been compared to it yet). So... pseudocode:
select * from table order by id
while (result) {
select * from table where id > result_id
}
That will do it.
If your database simply had a UPC code as one of it's fields, and this field was well-maintained, i.e., you could trust that it was entered correctly by the database maintainer and correctly reflected what the item was -- then you wouldn't need to do all of the work you suggest.
An even better idea might be to have a UPC field in your next database -- and constrain it as unique.
Database users attempt to put an-already-existing UPC into the database -- they get an error.
Database maintains its integrity.
And if such a database maintained its integrity -- the necessity of doing what you suggest never arises.
This probably doesn't help much with your current task (apologies) -- but for a future similar database -- you might wish to think about it...
I`d advise you to use some fulltext search engine, like sphinx. It has possibilities to implement any algorithm you want. For example, you may use "quorom" or "any" searches.
It seems that you might always want to return the shortest string?? That's more or a question than anything. But then you might have something like...
SELECT * FROM products LIMIT 1
WHERE product_name like '%LG%'
ORDER BY LENGTH(product_name) ASC
This is a clustering problem, which can be resolved by a data mining method. ( http://en.wikipedia.org/wiki/Cluster_analysis) It requires a lot of memory and computation intensive operations which is not suitable for database engine. Otherwise, separate data mining, text mining, or business analytics software wouldn't have existed.
This question is similar :) to this one:
What is the best way to implement a substring search in SQL?
Trigram can easily find similar rows, and in that question i posted a php+mysql+trigram solution.
You can use LIKE to find similar product names within the table. For example:
SELECT * FROM product WHERE product_name LIKE 'LG%';
Here is another idea (but I'm voting for levenshtein()):
Create a temporary table of all words used in names and their frequencies.
Choose range of results (most popular words are probably words like LCD or LED, most unique words could be good, they might be product actual names).
Suggest for each of result words either:
results with those words
results containing longest substring (like this: http://forums.mysql.com/read.php?10,277997,278020#msg-278020 ) of those words.
Ok, I think I was trying to implement very much similar thing. It can work the same as the google chrome address box. When you type the address it gives you the suggestions. This is what you are trying to achieve as far I am concerned.
I cannot give you exact solution to that but some advice.
You need to implement the dropdown box where someone starts to enter the product they are looking for
Then you need to get the current value of the dropdown and then run query like guy posted above. Can be "SELECT * FROM product WHERE product_name LIKE 'LG%';"
Save results of the query
Refresh the page
Add the results of the query to the dropdown
Note:
You need to save the query results somewhere like the text file with the HTML code i.e. "option" LG TS 600"/option" (add <> brackets to option of course). This values will be used for populating your option box after the page refresh. You need to set up the users session for the user to get the same results for the same user, otherwise if more users would use the search at the same time it could clash. So, with the search id and session id you can match them then. You can save it in the file or the table. Table would be more convenient. It is actually in my sense the whole subsystem for that what are you looking for.
I hope it helps.
OK - I'll get straight to the point - here's the PHP code in question:
<h2>Highest Rated:</h2>
<?php
// Our query base
$query = $this->db->query("SELECT * FROM code ORDER BY rating DESC");
foreach($query->result() as $row) {
?>
<h3><?php echo $row->title." ID: ";echo $row->id; ?></h3>
<p class="author"><?php $query2 = $this->db->query("SELECT email FROM users WHERE id = ".$row->author);
echo $query2->row('email');?></p>
<?php echo ($this->bbcode->Parse($row->code)); ?>
<?php } ?>
Sorry it's a bit messy, it's still a draft. Anyway, I researched ways to use a Ratings system - previously I had a single 'rating' field as you can see by SELECT * FROM code ORDER BY rating DESC. However I quickly realised calculating averages like that wasn't feasible, so I created five new columns - rating1, rating2, rating3, rating4, rating5. So when 5 users rating something 4 stars, rating4 says 5... does that make sense? Each ratingx column counts the number of times the rating was given.
So anyway: I have this SQL statement:
SELECT id, (ifnull(rating1,0) + ifnull(rating2,0) + ifnull(rating3,0) + ifnull(rating4,0) + ifnull(rating5,0)) /
((rating1 IS NOT NULL) + (rating2 IS NOT NULL) + (rating3 IS NOT NULL) + (rating4 IS NOT NULL) + (rating5 IS NOT NULL)) AS average FROM code
Again messy, but hey. Now what I need to know is how can I incorporate that SQL statement into my script? Ideally you'd think the overall query would be 'SELECT * FROM code ORDER BY (that really long query i just stated) DESC' but I can't quite see that working... how do I do it? Query, store the result in a variable, something like that?
If that makes no sense sorry! But I really appreciate the help :)
Jack
You should go back to the drawing board completely.
<?php
$query = $this->db->query("SELECT * FROM code ORDER BY rating DESC");
foreach($query->result() as $row) {
$this->db->query("SELECT email FROM users WHERE id = ".$row->author;
}
Anytime you see this in your code, stop what you're doing immediately. This is what JOINs are for. You almost never want to loop over the results of a query and issue multiple queries from within that loop.
SELECT code.*, users.email
FROM code
JOIN users ON users.id = code.author
ORDER BY rating DESC
This query will grab all that data in a single resultset, removing the N+1 query problem.
I'm not addressing the rest of your question until you clean up your question some and clarify what you're trying to do.
if you would like to change your tables again, here is my suggestion:
why don't you store two columns: RatingTotal and RatingCount, each user that rates it will increment RatingCount by one, and whatever they vote (5,4,4.2, etc) is added to RatingTotal. You could then just ORDER BY RatingTotal/RatingCount
also, I hope you store which users rated each item, so they don't vote multiple times! and swing the average their way.
First, I'd decide whether your application is write-heavy or read-heavy. If there are a lot more reads than writes, then you want to minimize the amount of work you do on reads (like this script, for example). On the assumption that it's read-heavy, since most webapps are, I'd suggest maintaining the combined average in a separate column and recalculating it whenever a user adds a new rating.
Other options are:
Try ordering by the calculated column name 'average'. SQL Server supports this. . not sure about mysql.
Use a view. You can create a view on your base table that does the average calculation for you and you can query against that.
Also, unrelated to your question, don't do a separate query for each user in your loop. Join the users table to the code table in the original query.
You should include it in the SELECT part:
SELECT *, (if ....) AS average FROM ... ORDER BY average
Edit: assuming that your ifnull statement actually works...
You might also want to look into joins to avoid querying the database again for every user; you can do everything in 1 select statement.
Apart from that I would also say that you only need one average and the number of total votes, that should give you all the information you need.
Some excellent ideas, but I think the best way (as sidereal said that it's more read heavy that write heavy) would be to have columns rating and times_rated, and just do something like this:
new_rating = ((times_rated * rating) + current_rating) / (times_rated + 1)
current_rating being the rating being applied when the person clicks the little stars. This simply weights the current user's rating in an average with the current rating.