need some guidance on a php point system - php

am trying to build a point system which checks how much points a user have and gives a specific title to them.
I have prepared a table which the php script can refer to when checking which title should be given to a member.
MYSQL Table structure as follows:
name: ptb
structure: pts , title
For example , if you have 100 points , you gain the title - "Veteran" , if you have 500 points , you gain the title "Pro". let's say i have pts:100 , title:veteran and pts:500 , title:pro rows in the ptb table.
However i stumble upon a confusing fact.
How can i use php to determine which title to give the user by using the ptb table data?
If a user have equal or more than 100 points will gain Veteran for title BUT 500 is also MORE THAN 100 which means the php script also needs to make sure it is below 500pts .
I still not sure how to use php to do this. as i am confused myself.
I hope someone could understand and provide me some guidelines.
THANKS!

You select all records with enought points, sort the one with the highest score to the top and cut out the rest.
SELECT title FROM ptb WHERE pts <= $points ORDER BY pts DESC LIMIT 1

(PiTheNumber's solution doesn't work very well if you want to retrieve titles for multiple users)
Since the points will change over time and the mutliple users can have the same title, it sounds like this should be 2 tables:
CREATE TABLE users (
userid ...whatever type,
points INTEGER NOT NULL DEFAULT 0
PRIMARY KEY(userid)
);
CREATE TABLE titles (
title VARCHAR(50),
minpoints INTEGER NOT NULL DEFAULT 0
PRIMARY KEY (title),
UNIQUE INDEX (minpoints)
);
Then....
SELECT u.userid, u.points, t.title
FROM users u, titles t
WHERE u.points>=t.minpoints
AND ....other criteria for filtering output....
AND NOT EXISTS (
SELECT 1
FROM titles t2
WHERE t2.minpoints>=u.points
AND t2.minpoints<=t.minpoints
);
(there are other ways to write the query)

Related

Pagination Offset Issues - MySQL

I have an orders grid holding 1 million records. The page has pagination, sort and search options. So If the sort order is set by customer name with a search key and the page number is 1, it is working fine.
SELECT * FROM orders WHERE customer_name like '%Henry%' ORDER BY
customer_name desc limit 10 offset 0
It becomes a problem when the User clicks on the last page.
SELECT * FROM orders WHERE customer_name like '%Henry%' ORDER BY
customer_name desc limit 10 offset 100000
The above query takes forever to load. Index is set to the order id, customer name, date of order column.
I can use this solution https://explainextended.com/2009/10/23/mysql-order-by-limit-performance-late-row-lookups/ if I don't have a non-primary key sort option, but in my case sorting is user selected. It will change from Order id, customer name, date of order etc.
Any help would be appreciated. Thanks.
Problem 1:
LIKE "%..." -- The leading wildcard requires a full scan of the data, or at least until it finds the 100000+10 rows. Even
... WHERE ... LIKE '%qzx%' ... LIMIT 10
is problematic, since there probably not 10 such names. So, a full scan of your million names.
... WHERE name LIKE 'James%' ...
will at least start in the middle of the table-- if there is an index starting with name. But still, the LIMIT and OFFSET might conspire to require reading the rest of the table.
Problem 2: (before you edited your Question!)
If you leave out the WHERE, do you really expect the user to page through a million names looking for something?
This is a UI problem.
If you have a million rows, and the output is ordered by Customer_name, that makes it easy to see the Aarons and the Zywickis, but not anyone else. How would you get to me (James)? Either you have 100K links and I am somewhere near the middle, or the poor user would have to press [Next] 'forever'.
My point is that the database is not the place to introduce efficiency.
In some other situations, it is meaningful to go to the [Next] (or [Prev]) page. In these situations, "remember where you left off", then use that to efficiently reach into the table. OFFSET is not efficient. More on Pagination
I use a special concept for this. First I have a table called pager. It contains an primary pager_id, and some values to identify a user (user_id,session_id), so that the pager data can't be stolen.
Then I have a second table called pager_filter. I consist of 3 ids:
pager_id int unsigned not NULL # id of table pager
order_id int unsigned not NULL # store the order here
reference_id int unsigned not NULL # reference into the data table
primary key(pager_id,order_id);
As first operation I select all records matching the filter rules from and insert them into pager_filter
DELETE FROM pager_filter WHERE pager_id = $PAGER_ID;
INSERT INTO pager_filter (pager_id,order_id,reference_id)
SELECT $PAGER_ID pager_id, ROW_NUMBER() order_id, data_id reference_id
FROM data_table
WHERE $CONDITIONS
ORDER BY $ORDERING
After filling the filter table you can use an inner join for pagination:
SELECT d.*
FROM pager_filter f
INNER JOIN data_table d ON d.data_id = f.reference id
WHERE f.pager_id = $PAGER_ID && f.order_id between 100000 and 100099
ORDER BY f.order_id
or
SELECT d.*
FROM pager_filter f
INNER JOIN data_table d ON d.data_id = f.reference id
WHERE f.pager_id = $PAGER_ID
ORDER BY f.order_id
LIMIT 100 OFFSET 100000
Hint: All code above is not tested pseudo code

Caching big data, alternative query or other indexes?

I'm with a problem, I am working on highscores, and for those highscores you need to make a ranking based on skill experience and latest update time (to see who got the highest score first incase skill experience is the same).
The problem is that with the query I wrote, it takes 28 (skills) x 0,7 seconds to create a personal highscore page to see what their rank is on the list. Requesting this in the browser is just not doable, it takes way too long for the page to load and I need a solution for my issue.
MySQL version: 5.5.47
The query I wrote:
SELECT rank FROM
(
SELECT hs.playerID, (#rowID := #rowID + 1) AS rank
FROM
(
SELECT hs.playerID
FROM highscores AS hs
INNER JOIN overall AS o ON hs.playerID = o.playerID
WHERE hs.skillID = ?
AND o.game_mode = ?
ORDER BY hs.skillExperience DESC,
hs.updateTime ASC
) highscore,
(SELECT #rowID := 0) r
) data
WHERE data.playerID = ?
As you can see I first have to create a whole resultset that gives me a full ranking for that game mode and skill, and then I have to select the rank based on the playerID after that, the problem is that I cannot let the query run untill it finds the result, because mysql doesn't offer such function, if I'd specifiy where data.playerID = ? in the query above, it would give back 1 result, meaning the ranking will be 1 as well.
The highscores table has 550k rows
What I have tried was storing the resultset for each skillid/gamemode combination in a temp table json_encoded, tried storing on files, but it ended up being quite slow as well, because the files are really huge and it takes time to process.
Highscores table:
CREATE TABLE `highscores` (
`playerID` INT(11) NOT NULL,
`skillID` INT(10) NOT NULL,
`skillLevel` INT(10) NOT NULL,
`skillExperience` INT(10) NOT NULL,
`updateTime` BIGINT(20) NOT NULL,
PRIMARY KEY (`playerID`, `skillID`)
)
COLLATE='utf8_general_ci'
ENGINE=MyISAM;
Overall table has got 351k rows
Overall table:
CREATE TABLE `overall` (
`playerID` INT(11) NOT NULL,
`playerName` VARCHAR(50) NOT NULL,
`totalLevel` INT(10) NOT NULL,
`totalExperience` BIGINT(20) NOT NULL,
`updateTime` BIGINT(20) NOT NULL,
`game_mode` ENUM('REGULAR','IRON_MAN','IRON_MAN_HARDCORE') NOT NULL DEFAULT 'REGULAR',
PRIMARY KEY (`playerID`, `playerName`)
)
COLLATE='utf8_general_ci'
ENGINE=MyISAM;
Explain Select result from the query:
Does anybody have a solution for me?
No useful index for WHERE
The last 2 lines of the EXPLAIN (#3 DERIVED):
WHERE hs.skillID = ?
AND o.game_mode = ?
Since neither table has a suitable index to use for the WHERE clause, to optimizer decided to do a table scan of one of them (overall), then reach into the other (highscores). Having one of these indexes would help, at least some:
highscores: INDEX(skillID)
overall: INDEX(game_mode, ...) -- note that an index only on a low-cardinality ENUM is rarely useful.
(More in a minute.)
No useful index for ORDER BY
The optimizer sometimes decides to use an index for the ORDER BY instead of for the WHERE. But
ORDER BY hs.skillExperience DESC,
hs.updateTime ASC
cannot use an index, even though both are in the same table. This is because DESC and ASC are different. Changing ASC to DESC would have an impact on the resultset, but would allow
INDEX(skillExperience, updateTime)
to be used. Still, this may not be optimal. (More in a minute.)
Covering index
Another form of optimization is to build a "covering index". That is an index that has all the columns that the SELECT needs. Then the query can be performed entirely in the index, without reaching over to the data. The SELECT in question is the innermost:
( SELECT hs.playerID
FROM highscores AS hs
INNER JOIN overall AS o ON hs.playerID = o.playerID
WHERE hs.skillID = ?
AND o.game_mode = ?
ORDER BY hs.skillExperience DESC, hs.updateTime ASC
) highscore,
For hs: INDEX(skillID, skillExperience, updateTime, playerID) is "covering" and has the most important item (skillID, from the WHERE) first.
For o: INDEX(game_mode, playerID) is "covering". Again, game_mode must be first.
If you change the ORDER BY to be DESC and DESC, then add another index for hs: INDEX(skillExperience, updateTime, skillID, playerID). Now the first 2 columns must be in that order.
Conclusion
It is not obvious which of those indexes the optimizer would prefer. I suggest you add both and let it choose.
I believe that (1) the innermost query is consuming the bulk of time, and (2) there is nothing to optimize in the outer SELECTs. So, I leave that as my recommendation.
Much of this is covered in my Indexing Cookbook.
Important subanswer: How frequently change rank of all players? Hmm.. Need explain.. You want realtime statistics? No, you dont want realtime )) You must select time interval for update statistics, e.g. 10 minutes. For this case you can run cronjob for insert new rank statistics into separated table like this:
/* lock */
TRUNCATE TABLE rank_stat; /* maybe update as unused/old for history) instead truncate */
INSERT INTO rank_stat (a, b, c, d) <your query here>;
/* unlock */
and users (browsers) will select readonly statistics from this table (can be split to pages).
But if rank stat not frequently change, e.g. you can recalculate it for all wanted game events and/or acts/achievs of players.
This is recommedations only. Because you not explain full environment. But I think you can found right solution with this recommendations.
It doesn't look like you really need to rank everyone, you just want to find out how many people are ahead of the current player. You should be able to get a simple count of how many players have better scores & dates than the current player which represents the current player's ranking.
SELECT count(highscores.id) as rank FROM highscores
join highscores playerscore
on playerscore.skillID = highscores.skillID
and playerscore.gamemode = highscores.gamemode
where highscores.skillID = ?
AND highscores.gamemode = ?
and playerscore.playerID = ?
and (highscores.skillExperience > playerscore.skillExperience
or (highscores.skillExperience = playerscore.skillExperience
and highscores.updateTime > playerscore.updateTime));
(I joined the table to itself and aliased the second instance as playerscore so it was slightly less confusing)
You could probably even simplify it to one query by grouping and parsing the results within your language of choice.
SELECT
highscores.gamemode as gamemode,
highscores.skillID as skillID,
count(highscores.id) as rank
FROM highscores
join highscores playerscore
on playerscore.skillID = highscores.skillID
and playerscore.gamemode = highscores.gamemode
where playerscore.playerID = ?
and (highscores.skillExperience > playerscore.skillExperience
or (highscores.skillExperience = playerscore.skillExperience
and highscores.updateTime > playerscore.updateTime));
group by highscores.gamemode, highscores.skillID;
Not quite sure about the grouping bit though.

Calculating top 5 users with the most points in the current category or category below it

I have a points system on my website and points are gained for different achievements within the different website categories. Categories can have sub-categories and parent categories using the 'category_relations' table with 'links_to' and 'links_from' fields holding the relevant category_id's.
What i'd like to do is fetch the top 5 users with the most points in the current category and any categories directly below it.
My 'points_awarded' table has all the records of any points awarded and to what users from what categories:
user_id,
points_amount,
plus (tinyint boolean if it's points added or not),
minus (tinyint boolean if it's point penalty or not),
category_id
I don't really know where to start with this. Will i need to have two queries, one to fetch all sub category id's, and then one to use that to run another query to fetch SUM() of the points used? Is it possible to do it in one query?
I'd need more information about your database tables to be sure but something like this will probably work:
SELECT
`user_id`,
SUM(IF(`plus`, `points_amount`, 0)) - SUM(IF(`minus`, `points_amount`, 0)) AS `points`
FROM
`points_awarded`
WHERE
`user_id` = $user_id
AND (
`category_id` = $cat_id
OR `category_id` IN(
SELECT `links_to`
FROM `category_relations`
WHERE `links_from` = $cat_id
)
)
GROUP BY `user_id`
I'm curious though, why do you have a plus field and a minus field? If plus is false, can't we assume it's minus? Why have either field anyway, why not just make points_amount a signed field?

Need support with figuring out a query for a cron job

I have a voting system for articles. Articles are stored in 'stories' table and all votes are stored in 'votes' table. id in 'stories' table is equal to item_name in 'votes' table (therefore each vote is related to article with item_name).
I want to make it so when sum of votes gets to 10 it updates 'showing' field in 'stories' table to value of "1".
I was thinking about setting up a cron job that runs every hour to check all posts that have a showing = 0. If showing = 0 than it will sum up votes related to that article and set showing = 1 if sum of votes >= 10. I'm not sure if it is efficient as it might take up a lot of server resources, not sure.
So could anyone suggest a cron job that could do the task?
Here is my database structure:
Stories table
Votes table
Edit:
For example this row from 'stories' table:
id| 12
st_auth | author name
st_date | story date
st_title| story title
st_category| story category
st_body| story body
showing| 0 for unaproved and 1 for approved
This row is related to this one from 'votes' table
id| 83
item_name| 12 (id of article)
vote_value| 1 for upvote -1 for downvote
...
Couple of things:
Why did you name the column item_name in the votes table, when it is actually the id of the article table? I would recommend making this a match on the article table in that it is an int(11) vs a var_char(255). Also, you should add a foreign key constraint to the votes table, so if an article is ever deleted, you don't orphan a row in the votes table.
Why is the vote_value column an int(11)? If it can only be two states (1, or -1) you can do a tinyint(1) signed (for the -1).
The ip column in the votes table is a bit concerning. If you are regulating 'unique' votes by ip, did you account for proxy ips? Something like this should be handled at the account level, so several users from the same proxy IP can issue individual votes.
I wouldn't do a cronjob for determining whether the showing column should be flagged 0 or 1. Rather, I would issue a count every time a vote was cast against the article. So if someone up-voted or down-voted, calculate the new value of the story, and store it in cache for future reads.
Using this query, you get a list of all articles plus a column containing the count of associated votes.
SELECT s.*, SUM(v.vote_value) AS votes_total
FROM stories AS s INNER JOIN votes AS v
ON v.item_name = s.id
GROUP BY v.vote
This way, you can create a view from which you can filter on votes_total > 10, without need of the cron job.
Or you can use it as a normal query, something like this:
SELECT * FROM (
SELECT s.*, SUM(v.vote_value) AS votes_total
FROM stories AS s INNER JOIN votes AS v
ON v.item_name = s.id
GROUP BY v.vote
) WHERE votes_total > 10;
I would use a trigger (insert trigger) and handle your logic there (in the database itself)?
This would remove the poll code altogether (cron job).
I would also keep your foreign key (in VOTES) the same (at least the type) as the primary key (in STORIES)?
Using a trigger instead of polling will be much cleaner in the long run.
You don't specify your database, but in TSQL (for SQL Server) it could be close to this
CREATE TRIGGER myTrigger
ON VOTES
FOR INSERT
AS
DECLARE #I INT --HOLDS COUNT OF VOTES
DECLARE #IN VARCHAR(255) --HOLDS FK ID FOR LOOKUP INTO STORIES IF UPDATE REQUIRED
SELECT #IN = ITEM_NAME FROM INSERTED
SELECT #I = COUNT(*) FROM VOTES WHERE ITEM_NAME = #IN
IF (#I >= 10)
BEGIN
UPDATE STORIES SET SHOWING = 1 WHERE ID = #IN --This is why your PK/FK should be refactored
END

Designing "relevance-based" search?

In my application (PHP/MySQL/JS), I have a search functionality built in. One of the search criteria contains checkboxes for various options, and as such, some results would be more relevant than others, should they contain more or less of each option.
i.e. Options are A and B, and if I search for both options A and B, Result 1 containing only option A is 50% relevent, while Result 2 containing both options A and B is 100% relevant.
Prior, I'd just be doing simple SQL queries based on form input, but this one's a little harder, since it's not as simple as data LIKE "%query%", but rather, some results are more valuable to some search queries, and some aren't.
I have absolutely no idea where to begin... does anybody have relevant (ha!) reading material to direct me to?
Edit: After mulling it over, I'm thinking something involving an SQL script to get the raw data, followed by many many rounds of parsing is something I'd have to do...
Nothing cacheable, though? :(
have a look at the lucence project
it is available in many languages
this is the php port
http://framework.zend.com/manual/en/zend.search.lucene.html
it indexes the items to search and returns the relevant weighted search results, eg better then select x from y where name like '%pattern%' style searching
What you need is a powerful search engine, like solr. While you could implement this on top of mysql, it's already provided out of the box with other tools.
Here's an idea: do the comparisons and sum the results. The higher the sum, the more criteria match.
How about a (stupid) table like this:
name
dob_year
dob_month
dob_day
Find the person who shares the most of the three date components with 3/15/1980:
SELECT (dob_year = 1980) + (dob_month = 3) + (dob_day = 15) as strength, name
from user
order by strength desc
limit 1
A good WHERE clause and index would be required to keep you from doing a table scan, but...
You could even add a weight to a column, e.g.
SELECT ((dob_year = 1980)*2)
Good luck.
Given your answer to my comment, here's an example on how you might do it:
First the tables:
CREATE TABLE `items` (
`id` int(11) NOT NULL,
`name` varchar(80) NOT NULL
);
CREATE TABLE `criteria` (
`cid` int(11) NOT NULL,
`option` varchar(80) NOT NULL,
`value` int(1) NOT NULL
);
Then an example of some items and criteria:
INSERT INTO items (id, name) VALUES
(1,'Name1'),
(2,'Name2'),
(3,'Name3');
INSERT INTO criteria VALUES
(1,'option1',1) ,(1,'option2',1) ,(1,'option3',0),
(2,'option1',0) ,(2,'option2',1) ,(2,'option3',1),
(3,'option1',1) ,(3,'option2',0) ,(3,'option3',1);
This would create 3 items and 3 options and assign options to them.
Now there are multiple way you can order by a certain "strength". The simplest of which would be:
SELECT i . * , c1.value + c3.value AS strength
FROM items i
JOIN criteria c1 ON c1.cid = i.id AND c1.option = 'option1'
JOIN criteria c3 ON c3.cid = i.id AND c3.option = 'option3'
ORDER BY strength DESC
This would show you all the items that have option 1 or option 3 but those with both options would appear to be ranked "higher.
This works well if you're doing a search on 2 options. But let's assume you make a search on all 3 options. All the items now share the same strength, this is why it's important to assign "weights" to options.
You could make the value your strength, but that might not help you if your queries don't always assign the same weights to the same options everywhere. This can be easily achieved on a per-query basis with the following query:
SELECT i.* , IF(c1.value, 2, 0) + IF(c3.value, 1, 0) AS strength
FROM items i
JOIN criteria c1 ON c1.cid = i.id AND c1.option = 'option1'
JOIN criteria c3 ON c3.cid = i.id AND c3.option = 'option3'
ORDER BY strength DESC
Try the queries out and see if it's what you need.
I would also like to note that this is not the best solution in terms of processing power. I'd recommend you add indexes, make the option field an integer, cache results wherever possible.
Leave a comment if you have any questions or anything to add.

Categories