Completely arbitrary sort order in MySQL with PHP

Completely arbitrary sort order in MySQL with PHP - php

I have a table in MySQL that I'm accessing from PHP. For example, let's have a table named THINGS:
things.ID - int primary key
things.name - varchar
things.owner_ID - int for joining with another table
My select statement to get what I need might look like:
SELECT * FROM things WHERE owner_ID = 99;
Pretty straightforward. Now, I'd like users to be able to specify a completely arbitrary order for the items returned from this query. The list will be displayed, they can then click an "up" or "down" button next to a row and have it moved up or down the list, or possibly a drag-and-drop operation to move it to anywhere else. I'd like this order to be saved in the database (same or other table). The custom order would be unique for the set of rows for each owner_ID.
I've searched for ways to provide this ordering without luck. I've thought of a few ways to implement this, but help me fill in the final option:
Add an INT column and set it's value to whatever I need to get rows
returned in my order. This presents the problem of scanning
row-by-row to find the insertion point, and possibly needing to
update the preceding/following rows sort column.
Having a "next" and "previous" column, implementing a linked list.
Once I find my place, I'll just have to update max 2 rows to insert
the row. But this requires scanning for the location from row #1.
Some SQL/relational DB trick I'm unaware of...
I'm looking for an answer to #3 because it may be out there, who knows. Plus, I'd like to offload as much as I can on the database.

From what I've read you need a new table containing the ordering of each user, say it's called *user_orderings*.
This table should contain the user ID, the position of the thing and the ID of the thing. The (user_id, thing_id) should be the PK. This way you need to update this table every time but you can get the things for a user in the order he/she wants using ORDER BY on the user_orderings table and joining it with the things table. It should work.

The simplest expression of an ordered list is: 3,1,2,4. We can store this as a string in the parent table; so if our table is photos with the foreign key profile_id, we'd place our photo order in profiles.photo_order. We can then consider this field in our order by clause by utilizing the find_in_set() function. This requires either two queries or a join. I use two queries but the join is more interesting, so here it is:
select photos.photo_id, photos.caption
from photos
join profiles on profiles.profile_id = photos.profile_id
where photos.profile_id = 1
order by find_in_set(photos.photo_id, profiles.photo_order);
Note that you would probably not want to use find_in_set() in a where clause due to performance implications, but in an order by clause, there are few enough results to make this fast.

Related

storing sum() results in database vs calculating during runtime

I'm new to sql & php and unsure about how to proceed in this situation:
I created a mysql database with two tables.
One is just a list of users with their data, each having a unique id.
The second one awards certain amounts of points to users, with relevant columns being the user id and the amount of awarded points. This table is supposed to get new entries regularly and there's no limit to how many times a single user can appear in it.
On my php page I now want to display a list of users sorted by their point total.
My first approach was creating a "points_total" column in the user table, intending to run some kind of query that would calculate and update the correct total for each user every time new entries are added to the other table. To retrieve the data I could then use a very simple query and even use sql's sort features.
However, while it's easy to update the total for a specific user with the sum where function, I don't see a way to do that for the whole user table. After all, plain sql doesn't offer the ability to iterate over each row of a table, or am I missing a different way?
I could probably do the update by going over the table in php, but then again, I'm not sure if that is even a good approach in the first place, because in a way storing the point data twice (the total in one table and then the point breakdown with some additional information in a different table) seems redundant.
A different option would be forgoing the extra column, and instead calculating the sums everytime the php page is accessed, then doing the sorting stuff with php. However, I suppose this would be much slower than having the data ready in the database, which could be a problem if the tables have a lot of entries?
I'm a bit lost here so any advice would be appreciated.

To get the total points awarded, you could use a query similar to this:
SELECT
`user_name`,
`user_id`,
SUM(`points`.`points_award`) as `points`,
COUNT(`points`.`points_award`) as `numberOfAwards`
FROM `users`
JOIN `points`
ON `users`.`user_id` = `points`.`user_id`
GROUP BY `users`.`user_id`
ORDER BY `users`.`user_name` // or whatever users column you want.

MySQL : For big storage, should I use a single heavy column or a table with thousand of rows?

I build a like system for a website and I'm front of a dilemma.
I have a table where all the items which can be liked are stored. Call it the "item table".
In order to preserve the speed of the server, do I have to :
add a column in the item table.
It means that I have to search (with a regex in my PHP) inside a string where all the ID of the users who have liked the item are registered, each time a user like an item. This in order verify if the user in question has (or not) already liked the item before. In this case, I show a different button on my html.
Problem > If I have (by chance) 3000 liked on an item, I fear the string to begin very big and heavy to regex each time ther is a like
on it...
add a specific new table (LikedBy) and record each like separately with the ID of the liker, the name of the item and the state of the like (liked or not).
Problem > In this case, I fear for the MySQL server with thousand of rows to analyze each time a new user like one popular item...
Server version: 5.5.36-cll-lve MySQL Community Server (GPL) by Atomicorp
Should I put the load on the PHP script or the MySql Database? What is the most performant (and scalable)?
If, for some reasons, my question does not make sens could anyone tell me the right way to do the trick?
thx.

You have to create another table call it likes_table containing id_user int, id_item int that's how it should be done, if you do like your proposed first solution your database won't be normalized and you'll face too many issues in the future.
To get count of like you just have to
SELECT COUNT(*) FROM likes_table WHERE id_item='id_item_you_are_looking_for';
To get who liked what:
SELECT id_item FROM likes_table WHERE id_user='id_user_you_are_looking_for';
No regex needed nothing, and your database is well normalized for data to be found easily. You can tell mysql to index id_user and id_item making them unique in likes_table this way all your queries will run much faster

With MySQL you can set the user ID and the item ID as a unique pair. This should improve performance by a lot.
Your table would have these 2 columns: item id, and user id. Every row would be a like.

SQL: Deleting old records only so long as there are not newer matching records?

I've got a really big collection of data in a postgres database where I'd like to nuke data past a particular age... but I do not want it nuking the latest iteration of data from any given location & site combination.
Basically, I've got a really big table that has a location (bigint), site (bigint), readdate (bigint), and a little accompanying data (note: there will be multiple entries for a given site, location, and readdate - but anything on the same readdate is considered part of the same scan, and needs to be kept for a given location).
Currently, I've just got it set to get rid of all old records... but the possibility exists that a particular site and location combination will stop giving out data for a while, and I'd like to preserve the final state if that happens. I'm doing the SQL queries from php, so I'm pretty sure I could hack together some highly ugly code that finds the latest readdate for any given site & location combination, then either deletes stuff younger than that for that location, or deletes based on the calender limit (whichever gives the lesser date), but I'd prefer to put the decision-making workload in the SQL query, rather than having to first get a list of all location, site, and max(readdate) entries, then iterate over them in php making individual delete queries.
My current query (which doesn't do what I want, as it deletes everything before $limit) is declared by:
$query="DELETE FROM votwdata WHERE readdate < '".$limit."';";
any ideas for a good revision?

If I understand what you are trying to do, you have a number of fields that might be the same, and you want to keep the most recent record. Assuming you have a sequential ID or a created_at on each record, you can run a subquery to identify the records you want to delete. For example:
select max(id),data1,data2 from table group by data1,data2;
That will pull the most recent record for a unique data1 and data2. You can run that as an inline query, joining it back to the original table.
select t.* from table t, (select max(id) "id",data1,data2 from table group by data1,data2) t2
where t.id=t2.id;
That will give you the most recent records. You can do an left join and look at the null values to delete anything that you don't like.
select t.id,t2.id
from table t left join (select max(id) "id",data1,data2 from table group by 2,3) t2 on t.id=t2.id
where t2.id is null;
That gives you all the records that you want to delete.
Okay, that's the dirty way - refactor away.

CREATE VIEW for MYSQL for last 30 days

I know i am writing query's wrong and when we get a lot of traffic, our database gets hit HARD and the page slows to a grind...
I think I need to write queries based on CREATE VIEW from the last 30 days from the CURDATE ?? But not sure where to begin or if this will be MORE efficient query for the database?
Anyways, here is a sample query I have written..
$query_Recordset6 = "SELECT `date`, title, category, url, comments
FROM cute_news
WHERE category LIKE '%45%'
ORDER BY `date` DESC";
Any help or suggestions would be great! I have about 11 queries like this, but I am confident if I could get help on one of these, then I can implement them to the rest!!

Putting a wildcard on the left side of a value comparison:
LIKE '%xyz'
...means that an index can not be used, even if one exists. Might want to consider using Full Text Searching (FTS), which means adding full text indexing.
Normalizing the data would be another step to consider - categories should likely be in a separate table.

SELECT `date`, title, category, url, comments
FROM cute_news
WHERE category LIKE '%45%'
ORDER BY `date` DESC
The LIKE '%45%' means a full table scan will need to be performed. Are you perhaps storing a list of categories in the column? If so creating a new table storing category and news_article_id will allow an index to be used to retrieve the matching records much more efficiently.

OK, time for psychic debugging.
In my mind's eye, I see that query performance would be improved considerably through database normalization, specifically by splitting the category multi-valued column into a a separate table that has two columns: the primary key for cute_news and the category ID.
This would also allow you to directly link said table to the categories table without having to parse it first.
Or, as Chris Date said: "Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else)."

Anything with LIKE '%XXX%' is going to be slow. Its a slow operation.
For something like categories, you might want to separate categories out into another table and use a foreign key in the cute_news table. That way you can have category_id, and use that in the query which will be MUCH faster.
Also, I'm not quite sure why you're talking about using CREATE VIEW. Views will not really help you for speed. Not unless its a materialized view, which MySQL doesn't suppose natively.

If your database is getting hit hard, the solution isn't to make a view (the view is still basically the same amount of work for the database to do), the solution is to cache the results.
This is especially applicable since, from what it sounds like, your data only needs to be refreshed once every 30 days.

I'd guess that your category column is a list of category values like "12,34,45,78" ?
This is not good relational database design. One reason it's not good is as you've discovered: it's incredibly slow to search for a substring that might appear in the middle of that list.
Some people have suggested using fulltext search instead of the LIKE predicate with wildcards, but in this case it's simpler to create another table so you can list one category value per row, with a reference back to your cute_news table:
CREATE TABLE cute_news_category (
news_id INT NOT NULL,
category INT NOT NULL,
PRIMARY KEY (news_id, category),
FOREIGN KEY (news_id) REFERENCES cute_news(news_id)
) ENGINE=InnoDB;
Then you can query and it'll go a lot faster:
SELECT n.`date`, n.title, c.category, n.url, n.comments
FROM cute_news n
JOIN cute_news_category c ON (n.news_id = c.news_id)
WHERE c.category = 45
ORDER BY n.`date` DESC

Any answer is a guess, show:
- the relevant SHOW CREATE TABLE outputs
- the EXPLAIN output from your common queries.
And Bill Karwin's comment certainly applies.
After all this & optimizing, sampling the data into a table with only the last 30 days could still be desired, in which case you're better of running a daily cronjob to do just that.

What is the best approach to list a user's recent activities in PHP/MySQL?

I want to list the recent activities of a user on my site without doing too many queries. I have a table where I list all the things the user did with the date.
page_id - reference_id - reference_table - created_at - updated_at
The reference_id is the ID I need to search for in the reference_table (example: comments). If I would do a SELECT on my activity table I would then have to query:
SELECT * FROM reference_table where id = reference_id LIMIT 1
An activity can be a comment, a page update or a subscription. Depending which one it is, I need to fetch different data from other tables in my database
For example if it is a comment, I need to fetch the author's name, the comment, if it is a reply I need to fetch the orignal comment username, etc.
I've looked into UNION keyword to union all my tables but I'm getting the error
1222 - The used SELECT statements have a different number of columns
and it seems rather complicated to make it work because the amount of columns has to match and none of my table has the same amount of tables and I'm not to fond of create column for the fun of it.
I've also looked into the CASE statement which also requires the amount of columns to match if I remember correctly (I could be wrong for this one though).
Does anyone has an idea of how I could list the recent activities of a user without doing too many queries?
I am using PHP and MySQL.

You probably want to split out the different activities into different tables. This will give you more flexiblity on how you query the data.
If you choose to use UNION, make sure that the you use the same number of columns in each select query that the UNION is comprised of.
EDIT:
I was down-voted for my response, so perhaps I can give a better explanation.
Split Table into Separate Tables and UNION
I recommended this technique, because it will allow you to be more explicit about the resources for which you are querying. Having a single table for inserting is convenient, but you will always have to do separate queries to join with other tables to get meaningful information. Also, you database schema will be obfuscated by a single column being a foreign key for different tables depending on the data stored in that row.
You could have tables for comment, update and subscription. These would have their own data which could be queried on individually. If, say, you wanted to look at ALL user activity, you could somewhat easily use a UNION as follows:
(SELECT 'comment', title, comment_id AS id, created FROM comment)
UNION
(SELECT 'update', title, update_id as id, created FROM update)
UNION
(SELECT 'subscription', title, subscription_id as id, created
FROM subscription)
ORDER BY created desc
This will provide you with a listing view. You could then link to the details of each type or load it on an ajax call.
You could accomplish this with the method that you are currently using, but this will actually eliminate the need for the 'reference_table' and will accomplish the same thing in a cleaner way (IMO).

The problem is that UNION should be used just to get similar recordsets together. If you try to unify two different queries (for example, with different columns being fetched) it's an error.
If the nature of the queries is different (having different column count, or data types) you'll need to make several different queries and treat them all separately.
Another approach (less elegant, I guess) would be LEFT JOINing your activities table with all the others, so you'll end up with a recordset with a lot of columns, and you'll need to check for each row which columns should be used depending on the activity nature.
Again, I'd rather stick with the first one, since the second procudes a rather sparse recorset.

With UNION you don't have to get all of the columns from each table, just as long as all of the columns have the same datatypes.
So you could do something like this:
SELECT name, comment as description
FROM Comments
UNION
SELECT name, reply as description
FROM Replies
And it wouldn't matter if Comments and Replies have the same number of columns.

This really depends on the amount of traffic on your site. The union approach is a straightforward and possibly the correct one, logically, but you'll suffer on the performance if your site is heavily loaded since the indexing of a UNIONed query is hard.
Joining might be good, but again, in terms of performance and code clarity, it's not the best of ways.
Another totally different approach is to create an 'activities' table, which will be updated with activity (in addition to the real activity, just for this purpose). In old terms of DB correctness, you should avoid this approach since it will create duplicate data on your system, I, however, found it very useful in terms of performance.
[Another side note about the UNION approach if you decide to take it: if you have difference in parameters length, you can SELECT bogus parameters on some of the unions, for example.. (SELECT UserId,UserName FROM users) UNION (SELECT 0,UserName from notes)

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.