I have a SQLite DB with about 24k records in one of the tables, 15 in the other. The table with 15 records holds information about forms that need to be completed by users (roughly 1k users). The table with 24k records holds information about which forms have been completed by who, and when. When a user logs in, there is about a ~3/4 second wait time while the queries run to determine what the user has finished so far. Too long for my client. I know I can't be doing my queries in the best way, because they are contained within a loop. But I cannot seem to figure out how to optimize my query.
The queries run as follows:
1) Select all of the forms and information
$result = $db->query("SELECT * FROM tbl_forms");
while($row = $result->fetchArray()){
//Run other query 2 here
}
2) For each form/row, run a query that figures out what is the most recent completion information about that form for the user.
$complete = $db->querySingle("SELECT * FROM tbl_completion AS forms1
WHERE userid='{$_SESSION['userid']}' AND form_id='{$row['id']}' AND forms1.id IN
(SELECT MAX(id) FROM tbl_completion
GROUP BY tbl_completion.userid, tbl_completion.form_id)", true);
There are 15 forms, so there is a total of 16 queries running. However, with my table structure, I'm unsure as how to get the "most recent" (aka max form id) form information using 1 joined query instead.
My table structure looks like so:
tbl_forms:
id | form_name | deadline | required | type | quicklink
tbl_completion:
id | userid | form_id | form_completion | form_path | timestamp | accept | reject
Edit: Index on tbl_forms (id), Index on tbl_forms (id, form_name), Index on tbl_complete (id)
I've tried using a query that is like:
SELECT * FROM tbl_completion AS forms1
LEFT OUTER JOIN tbl_forms ON forms1.form_id = tbl_forms.id
WHERE forms1.userid='testuser' AND forms1.id IN
(SELECT MAX(id) FROM tbl_completion GROUP BY tbl_completion.userid, tbl_completion.form_id)
Which will give me the most up-to-date information about the forms completed, as well as the form information, but the only problem with this is I need to output all the forms in a table (like: Form 1-Incomplete, Form 2-Completed, etc) I cannot seem to figure out how to get it to work with the left table being tbl_forms and getting all form info, as well as "latest" form tbl_completion info. I also tried doing a 3 LOJ with the last "table" as a temp table holding the maxid, but it was very slow AND didn't give me what I wanted.
Can anybody help?? Is there a better optimized query I can run once, or can I do something else on the DB side to speed this up? Thank you in advance.
You're missing indexes. See:
DOs and DONTs for Indexes
Also, the SELECT MAX(id) FROM tbl_completion GROUP BY tbl_completion.userid, tbl_completion.form_id could presumably discard unneeded rows if you toss in your userid in a where clause.
It sounds like you might be running into the concurrency limitations of SQLite. SQLite does not support concurrent writes, so if you have a lot of users, you end up having a lot of contention. You should consider migrating to another DBMS in order to satisfy your scaling needs.
Related
Trying to write statement where in single statement select all (*) and sum one column from the same database and the same table, depending on conditions.
Wrote such statement (based on this Multiple select statements in Single query)
SELECT ( SELECT SUM(Amount) FROM 2_1_journal), ( SELECT * FROM 2_1_journal WHERE TransactionPartnerName = ? )
I understand that SELECT SUM(Amount) FROM 2_1_journal will sum all values in column Amount (not based on codition).
But at first want to understand what is correct statement
With above statement get error SQLSTATE[21000]: Cardinality violation: 1241 Operand should contain 1 column(s)
Can not understand error message. From advice here MySQL - Operand should contain 1 column(s) understand that subquery SELECT * FROM 2_1_journal WHERE TransactionPartnerName = ? must select only one column?
Tried to change statement to this SELECT ( SELECT * FROM 2_1_journal WHERE TransactionPartnerName = ? ), ( SELECT SUM(Amount) FROM 2_1_journal), but get the same error...
What would be correct statement?
SELECT *, (SELECT SUM(Amount) FROM 2_1_journal)
FROM 2_1_journal
WHERE TransactionPartnerName = ?
This selects sums up Amount from the entire table and "appends" all rows where TransactionPartnerName is the parameter you bind in the client code.
If you want to limit the sum to the same criteria as the rows you select, just include it:
SELECT *, (SELECT SUM(Amount) FROM 2_1_journal WHERE TransactionPartnerName = ?)
FROM 2_1_journal
WHERE TransactionPartnerName = ?
A whole different thing: table names like 2_1_journal are strong indicators of a broken database design. If you can redo it, you should look into how to normalize the database properly. It is most likely pay back many times over.
With regard to normalization (added later):
Since the current design uses keys in table names (such as the 2 and 1 in 2_1_journal), I'll quickly illustrate how I think you can vastly improve that design. Lets say that the table 2_1_journal has the following data (I'm just guessing here because the tables haven't been described anywhere yet):
title | posted | content
------+------------+-----------------
Yes! | 2013-01-01 | This is just a test
2nd | 2013-01-02 | Another test
This stuff belongs to user 2 in company 1. But hey! If you look at the rows, the fact that this data belongs to user 2 in company 1 is nowhere to be found.
The problem is that this design violates one of the most basic principles of database design: don't use keys in object (here: table) names. A clear indication that something is very wrong is if you have to create new tables if something new is added. In this case, adding a new user or a new company requires adding new tables.
This issue is easilly fixed. Create one table named journal. Next, use the same columns, but add another two:
company | user | title | posted | content
--------+------+-------+------------+-----------------
1 | 2 | Yes! | 2013-01-01 | This is just a test
1 | 2 | 2nd | 2013-01-02 | Another test
Doing it like this means:
You never add or modify tables unless the application changes.
Doing joins across companies or users (and anything else that used to be part of the table naming scheme is now possible with a single, fairly simple select statement).
Enforcing integrity is easy - if you upgrade the application and want to change the tables, the changes doesn't have to be repeated for each company and user. More importantly, this lowers the risk of having the application get out of sync with the tables in the database (such as adding the field comments to all x_y_journal tables, but forgetting 5313_4324_journal causing the application to break only when user 5313 logs in. This is the kind of problem you don't want to deal with.
I am not writing this because it is a matter of personal taste. Databases are just designed to handle tables that are laid out as I describe above. The design where you use object keys as part of table names has a host of other problems associated with it that are very hard to deal with.
I have a web application that stores points in a table, and total points in the user table as below:
User Table
user_id | total_points
Points Table
id | date | user_id | points
Every time a user earns a point, the following steps occur:
1. Enter points value to points table
2. Calculate SUM of the points for that user
3. Update the user table with the new SUM of points (total_points)
The values in the user table might get out of sync with the sum in the points table, and I want to be able to recalculate the SUM of all points for every user once in a while (eg. once a month). I could write a PHP script that could loop through each user in the user table and find the sum for that user and update the total_points, but that would be a lot of SQL queries.
Is there a better(efficient) way of doing what I am trying to do?
Thanks...
A more efficient way to do this would be the following:
User Table
user_id
Points Table
id | date | user_id | points
Total Points View
user_id | total_points
A view is effectively a select statement disguised as a table. The select statement would be: SELECT "user_id", SUM("points") AS "total_points" FROM "Points Table" GROUP BY "user_id". To create a view, execute CREATE VIEW "Total Points View" AS <SELECT STATEMENT> where SELECT STATEMENT is the previous select statement.
Once the view has been created, you can treat it as you would any regular table.
P.S.: I don't know that the quotes are necessary unless your table names actually contain spaces, but it's been a while since I worked with MySQL, so I don't remember it's idiosyncrasies.
You have to user Triggers for this, to make the users total points in sync with the user_points table. Something like:
Create Trigger UpdateUserTotalPoints AFTER INSERT ON points
FOR EACH ROW Begin
UPDATE users u
INNER JOIN
(
SELECT user_id, SUM(points) totalPoints
FROM points
GROUP BY user_id
) p ON u.user_id = p.user_id
SET u.total_points = p.totalPoints;
END;
SQL Fiddle Demo
Note that: As noted by #FireLizzard, if these records in the second table, are frequently updated or delted, you have to have other AFTER UPDATE and AFTER DELETE triggers as well, to keep the two tables in sync. And in this case the solution that #FireLizzard will be better in this case.
If you want it once a month, you can’t deal with just MySQL. You have too « logic » code here, and put too logic in database is not the correct way to go. The trigger of Karan Punamiya could be nice, but it will update the user_table on every insert in points table, and it’s not what you seem to want.
For the fact you want to be able to remove points, just add bsarv new negated rows in points, don’t remove any row (it will break the history trace).
If you really want it periodically, you can run a cron script that does that, or even call your PHP script ;)
I got some code that need to look through three tables to display some information for a minecraft top list.
The tables are the following:
servers
id | user_id | name | information | websitename | websiteurl | postdate.. etc
This is the main table that contains server name, website information(Full description of the server) etc.
vote
id | server_id | username | ipaddress | votetimestamp
On the website I allow players to vote every 24 hour. All votes get inserted into this table with the users in-game name(username), the server id and then time of the vote.
ping
id | server_id | min_player | max_player | motd | pingtimestamp
This table get updated by another script every 10 mins with CronJobs that is running on my web server with the use of fsock's.
Doing this I can find out if the server is offline or online, how many players there is online and how many players that can be online at a time.
On my index page I got a script that should pull out data from all three tables onto the web page and display every server in the database order after the server that got the most votes to the server that got the least votes.
I can pull out every server that already got a vote in the voting table however if there is a server that haven't received a vote yet it wont get listed which is should.
This is the SQL code I use.
SELECT DISTINCT(ping.server_id), COUNT(vote.server_id) AS count, servers.id,
servers.name, servers.server_ip, ping.min_player,ping.max_player,ping.motd
FROM servers,vote,ping
WHERE servers.id = vote.server_id
AND servers.id = ping.server_id
GROUP BY servers.id
ORDER BY count DESC
LIMIT $start, $per_page
I'm sure this is simple enough but I've tried a few things now but nothing really seem to work.
Would be a good idea to mention at this point that SQL is not really my strong suit.
Edit
I have tried to remove the ´DISTINCT´ in the string but for some reason it returns multiple rows of every server which is displaying the server more then once.
Every server should only be displayed once, sorting from top to bottom after the server that got the most votes to the least.
First, don't use the comma-delimited syntax where you simply list the tables in your From clause; instead use the Join syntax. It would help if you showed us some sample inputs and expected outputs. You stated that you want all servers whether or not they have a vote means you will probably need to use Left Joins:
Select servers.id
, Count(vote.server_id) As cnt
, servers.name
, servers.server_ip
, Min(ping.min_player) As min_player
, Max(ping_ping.max_player) As max_player
, Min(ping.motd) As Min_Motd
From servers
Left Join votes
On votes.server_id = servers.id
Left Join ping
On ping.server_id = servers.id
Group By servers.id
, servers.name
, servers.server_ip
, ping.server_id
You were not clear on the structure and purpose of the ping table nor the nature of the results if there are multiple rows in the ping table for the same servers.id value. In the above example, I guessed as to the aggregate functions to use for ping.min_player, ping.max_player and ping.motd. In addition, I assumed you really did want all servers rows and not necessarily those that contained a value in the ping table and so I used a Left Join from servers to ping.
In addition, if servers.id is the primary key of the servers table and since you are using MySQL, you do not need to enumerate all of its columns in the Group By (but you would in other database products). Since it wasn't stated whether servers.id is the primary key, I enumerated them.
How to connect two mysql tables in which I am inserting data in same time?
I have table customers which is my app table, and I have user which is used by library I use for my framework, this library handles users and authorization.
I wanted to have user_id (which is id from user) in customers, but I am creating those two tables in same time.
Any ideas? Thanks!
the php command mysql_insert_id gives you the id of the last record inserted into a table. So from my undestanding if your inserting a user you could get the id then insert that into another table?
http://php.net/manual/en/function.mysql-insert-id.php
Or have I understood your question wrongly?
It is simply not possible. Nothing can happen at the same time in a program.
What is possible is to:
Start a database transaction
perform your first query
retrieve the table's key; if it's an autoincrement, there are built-in ways to retrieve the last inserted key in every database API
perform your second query using the retrieved key as a parameter
Commit the transaction; But if an error occurred, you need to rollback the whole transaction
This is how it is done, and it behaves exactly like you want it to.
How about a link table?
users_customers
user_id | customer_id
=====================
1 | 648785552
5 | 145778304
4 | 654566055
You can then join the tables together using this table, for example like this:
SELECT users.name, customers.address
FROM users
JOIN users_customers
ON users_customers.user_id = users.id
JOIN customers
ON users_customers.customer_id = customers.id
How can I number my results where the lowest ID is #1 and the highest ID is the #numberOfResults
Example: If I have a table with only 3 rows in it. whose ID's are 24, 87, 112 it would pull like this:
ID 24 87 112
Num 1 2 3
The reason why I want this, is my manager wants items to be numbered like item1, item2, etc. I initially made it so it used the ID but he saw them like item24, item87, item112. He didn't like that at all and wants them to be like item1, item2, item3. I personally think this is going to lead to problems because if you are deleting and adding items, then item2 will not always refer to the same thing and may cause confusion for the users. So if anyone has a better idea I would like to hear it.
Thanks.
I agree with the comments about not using a numbering scheme like this if the numbers are going to be used for anything other than a simple ordered display of items with numbers. If the numbers are actually going to be tied to something, then this is a really bad idea!
Use a variable, and increment it in the SELECT statement:
SELECT
id,
(#row:=#row+1) AS row
FROM table,
(SELECT #row:=0) AS row_count;
Example:
CREATE TABLE `table1` (
`id` int(11) NOT NULL auto_increment,
PRIMARY KEY (`id`)
) ENGINE=InnoDB
INSERT INTO table1 VALUES (24), (87), (112);
SELECT
id,
(#row:=#row+1) AS row
FROM table1,
(SELECT #row:=0) AS row_count;
+-----+------+
| id | row |
+-----+------+
| 24 | 1 |
| 87 | 2 |
| 112 | 3 |
+-----+------+
How it works
#row is a user defined variable. It is necessary to set it to zero before the main SELECT statement runs. This can be done like this:
SELECT #row:=0;
or like this:
SET #row:=0
But it is handy to tie the two statements together. This can be done by creating a derived table, which is what happens here:
FROM table,
(SELECT #row:=0) AS row_count;
The the second SELECT actually gets run first. Once that's done, it's just a case of incrementing the value of #row for every row retrieved:
#row:=#row+1
The #row value is incremented every time a row is retrieved. It will always generate a sequential list of numbers, no matter what order the rows are accessed. So it's handy for some things, and dangerous for other things...
Sounds like it would be better just making that number in your code instead of trying to come up with some sort of convoluted way of doing it using SQL. When looping through your elements, just maintain the sequentiality there.
What is the ID being used for?
If it's only for quick and easy reference then that's fine, but if it's to be used for deleting or managing in any way as you mentioned then your only option would be to assign a new ID column that is unique for each row in the table. Doing this is pointless though because that duplicates the purpose of your initial ID column.
My company had a similar challenge on a CMS system that used an order field to sort the articles on the front page of the site. The users wanted a "promote, demote" icon that they could click that would move an article up or down.
Again, not ideal, but the strategy we used was to build a promote function and accompanying demote function that identified the current sort value via query, added or subtracted one from the previous or next value, respectively, then set the value of the initially promoted/demoted item. It was also vital to engineer the record insert to accurately set the initial value of newly added records so inserts wouldn't cause a duplicate value to be added. This was also enforced at the DB level for safety's sake. The user was never allowed to directly key in the value of the sort, only promote or demote via icons. To be honest, it worked quite well for the user.
If you have to go this route.....it's not impossible. But there is brain damage involved....