I want to save top 100 results for a game daily.
I have two tables: Users, Stats.
Users db has two columns: userID(mediumint(8) unsigned AUTO_INCREMENT) and userName(varchar(500)).
Stats db has three columns: time(date), userID(mediumint(8) unsigned AUTO_INCREMENT), result(tinyint(3) unsigned).
Now, every time I execute query (daily) I have array of 100 results wit user name. So here's what I need to do:
For every result in array:
get user id from Users table - or if user doesn't exist in User
table than create entry and get id;
Insert to Stats table current date, user id and and result.
What would be the most optimal way to do this in php and mysql. Is there a way to avoid having 200 queries in 'for' loop.
Thanks for your time guys.
200 queries par day is nothing. You can leave everything as is and there will be not a single problem.
why do you have an array with user names where you ought to have user ids instead?
Mysql INSERT query support multiple VALUES statement. So, you could assemble a string like
VALUES (time,userid,result),(time,userid,result),(time,userid,result)
and run it at once.
Also note that userID should be int, not medium int and in the stats table it shouldn't be autoincremented.
Use a pair of prepared statements. You'll still be running each one 100 times, but the query itself will already be parsed (and even cached by the DB server), it'll just have 100 different sets of parameters to be run with, which is quite efficient.
Related
I'm not sure the best way to phrase this question!
I have a mysql database that needs to retrieve and store the past 24 data values, and the data always needs to be the last 24 data values.
I have this fully working, but I am sure there must be a better way to do it!
Just now, my mysql database has columns for id, timestamp, etc, and then 24 data columns:
data_01
data_02
data_03
data_04
data_05
etc
There are multiple rows for different ids.
I run a cron job every hour, which deletes column 'data_24', and then renames all columns:
data_01 -> data_02
data_02 -> data_03
data_03 -> data_04
data_04 -> data_05
data_05 -> data_06
etc
And then adds a new, blank column:
data_01
The new data is then added into this new, blank column.
Does this sound like a sensible way to do this, or is there any better way??
My concern with this method is that the column deleting, renaming and adding has to be done first, before the new data is retrieved, so that the new column exists for adding data.
If the data retrieve fails for any reason, my table then has a column with NULL as a data value.
Renaming columns for something like this is not a good idea.
I'm curious how you insert and update this data, but there must be a better way to do this.
Two things that seem feasible:
Not renaming the column, but moving the data to the next column:
update YourTable
set data1 = :newvalue,
data2 = data1,
data3 = data2,
...;
Or by spreading the data over 24 rows instead of having 24 columns. Each data is a row in your table, (or in a new table where your id is a foreign key). Every time when you insert a new value, you can also delete the oldest value for that same id. You can do this in one atomic transaction so there won't ever be more or less than 24 rows per id.
insert into YourTable(id, data)
values (:id, :newvalue);
delete from YourTable
where id = :id
order by timestamp desc
limit 1;
This will multiply the number of rows (but not the amount of data) by 24, so for 1000 rows (like you mentioned), you're talking about 24000 rows, which is still peanuts if you have the proper indexes.
We got tables in MySQL with over 100 million rows. Manipulating 24000 rows is WAY easier than rewriting a complete table of 1000 rows, which is essentially what you're doing by renaming the columns.
So the second option certainly has my preference. It will provide you with a simple structure, and should you ever decide to not clean up old data, or move that to a separate job, or stick to 100 items instead of 24, then you can easily do that by changing 3 lines of code, instead of completely overhauling your table structure and the application with it.
It doesn't look as a sensible way of doing thins, to be honest.
IMHO, having multiple rows instead of having the wide table is much more flexible.
You can define columns (id, entity_id, created). Then you'll be able to write your records in a "log" manner.
When you need to select the data in the same way as it used to be, you can use the MySQL view for that. Something like
CREATE VIEW my_view AS
SELECT data_01, ..., data_24 -- here you should put the aggregated values aliased as data_01 ... data_24
FROM my_table
WHERE my_table.created >= DATE_SUB(NOW(), INTERVAL 1 DAY)
GROUP BY ... -- here you should aggregate the fields by hours
ORDER BY created;
I have a MySQL database that is becoming really large. I can feel the site becoming slower because of this.
Now, on a lot of pages I only need a certain part of the data. For example, I store information about users every 5 minutes for history purposes. But on one page I only need the information that is the newest (not the whole history of data). I achieve this by a simple MAX(date) in my query.
Now I'm wondering if it wouldn't be better to make a separate table that just stores the latest data so that the query doesn't have to search for the latest data from a specific user between millions of rows but instead just has a table with only the latest data from every user.
The con here would be that I have to run 2 queries to insert the latest history in my database every 5 minutes, i.e. insert the new data in the history table and update the data in the latest history table.
The pro would be that MySQL has a lot less data to go through.
What are common ways to handle this kind of issue?
There are a number of ways to handle slow queries in large tables. The three most basic ways are:
1: Use indexes, and use them correctly. It is important to avoid table scans on large tables; this is almost always your most significant performance hit with single queries.
For example, if you're querying something like: select max(active_date) from activity where user_id=?, then create an index on the activity table for the user_id column. You can have multiple columns in an index, and multiple indexes on a table.
CREATE INDEX idx_user ON activity (user_id)
2: Use summary/"cache" tables. This is what you have suggested. In your case, you could apply an insert trigger to your activity table, which will update the your summary table whenever a new row gets inserted. This will mean that you won't need your code to execute two queries. For example:
CREATE TRIGGER update_summary
AFTER INSERT ON activity
FOR EACH ROW
UPDATE activity_summary SET last_active_date=new.active_date WHERE user_id=new.user_id
You can change that to check if a row exists for the user already and do an insert if it is their first activity. Or you can insert a row into the summary table when a user registers...Or whatever.
3: Review the query! Use MySQL's EXPLAIN command to grab a query plan to see what the optimizer does with your query. Use it to ensure that the optimizer is avoiding table scans on large tables (and either create or force an index if necesary).
this is my first question here, sorry if i'm breaking any etiquette.
I'm kinda into coding, but sometimes my brain is hard to swallow path of logic steps.
Currently i'm working on my own small web-app, where i have public events, and i'm making my own guestlist.
By the time, i've solved these things:(i think so)
Getting from Facebook page all events;
From current event view get all attendees;
Live search function and ordering array alphabetically.
To Do:
Complete CHECK button function - when checked, one get removed from
lis;
Other analysis functions.
Problem:
Currently i'm getting all attendees from JSON string, then converting to array, putting it all in database. I can't decide on SQL logic.
I have all list with people - json->array->db then it reads from db and show wich is checked wich one not, like comparing with table that is from JSON.
Current algorithm is - getting json, and in foreach cycle, everytime i load it writes in DB, using INSERT IGNORE it ignores if it's same userid, so i have db of all atendees.
How to arrange my database? I'm thinking about making tables:
guests - USERID ; EVENT ID; NAME; [for huge list of all people]
checkins - USERID; CHECKEDEVENTID; DATETIME; [for getting stats]
My goal is to make "Checking In" door-app, so in the end i see, that those and those users are attending more on those kind of events, than these one...
So how could i make like stats, like - EVENT - attended Y people of X, and more global SQL queries, like, USER Y came to EVENTS A,B,C. Or, most checkings happening at timespan [probably some bars or chart]....
Should i make for each event new table to store all guest there to see all atendee statistics, and checking table for checkin stats?
For the what you refer to as the "Check" feature, it sounds like you want (roughly*) the following tables:
create table users
(
userid float NOT NULL,
username varchar(64)
);
create table events
(
eventid float NOT NULL,
eventname varchar(64),
eventstart date,
eventlength float
);
create table checkin_activity
(
userid float not null,
eventid float not null,
checkin_time date
);
* This is a highly simplified database schema. You'll want to make sure you add the necessary keys, constraints, etc., and make sure the data types on your columns are appropriate. I didn't give that much thought with this quick example.
Using the entries in the USERS and EVENTS tables, you'll populate the CHECKIN_ACTIVITY table with what you refer to as the "Check" button. You can join queries against these tables as needed to run reports and so on.
NOTE: You mention:
Current algorithm is - getting json, and in foreach cycle, everytime i load it writes in DB, using INSERT IGNORE it ignores if it's same userid, so i have db of all atendees
You should avoid writing to the database within a for loop (into a table I didn't account for above; let's call it the EVENT_ATTENDEES table). Instead, build an INSERT ALL query and executing it once so you're not hitting the database's transaction handler n times.
INSERT ALL
INTO event_attendees (eventid, name) VALUES (1, 'John')
INTO event_attendees (eventid, name) VALUES (1, 'Jane')
INTO event_attendees (eventid, name) VALUES (1, 'Colin')
SELECT * FROM dual;
This is especially important if this kind of load is something you'll be doing often.
Hello I have a mysql database and all I want is basically to get a value on the second table from a first table query
I have figured something like this but is not working.
select src, dst_number, state, duration
from cdrs, area_code_infos
where SUBSTRING(cdrs.src,2,3) = area_code_infos.`npa`;
Please help me figure out this. I have tried in PHP to have multiple queries running one after the other but when I loaded the page after 45 minutes of wait time I gave up.
Thanks,
I assume the tables are farily big, and you are also doing an unindexed query.. basically substring has to be calculated for every row.
Whenever you do a join, you want to make sure both of the joined fields are indexed.
An option would be to create another column containing the substring calculation and then creating an index on that.
However, a better option would be to have an areaCodeInfosID column and set it as a foreign key to the area_code_infos table
Scenario 1
I have one table lets say "member". In that table "member" i have 7 fields ( memid,login_name,password,age,city,phone,country ). In my table i have 10K records.i need to fetch one record . so i'm using the query like this
mysql_query("select * from member where memid=999");
Scenario 2
I have the same table called "member" but i'm splitting the table like this member and member_txt .So in my member_txt table i have memid,age,phone,city,country )and in my member table i have memid,login_name,password .
Which is the best scenario to fetch the data quickly? Either going to single table or split the table into two with reference?
Note: I need to fetch the particular data in PHP and MYSQL. Please let me know which is best method to follow.
we have 10K records
For your own health, use the single table approach.
As long as you are using a primary key for memid, things are going to be lightning fast. This is because PRIMARY KEY automatically assigns an index, which basically tells the exact location for the data and eliminates the need to go through data that it would otherwise do.
From http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
Indexes are used to find rows with specific column values quickly.
Without an index, MySQL must begin with the first row and then read
through the entire table to find the relevant rows. The larger the
table, the more this costs. If the table has an index for the columns
in question, MySQL can quickly determine the position to seek to in
the middle of the data file without having to look at all the data. If
a table has 1,000 rows, this is at least 100 times faster than reading
sequentially. If you need to access most of the rows, it is faster to
read sequentially, because this minimizes disk seeks.
Your second approach only makes your system more complex, and provides no benefits.
Use the scenario 1.
please make the memid primary/unique key then having one table is faster than having two tables.
In general you should not see to much impact on performance with 10k rows as long as your accessing it by your primary key.
Also note that fetching data from one table is also faster than fetching data from 2 tables.
If you want to optimize further use the column names in the select statement instead of the * operator.