i have a big data record from onlines table. (more than 40 Million record)
now i want to show online user in any time from the table but this execute from server has been failed ...
for example when i send request get online in last week , it's dose not work (because the table have very large record).
this is my example php code:
$d = $_GET['date'];
$time = time() - 60*60 * 24 * $d;
$phql = "SELECT DISTINCT aid FROM onlines WHERE time > '$time'";
so, do you have any better tips ?
Tnx.
Use EXPLAIN SELECT ... to see which indices are defined on your table. Ensure that particularly for big tables, that your columns that you query are indexed. In this case time.
You can create an index by:
CREATE INDEX time_index ON onlines (time);
This should speed up the query. If you do not care about potential data loss or persistance you might look into using an in-memory table to avoid IO. That speed up queries significantly but will empty the table if the server restarts or MySQL is shut down.
Related
i am running queries on a table that has thousands of rows:
$sql="select * from call_history where extension_number = '0536*002' and flow = 'in' and DATE(initiated) = '".date("Y-m-d")."' ";
and its taking forever to return results.
The SQL itself is
select *
from call_history
where extension_number = '0536*002'
and flow = 'in'
and DATE(initiated) = 'dateFromYourPHPcode'
is there any way to make it run faster? should i put the where DATE(initiated) = '".date("Y-m-d")."' before the extension_number where clause?
or should i select all rows where DATE(initiated) = '".date("Y-m-d")."' and put that in a while loop then run all my other queries (where extension_number = ...) whthin the while loop?
Here are some suggestions:
1) Replace SELECT * by the only fields you want.
2) Add indexing on the table fields you want as output.
3) Avoid running queries in loops. This causes multiple requests to SQL server.
4) Fetch all the data at once.
5) Apply LIMIT tag as and when required. Don't select all the records.
6) Fire two different queries: one for counting total number of records and other for fetching number of records per page (e.g. 10, 20, 50, etc...)
7) If applicable, create Database Views and get data from them instead of tables.
Thanks
The order of clauses under WHERE is irrelevant to optimization.
Pro-tip, also suggested by somebody else: Never use SELECT * in a query in a program
unless you have a good reason to do so. "I don't feel like writing out the names of the columns I need" isn't a good reason. Always enumerate the columns you need. MySQL and other database systems can often optimize things in surprising ways when the list of data columns you need is available.
Your query contains this selection criterion.
AND DATE(initiated) = 'dateFromYourPHPcode'
Notice that this search criterion takes the form
FUNCTION(column) = value
This form of search defeats the use of any index on that column. Your initiated column has a TIMESTAMP data type. Try this instead:
AND initiated >= 'dateFromYourPHPcode'
AND initiated < 'dateFromYourPHPcode' + INTERVAL 1 DAY
This will find all the initiated items in the particular day. And, because it doesn't use a function on the column value it can use an index range scan to do that, which performs well. It may, or may not, also help without an index. It's worth a try.
I suspect your ideal index for this particular search would created by
ALTER TABLE call_history
ADD INDEX flowExtInit (flow, extension_number, initiated)
You should ask the administrator of the database to add this index if your query needs good performance.
You should add index to your table. This way MySql will fetch faster. I have not tested but command should be like this:
ALTER TABLE `call_history ` ADD INDEX `callhistory` (`extension_number`,`flow`,`extension_number`,`DATE(initiated)`);
I have a table in my database :
And I'm using MySQL to insert/update/create info however the time it takes to execute queries is slow example:
First when a user runs my app it verifies they can use it by retrieving all IPs from the db with
SELECT IP FROM user_info
Then using a while loop if the IP is in the db do another query:
"SELECT Valid FROM user_info WHERE IP='".$_SERVER['REMOTE_ADDR']."'""
If Valid is 1 then they can use it otherwise they can't however if their IP is not found in the db it creates a new entry for them using
"INSERT INTO user_info (IP,First) VALUES ('".$_SERVER['REMOTE_ADDR']."','".date("Y/m/d") ."')")"
Now this first script has finished it accesses another - this one was supposed to update the db every minute however i don't think i can accomplish that now ; the update query is this:
"UPDATE user_info SET Name='".$Name."', Status='".$Status."', Last='".$Last."', Address='".$Address."' WHERE IP='".$_SERVER['REMOTE_ADDR']."'")"
All together it takes average 2.2 seconds and there's only 1 row in the table atm
My question is how do i speed up mysql queries? I've read about indexes and how they can help improve performance but i do not fully understand how to use them. Any light shed on this topic would help.
Nubcake
Indexes will become very important as your site grows, but they won't help when you have only one row in your table and it cannot be the reason why your queries take so long. You seem to have some sort of fundamental problem with your database. Maybe it's a latency issue.
Try starting with some simpler queries like SELECT * FROM users LIMIT 1 or even just SELECT 1 and see if you still get bad performance.
Lesser the number of queries, lesser the latency of the system. Try merging queries being made on the same table. For example, your second and third queries can be merged and you can execute
INSERT INTO user_info ...... WHERE Valid=1 AND IP= ...
Check the number of rows affected to know if a new row was added or not.
Also, do not open/close your sql connection at any point in between. The overheads of establishing a new connection could be high.
you could make IP Primary Key and Index
So what I am trying to do is make a trending algorithm, i need help with the SQL code as i cant get it to go.
There are three aspects to the algorithm: (I am completely open to ideas on a better trend algorithm)
1.Plays during 24h / Total plays of the song
2.Plays during 7d / Total plays of the song
3.Plays during 24h / The value of plays of the most played item over 24h (whatever item leads the play count over 24h)
Each aspect is to be worth 0.33, for a maximum value of 1.0 being possible.
The third aspect is necessary as newly uploaded items would automatically be at top place unless their was a way to drop them down.
The table is called aud_plays and the columns are:
PlayID: Just an auto-incrementing ID for the table
AID: The id of the song
IP: ip address of the user listening
time: UNIX time code
I have tried a few sql codes but im pretty stuck being unable to get this to work.
In your ?aud_songs? (the one the AID points to) table add the following columns
Last24hrPlays INT -- use BIGINT if you plan on getting billion+
Last7dPlays INT
TotalPlays INT
In your aud_plays table create an AFTER INSERT trigger that will increment aud_song.TotalPlays.
UPDATE aud_song SET TotalPlays = TotalPlays + 1 WHERE id = INSERTED.aid
Calculating your trending in real time for every request would be taxing on your server, so it's best to just run a job to update the data every ~5 minutes. So create a SQL Agent Job to run every X minutes that updates Last7dPlays and Last24hrPlays.
UPDATE aud_songs SET Last7dPlays = (SELECT COUNT(*) FROM aud_plays WHERE aud_plays.aid = aud_songs.id AND aud_plays.time BETWEEN GetDate()-7 AND GetDate()),
Last24hrPlays = (SELECT COUNT(*) FROM aud_plays WHERE aud_plays.aid = aud_songs.id AND aud_plays.time BETWEEN GetDate()-1 AND GetDate())
I would also recommend removing old records from aud_plays (possibly older than 7days since you will have the TotalPlays trigger.
It should be easy to figure out how to calculate your 1 and 2 (from the question). Here's the SQL for 3.
SELECT cast(Last24hrPlays as float) / (SELECT MAX(Last24hrPlays) FROM aud_songs) FROM aud_songs WHERE aud_songs.id = #ID
NOTE I made the T-SQL pretty generic and unoptimized to illustrate how the process works.
I have currently created a facebook like page that pulls notifications from different tables, lets say about 8 tables. Each table has a different structure with different columns, so the first thing that comes to mind is that I'll have a global table, like a table of contents, and refresh it with every new hit. I know inserts are resource intensive, but I was hoping that since it is a static table, I'd only add maybe one new record every 100 visitors, so I thought "MAYBE" I could get away with this, but I was wrong. I managed to get deadlocks from just three people hammering the website.
So anyways, now I have to redo it using a different method. Initially I was going to do views, but I have an issue with views. The selected table will have to contain the id of a user. Here is an example of a select statement from php:
$get_events = "
SELECT id, " . $userId . ", 'admin_events', 0, event_start_time
FROM admin_events
WHERE CURDATE() < event_start_time AND
NOT EXISTS(SELECT id
FROM admin_event_registrations
WHERE user_id = " . $userId . " AND admin_events.id = event_id) AND
NOT EXISTS(SELECT id
FROM admin_event_declines
WHERE user_id = " . $userId . " AND admin_events.id = event_id) AND
event_capacity > (SELECT COUNT(*) FROM admin_event_registrations WHERE event_id = admin_events.id)
LIMIT 1
Sorry about the messiness. In any event, as you can see, I need to return the user Id from the page as a selected column from the table. I could not figure out how to do it with views so I don't think views are the way that I will be heading because there's a lot more of these types of queries. I come from an MSSQL background, and I love stored procedures, so if there are stored procedures for MYSQL, that would be excellent.
Next I started thinking about temp tables. The table will be in memory, the table will be probably 150 rows max, and there will be no deadlocks. Is it still very expensive to do inserts on a temp table? Will I end up crashing the server? Right now we have maybe 100 users per day, but I want to try to be future proof when we get more users.
After a long thought, I figured that the only way is the user php and get all the results as an array. The problem is that I'd get something like:
$my_array[0]["date_created"] = <current_date>
The problem with the above is that I have to sort by date_created, but this is a multi dimensional array.
Anyways, to pull 150 to 200 MAX records from a database, which approach would you take? Temp Table, View, or php?
Some thoughts:
Temp Tables:
temporary tables will only last as long as the session is alive. If you run the code in a PHP script, the temporary table will be destroyed automatically when the script finishes executing.
Views:
These are mainly for hiding complexity in that you create it with a join and then access it like a single table. The underlining code is a SELECT statement.
PHP Array:
A bit more cumbersome than SQL to get data from. However, PHP does have some functions to make life easier but no real query language.
Stored Procedures:
There are stored procedures in MySQL - see: http://dev.mysql.com/doc/refman/5.0/en/stored-routines-syntax.html
My Recommendation:
First, re-write your query using the MySQL Query Analyzer: http://www.mysql.com/products/enterprise/query.html
Now I would use PDO to put my values into an array using PHP. This will still leaves the initial heavy lifting to the DB Engine and keeps you from making multiple calls to the DB Server.
Try this:
SELECT id, " . $userId . ", 'admin_events', 0, event_start_time
FROM admin_events AS ae
LEFT JOIN admin_event_registrations AS aer
ON ae.id = aer.event_id
LEFT JOIN admin_event_declines AS aed
ON ae.id = aed.event_id
WHERE aed.user_id = ". $userid ."
AND aer.user_id = ". $userid ."
AND aed.id IS NULL
AND aer.id IS NULL
AND CURDATE() < ae.event_start_time
AND ae.event_capacity > (
SELECT SUM(IF(aer2.event_id IS NOT NULL, 1, 0))
FROM admin_event_registrations aer2
JOIN admin_events AS ae2
ON aer2.event_id = ae2.id
WHERE aer2.user_id = ". $userid .")
LIMIT 1
It still has a subquery, but you will find that it is much faster than the other options given. MySQL can join tables easily (they should all be of the same table type though). Also, the last count statement won't respond the way you want it to with null results unless you handle null values. This can all be done in a flash, and with the join statements it should reduce your overall query time significantly.
The problem is that you are using correlated subqueries. I imagine that your query takes a little while to run if it's not in the query cache? That's what would be causing your table to lock and causing contention.
Switching the table type to InnoDB would help, but your core problem is your query.
150 to 200 records is a very amount. MySQL does support stored procedures, but this isn't something you would need it for. Inserts are not resource intensive, but a lot of them at once, or in sequence (use bulk insert syntax) can cause issues.
I will create 5 tables, namely data1, data2, data3, data4 and data5 tables. Each table can only store 1000 data records.
When a new entry or when I want to insert a new data, I must do a check,
$data1 = mysql_query(SELECT * FROM data1);
<?php
if(mysql_num_rows($data1) > 1000){
$data2 = mysql_query(SELECT * FROM data2);
if(mysql_num_rows($data2 > 1000){
and so on...
}
}
I think this is not the way right? I mean, if I am user 4500, it would take some time to do all the check. Is there any better way to solve this problem?
I haven decided the numbers, it can be 5000 or 10000 data. The reason is flexibility and portability? Well, one of my sql guru suggest me to do this way
Unless your guru was talking about something like Partitioning, I'd seriously doubt his advise. If your database can't handle more than 1000, 5000 or 10000 rows, look for another database. Unless you have a really specific example how a record limit will help you, it probably won't. With the amount of overhead it adds it probably only complicates things for no gain.
A properly set up database table can easily handle millions of records. Splitting it into separate tables will most likely increase neither flexibility nor portability. If you accumulate enough records to run into performance problems, congratulate yourself on a job well done and worry about it then.
Read up on how to count rows in mysql.
Depending on what database engine you are using, doing count(*) operations on InnoDB tables is quite expensive, and those counts should be performed by triggers and tracked in a adjacent information table.
The structure you describe is often designed around a mapping table first. One queries the mapping table to find the destination table associated with a primary key.
You can keep a "tracking" table to keep track of the current table between requests.
Also be on alert for race conditions (use transactions, or insure only one process is running at a time.)
Also don't $data1 = mysql_query(SELECT * FROM data1); with nested if's, do something like:
$i = 1;
do {
$rowCount = mysql_fetch_field(mysql_query("SELECT count(*) FROM data$i"));
$i++;
} while ($rowCount >= 1000);
I'd be surprised if MySQL doesn't have some fancy-pants way to manage this automatically (or at least, better than what I'm about to propose), but here's one way to do it.
1. Insert record into 'data'
2. Check the length of 'data'
3. If >= 1000,
- CREATE TABLE 'dataX' LIKE 'data';
(X will be the number of tables you have + 1)
- INSERT INTO 'dataX' SELECT * FROM 'data';
- TRUNCATE 'data';
This means you will always be inserting into the 'data' table, and 'data1', 'data2', 'data3', etc are your archived versions of that table.
You can create a MERGE table like this:
CREATE TABLE all_data ([col_definitions]) ENGINE=MERGE UNION=(data1,data2,data3,data4,data5);
Then you would be able to count the total rows with a query like SELECT COUNT(*) FROM all_data.
If you're using MySQL 5.1 or above, you can let the database handle this (nearly) automatically using partitioning:
Read this article or the official documentation