I'm trying to get a function to run so that once a message has been posted to the database after say 1 minute it removes that row.
My database looks like this:
CREATE TABLE `messages` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(35) NOT NULL,
`account_number` int(25) NOT NULL,
`message_content` text NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`status` enum('1','0') NOT NULL,
`txt` varchar(20) NOT NULL DEFAULT 'Unread',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Table for storing notifications/messages' AUTO_INCREMENT=27 ;
My current code looks like:
$sql = "DELETE FROM messages WHERE created < DATE_SUB( NOW( ) , INTERVAL 2 MINUTE )";
$this->db->query($sql);
Basically like I said after 1/2 minutes this needs to remove the message from the database by the ID it's grabbing, how does one achieve this, because the code i have shown above isn't working. I am using ActiveRecord it that helps anyone help me.
Thanks in advance.
If you only want to delete a single message by ID, a better strategy would be to have a periodic job like cron that deletes records at periodic intervals, or have another process other than your web server that performs deletions. Then you can flag the records that you want deleted, or pass the ID's to that process.
Does that help?
There are a few ways to tackle this, depending on your access to your server.
Use MySQL events (http://dev.mysql.com/doc/refman/5.1/en/events.html). This will allow you to execute your sql query on i.e. each update of the table. You do however need "SUPER" permission on your database to use this.
Use crontab(unix) or a scheduled task (windows) to execute a php
script containing your sql query every minute.
Put your sql query on the top of your page. This is ofcourse not perfect performance wise, but depending on how busy your site/application or how big your database/table is, I doubt you'll notice the difference.
For reference: this sql will work:
DELETE FROM messages WHERE created < (NOW() - INTERVAL 30 SECOND)
Related
I have 2 tables. 1 is music and 2 is listenTrack. listenTrack tracks the unique plays of each song. I am trying to get results for popular songs of the month. I'm getting my results but they are just taking too long. Below is my tables and query
430,000 rows
CREATE TABLE `listentrack` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`sessionId` varchar(50) NOT NULL,
`url` varchar(50) NOT NULL,
`date_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`ip` varchar(150) NOT NULL,
`user_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=731306 DEFAULT CHARSET=utf8
12500 rows
CREATE TABLE `music` (
`music_id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`title` varchar(50) DEFAULT NULL,
`artist` varchar(50) DEFAULT NULL,
`description` varchar(255) DEFAULT NULL,
`genre` int(4) DEFAULT NULL,
`file` varchar(255) NOT NULL,
`url` varchar(50) NOT NULL,
`allow_download` int(2) NOT NULL DEFAULT '1',
`plays` bigint(20) NOT NULL,
`downloads` bigint(20) NOT NULL,
`faved` bigint(20) NOT NULL,
`dateadded` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`music_id`)
) ENGINE=MyISAM AUTO_INCREMENT=15146 DEFAULT CHARSET=utf8
SELECT COUNT(listenTrack.url) AS total, listenTrack.url
FROM listenTrack
LEFT JOIN music ON music.url = listenTrack.url
WHERE DATEDIFF(DATE(date_created),'2009-08-15') = 0
GROUP BY listenTrack.url
ORDER BY total DESC
LIMIT 0,10
this query isn't very complex and the rows aren't too large, i don't think.
Is there any way to speed this up? Or can you suggest a better solution? This is going to be a cron job at the beggining of every month but I would also like to do by the day results as well.
Oh btw i am running this locally, over 4 min to run, but on prod it takes about 45 secs
I'm more of a SQL Server guy but these concepts should apply.
I'd add indexes:
On ListenTrack, add an index with url, and date_created
On Music, add an index with url
These indexes should speed the query up tremendously (I originally had the table names mixed up - fixed in the latest edit).
For the most part you should also index any column that is used in a JOIN. In your case, you should index both listentrack.url and music.url
#jeff s - An index music.date_created wouldnt help because you are running that through a function first so MySQL cannot use an index on that column. Often, you can rewrite a query so that the indexed referenced column is used statically like:
DATEDIFF(DATE(date_created),'2009-08-15') = 0
becomes
date_created >= '2009-08-15' and date_created < '2009-08-15'
This will filter down records that are from 2009-08-15 and allow any indexes on that column to be candidates. Note that MySQL might NOT use that index, it depends on other factors.
Your best bet is to make a dual index on listentrack(url, date_created)
and then another index on music.url
These 2 indexes will cover this particular query.
Note that if you run EXPLAIN on this query you are still going to get a using filesort because it has to write the records to a temporary table on disk to do the ORDER BY.
In general you should always run your query under EXPLAIN to get an idea on how MySQL will execute the query and then go from there. See the EXPLAIN documentation:
http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
Try creating an index that will help with the join:
CREATE INDEX idx_url ON music (url);
I think I might have missed the obvious before. Why are you joining the music table at all? You do not appear to be using the data in that table at all and you are performing a left join which is not required, right? I think this table being in the query will make it much slower and will not add any value. Take all references to music out, unless the url inclusion is required, in which case you need a right join to force it to not include a row without a matching value.
I would add new indexes, as the others mention. Specifically I would add:
music url
listentrack date_created,url
This will improve your join a ton.
Then I would look at the query, you are forcing the system to perform work on each row of the table. It would be better to rephrase the date restriction as a range.
Not sure of the syntax off the top of my head:
where '2009-08-15 00:00:00' <= date_created < 2009-08-16 00:00:00
That should allow it to rapidly use the index to locate the appropriate records. The combined two key index on music should allow it to find the records based on the date and URL. You should experiment, they might be better off going in the other direction url,date_created on the index.
The explain plan for this query should say "using index" on the right hand column for both. That means that it will not have to hit the data in the table to calculate your sums.
I would also check the memory settings that you have configured for MySQL. It sounds like you do not have enough memory allocated. Be very careful on the differences between server based settings and thread based settings. The server with a 10MB cache is pretty small, a thread with a 10MB cache can use a lot of memory quickly.
Jacob
Pre-grouping and then joining makes things a lot faster with MySQL/MyISAM. (I'm suspicious less of this is needed with other DB's)
This should perform about as fast as the non-joined version:
SELECT
total, a.url, title
FROM
(
SELECT COUNT(*) as total, url
from listenTrack
WHERE DATEDIFF(DATE(date_created),'2009-08-15') = 0
GROUP BY url
ORDER BY total DESC
LIMIT 0,10
) as a
LEFT JOIN music ON music.url = a.url
;
P.S. - Mapping between the two tables with an id instead of a url is sound advice.
Why are you repeating the url in both tables?
Have listentrack hold a music_id instead, and join on that. Gets rid of the text search as well as the extra index.
Besides, it's arguably more correct. You're tracking the times that a particular track was listened to, not the url. What if the url changes?
After you add indexes then you may want to explore adding a new column for the date_created to be a unix_timestamp, which will make math operations quicker.
I am not certain why you have the diff function though, as it appears you are looking for all rows that were updated on a particular date.
You may want to look at your query as it seems to have an error.
If you use unit tests then you can compare the results of your query and a query using a unix timestamp instead.
you might want to add an index to the url field of both tables.
having said that, when i converted from mysql to sql server 2008, with the same queries and same database structures, the queries ran 1-3 orders of magnitude faster.
i think some of it had to do with the rdbms (mysql optimizers are not so good...) and some of it might have had to do with how the rdbms reserve system resources. although, the comparisons were made on production systems where only the db would run.
This below would probably work to speed up the query.
CREATE INDEX music_url_index ON music (url) USING BTREE;
CREATE INDEX listenTrack_url_index ON listenTrack (url) USING BTREE;
You really need to know the total number of comparisons and row scans that are happening. To get that answer look at the code here of how to do that using explain http://www.siteconsortium.com/h/p1.php?id=mysql002.
I have table with 5 simple fields. Total rows in table is cca 250.
When I use PHPmyAdmin with one DELETE query it is processed in 0.05 sec. (always).
Problem is that my PHP application (PDO connection) processing same query between other queries and this query is extremely slow (cca 10 sec.). And another SELECT query on table with 5 rows too (cca 1 sec.). It happened only sometimes!
Other queries (cca 100) are always OK with normal time response.
What problem should be or how to find what is the problem?
Table:
CREATE TABLE `list_ip` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`type` CHAR(20) NOT NULL DEFAULT '',
`address` CHAR(50) NOT NULL DEFAULT '',
`description` VARCHAR(50) NOT NULL DEFAULT '',
`datetime` DATETIME NOT NULL DEFAULT '1000-01-01 00:00:00',
PRIMARY KEY (`id`),
INDEX `address` (`address`),
INDEX `type` (`type`),
INDEX `datetime` (`datetime`) ) COLLATE='utf8_general_ci' ENGINE=InnoDB;
Query:
DELETE FROM list_ip WHERE address='1.2.3.4' AND type='INT' AND datetime<='2017-12-06 08:04:30';
As I said before table has only 250 rows. Size of table is 96Kib.
I tested also with empty table and its slow too.
Wrap your query in EXPLAIN and see if it's running a sequential select, not using indexes. EXPLAIN would be my first stop in determining if I have a data model problem (bad / missing indexes would be one model issue).
About EXPLAIN: https://dev.mysql.com/doc/refman/5.7/en/explain.html
Another tool I'd recommend is running 'mytop' and looking at the server activity/load during those times when it's bogging down. http://jeremy.zawodny.com/mysql/mytop/
There was some network problem. I uninstalled docker app with some network peripherals and looks much beter.
I am writing a little PHP script which is simply return data from a MYSQL table using below query
"SELECT * FROM data where status='0' limit 1";
After reading the data I update the status by getting Id of the particular row using below query
"Update data set status='1' WHERE id=" . $db_field['id'];
Things are working good for a single client. Now i am willing to make this particular page for multiple clients. There are more then 20 clients which will access the same page on almost same time continuously (24/7). Is there a possibility that two or more clients read same data from table? If yes then how to solve it?
Thanks
You are right to consider concurrency. Unless you have only 1 PHP thread responding to client requests, there's really nothing to stop them each from handing out the same row from data to be processed - in fact, since they will each run the same query, they'll each almost certainly hand out the same row.
The easiest way to solve that problem is locking, as suggested in the accepted answer. That may work if the time the PHP server thread takes to run the SELECT...FOR UPDATE or LOCK TABLE ... UNLOCK TABLES (non-transactional) is minimal, such that other threads can wait while each thread runs this code ( it's still wasteful, as they could be processing some other data row, but more on that later).
There is a better solution, though it requires a schema change. Imagine you have a table such as this:
CREATE TABLE `data` (
`data_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`data` blob,
`status` tinyint(1) DEFAULT '0',
PRIMARY KEY (`data_id`)
) ENGINE=InnoDB;
You don't have any way to transactionally update "the next processed record" because the only field you have to update is status. But imagine your table looks more like this:
CREATE TABLE `data` (
`data_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`data` blob,
`status` tinyint(1) DEFAULT '0',
`processing_id` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`data_id`)
) ENGINE=InnoDB;
Then you can write a query something like this to update the "next" column to be processed with your 'processing id':
UPDATE data
SET processing_id = #unique_processing_id
WHERE processing_id IS NULL and status = 0 LIMIT 1;
And any SQL engine worth a darn will make sure you don't have 2 distinct processing IDs accounting for the same record to be processed at the same time. Then at your leisure, you can
SELECT * FROM data WHERE processing_id = #unique_processing_id;
and know that you're getting a unique record every time.
This approach also lends it well to durability concerns; you're basically identify the batch processing run per data row, meaning you can account for each batch job whereas before you're potentially only accounting for the data rows.
I would probably implement the #unique_processing_id by adding a second table for this metadata ( the auto-increment key is the real trick to this, but other data processing metadata could be added):
CREATE TABLE `data_processing` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`data_id` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
and using that as a source for your unique IDs, you might end up with something like:
INSERT INTO data_processing SET date=NOW();
SET #unique_processing_id = (SELECT LAST_INSERT_ID());
UPDATE data
SET processing_id = #unique_processing_id
WHERE status = 0 LIMIT 1;
UPDATE data
JOIN data_processing ON data_processing.id = data.processing_id
SET data_processing.data_id = data.data_id;
SELECT * from data WHERE processing_id = #unique_processing_id;
-- you are now ready to marshal the data to the client ... and ...
UPDATE data SET status = 1
WHERE status = 0
AND processing_id = #unique_processing_id
LIMIT 1;
Thus solving your concurrency problem, and putting you in better shape to audit for durability as well, depending on how you set up data_processing table; you could track thread IDs, processing state, etc. to help verify that the data is really done being processed.
There are other solutions - a message queue might be ideal, allowing you to queue each unprocessed data object's ID to the clients directly ( or through a php script ) and then provide an interface for that data to be retrieved and marked processed separately from the queueing of the "next" data. But as far as "mysql-only" solutions go, the concepts behind what I've shown you here should server you pretty well.
The answer you seek might be using transactions. I suggest you read the following post and its accepted answer:
PHP + MySQL transactions examples
If not, there is also table locking you should look at:
13.3.5 LOCK TABLES and UNLOCK TABLES
I will suggest you to use session for this...
you can save that id into session...
so you can check if one client is checking that record, than you can not allow another client to access it ...
I'm trying to make a queue using MySQL (I know, shame on me!). The way I have it set up is an update is done to set a receiver ID on a queue item, after the update takes place, I select the updated item by the receiver ID.
The problem I'm facing is when I query for the update and then do the select, the select query returns true instead of a result set. This seems to happen when a rapid amount requests are made.
Does anyone have any idea why this is happening?
Thanks in advance.
Schema:
CREATE TABLE `Queue` (
`id` char(11) NOT NULL DEFAULT '',
`status` varchar(20) NOT NULL DEFAULT '',
`createdAt` datetime DEFAULT NULL,
`receiverId` char(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Dequeue:
update `'.self::getTableName().'`
set
`status` = 'queued',
`receiverId` = '%s'
where
`status` = 'queued'
and `receiverId` is null
order by id
limit 1;
select
*
from
`'.self::getTableName().'`
where
`receiverId` = \'%s\'
order by id
desc limit 1
This sounds like a race condition of some kind. You're using MyISAM, so it's possible an update might be deferred (especially if there's a lot of traffic on that table).
The true return indicates that your select query completed properly but returned and empty result set (no rows). If your logic when that happens is to wait, say, 50 milliseconds, and try again, you may find that things work correctly.
Edit: You could try locking the table from before you do the UPDATE until you've done the last SELECT. But that might foul up the performance of other parts of your app. The best thing to do is make your app robust in the face of race conditions.
I'm adding "activity log" to a busy website, which should show user the last N actions relevant to him and allow going to a dedicated page to view all the actions, search them etc.
The DB used is MySQL and I'm wondering how the log should be stored - I've started with a single Myisam table used for FULLTEXT searches, and to avoid extra select queries on every action: 1) an insert to that table happens 2) the APC cache for each is updated, so on the next page request mysql is not used. Cache has a log lifetime and if it's missing, the first AJAX request from user creates it.
I'm caching 3 last events for each user, so when a new event happens, I grab the current cache, add the new event to the beginning and remove the oldest event, so there's always 3 of those in the cache. Every page of the site has a small box displaying those.
Is this a proper setup? How would you recommend implementing this sort of feature?
The schema I have is:
CREATE DATABASE `audit`;
CREATE TABLE `event` (
`eventid` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`userid` INT UNSIGNED NOT NULL ,
`createdat` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ,
`message` VARCHAR( 255 ) NOT NULL ,
`comment` TEXT NOT NULL
) ENGINE = MYISAM CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER DATABASE `audit` DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE `audit`.`event` ADD FULLTEXT `search` (
`message` ( 255 ) ,
`comment` ( 255 )
);
Based on your schema, I'm guessing that (caching aside), you'll be inserting many records per second, and running fairly infrequent queries along the lines of select * from event where user_id = ? order by created_date desc, probably with a paging strategy (thus requiring "limit x" at the end of the query to show the user their history.
You probably also want to find all users affected by a particular type of event - though more likely in an off-line process (e.g. a nightly mail to all users who have updated their password"; that might require a query along the lines of select user_id from event where message like 'password_updated'.
Are there likely to be many cases where you want to search the body text of the comment?
You should definitely read the MySQL Manual on tuning for inserts; if you don't need to search on freetext "comment", I'd leave the index off; I'd also consider a regular index on the "message" table.
It might also make sense to introduce the concept of "message_type" so you can introduce relational consistency (rather than relying on your code to correctly spell "password_updat3"). For instance, you might have an "event_type" table, with a foreign key relationship to your event table.
As for caching - I'm guessing users would only visit their history page infrequently. Populating the cache when they visit the site, on the off-chance they might visit their history (if I've understood your design) immediately limits the scalability of your solution to how many history records you can fit into your cachce; as the history table will grow very quickly for your users, this could quickly become a significant factor.
For data like this, which moves quickly and is rarely visited, caching may not be the right solution.
This is how Prestashop does it:
CREATE TABLE IF NOT EXISTS `ps_log` (
`id_log` int(10) unsigned NOT NULL AUTO_INCREMENT,
`severity` tinyint(1) NOT NULL,
`error_code` int(11) DEFAULT NULL,
`message` text NOT NULL,
`object_type` varchar(32) DEFAULT NULL,
`object_id` int(10) unsigned DEFAULT NULL,
`id_employee` int(10) unsigned DEFAULT NULL,
`date_add` datetime NOT NULL,
`date_upd` datetime NOT NULL,
PRIMARY KEY (`id_log`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=6 ;
My advice would be use a schema less storage system .. they perform better in high volume logging data
Try to consider
Redis
MongoDB
Riak
Or any other No SQL System