mysqli::query() returns true on select queries - php

I'm trying to make a queue using MySQL (I know, shame on me!). The way I have it set up is an update is done to set a receiver ID on a queue item, after the update takes place, I select the updated item by the receiver ID.
The problem I'm facing is when I query for the update and then do the select, the select query returns true instead of a result set. This seems to happen when a rapid amount requests are made.
Does anyone have any idea why this is happening?
Thanks in advance.
Schema:
CREATE TABLE `Queue` (
`id` char(11) NOT NULL DEFAULT '',
`status` varchar(20) NOT NULL DEFAULT '',
`createdAt` datetime DEFAULT NULL,
`receiverId` char(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Dequeue:
update `'.self::getTableName().'`
set
`status` = 'queued',
`receiverId` = '%s'
where
`status` = 'queued'
and `receiverId` is null
order by id
limit 1;
select
*
from
`'.self::getTableName().'`
where
`receiverId` = \'%s\'
order by id
desc limit 1

This sounds like a race condition of some kind. You're using MyISAM, so it's possible an update might be deferred (especially if there's a lot of traffic on that table).
The true return indicates that your select query completed properly but returned and empty result set (no rows). If your logic when that happens is to wait, say, 50 milliseconds, and try again, you may find that things work correctly.
Edit: You could try locking the table from before you do the UPDATE until you've done the last SELECT. But that might foul up the performance of other parts of your app. The best thing to do is make your app robust in the face of race conditions.

Related

Slow GroupBy query for filter results Laravel [duplicate]

I have 2 tables. 1 is music and 2 is listenTrack. listenTrack tracks the unique plays of each song. I am trying to get results for popular songs of the month. I'm getting my results but they are just taking too long. Below is my tables and query
430,000 rows
CREATE TABLE `listentrack` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`sessionId` varchar(50) NOT NULL,
`url` varchar(50) NOT NULL,
`date_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`ip` varchar(150) NOT NULL,
`user_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=731306 DEFAULT CHARSET=utf8
12500 rows
CREATE TABLE `music` (
`music_id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`title` varchar(50) DEFAULT NULL,
`artist` varchar(50) DEFAULT NULL,
`description` varchar(255) DEFAULT NULL,
`genre` int(4) DEFAULT NULL,
`file` varchar(255) NOT NULL,
`url` varchar(50) NOT NULL,
`allow_download` int(2) NOT NULL DEFAULT '1',
`plays` bigint(20) NOT NULL,
`downloads` bigint(20) NOT NULL,
`faved` bigint(20) NOT NULL,
`dateadded` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`music_id`)
) ENGINE=MyISAM AUTO_INCREMENT=15146 DEFAULT CHARSET=utf8
SELECT COUNT(listenTrack.url) AS total, listenTrack.url
FROM listenTrack
LEFT JOIN music ON music.url = listenTrack.url
WHERE DATEDIFF(DATE(date_created),'2009-08-15') = 0
GROUP BY listenTrack.url
ORDER BY total DESC
LIMIT 0,10
this query isn't very complex and the rows aren't too large, i don't think.
Is there any way to speed this up? Or can you suggest a better solution? This is going to be a cron job at the beggining of every month but I would also like to do by the day results as well.
Oh btw i am running this locally, over 4 min to run, but on prod it takes about 45 secs
I'm more of a SQL Server guy but these concepts should apply.
I'd add indexes:
On ListenTrack, add an index with url, and date_created
On Music, add an index with url
These indexes should speed the query up tremendously (I originally had the table names mixed up - fixed in the latest edit).
For the most part you should also index any column that is used in a JOIN. In your case, you should index both listentrack.url and music.url
#jeff s - An index music.date_created wouldnt help because you are running that through a function first so MySQL cannot use an index on that column. Often, you can rewrite a query so that the indexed referenced column is used statically like:
DATEDIFF(DATE(date_created),'2009-08-15') = 0
becomes
date_created >= '2009-08-15' and date_created < '2009-08-15'
This will filter down records that are from 2009-08-15 and allow any indexes on that column to be candidates. Note that MySQL might NOT use that index, it depends on other factors.
Your best bet is to make a dual index on listentrack(url, date_created)
and then another index on music.url
These 2 indexes will cover this particular query.
Note that if you run EXPLAIN on this query you are still going to get a using filesort because it has to write the records to a temporary table on disk to do the ORDER BY.
In general you should always run your query under EXPLAIN to get an idea on how MySQL will execute the query and then go from there. See the EXPLAIN documentation:
http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
Try creating an index that will help with the join:
CREATE INDEX idx_url ON music (url);
I think I might have missed the obvious before. Why are you joining the music table at all? You do not appear to be using the data in that table at all and you are performing a left join which is not required, right? I think this table being in the query will make it much slower and will not add any value. Take all references to music out, unless the url inclusion is required, in which case you need a right join to force it to not include a row without a matching value.
I would add new indexes, as the others mention. Specifically I would add:
music url
listentrack date_created,url
This will improve your join a ton.
Then I would look at the query, you are forcing the system to perform work on each row of the table. It would be better to rephrase the date restriction as a range.
Not sure of the syntax off the top of my head:
where '2009-08-15 00:00:00' <= date_created < 2009-08-16 00:00:00
That should allow it to rapidly use the index to locate the appropriate records. The combined two key index on music should allow it to find the records based on the date and URL. You should experiment, they might be better off going in the other direction url,date_created on the index.
The explain plan for this query should say "using index" on the right hand column for both. That means that it will not have to hit the data in the table to calculate your sums.
I would also check the memory settings that you have configured for MySQL. It sounds like you do not have enough memory allocated. Be very careful on the differences between server based settings and thread based settings. The server with a 10MB cache is pretty small, a thread with a 10MB cache can use a lot of memory quickly.
Jacob
Pre-grouping and then joining makes things a lot faster with MySQL/MyISAM. (I'm suspicious less of this is needed with other DB's)
This should perform about as fast as the non-joined version:
SELECT
total, a.url, title
FROM
(
SELECT COUNT(*) as total, url
from listenTrack
WHERE DATEDIFF(DATE(date_created),'2009-08-15') = 0
GROUP BY url
ORDER BY total DESC
LIMIT 0,10
) as a
LEFT JOIN music ON music.url = a.url
;
P.S. - Mapping between the two tables with an id instead of a url is sound advice.
Why are you repeating the url in both tables?
Have listentrack hold a music_id instead, and join on that. Gets rid of the text search as well as the extra index.
Besides, it's arguably more correct. You're tracking the times that a particular track was listened to, not the url. What if the url changes?
After you add indexes then you may want to explore adding a new column for the date_created to be a unix_timestamp, which will make math operations quicker.
I am not certain why you have the diff function though, as it appears you are looking for all rows that were updated on a particular date.
You may want to look at your query as it seems to have an error.
If you use unit tests then you can compare the results of your query and a query using a unix timestamp instead.
you might want to add an index to the url field of both tables.
having said that, when i converted from mysql to sql server 2008, with the same queries and same database structures, the queries ran 1-3 orders of magnitude faster.
i think some of it had to do with the rdbms (mysql optimizers are not so good...) and some of it might have had to do with how the rdbms reserve system resources. although, the comparisons were made on production systems where only the db would run.
This below would probably work to speed up the query.
CREATE INDEX music_url_index ON music (url) USING BTREE;
CREATE INDEX listenTrack_url_index ON listenTrack (url) USING BTREE;
You really need to know the total number of comparisons and row scans that are happening. To get that answer look at the code here of how to do that using explain http://www.siteconsortium.com/h/p1.php?id=mysql002.

Logical Help Needed in a PHP Script

I am writing a little PHP script which is simply return data from a MYSQL table using below query
"SELECT * FROM data where status='0' limit 1";
After reading the data I update the status by getting Id of the particular row using below query
"Update data set status='1' WHERE id=" . $db_field['id'];
Things are working good for a single client. Now i am willing to make this particular page for multiple clients. There are more then 20 clients which will access the same page on almost same time continuously (24/7). Is there a possibility that two or more clients read same data from table? If yes then how to solve it?
Thanks
You are right to consider concurrency. Unless you have only 1 PHP thread responding to client requests, there's really nothing to stop them each from handing out the same row from data to be processed - in fact, since they will each run the same query, they'll each almost certainly hand out the same row.
The easiest way to solve that problem is locking, as suggested in the accepted answer. That may work if the time the PHP server thread takes to run the SELECT...FOR UPDATE or LOCK TABLE ... UNLOCK TABLES (non-transactional) is minimal, such that other threads can wait while each thread runs this code ( it's still wasteful, as they could be processing some other data row, but more on that later).
There is a better solution, though it requires a schema change. Imagine you have a table such as this:
CREATE TABLE `data` (
`data_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`data` blob,
`status` tinyint(1) DEFAULT '0',
PRIMARY KEY (`data_id`)
) ENGINE=InnoDB;
You don't have any way to transactionally update "the next processed record" because the only field you have to update is status. But imagine your table looks more like this:
CREATE TABLE `data` (
`data_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`data` blob,
`status` tinyint(1) DEFAULT '0',
`processing_id` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`data_id`)
) ENGINE=InnoDB;
Then you can write a query something like this to update the "next" column to be processed with your 'processing id':
UPDATE data
SET processing_id = #unique_processing_id
WHERE processing_id IS NULL and status = 0 LIMIT 1;
And any SQL engine worth a darn will make sure you don't have 2 distinct processing IDs accounting for the same record to be processed at the same time. Then at your leisure, you can
SELECT * FROM data WHERE processing_id = #unique_processing_id;
and know that you're getting a unique record every time.
This approach also lends it well to durability concerns; you're basically identify the batch processing run per data row, meaning you can account for each batch job whereas before you're potentially only accounting for the data rows.
I would probably implement the #unique_processing_id by adding a second table for this metadata ( the auto-increment key is the real trick to this, but other data processing metadata could be added):
CREATE TABLE `data_processing` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`data_id` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
and using that as a source for your unique IDs, you might end up with something like:
INSERT INTO data_processing SET date=NOW();
SET #unique_processing_id = (SELECT LAST_INSERT_ID());
UPDATE data
SET processing_id = #unique_processing_id
WHERE status = 0 LIMIT 1;
UPDATE data
JOIN data_processing ON data_processing.id = data.processing_id
SET data_processing.data_id = data.data_id;
SELECT * from data WHERE processing_id = #unique_processing_id;
-- you are now ready to marshal the data to the client ... and ...
UPDATE data SET status = 1
WHERE status = 0
AND processing_id = #unique_processing_id
LIMIT 1;
Thus solving your concurrency problem, and putting you in better shape to audit for durability as well, depending on how you set up data_processing table; you could track thread IDs, processing state, etc. to help verify that the data is really done being processed.
There are other solutions - a message queue might be ideal, allowing you to queue each unprocessed data object's ID to the clients directly ( or through a php script ) and then provide an interface for that data to be retrieved and marked processed separately from the queueing of the "next" data. But as far as "mysql-only" solutions go, the concepts behind what I've shown you here should server you pretty well.
The answer you seek might be using transactions. I suggest you read the following post and its accepted answer:
PHP + MySQL transactions examples
If not, there is also table locking you should look at:
13.3.5 LOCK TABLES and UNLOCK TABLES
I will suggest you to use session for this...
you can save that id into session...
so you can check if one client is checking that record, than you can not allow another client to access it ...

Codeigniter - Delete a record after 1 minute

I'm trying to get a function to run so that once a message has been posted to the database after say 1 minute it removes that row.
My database looks like this:
CREATE TABLE `messages` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(35) NOT NULL,
`account_number` int(25) NOT NULL,
`message_content` text NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`status` enum('1','0') NOT NULL,
`txt` varchar(20) NOT NULL DEFAULT 'Unread',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Table for storing notifications/messages' AUTO_INCREMENT=27 ;
My current code looks like:
$sql = "DELETE FROM messages WHERE created < DATE_SUB( NOW( ) , INTERVAL 2 MINUTE )";
$this->db->query($sql);
Basically like I said after 1/2 minutes this needs to remove the message from the database by the ID it's grabbing, how does one achieve this, because the code i have shown above isn't working. I am using ActiveRecord it that helps anyone help me.
Thanks in advance.
If you only want to delete a single message by ID, a better strategy would be to have a periodic job like cron that deletes records at periodic intervals, or have another process other than your web server that performs deletions. Then you can flag the records that you want deleted, or pass the ID's to that process.
Does that help?
There are a few ways to tackle this, depending on your access to your server.
Use MySQL events (http://dev.mysql.com/doc/refman/5.1/en/events.html). This will allow you to execute your sql query on i.e. each update of the table. You do however need "SUPER" permission on your database to use this.
Use crontab(unix) or a scheduled task (windows) to execute a php
script containing your sql query every minute.
Put your sql query on the top of your page. This is ofcourse not perfect performance wise, but depending on how busy your site/application or how big your database/table is, I doubt you'll notice the difference.
For reference: this sql will work:
DELETE FROM messages WHERE created < (NOW() - INTERVAL 30 SECOND)

mysql effecient query (select and update)

I have a table that its structure is as like as follow:
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`ttype` int(1) DEFAULT '19',
`title` mediumtext,
`tcode` char(2) DEFAULT NULL,
`tdate` int(11) DEFAULT NULL,
`visit` int(11) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `tcode` (`tcode`),
KEY `ttype` (`ttype`),
KEY `tdate` (`tdate`)
ENGINE=MyISAM
I have two query on x.php same as:
SELECT * FROM table_name WHERE id='10' LIMIT 1
UPDATE table_name SET visit=visit+1 WHERE id='10' LIMIT 1
My first problem is that whether updating 'visit' in table cause reindexing and decreasing performance or not? Note to this point that 'visit' is not key.
Second method may be creating new table that contain 'visit' like as follow:
'newid' int(10) unsigned NOT NULL ,
`visit` int(11) DEFAULT '0',
PRIMARY KEY (`newid`),
ENGINE=MyISAM
So selecting by
SELECT w.*,q.visit FROM table_name w LEFT JOIN table_name2 q
ON (w.id=q.newid) WHERE w.id='10' LIMIT 1
UPDATE table_name2 SET visit=visit+1 WHERE newid='10' LIMIT 1
Is second method prefered rescpect to first method? Which one would have better performance and would be quick?
Note: all sql queries would be run by PHP (mysql_query command). Also I need first table indexes for other queries on other pages.
I'd say your first method is the best, and simplest. Updating visit will be very fast and no updating of indexes needs to be performed.
I'd prefer the first, and have used that for similar things in the past with no problems. You can remove the limit clause since id is your primary key you will never have more than 1 result, although the query optimizer probably does this for you.
There was a question someone asked earlier to which I responded with a solution you may want to consider as well. When you do 'count' columns you lose the ability to mine the data later. With a transaction table not only can you get 'views' counts, but you can also query for date ranges etc. Sure you will carry the weight of storing potentially hundreds of thousands of rows, but the table is narrow and indices numeric.
I cannot see a solution on the database side... Perhaps you can do it in PHP: If the user has a PHP session, you could, for example, only update the visitor count each 10th time, like:
<?php
session_start();
$_SESSION['count']+=1;
if ($_SESSION['count'] > 10) {
do_the_function_that_updates_the_count_plus_10();
$_SESSION['count'] = 0;
}
Of course you loose some counts, this way, but perhaps this is not that important?

MySQL Slow on join. Any way to speed up

I have 2 tables. 1 is music and 2 is listenTrack. listenTrack tracks the unique plays of each song. I am trying to get results for popular songs of the month. I'm getting my results but they are just taking too long. Below is my tables and query
430,000 rows
CREATE TABLE `listentrack` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`sessionId` varchar(50) NOT NULL,
`url` varchar(50) NOT NULL,
`date_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`ip` varchar(150) NOT NULL,
`user_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=731306 DEFAULT CHARSET=utf8
12500 rows
CREATE TABLE `music` (
`music_id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`title` varchar(50) DEFAULT NULL,
`artist` varchar(50) DEFAULT NULL,
`description` varchar(255) DEFAULT NULL,
`genre` int(4) DEFAULT NULL,
`file` varchar(255) NOT NULL,
`url` varchar(50) NOT NULL,
`allow_download` int(2) NOT NULL DEFAULT '1',
`plays` bigint(20) NOT NULL,
`downloads` bigint(20) NOT NULL,
`faved` bigint(20) NOT NULL,
`dateadded` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`music_id`)
) ENGINE=MyISAM AUTO_INCREMENT=15146 DEFAULT CHARSET=utf8
SELECT COUNT(listenTrack.url) AS total, listenTrack.url
FROM listenTrack
LEFT JOIN music ON music.url = listenTrack.url
WHERE DATEDIFF(DATE(date_created),'2009-08-15') = 0
GROUP BY listenTrack.url
ORDER BY total DESC
LIMIT 0,10
this query isn't very complex and the rows aren't too large, i don't think.
Is there any way to speed this up? Or can you suggest a better solution? This is going to be a cron job at the beggining of every month but I would also like to do by the day results as well.
Oh btw i am running this locally, over 4 min to run, but on prod it takes about 45 secs
I'm more of a SQL Server guy but these concepts should apply.
I'd add indexes:
On ListenTrack, add an index with url, and date_created
On Music, add an index with url
These indexes should speed the query up tremendously (I originally had the table names mixed up - fixed in the latest edit).
For the most part you should also index any column that is used in a JOIN. In your case, you should index both listentrack.url and music.url
#jeff s - An index music.date_created wouldnt help because you are running that through a function first so MySQL cannot use an index on that column. Often, you can rewrite a query so that the indexed referenced column is used statically like:
DATEDIFF(DATE(date_created),'2009-08-15') = 0
becomes
date_created >= '2009-08-15' and date_created < '2009-08-15'
This will filter down records that are from 2009-08-15 and allow any indexes on that column to be candidates. Note that MySQL might NOT use that index, it depends on other factors.
Your best bet is to make a dual index on listentrack(url, date_created)
and then another index on music.url
These 2 indexes will cover this particular query.
Note that if you run EXPLAIN on this query you are still going to get a using filesort because it has to write the records to a temporary table on disk to do the ORDER BY.
In general you should always run your query under EXPLAIN to get an idea on how MySQL will execute the query and then go from there. See the EXPLAIN documentation:
http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
Try creating an index that will help with the join:
CREATE INDEX idx_url ON music (url);
I think I might have missed the obvious before. Why are you joining the music table at all? You do not appear to be using the data in that table at all and you are performing a left join which is not required, right? I think this table being in the query will make it much slower and will not add any value. Take all references to music out, unless the url inclusion is required, in which case you need a right join to force it to not include a row without a matching value.
I would add new indexes, as the others mention. Specifically I would add:
music url
listentrack date_created,url
This will improve your join a ton.
Then I would look at the query, you are forcing the system to perform work on each row of the table. It would be better to rephrase the date restriction as a range.
Not sure of the syntax off the top of my head:
where '2009-08-15 00:00:00' <= date_created < 2009-08-16 00:00:00
That should allow it to rapidly use the index to locate the appropriate records. The combined two key index on music should allow it to find the records based on the date and URL. You should experiment, they might be better off going in the other direction url,date_created on the index.
The explain plan for this query should say "using index" on the right hand column for both. That means that it will not have to hit the data in the table to calculate your sums.
I would also check the memory settings that you have configured for MySQL. It sounds like you do not have enough memory allocated. Be very careful on the differences between server based settings and thread based settings. The server with a 10MB cache is pretty small, a thread with a 10MB cache can use a lot of memory quickly.
Jacob
Pre-grouping and then joining makes things a lot faster with MySQL/MyISAM. (I'm suspicious less of this is needed with other DB's)
This should perform about as fast as the non-joined version:
SELECT
total, a.url, title
FROM
(
SELECT COUNT(*) as total, url
from listenTrack
WHERE DATEDIFF(DATE(date_created),'2009-08-15') = 0
GROUP BY url
ORDER BY total DESC
LIMIT 0,10
) as a
LEFT JOIN music ON music.url = a.url
;
P.S. - Mapping between the two tables with an id instead of a url is sound advice.
Why are you repeating the url in both tables?
Have listentrack hold a music_id instead, and join on that. Gets rid of the text search as well as the extra index.
Besides, it's arguably more correct. You're tracking the times that a particular track was listened to, not the url. What if the url changes?
After you add indexes then you may want to explore adding a new column for the date_created to be a unix_timestamp, which will make math operations quicker.
I am not certain why you have the diff function though, as it appears you are looking for all rows that were updated on a particular date.
You may want to look at your query as it seems to have an error.
If you use unit tests then you can compare the results of your query and a query using a unix timestamp instead.
you might want to add an index to the url field of both tables.
having said that, when i converted from mysql to sql server 2008, with the same queries and same database structures, the queries ran 1-3 orders of magnitude faster.
i think some of it had to do with the rdbms (mysql optimizers are not so good...) and some of it might have had to do with how the rdbms reserve system resources. although, the comparisons were made on production systems where only the db would run.
This below would probably work to speed up the query.
CREATE INDEX music_url_index ON music (url) USING BTREE;
CREATE INDEX listenTrack_url_index ON listenTrack (url) USING BTREE;
You really need to know the total number of comparisons and row scans that are happening. To get that answer look at the code here of how to do that using explain http://www.siteconsortium.com/h/p1.php?id=mysql002.

Categories