As I am using InnoDB as a database engine, the query goes slower sometimes it takes 20 seconds or more often.
I know the solution it can be done via my.conf to change the value of innodb_flush_log_at_trx_commit to 2 it can solve my problem I also want to do that but as I have shared hosting so they are not allowing me to do that.
MySQL version: 5.6.32-78.1
I also tried with MySQL query
mysql> SHOW VARIABLES LIKE 'innodb_flush_log_at_trx_commit';
+--------------------------------+-------+
| Variable_name | Value |
+--------------------------------+-------+
| innodb_flush_log_at_trx_commit | 1 |
+--------------------------------+-------+
1 row in set
And i tried this query
mysql> SET GLOBAL innodb_flush_log_at_trx_commit=2;
but it is also not allowing me too because I don't have super privileges to perform this action.
I have database with 25 tables, and in 4 tables there are 4000+ records and in rest tables, there are below 100 records
So is there any other solution to speed up the query performance.? Any help will be appreciated.
Use profile to check the cost time in each step;
set profiling=1;
Run you query;
Check the query:show profiles;
List the time cost: show profile block io,cpu for query N;
Find the step which has the high Duration
It shows like this
Possible problem: Index, Order by, File sort, Use temp table..
Related
So, I'm using Wordpress, MySQL (8.0.16), InnoDB.
The wp_options table normally is 13 MB. The problem is, it suddenly (at least within a span of a few days) becomes 27 GB and then stops growing, because there's no more space available. Those 27 GB are considered data, not indexes.
Dumping and importing the table gives you a table of the normal size. The number of entries is around 4k, the autoincrement index is 200k+. Defragmenting table with ALTER TABLE wp_options ENGINE = InnoDB; changes the table size on disk to normal, but mysql thinks otherwise, even after the server restart.
+------------+------------+
| Table | Size in MB |
+------------+------------+
| wp_options | 26992.56 |
+------------+------------+
1 row in set (0.00 sec)
MySQL logs don't say much:
2019-08-05T17:02:41.939945Z 1110933 [ERROR] [MY-012144] [InnoDB] posix_fallocate(): Failed to preallocate data for file ./XXX/wp_options.ibd, desired size 4194304 bytes. Operating system error number 28. Check that the disk is not full or a disk quota exceeded. Make sure the file system supports this function. Some operating system error numbers are described at http://dev.mysql.com/doc/refman/8.0/en/operating-system-error-codes.html
2019-08-05T17:02:41.941604Z 1110933 [Warning] [MY-012637] [InnoDB] 1048576 bytes should have been written. Only 774144 bytes written. Retrying for the remaining bytes.
2019-08-05T17:02:41.941639Z 1110933 [Warning] [MY-012638] [InnoDB] Retry attempts for writing partial data failed.
2019-08-05T17:02:41.941655Z 1110933 [ERROR] [MY-012639] [InnoDB] Write to file ./XXX/wp_options.ibd failed at offset 28917628928, 1048576 bytes should have been written, only 774144 were written. Operating system error number 28. Check that your OS and file system support files of this size. Check also that the disk is not full or a disk quota exceeded.
2019-08-05T17:02:41.941673Z 1110933 [ERROR] [MY-012640] [InnoDB] Error number 28 means 'No space left on device'
My guess is that something starts adding options (something transient-related, maybe?) and never stops.
The question is, how to debug it? Any help/hints would be appreciated.
Hourly Cron to defragment looks like a very bad solution.
UPD:
1 day had passed, free disk space decreased by 7 GB. Current autoincrement index is 206975 (and it was 202517 yesterday when there were 27 GB free). So 4.5K entries = 7 GB, I guess?
mysql> SELECT table_name AS `Table`, round(((data_length + index_length) / 1024 / 1024), 2) `Size in MB` FROM information_schema.TABLES WHERE table_schema = 'XXX' AND table_name = 'wp_options';
+------------+------------+
| Table | Size in MB |
+------------+------------+
| wp_options | 7085.52 |
+------------+------------+
1 row in set (0.00 sec)
mysql> select ENGINE, TABLE_NAME,Round( DATA_LENGTH/1024/1024) as data_length , round(INDEX_LENGTH/1024/1024) as index_length, round(DATA_FREE/ 1024/1024) as data_free from information_schema.tables where DATA_FREE > 0 and TABLE_NAME = "wp_options" limit 0, 10;
+--------+------------+-------------+--------------+-----------+
| ENGINE | TABLE_NAME | data_length | index_length | data_free |
+--------+------------+-------------+--------------+-----------+
| InnoDB | wp_options | 7085 | 0 | 5 |
+--------+------------+-------------+--------------+-----------+
I will monitor the dynamics of how free space decreases, maybe that would shed some more light on the problem.
UPD (final)
I had a feeling it was something stupid, and I was right. There was a flush_rewrite_rules(); of all things unholy right in the functions.php. Examining the general log was helpful.
One possibility is that you're seeing incorrect statistics about the table size.
MySQL 8.0 tries to cache the statistics about tables, but there seem to be some bugs in the implementation. Sometimes it shows table statistics as NULL, and sometimes it shows values, but fails to update them as you modify table data.
See https://bugs.mysql.com/bug.php?id=83957 for example, a bug that discusses the problems with this caching behavior.
You can disable the caching. It may cause queries against the INFORMATION_SCHEMA or SHOW TABLE STATUS to be a little bit slower, but I would guess it's no worse than in versions of MySQL before 8.0.
SET GLOBAL information_schema_stats_expiry = 0;
The integer value is the number of seconds MySQL keeps statistics cached. If you query the table stats, you may see old values from the cache, until they expire and MySQL refreshes them by reading from the storage engine.
The default value for the cache expiration is 86400, or 24 hours. That seems excessive.
See https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_information_schema_stats_expiry
If you think Wordpress is writing to the table, then it might be. You can enable the binary log or the query log to find out. Or just observe SHOW PROCESSLIST for a few minutes.
You might have a wordpress plugin that is updating or inserting into a table frequently. You can look for the latest update_time:
SELECT * FROM INFORMATION_SCHEMA.TABLES
ORDER BY UPDATE_TIME DESC LIMIT 3;
Watch this to find out which tables are written to most recently.
There are caveats to this UPDATE_TIME stat. It isn't always in sync with the queries that updated the table, because writes to tablespace files are asynchronous. Read about it here: https://dev.mysql.com/doc/refman/8.0/en/tables-table.html
Have you tried with slow log? That might give you some hint where all the queries come from.
Do you have a staging site, and is the problem replicated there? If so, then turn off all plugins (as long as turning off the plugin doesn't break your site) on your staging site to see if the problem stops. If it does, then turn them on again one at a time to find out which plugin is causing the problem.
If you don't have a staging site, then you can try doing this on live, but keep in mind you will be messing with your live functionality (generally not recommended). In this case, just remove the plugins you absolutely do not need, and hopefully one of them is the culprit. If this works, then add them one at a time again until you find the plugin causing the problem.
My suggestion above assumes a plugin is causing this problem. Out of the box WordPress doesn't do this (not that I've heard of). It's possible there is custom programming doing this, but you would need to let us know more details concerning any recent scripts.
The biggest problem at this point is that you don't know what the problem is. Until you do, it's hard to take corrective measures.
Some plugins fail to clean up after themselves. Chase down what plugin(s) you added about the time the problem started. Look in the rows in options to see if there are clues that further implicate specific plugins.
No, nothing is MySQL's statistics, etc, can explain a 27GB 'error' in the calculations. No amount of OPTIMIZE TABLE, etc will fix more than a fraction of that. You need to delete most of the rows. Show us some of the recent rows (high AUTO_INCREMENT ids).
If the table has frequent delete/update/insert, you can run OPTIMIZE TABLE yourTable;
It needs to be run in maintenance window.
It will free spaces for reuse, but disk space will not be decreased.
I created a basic php page which is querying a MySQL DB (MyISAM) - read only. It has a table with 7fields and ~6 million records.
Something like :
ID | Category | Name | Price | Date ...
Running a basic select query on this table takes like 4-5ms.
Select * from mytable where ID = 'myID'.
I've already set it up the caching, so if I run the same select again it's instant.
Is there any easy way to cache all the select queries automatically? (Generate all the selects and run them after server restart, or something like that.)
Like I said its a read-only db, so the solution doesn't need to be dynamic.
Thank you.
I have a PHP script that retrieves rows from a database and then performs work based on the contents. The work can be time consuming (but not necessarily computationally expensive) and so I need to allow multiple scripts to run in parallel.
The rows in the database looks something like this:
+---------------------+---------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+---------------+------+-----+---------------------+----------------+
| id | bigint(11) | NO | PRI | NULL | auto_increment |
.....
| date_update_started | datetime | NO | | 0000-00-00 00:00:00 | |
| date_last_updated | datetime | NO | | 0000-00-00 00:00:00 | |
+---------------------+---------------+------+-----+---------------------+----------------+
My script currently selects rows with the oldest dates in date_last_updated (which is updated once the work is done) and does not make use of date_update_started.
If I were to run multiple instances of the script in parallel right now, they would select the same rows (at least some of the time) and duplicate work would be done.
What I'm thinking of doing is using a transaction to select the rows, update the date_update_started column, and then add a WHERE condition to the SQL statement selecting the rows to only select rows with date_update_started greater than some value (to ensure another script isn't working on it). E.g.
$sth = $dbh->prepare('
START TRANSACTION;
SELECT * FROM table WHERE date_update_started > 1 DAY ORDER BY date_last_updated LIMIT 1000;
UPDATE table DAY SET date_update_started = UTC_TIMESTAMP() WHERE id IN (SELECT id FROM table WHERE date_update_started > 1 DAY ORDER BY date_last_updated LIMIT 1000;);
COMMIT;
');
$sth->execute(); // in real code some values will be bound
$rows = $sth->fetchAll(PDO::FETCH_ASSOC);
From what I've read, this is essentially a queue implementation and seems to be frowned upon in MySQL. All the same, I need to find a way to allow multiple scripts to run in parallel, and after the research I've done this is what I've come up with.
Will this type of approach work? Is there a better way?
I think your approach could work, as long as you also add some kind of identifier to the rows you selected that they are currently been worked on, it could be as #JuniusRendel suggested and i would even think about using another string key (random or instance id) for cases where the script resulted in errors and did not complete gracefully, as you will have to clean these fields once you updated the rows back after your work.
The problem with this approach as i see it is the option that there will be 2 scripts that run at the same point and will select the same rows before they were signed as locked. here as i can see it, it really depends on what kind of work you do on the rows, if the end result in these both scripts will be the same, i think the only problem you have is for wasted time and server memory (which are not small issues but i will put them aside for now...). if your work will result in different updates on both scripts your problem will be that you could have the wrong update at the end in the TB.
#Jean has mentioned the second approach you can take that involves using the MySql locks. i am not an expert of the subject but it seems like a good approach and using the 'Select .... FOR UPDATE' statement could give you what you are looking for as you could do on the same call the select & the update - which will be faster than 2 separate queries and could reduce the risk for other instances to select these rows as they will be locked.
The 'SELECT .... FOR UPDATE' allows you to run a select statement and lock those specific rows for updating them, so your statement could look like:
START TRANSACTION;
SELECT * FROM tb where field='value' LIMIT 1000 FOR UPDATE;
UPDATE tb SET lock_field='1' WHERE field='value' LIMIT 1000;
COMMIT;
Locks are powerful but be careful that it wont affect your application in different sections. Check if those selected rows that are currently locked for the update, are they requested somewhere else in your application (maybe for the end user) and what will happen in that case.
Also, Tables must be InnoDB and it is recommended that the fields you are checking the where clause with have a Mysql index as if not you may lock the whole table or encounter the 'Gap Lock'.
There is also a possibility that the locking process and especially when running parallel scripts will be heavy on your CPU & memory.
here is another read on the subject: http://www.percona.com/blog/2006/08/06/select-lock-in-share-mode-and-for-update/
Hope this helps, and would like to hear how you progressed.
We have something like this implemented in production.
To avoid duplicates, we do a MySQL UPDATE like this (I modified the query to resemble your table):
UPDATE queue SET id = LAST_INSERT_ID(id), date_update_started = ...
WHERE date_update_started IS NULL AND ...
LIMIT 1;
We do this UPDATE in a single transaction, and we leverage the LAST_INSERT_ID function. When used like that, with a parameter, it writes in the transaction session the parameter that, in this case, it's the ID of the single (LIMIT 1) queue that has been updated (if there is one).
Just after that, we do:
SELECT LAST_INSERT_ID();
When used without parameter, it retrieves the previously stored value, obtaining the queue item's ID that has to be performed.
Edit: Sorry, I totally misunderstood your question
You should just put a "locked" column on your table put the value to true on the entries your script is working with, and when it's done put it to false.
In my case i have put 3 other timestamp (integer) columns: target_ts , start_ts , done_ts.
You
UPDATE table SET locked = TRUE WHERE target_ts<=UNIX_TIMESTAMP() AND ISNULL(done_ts) AND ISNULL(start_ts);
and then
SELECT * FROM table WHERE target_ts<=UNIX_TIMESTAMP() AND ISNULL(start_ts) AND locked=TRUE;
Do your jobs and update each entry one by one (to avoid data inconcistencies) setting the done_ts property to current timestamp (you can also unlock them now). You can update target_ts to the next update you wish or you can ignore this column and just use done_ts for your select
Each time the script runs I would have the script generate a uniqid.
$sctiptInstance = uniqid();
I would add a script instance column to hold this value as a varchar and put an index on it. When the script runs I would use select for update inside of a transaction to select your rows based on whatever logic, excluding rows with a script instance, and then update those rows with the script instance. Something like:
START TRANSACTION;
SELECT * FROM table WHERE script_instance = '' AND date_update_started > 1 DAY ORDER BY date_last_updated LIMIT 1000 FOR UPDATE;
UPDATE table SET date_update_started = UTC_TIMESTAMP(), script_instance = '{$scriptInstance}' WHERE script_instance = '' AND date_update_started > 1 DAY ORDER BY date_last_updated LIMIT 1000;
COMMIT;
Now those rows will be excluded from other instances of the script. Do you work, and then update the rows to set the script instance back to null or blank, and also update your date last updated column.
You could also use the script instance to write to another table called "current instances" or something like that, and have the script check that table to get a count of running scripts to control the number of concurrent scripts. I would add the PID of the script to the table as well. You could then use that information to create a housekeeping script to run from cron periodically to check for long running or rogue processes and kill them, etc.
I have a system working exactly like this in production. We run a script every minute to do some processing, and sometimes that run can take more than a minute.
We have a table column for status, which is 0 for NOT RUN YET, 1 for FINISHED, and other value for under way.
The first thing the script does is to update the table, setting a line or multiple lines with a value meaning that we are working on that line. We use getmypid() to update the lines that we want to work on, and that are still unprocessed.
When we finish the processing, the script updates the lines that have the same process ID, marking them as finished (status 1).
This way we avoid each of the scripts to try and process a line that is already under processing, and it works like a charm. This doesn't mean that there isn't a better way, but this does get the work done.
I have used a stored procedure for very similar reasons in the past. We used the FOR UPDATE read lock to lock the table while a selected flag was updated to remove that entry from any future selects. It looked something like this:
CREATE PROCEDURE `select_and_lock`()
BEGIN
START TRANSACTION;
SELECT your_fields FROM a_table WHERE some_stuff=something
AND selected = 0 FOR UPDATE;
UPDATE a_table SET selected = 1;
COMMIT;
END$$
No reason it has to be done in a stored procedure though now I think about it.
Visitor opens url, for example
/transport/cars/audi/a6
or
/real-estate/flats/some-city/city-district
I plan separate table for cars and real-estate (separate table for each top level category).
Based on url (php explode create array)
$array[0] - based on the value know which table to SELECT
And so on $array[1], $array[2] ...
For example, RealEstate table may look like:
IdOfAd | RealEstateType | Location1 | Location2 | TextOfAd | and so on
----------------------------------------------------------------------
1 | flat | City1 | CityDistric1 | text.. |
2 | land | City2 | CityDistric2 | text.. |
And mysql query to display ads would be like:
SELECT `TextOfAd`, `and so on...`
WHERE RealEstateType = ? AND Location1 =? AND Location2 = ?
// and possibly additional AND .. AND
LIMIT $start, $limit
Thinking about performance. Hopefully after some long time number of active ads would be high (also i plan not to delete expired ads, just change column value to 0 not to display for SELECT; but display if directly visit from search engine).
What i need to do (change database design or SELECT in some another way), if for example number of rows in table would be 100 000 or millions?
Thinking about moving expired ads to another table (for which performance is not important). For example, from search engine user goes to some url with expired ad. At first select main table, if do not find, then select in table for expired ads. Is this some kind of solution?
Two hints:
Use ENGINE=InnoDB when creating your table. InnoDB uses row-level locking, which is MUCH better for bigger tables, as this allows rows to be read much faster, even when you're updating some of them.
ADD INDEX on suitable columns. Indexing big tables can reduce search times by several orders of magnitude. They're easy to forget and a pain to debug! More than once I've been investigating a slow query, realised I forgot a suitable INDEX, added it, and had immediate results on a query that used to take 15 seconds to run.
I'm new to mysql & I have coded my first php-mysql application.
I have 30 devices list from which I want user to add their preferred devices into his account.
So, I created mysql table "A" where some device-id's are being stored under particular userID like this
UserID | Device Id
1 | 33
1 | 21
1 | 52
2 | 12
2 | 45
3 | 22
3 | 08
1 | 5
more.....
Say, I have 5000 user-ids & 30 devices-ids.
& If each user has 15 device-id's (on-average) under his account records.
Then it will be 5000 X 15 = 75000 records under Table "A"
So, My question, Is there any limit on how many records can we store in to mysql table ?
Is my approach for storing records as mentioned above is correct ?? whether it will affect query performance if more users are added ?
OR there is any better way to do that ?
It's very unlikely that you will approach the limitations of a MySQL table with two columns that are only integers.
If you're really concerned about query performance, you can just go ahead and throw an index on both columns. It's likely that the cost of inserting / updating your table will be negligible even with an index on the device ID. If your database gets huge, it can speed up queries such as "which users prefer this device". Your queries that ask "what devices do this user prefer" will also be fast with an index on user.
I would just say to make this table a simple two column table with a two-part composite key (indexed). This way, it will be as atomic as possible and won't require any of the tom-foolery that some may suggest to "increase performance."
Keep it atomic and normal -- your performance will be fine and you won't exceed any limitations of your DBMS
I don't see anything wrong in it. If you are looking forward to save some server space, then don't worry about it. Let your database do the underlying job. Index your database properly with an ID - int(10) primary auto increment . Think about scalability when it is needed. Your first target should be to complete the application that you are making. Then test it. If you find that it is causing any lag, problem, then start worrying about the things to solve the problem. Don't bother yourself with things that probably you might not even face.
But considering the scale of your application (75k to 1 lac records), it shouldn't be much of a task. Alternatively you can have a schema like this for your users
(device_table)
device_id
23
45
56
user_id | device_id
1 | 23,45,67,45,23
2 | 45,67,23,45
That is storing device_ids in an array and then getting the device_id for particular user as
$device_for_user=explode(',',$device_id)
Where of course device_id is retrieved from mysql database.
so you'll have
$device_for_user[0]=23
$device_for_user[1]=45
amd so on.
But this method isn't a very good design or an approach. But just for your information, this is one way of doing it