Why The connection was reset? Joomla & MySQL - php

SELECT t1.id,t1.tx_id,t1.tx_date,t1.bx_date,t1.method,t1.theater_id,t1.showtime_id,t1.category_id,t1.amount,t1.fname,t1.status,t1.mobile,(CASE WHEN (t4.type = '1') THEN ( (t1.full_tickets * 2 ) + (t1.half_tickets)) ELSE ( t1.full_tickets + t1.half_tickets ) END) as no_seats ,u.username FROM 'reservation` as t1 LEFT JOIN `theatercategories` as t4 ON t1.category_id=t4.id JOIN `users` AS u ON u.id = t1.user_id WHERE t1.bx_date >= '2012-08-01' AND t1.bx_date <= '2012-08-31' ORDER BY t1.id desc
Above query returns "The connection was reset" error.
It loads 75,000 records(75,195 total, Query took 15.2673 sec). I use MYSQL with Joomla. What seemes be the issue ?
Please guide me. Thanks

There are a number of possible solutions ... depends on the "why" ... so it ends up being a bit of trial and error. On a fresh install, that's tricky to determine. But, if you made a recent "major" change that's a place to start looking - like modifying virtual hosts or adding/enabling XDebug.
Here's a list of things I've used/done/tried in the past
check for infinite loops ... in particular looping through a SQL
fetch result which works 99% of the time except the 1% it doesn't. In
one case, I was using the results of two previous queries as the
upper and lower bounds of a for loop ... and occasionally got a upper
bound of a UINT max ... har har har (vomit)
copying the ./php/libmysql.dll to the windows/system32 directory (Particularly if you see Parent: child process exited with status
3221225477 -- Restarting in your log files ... check out:
http://www.java-samples.com/showtutorial.php?tutorialid=1050)
if you modify PHP's error_reporting at runtime ... in certain circumstances this can cause PHP to degenerate into an unstable state
if, say, in your PHP code you modify the superglobals or fiddle
around with other deep and personal background system variables (Nah,
who would ever do such evil hackery? ahem)
if you convert your MySQL to something other than MyISAM or mysqli
There is a known bug with MySQL related to MyISAM, the UTF8 character set and indexes (http://bugs.mysql.com/bug.php?id=4541)
Solution is to use InnoDB dialect (eg sql set GLOBAL
storage_engine='InnoDb';)
Doing that changes how new tables are created ... which might slightly alter the way results are returned to a fetch statement ...
leading to an infinite loop, a malformed dataset, etc. (although this
change should not hang the database itself)
Other helpful items are to ramp up the debug reporting for PHP and
apache in their config files and restart the servers. The log files
sometimes give a clue as to at least where the problem might reside.
If it happens after your page content was finished it's more likely
in the php settings. If it's during page construction, check your PHP
code. Etc. etc.
Hope the above laundry list helps ...

just refresh you DB link, it might disconnected because of some reason,

Related

MySQL query is extremely long from PHP

I have a query that takes 0.0002s in PHPMyAdmin and takes hundreds of seconds if I do it from PHP. Here it is:
SELECT id, id_pages, link, childlink, url, hiddencontent, cansearch,
(
SELECT p.id
FROM pages as p
WHERE pages.hiddencontent=1
AND p.id_pages IS NOT NULL
AND p.hiddencontent=0
AND pages.id=p.id_pages
order by p.npp asc
limit 1
) as id_firstchild
FROM pages
It returns around 24k rows and I don't know why it takes so long. My friend tried it on his PC and it worked lightning fast and his pc is not better. I don't know the reason of this PHP behavior, maybe I should make some changes in the configuration file?
You have two questions:
Why the timing?
Is the "Query cache" turned on? That's about the only way it can run in 0.2ms. (Any non-trivial SELECT that runs in under 1ms almost certainly did not run, but was found in that cache.)
And, as pointed out by others, phpmyadmin silently adds a LIMIT. However, other clues (mostly in Comments) point to the Query cache giving anomalous results.
How to speed up.
SELECT id, id_pages, link, childlink, url, hiddencontent, cansearch,
if (hiddencontent = 1, NULL, -- to avoid doing the SELECT
( SELECT p.id
FROM pages as p
WHERE p.hiddencontent = 0
AND outer.id = p.id_pages -- fails on NULL
order by p.npp asc
limit 1
)) as id_firstchild
FROM pages AS outer -- clarify which is which
and have this 'composite' and 'covering' index:
INDEX(hiddencontent, id_pages, npp, id)
Two improvements:
Avoid calling the subquery when not needed.
Have an index that will allow the subquery to look at only one row, and only in the index's BTree, hence be 'blazingly fast'.
Whenever you try to decide if searching or fetching is slow, use a LIMIT 1 at the end of your query (and comment out the ORDER BY part if there's any). This way, you get the first row so you will know how long that takes. It should be blazing fast.
Another useful information is when you enclose the whole query in another, like SELECT ...FROM (SELECT ...), in which case the outer query should count the rows returned. This will give you the total time needed to identify all rows that would be fetched, but without actually fetching them. This is useful to determine if you wrote your SQL query poorly, or it's just a lot of data to fetch. (If both of the above go fast and you still get a slow query, it's the fetch.)
You can also make use of EXPLAIN to see if your performance issue is because of insufficient or improper indexing.
As for phpmyadmin, the first comment on your post pretty much sums it up: phpmyadmin uses a LIMIT so it will run faster even if your query itself is slow.

How to parallelize requests with MySQL+PHP+nginx?

Recent Discovery
Among everything else I tried, I replaced my JMeter profile with a custom JavaScript that hit each of my API endpoints in turn in an infinite loop, and then ran this script in parallel in different browsers (one Firefox, one Chrome, one Safari) - to try and rule out issues related to all of my connections coming from the same source (same user agent, same cookies, same session ID, etc)
When I did this, I noticed that all of my issues went away. The queries ran in parallel and the app was a lot more responsive than JMeter would have you believe
It seems impossible to me that JMeter would be serializing the requests, since it's a de facto standard for load testing. So I started trying to reproduce the behavior
In an attempt to re-create the JMeter I created the following two PHP scripts which (hopefully) simulated my Yii application:
slow.php
<?php
session_start();
$_SESSION['some'] = 'value';
// Yii is calling session_write_close() almost immediately after
// the session is initialized, but to try and exacerbate issues,
// I've commented it out:
// session_write_close();
$dsn = "mysql:host=localhost;dbname=platypus;unix_socket=/tmp/mysql.sock";
$pdo = new PDO($dsn, "user", "password");
// Yii was using whatever the default persistence behavior was,
// but to try and exacerbate issues I set this flag:
$pdo->setAttribute(PDO::ATTR_PERSISTENT, true);
// Simulate a query running for 1 second by issuing a 1-second sleep
$pdo->query("DO SLEEP(1)");
echo "Done";
fast.php
<?php
session_start();
$_SESSION['some'] = 'value';
$dsn = "mysql:host=localhost;dbname=platypus;unix_socket=/tmp/mysql.sock";
$pdo = new PDO($dsn, "user", "password");
$pdo->setAttribute(PDO::ATTR_PERSISTENT, true);
// Simulate a query running for 0.1 seconds
$pdo->query("DO SLEEP(0.1)");
echo "Done";
Running JMeter against these two new endpoints there was no serialization of requests. Everything ran in parallel. fast.php always returned in 100-150ms and slow.php always returned in 1000-1050ms even as I scaled up to 3, 4, and 5 threads. I was able to watch things collapse at 11 threads, but that's because I exceeded the number of worker threads in PHP
So to summarize:
The issue only occurs when profiling my API with JMeter and is not inherent in the app itself
The issue isn't just a JMeter bug, but is somehow tied to my application or Yii 1.1
I tried but could not come up with a minimum repro case
Despite the issue being non-existent when profiling with others tools, lots of people responded and gave lots of helpful information:
Avoid persistent connections in PHP (could cause multiple requests to share a connection, probably not)
Avoid session locking by calling session_write_close() as early as possible
Ensure you have enough PHP worker threads to handle the number of simultaneous connections
MySQL fully supports parallel requests (if the hardware can handle it)
Be wary of table locking (any transaction with an UPDATE statement could potentially lock the tables)
MyISAM does table-level locking instead of row-level locking
Original Post
I inherited a web application and I'm trying to make sense of its performance profile so I can start optimizing it for speed.
One thing I noticed pretty early on is that requests to the server are being serialized. For instance, suppose I have three endpoints with response times like so:
/api/endpoint1 --> 50ms
/api/endpoint2 --> 1000ms
/api/endpoint3 --> 100ms
If I hit a single endpoint, I measure the appropriate response times. But when I set up a script to hit all 3 at once I will sometimes see something like the following:
endpoint1: 50ms
endpoint2: 1050ms
endpoint3: 1150ms
Clearly the call to endpoint3 was queued and waiting for the call to endpoint2 to finish before it got a response.
My first thought was that this should be trivially solved with multithreading, so I took a look at the server configuration. PHP-FPM's process manager was set to "dynamic" with "start_servers" of 1, "max_children" of 5, and "max_spare_servers" of 2. For testing purposes I swapped this to "static" so that 5 PHP processes would remain open for handling connections in parallel (more than the 3 for the number of endpoints I was hitting, so they should be able to process simultaneously)
This had no impact on performance, so I looked at my nginx config. "worker_processes" was set to 1 with "worker_connections" set to 1024. I know that nginx uses an event loop model, so it shouldn't be blocking while it waits for a response from PHP-FPM. But just in case, I bumped up "worker_processes" to 5
Still, no effect. So next I looked at the database. All 3 endpoints had to hit the database, and I know as a fact that the 1000ms response time is mostly spent waiting on a long-running database query. I tried setting "thread_pool-size" to 5 and also within the MySQL REPL I set "innodb_parallel_read_threads" and "mysqlx_min_worker_threads" to 5
Still, my requests were getting serialized. When I log into the MySQL REPL and type show processlist; while my script is running (using a while-true loop to repeatedly hit those 3 API endpoints) I noticed that there was only ever one connection to the web application's user
Unfortunately I'm not sure if my issue lies with the database (not allowing more than one connection), with PHP-FPM (not processing more than one request at a time), or with nginx (not forwarding more than one request at a time to PHP-FPM). I'm also not sure how to figure out which one is acting as the bottleneck
Update
Looking around some more I found this SO post which seems to suggest that MySQL doesn't support parallel queries from the same user (e.g. from the web application user)
Is this true? Surely such a ubiquitous database engine wouldn't have such a performance flaw, especially given how commonly it's used with AWS for massively scaled applications. I understand that for simple "read from disk" queries parallelizing them wouldn't improve performance since they'd just have to sit in a queue waiting on disk I/O, but modern databases have in-memory caches, and most of the really slow operations like filesort tend to happen in memory. There's no reason a disk-bound query couldn't run in parallel (make a request to disk and start waiting on I/O) while a cpu-bound query is busy sorting a table in RAM. The context switching may slightly slow down the cpu-bound queries, but if slowing those down from 1000ms to 1200ms means my 5ms query can run in 5 ms, I think that's worth it.
My queries
Here are the queries for my 3 endpoints. Note that the timings listed are the response time for the full HTTP pipeline (from browser request to response) so this includes overhead from nginx and PHP, plus any post-processing of the query done in PHP. That said, the query in endpoint 2 makes up 99% of the runtime, and locks the database so that endpoints 1 and 3 are queued up instead of returning quickly.
endpoint1 (50ms)
SELECT * FROM Widget WHERE id = 1 LIMIT 1
(Note that 50ms is the full response time for the endpoint, not how long the query takes. This query is clearly on the order of microseconds)
endpoint2 (1000ms)
USE platypus;
SELECT `t`.`(49 fields)` AS `t0_cX`,
`site`.`(29 fields)` AS `t2_cX`,
`customer`.`(26 fields)` AS `t4_cX`,
`domain`.`(20 fields)` AS `t6_c0`,
`domain-general_settings`.`(18 fields)` AS `t8_cX`,
`domain-access_settings`.`(17 fields)` AS `t9_cX`,
`customer-general_settings`.`(18 fields)` AS `t10_cX`,
`customer-access_settings`.`(17 fields)` AS `t11_cX`,
`site-general_settings`.`(18 fields)` AS `t12_cX`,
`site-access_settings`.`(17 fields)` AS `t13_cX`,
`backup_broadcast`.`(49 fields)` AS `t14_cX`,
`playlists`.`(11 fields)` AS `t16_cX`,
`section`.`(10 fields)` AS `t17_cX`,
`video`.`(16 fields)` AS `t18_cX`,
`general_settings`.`(18 fields)` AS `t19_cX`,
`access_settings`.`(17 fields)` AS `t20_cX`,
FROM `broadcast` `t`
LEFT OUTER JOIN `site` `site`
ON ( `t`.`site_id` = `site`.`id` )
LEFT OUTER JOIN `customer` `customer`
ON ( `site`.`customer_id` = `customer`.`id` )
LEFT OUTER JOIN `domain` `domain`
ON ( `customer`.`domain_id` = `domain`.`id` )
LEFT OUTER JOIN `generalsettings` `domain-general_settings`
ON ( `domain`.`general_settings_id` =
`domain-general_settings`.`id` )
LEFT OUTER JOIN `accesssettings` `domain-access_settings`
ON
( `domain`.`access_settings_id` = `domain-access_settings`.`id` )
LEFT OUTER JOIN `generalsettings` `customer-general_settings`
ON ( `customer`.`general_settings_id` =
`customer-general_settings`.`id` )
LEFT OUTER JOIN `accesssettings` `customer-access_settings`
ON ( `customer`.`access_settings_id` =
`customer-access_settings`.`id` )
LEFT OUTER JOIN `generalsettings` `site-general_settings`
ON ( `site`.`general_settings_id` =
`site-general_settings`.`id` )
LEFT OUTER JOIN `accesssettings` `site-access_settings`
ON ( `site`.`access_settings_id` =
`site-access_settings`.`id` )
LEFT OUTER JOIN `broadcast` `backup_broadcast`
ON ( `t`.`backup_broadcast_id` = `backup_broadcast`.`id` )
AND ( backup_broadcast.deletion IS NULL )
LEFT OUTER JOIN `playlist_broadcast` `playlists_playlists`
ON ( `t`.`id` = `playlists_playlists`.`broadcast_id` )
LEFT OUTER JOIN `playlist` `playlists`
ON
( `playlists`.`id` = `playlists_playlists`.`playlist_id` )
LEFT OUTER JOIN `section` `section`
ON ( `t`.`section_id` = `section`.`id` )
LEFT OUTER JOIN `video` `video`
ON ( `t`.`video_id` = `video`.`id` )
AND ( video.deletion IS NULL )
LEFT OUTER JOIN `generalsettings` `general_settings`
ON ( `t`.`general_settings_id` = `general_settings`.`id` )
LEFT OUTER JOIN `accesssettings` `access_settings`
ON ( `t`.`access_settings_id` = `access_settings`.`id` )
WHERE
(
(
t.id IN (
SELECT `broadcast`.id FROM broadcast
LEFT JOIN `mediashare` `shares`
ON ( `shares`.`media_id` = `broadcast`.`id` )
AND `shares`.media_type = 'Broadcast'
WHERE
(
(
broadcast.site_id IN(
'489', '488', '253', '1083', '407'
)
OR
shares.site_id IN(
'489', '488', '253', '1083', '407'
)
)
)
)
)
AND
(
(
(
(t.deletion IS NULL)
)
)
AND
(
IF(
t.backup_mode IS NULL,
t.status,
IF(
t.backup_mode = 'broadcast',
backup_broadcast.status,
IF(
t.backup_mode = 'embed',
IF(
t.backup_embed_status,
t.backup_embed_status,
IF(
'2020-01-08 16:34:52' < t.date,
1,
IF(
t.date > Date_sub(
'2020-01-08 16:34:52',
INTERVAL IF(t.expected_duration IS NULL, 10800, t.expected_duration) second
),
10,
12
)
)
),
t.status
)
)
) != 0
)
)
)
LIMIT 10;
This query takes roughly 1000ms to run, but the PHP for the endpoint is extremely simple (run the query, return the results as JSON) and only adds a couple milliseconds of overhead
endpoint 3 (100ms)
SELECT * FROM platypus.Broadcast
WHERE deletion IS NULL
AND site_id IN (SELECT id FROM platypus.Site
WHERE deletion IS NULL
AND customer_id = 7);
There's additional validation on the PHP side here which makes this endpoint take 100ms. The SQL, as you can see, is still fairly simple.
Create Table Statements
As there is a post length limit in StackOverflow, I cannot show the CREATE TABLE for every single table touched by endpoint 2, but I can show at least one table. Others use the same engine.
CREATE TABLE `Widget` (
`id` int unsigned NOT NULL AUTO_INCREMENT,
`widget_name` varchar(255) NOT NULL,
`widget_description` varchar(255) NOT NULL,
`status` varchar(255) NOT NULL,
`date_created` datetime NOT NULL,
`date_modified` datetime NOT NULL,
`auto_play` varchar(255) NOT NULL,
`on_load_show` varchar(255) NOT NULL,
`widget_content_source` varchar(255) NOT NULL,
`associated_sites` text NOT NULL,
`author_id` int NOT NULL,
`associated_sections` text,
`after_date` datetime DEFAULT NULL,
`before_date` datetime DEFAULT NULL,
`show_playlists` varchar(255) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
`is_classic` tinyint(1) NOT NULL,
`default_site` int unsigned DEFAULT NULL,
`auth_code_url` varchar(255) DEFAULT NULL,
`widget_layout_id` int unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `fk_Widget_widget_layout_id_WidgetLayout_id` (`widget_layout_id`),
CONSTRAINT `fk_Widget_widget_layout_id_WidgetLayout_id` FOREIGN KEY (`widget_layout_id`) REFERENCES `WidgetLayout` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1412 DEFAULT CHARSET=utf8
Note
Notice that endpoint 2 doesn't even touch the Widget table, but endpoint 1 (which ONLY touches the Widget table) is also queued up. This eliminates the possibility of table locking.
When observing the process list in MySQL, only one connection is ever being made to the database from the application user. The issue may therefore lie in my PHP configuration.
Explain for query 2
Attached is the EXPLAIN SELECT ... query for endpoint 2
Simpler experiments
To try and determine where the parallel pipeline was falling apart, I created two simple scripts:
sleep.php
<?php
sleep(5);
echo "Done sleeping";
return.php
<?php
echo "That's all";
Doing this (sleeping in PHP) and running my script to hit these two endpoints with 3 threads I saw no issues. return.php always came back in ~11 milliseconds, despite sleep.php taking 5066 on average. I then tried doing the sleeping in MySQL:
sleep.php
<?php
$pdo = new PDO("...", "user", "pass");
$pdo->query("DO SLEEP(5)");
echo "Done sleeping";
This, again, had no issues. The sleeping endpoint did not block the non-sleeping one.
This means that the issue does not exist at the nginx, PHP, or PDO level - but that there must be some kind of table or row locking going on. I'm going to re-enable the general query log and scan through every query being performed to see if I can figure out what's happening.
Final Update
If you scroll up to "Recent Discovery" at the top of this post, you'll notice that I've modified my understanding of the problem.
I was not having an issue with parallelization, but with JMeter. I have failed to create a simple repro case, but I know now that the issue does not lie with my application but rather with how I'm profiling it.
MySQL + PHP + Apache has 'always' been very good at running separate SQL statements in 'parallel'. If separate users issue HTTP requests, they will naturally go through Apache quickly (probably in sequence, but fast) and get to separate instances of PHP (assuming Apache has configured enough 'children'). Each PHP script will make its own connection MySQL. MySQL will accept multiple connections quite rapidly (assuming max_connections is high enough, which it is by default). Each MySQL connection will work independently (baring low-level database locks, mutexes, etc). Each will finish when it finishes, ditto for PHP, and Apache returning results to the user.
I assume (without knowing for sure) that nginx works similarly.
Note: I suggest that Apache (and nginx) does things serially. But I suspect it takes on the order of a millisecond to hand off an HTTP request to PHP, so this "serial" step won't explain the timings you found.
I conclude that one of these is not really happening:
The configuration at each step is not allowing 3 children/connections/etc, or
There are 3 separate HTTP requests.
There are 3 separate PHP scripts.
The 3 SQL statements are not blocking each other. (Please provide the SQL.) Note: ENGINE=MyISAM uses table locking; this, alone, might explain the problem. (Please provide SHOW CREATE TABLE.)
It may be possible (after seeing the SQL) to speed up the SQL, thereby diminishing the overall problem of sluggishness.
Queries
Assuming id is the PRIMARY KEY of each table, then these other indexes may be beneficial at speeding up Query 2:
backup_broadcast: (deletion, id)
shares: (media_type, media_id, site_id)
broadcast: (site_id, id)
video: (deletion, id)
playlists_playlists: (playlist_id, broadcast_id)
playlist_broadcast smells like a "many-to-many mapping" table. If so, I recommend following the tips in http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table . (Ditto for any similar tables.)
OR and IN ( SELECT ... ) tend to be inefficient constructs. But it sounds like you don't have any control over the queries?
Is that a LIMIT without an ORDER BY?? Do you care which 10 rows you get?? It will not be predictable.
What happens with that huge number of columns? Seems like most of them will be the same every time you run the query, thereby making it mostly a waste of time??
For query 3, site needs INDEX(deletion, customer_id) (in either order). However reformulating it to use a JOIN or EXISTS would probably run faster.
I think you have an issue with php session locking : your second and third query are trying to access the same php session, and are waiting.
Try to call session_write_close as soon as you can, to free your php session. As soon as you can : when you are sure you will not write any more data in your php session.
Another thread where you can read about the dangers of not handling your session properly
A long explanation about what is a php session (I did not read it all, sorry)
An easy way to check this is to try with 2 browsers or in anonymous/incognito mode : your cookies will not be shared, and you should have 2 sessions, not blocking each other.
MySQL can handle a lot of parallel queries, but you can't do more than one query at the time for each connection. The way PHP is usually setup is that each request goes to a different thread/process, so each process will have its own connection to MySQL, thus the problem mentioned is avoided. Unless you use persistent connection inside PHP and then you might end up using the same connection for each request. If that's the case it should be easy to disable it and go back to the standard one database connection per request model.
My first guess is that endpoint 2 triggers some locking on the database and that's why endpoint3 query is queued until enpoint2's query finishes. This can be fixed by changing the logic in the code (avoid or minimize the locking of the database), or by changing database configuration or table engines used to better suit application needs. Example: InnoDB uses row level locking while MyISAM locks the whole table lock.
Profiling will be really helpful if you don't mind configuring it. I suggest to have a look at Blackfire.io, New Relic or xdebug profiling if you go this route. You will be able to find the bottlenecks faster this way.
HM... too long for a comment.
a little bit simplified every engine has one queue where it gathers querys to be computed, depending on hardware it uses 2 or 3 or even more threads to compute every query. More threads are running more time every query needs, because of locks, like it locks an entire Table, when it inserts a new row with autoincrement.(you will find with a search many examples for locks). Of course every query needs memory and other resources that they have to share with the rest of all computer software that is running on a server.
With clustes you pay the price with overhead to manage multiple sql servers.
So from sql server side, it is parallel, however you need the hardware to support many threads/many engines(which should only be uses very carefully)
Of course you can have many users in sql, but for convenience sake, you have usually one per APP or sometimes even one per server. But the same user can access the database simultaneously, but you can disable that of course.
Your php runs parallel, because webserver are build to run papallel requests and and there it doesn't matter if it runs php, Python(django) or javascript(nodejs) , apache, IIS, nginx and there are a lot more, every technology has there perks and of cause more module you add to en engine, so much slower it gets.
So everything is parallel to a certain degree and you can increase the power of such systems as you see in cloud providers or virtual servers and so on.
The limits you only notice when like the introduction of Pokemon go or new games where even the huge cloud providers crash. Or the disaster with ObamaCare where nothing was tested on that scale, whichever idi... was responsible,
Parallelizing such tasks is difficult, because in case of a webserver and sqlserver it has to a degree caches where they park requests that are often made, but usually every request needs its own data.
In reality everything is much more complicated, starting with cpus with 3 Pipelines , Multiple cores and shared memory(which caused Meltdown and their brothers), goes over tables or databases that reside only in memory for high performance or web server that run only in cache of cpus, which is much faster than memory or harddrives.....

Mysql unable to update a row, when multiple selects are in process or taking too much time

I have a table called Settings which has only one row. The settings are very important in all the cases for my program, The Settings is been read by 200 to 300 users every second. I haven't used any caching yet. I cannot update the settings table for a value like Limit. Change the limit from 5 -10 Or anything from an API.
Ex: Limit Products 5 - 10. The update query runs forever.
From the Workbench, I can update the record, But from Admin Panel through API it's not updating or take too much time. Table - InnoDB
1. Already Tried Locking With Read - Write.
2. Transaction.
3. Made a View of the table and Tried to update the table, the Same Issue remains.
4. The Update query is fine from Workbench, But through an API. It runs all day.
Is there anyway, I can lock the read operations on the table and update the table. I have only one row in a table.
Any help would be highly appreciated, Thanks in advance.
This sounds like a really good use case for using query cache.
The query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again. The query cache is shared among sessions, so a result set generated by one client can be sent in response to the same query issued by another client.
The query cache can be useful in an environment where you have tables that do not change very often and for which the server receives many identical queries.
To enable the query cache, you can run:
SET GLOBAL query_cache_size = 1000000;
And then edit your mysql config file (typically /etc/my.cnf or /etc/mysql/my.cnf):
query_cache_size=1000000
query_cache_type=2
query_cache_limit=100000
And then for your query you can change it to:
SELECT SQL_CACHE * FROM your_table;
And that should make it so you are able to update the table (as it won't be constantly locked).
You would need to restart the server.
As an alternative, you could implement cacheing in your PHP application. I would use something like memcached, but as a very simplistic solution you could do something like:
$settings = json_decode(file_get_contents("/path/to/settings.json"), true);
$minute = intval(date('i'));
if (isset($settings['minute']) && $settings['minute'] !== $minute) {
$settings = get_settings_from_mysql();
$settings['minute'] = intval(date('i'));
file_put_contents("/path/to/settings.json", json_encode($settings), LOCK_EX);
}
Are the queries being run in the context of transactions with say a transaction isolation level for repeatable read? It sounds like the update isn't able to complete due to a lock on the table, in which case caching isn't likely to be able to help you, as on a write the cache will be purged. More information on repeatable reads can be found at https://www.percona.com/blog/2012/08/28/differences-between-read-committed-and-repeatable-read-transaction-isolation-levels/.

SELECT+UPDATE to avoid returning the same result

I have a cron task running every x seconds on n servers. It will "SELECT FROM table WHERE time_scheduled<CURRENT_TIME" and then perform a lengthy task on this result set.
My problem is now: How do I avoid having two seperate servers perform the same task at the same time?
The idea is to update *time_scheduled* with a set interval after selecting it. But if two servers happen to run the query at the same time, that will be too late, no?
All ideas are welcome. It doesnt have to be a strict MySQL solution.
Thanks!
I am guessing you have a single MySQL instance, and connections from your n servers to run this processing job. You're implementing a job queue here.
The table you mention needs to use the InnoDB access method (or one of the other transaction-friendly access methods offered by Percona or MariaDB).
Do these items in your table need to be processed in batches? That is, are they somehow inter-related? Or is it possible for your server processes to handle them one-by-one? This is an important question, because you'll get better load balancing between your server processes if you can handle them individually or in small batches. Let's assume the small batches.
The idea is to prevent any server process from grabbing onto a row in your table if some other server process has that row. I've had to do this kind of thing a lot, and here is my suggestion; I know this works.
First, add an integer column to your table. Call it "working" or some such thing. Give it a default value of zero.
Second, assign a permanent id number to each server. The last part of the server's IP address (for example, if the server's IP address is 10.1.0.123, the id number is 123) is a good choice, because it's probably unique in your environment.
Then, when a server's grabbing work to do, use these two SQL queries.
UPDATE table
SET working = :this_server_id
WHERE working = 0
AND time_scheduled < CURRENT_TIME
ORDER BY time_scheduled
LIMIT 1
SELECT table_id, whatever, whatever
FROM table
WHERE working = :this_server_id
The first query will consistently grab a batch of rows to work on. If another server process comes in at the same time, it won't ever grab the same rows, because no process can grab rows unless working = 0. Notice that the LIMIT 1 will limit your batch size. You don't have to do this, but you can. I also threw in ORDER BY to process the rows first that have been waiting the longest. That's probably a useful way to do things.
The second query retrieves the information you need to do the work. Don't forget to retrieve the primary key values (I called them table_id) for the rows you're working on.
Then, your server process does whatever it needs to do.
When it's done, it needs to throw the row back into the queue for a later time. To do that, the server process needs to set the time_scheduled to whatever it needs to be, then to set working = 0. So, for example, you could run this query for each row you're processing.
UPDATE table
SET time_scheduled = CURRENT_TIME + INTERVAL 5 MINUTE,
working = 0
WHERE table_id = ?table_id_from_previous_query
That's it.
Except for one thing. In the real world these queuing systems get fouled up sometimes. Server processes crash. Etc. Etc. See Murphy's Law. You need a monitoring query. That's easy in this system.
This query will give a list of all jobs that are more than five minutes overdue, along with the server that's supposed to be working on them.
SELECT working, COUNT(*) stale_jobs
FROM table
WHERE time_scheduled < CURRENT_TIME - INTERVAL 5 MINUTE
GROUP BY WORKING
If this query comes up empty, all is well. If it comes up with lots of jobs with working set to zero, your servers aren't keeping up. If it comes up with jobs with working set to some server's id number, that server is taking a lunch break.
You can reset all the jobs assigned to the server that's gone to lunch with this query, if need be.
UPDATE table
SET working=0
WHERE working=?server_id_at_lunch
By the way, a compound index on (working, time_scheduled) will probably help this perform well.

Would I gain a performance boost by normalizing this table?

I'm running a sort of forum modification code on my server. In my database, I have a HUGE table called core, which basically includes all the admin-editable settings, of which there are a lot. My question is, should I normalize this?
I've created a fiddle to show you how large this table is: http://sqlfiddle.com/#!2/f4536/1
You'll see some columns called gear_notifications, gear_bank, gear_* etc. These indicate whether a certain system is turned on. For example, if gear_bank=1, then the Bank System is turned on. At the moment, in the __construct of my DB file, I run the following query:
$settings = mysql_query("SELECT GROUP_CONCAT(d.domain) AS domains, GROUP_CONCAT(d.zb) AS zb_info, c.* FROM core c JOIN domains d ON d.cid = c.id WHERE c.id ='$cid' LIMIT 1");
Ignoring the JOIN you can straight away see a problem here; the query returns EVERY field from the core table, regardless of whether the corresponding system is turned on. For example, if gear_bank=0, the query will still return bank_name, bank_history_perpage, bank_* etc. While this doesn't present any physical problem with how the code runs (as it can just ignore any data it does not need), I'd much prefer if it didn't have to load that data.
Would I be better off creating one table called core which has all the gear_* values, then corresponding tables (core_bank, core_* etc) for their corresponding values?
I've been querying this (bad pun, sorry!) for a while now and I just don't know enough to work out whether this will provide a performance boost for my code. Note that the above query will be run on EVERY page.
If I were to revert to the new system, i.e the multiple tables, how would I get all the information I need? I don't want to run one query to core to work out which systems are turned on, then subsequently run an extra query on all corresponding tables. I'd much prefer one query which JOINs all necessary tables based on the values of gear_* in core. Would that be possible?
I've not yet released this forum modification so I can make as many changes as I like without any real-world impact :) .
If you split your core into multiple tables you can then use NULL to indicate if a particular entry has info for a particular system. The joins will then work as expected.
However, your table really is not that big and it's doubtful that you will actually notice any speed improvement at the application level by breaking it up.
And if you don't need to query on individual columns then just use one TEXT column, put all your attributes in an array and then serialize the array into the text column.

Categories