I was experiencing some problems with duplicate queries slowing down rendering a HTML table (very similar select queries inside a while loop). So I created some simple caching functions (php):
check_cache()
write_cache()
return_cache()
These functions prevent the server from asking anything from the database.
Which sped things up quite a lot!
Later I read that MySQL caches SELECT statements:
The query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again.
Why does this increase performance if MySQL is already doing this?
Possible issues
1) If your application updates tables frequently, then the query cache will be constantly purged and you won't get any benefit from this.
2)The query cache is not supported for partitioned tables.
3)The query cache does not work in an environment where you have multiple mysqld servers updating the same MyISAM tables.
4)The SELECT statements should be identical.
Related
I have 1 mysql table which is controlled strictly by admin. Data entry is very low but query is high in that table. Since the table will not change content much I was thinking to use mysql query cache with PHP but got confused (when i googled about it) with memcached.
What is the basic difference between memcached and mysqlnd_qc ?
Which is most suitable for me as per below condition ?
I also intend to extend the same for autcomplete box, which will be suitable in such case ?
My queries will return less than 30 rows mostly of very few bytes data and will have same SELECT queries. I am on a single server and no load sharing will be done. Thankyou in advance.
If your query is always the same, i.e. you do SELECT title, stock FROM books WHERE stock > 5 and your condition never changes to stock > 6 etc., I would suggest using MySQL Query Cache.
Memcached is a key-value store. Basically it can cache anything if you can turn it into key => value. There are a lot of ways you can implement caching with it. You could query your 30 rows from database, then cache it row by row but I don't see a reason to do that here if you're returning the same set of rows over and over. The most basic example I can think of for memcached is:
// Run the query
$result = mysql_query($con, "SELECT title, stock FROM books WHERE stock > 5");
// Fetch result into one array
$rows = mysqli_fetch_all($result);
// Put the result into memcache.
$memcache_obj->add('my_books', serialize($rows), false, 30);
Then do a $memcache_obj->get('my_books'); and unserialize it to get the same results.
But since you're using the same query over and over. Why add the complication when you can let MySQL handle all the caching for you? Remember that if you go with memcached option, you need to setup memcached server as well as implementing logic to check if the result is already in cache or not, or if the records have been changed in the database.
I would recommend using MySQL query cache over memcached in this case.
One thing you need to be careful with MySQL query cache, though, is that your query must be exactly the same, no extra blank spaces, comments whatsoever. This is because MySQL does no parsing to determine compare the query string from cache at all. Any extra character somewhere in the query means a different query.
Peter Zaitsev explained very well about MySQL Query Cache at http://www.mysqlperformanceblog.com/2006/07/27/mysql-query-cache/, worth taking a look at it. Make sure you don't need anything that MySQL Query Cache does not support as Peter Zaitsev mentioned.
If the queries run fast enough and does not really slows your application, do not cache it. With a table this small, MySQL will keep it in it's own cache. If your application and database are on the same server, the benefit will be very small, maybe even not measurable at all.
So, for your 3rd question, it also depends on how you query the underlying tables. Most of the time, it is sufficient to let MySQL cache it internally. An other approach is to generate all the possible combinations and store these, so mysql does not need to compute the matching rows and returns the right one straight away.
As a general rule: build your application without caching and only add caches for things that do not change often if a) the computation for the resultset is complex and timeconsuming or b) you have multiple application instances calling the database over a network. In those cases caching results in better performance.
Also, if you run PHP in a web server like Apache, caching inside your program does not add much benefit as it only uses the cache for the current page. An external cache (like memcache)- is then needed to cache over multiple results.
What is the basic difference between memcached and mysqlnd_qc ?
There is rather nothing common at all between them
Which is most suitable for me as per below condition ?
mysql query cache
I also intend to extend the same for autcomplete box, which will be suitable in such case ?
Sphinx Search
This seems like a pretty basic question but one I don't know the answer to.
I wrote a script in PHP that loops through some data and then performs an UPDATE to records in our database. There are roughly some 150,000 records, so the script certainly takes a while to complete.
Could I potentially harm or interfere with the data insertion if I run a basic SELECT statement?
Say...I want to ensure that the script is working properly so if I run a basic SELECT COUNT() to see if it's increasing in real time as the script runs. Is this possible or would it screw something up?
Thank you!
Generally a SELECT call is incapable of "causing harm" provided you're not talking about SQL injection problems.
The InnoDB engine, which you should be using, has what's called Multi-Version Concurrency Control or MVCC for short. It means that until your UPDATE statement is finished, or the transaction that the statement is a part of, the SELECT will be done against the last consistent database state.
If you're using MyISAM, which is a very bad idea in most production environments due to the limitations of that engine and the way the data is stored without a rollback journal, the SELECT call will probably block until the UPDATE is applied since it does not support MVCC.
I have a rapidly growing, write-heavy PHP/MySql application that inserts new rows at a rate of a dozen or so per second into an INNODB table of several million rows.
I started out using realtime INSERT statements and then moved to PHP's file_put_contents to write entries to a file and LOAD DATA INFILE to get the data into the database. Which is the better approach?
Are there any alternatives I should consider? How can I expect the two methods to handle collisions and increased load in the future?
Thanks!
Think of LOAD DATA INFILE as a batch-method of inserting data. It eliminates the overhead of firing up an insert query for every statement therefore is much faster. However, you lose some of the control when handling errors. It's much easier to handle an error on a single insert query vs one row in the middle of a file.
Depending on whether you can afford to have the data inserted by the PHP not being instantly available in the table, then INSERT DELAYED might be an option.
MySQL will accept the data to be inserted and will deal with the insertion later on, putting it into a queue. So this won't block your PHP application while MySQL ensures the data to be inserted later on.
As it says in the manual:
Another major benefit of using INSERT DELAYED is that inserts from many clients are bundled together and written in one block. This is much faster than performing many separate inserts.
I have used this for logging data where a data loss is not fatal but if you want to be protected from server crashes when data from INSERT DELAYED hadn't been inserted yet, you could look into replicating the changes away to a dedicated slave machine.
The way we deal with our inserts is to have them sent to a message queue system like ActiveMQ. From there we have a separate application that loads the inserts using LOAD DATA INFILE in batches of about 5000. Error handling can still take place with the infile however it processes the inserts much faster. If setting up a message queue is outside of the scope of your application there is no reason that file_put_contents would not be an acceptable option -- Especially if it's already implemented and is working fine.
Additionally you may want to test disabling indexes during writes to see if that improves performance.
It doesn't sound like you should be using innoDB. Regardless, a dozen inserts per second should not be problematic even for crappy hardware - unless, possibly, your data model is very complex, but for that, LOAD DATA INFILE is very good because, among other things, it rebuilds the indexes only once, as opposed to on every insert. So using files is a decent approach, but do make sure you open them in append only mode.
in the long run (1k+ of writes/s), look at other databases - particularly cassandra for write heavy applications.
if you do go the sql insert route, wrap the pdo execute statements in a transaction. doing so will greatly speed up the process.
LOAD DATA is disabled on some servers for security reasons:
http://dev.mysql.com/doc/mysql-security-excerpt/5.0/en/load-data-local.html
Also I don't enjoy writing my applications upside down to maintain database integrity.
I have about 10 tables with ~10,000 rows each which need to be pulled very often.
For example, list of countries, list of all schools in the world, etc.
PHP can't persist this stuff in memory (to my knowledge) so I would have to query the server for a SELECT * FROM TABLE every time. Should I use memcached here? At first though it's a clear absolutely yes, but at second thought, wouldn't mysql already be caching for me and this would be almost redundant?
I don't have too much understanding of how mysql caches data (or if it even does cache entire tables).
You could use MySQL query cache, but then you are still using DB resources to establish the connection and execute the query. Another option is opcode caching if your pages are relatively static. However I think memcached is the most flexible solution. For example if you have a list of countries which need to be accessed from various code-points within your application, you could pull the data from the persistent store (mysql), and store them into memcached. Then the data is available to any part of your application (including batch processes and cronjobs) for any business requirement.
I'd suggest reading up on the MySQL query cache:
http://dev.mysql.com/doc/refman/5.6/en/query-cache.html
You do need some kind of a cache here, certainly; layers of caching within and surrounding the database are considerably less efficient than what memcached can provide.
That said, if you're jumping to the conclusion that the Right Thing is to cache the query itself, rather than to cache the content you're generating based on the query, I think you're jumping to conclusions -- more analysis is needed.
What data, other than the content of these queries, is used during output generation? Would a page cache or page fragment cache (or caching reverse-proxy in front) make more sense? Is it really necessary to run these queries "often"? How frequently does the underlying data change? Do you have any kind of a notification event when that happens?
Also, SELECT * queries without a WHERE clause are a "code smell" (indicating that something probably is being done the Wrong Way), especially if not all of the data pulled is directly displayed to the user.
I have a particular PHP page that, for various reasons, needs to save ~200 fields to a database. These are 200 separate insert and/or update statements. Now the obvious thing to do is reduce this number but, like I said, for reasons I won't bother going into I can't do this.
I wasn't expecting this problem. Selects seem reasonably performant in MySQL but inserts/updates aren't (it takes about 15-20 seconds to do this update, which is naturally unacceptable). I've written Java/Oracle systems that can happily do thousands of inserts/updates in the same time (in both cases running local databases; MySQL 5 vs OracleXE).
Now in something like Java or .Net I could quite easily do one of the following:
Write the data to an in-memory
write-behind cache (ie it would
know how to persist to the database
and could do so asynchronously);
Write the data to an in-memory cache
and use the PaaS (Persistence as a
Service) model ie a listener to the
cache would persist the fields; or
Simply start a background process
that could persist the data.
The minimal solution is to have a cache that I can simply update, which will separately go and upate the database in its own time (ie it'll return immediately after update the in-memory cache). This can either be a global cache or a session cache (although a global shared cache does appeal in other ways).
Any other solutions to this kind of problem?
mysql_query('INSERT INTO tableName VALUES(...),(...),(...),(...)')
Above given query statement is better. But we have another solution to improve the performance of insert statement.
Follow the following steps..
1. You just create a csv(comma separated delimited file)or simple txt file and write all the data that you want to insert using file writing mechanism (like FileOutputStream class in Java).
2. use this command
LOAD DATA INFILE 'data.txt' INTO TABLE table2
FIELDS TERMINATED BY '\t';
3 if you are not clear about this command then follow the link
You should be able to do 200 inserts relatively quickly, but it will depend on lots of factors. If you are using a transactional engine and doing each one in its own transaction, don't - that creates way too much I/O.
If you are using a non-transactional engine, it's a bit trickier. Using a single multi-row insert is likely to be better as the flushing policy of MySQL means that it won't need to flush its changes after each row.
You really want to be able to reproduce this on your production-spec development box and analyse exactly why it's happening. It should not be difficult to stop.
Of course, another possibility is that your inserts are slow because of extreme sized tables or large numbers of indexes - in which case you should scale your database server appropriately. Inserting lots of rows into a table whose indexes don't fit into RAM (or doesn't have RAM correctly configured to be used for caching those indexes) generally gets pretty smelly.
BUT don't try to look for a way of complicating your application when there is a way of easily turning it instead, keeping the current algorithm.
One more solution that you could use (instead of tuning mysql :) ) is to use some JMS server and STOMP connection driver for PHP for write data to database server in a asynchronous manner. ActiveMQ have built-in support for STOMP protocol. And there is StompConnect project which is STOMP proxy for any JMS compilant server (OpenMQ, JBossMQ etc).
You can update your local cache (hopefully memcached) and then push the write requests through beanstalkd.
I would suspect a problem with your SQL inserts - it really shouldn't take that long. Would prepared queries help? Does your mysql server need some more memory dedicated to the keyspace? I think some more questions need asked.
How are you doing the inserts, are you doing one insert per record
mysql_query('INSERT INTO tableName VALUES(...)');
mysql_query('INSERT INTO tableName VALUES(...)');
mysql_query('INSERT INTO tableName VALUES(...)');
mysql_query('INSERT INTO tableName VALUES(...)');
mysql_query('INSERT INTO tableName VALUES(...)');
or are you using a single query
mysql_query('INSERT INTO tableName VALUES(...),(...),(...),(...)');
The later of the two options is substantially faster, and from experience the first option will cause it to take much longer as PHP must wait for the first query to finish before moving to the second and so on.
Look at the statistics for your database while you do the inserts. I'm guessing that one of your updates locks the table and therefor all your statements are queued up and you experience this delay. Another thing to look into is your index creation/updating because the more indices you have on a table, the slower all UPDATE and INSERT statements get.
Another thing is that I think you use MYISAM (table engine) which locks the entire table on UPDATE.I suggest you use INNODB instead. INNODB is slower on SELECT-queries, but faster on INSERT and UPDATE because it only locks the row it's working on and not the entire table.
consider this:
mysql_query('start transaction');
mysql_query('INSERT INTO tableName VALUES(...)');
mysql_query('INSERT INTO tableName VALUES(...)');
mysql_query('INSERT INTO tableName VALUES(...)');
mysql_query('INSERT INTO tableName VALUES(...)');
mysql_query('INSERT INTO tableName VALUES(...)');
mysql_query('commit;')
Note that if your table is INSERT-ONLY (no deletes, and no updates on variable-length columns), then inserts will not lock or block reads when using MyISAM.
This may or may not improve insert performance, but it could help if you are having concurrent insert/read issues.
I'm using this, and only purging old records daily, followed by 'optimize table'.
you can use CURL with PHP to do Asynchronous database manipulations.
One possible solution is fork each query into a separate thread but, PHP doesnot support threads. We can use PCNTL functions but it’s a bit tricky for me to use them. I prefer to use this another solution to create fork and perform asynchronous operations.
Refer this
http://gonzalo123.wordpress.com/2010/10/11/speed-up-php-scripts-with-asynchronous-database-queries/