I have an auction site that sometimes becomes heavily loaded & mostly mysql is seen to consume lot of memory & cpu. The situation i have is as below.
An ajax query is going to mysql every second for every user who is online & watching the auction to check the bid count against a previous value. If anyone places a bid, the count is different, so this ajax invokes one more ajax that retrieves records & displays in a table bids that are specific to the user who is watching / logged in. I'm limiting this to first 10 to reduce load.
However the problem is if there are 50 users online, & one of them places a bid, 50 queries go into mysql & all of them detect the bid count has changed & issue further queries to get records to display bids corresponding to each user.
THe bigger problem is if there are 500 users online then 500 queries go into mysql to detect a change & if a bid is placed another 500 queries (a query specific to each online user) go into mysql & potentially crash the server.
Note: Currently there is a single mysql connection object used as a singleton in a php that is responsible for executing queries, retrieving records, etc.
I'm essentially looking at a solution where 500 queries don't goto mysql if 500 users are online, but all of them should get an update even if one of them places a bid for a particular auction. Any ideas / suggestions highly welcome.
How can i best implement a solution for this scenario that reduce the load on mysql ?
Resource wise we are fairly ok, doing a VPS4 on Hostgator. The only problem is cpu / memory usage which is 95% when many users are placing bids.
Appreciate some suggestions
It sounds like you will want to take a look at memcached or some other caching service. You can have a process querying MySQL and updating it into memcached, and ajax making a query directly into memcached to retrieve the rows.
Memcached does not keep the relational consistency, and querying it is much less resource consuming than querying MySQL every single time.
PHP has a very nice interface to work with memcached: Memcache
The website of the memcached project.
There are a few other caching services. You might also want to look at query caching in MySQL, but this would still need several connections into MySQL, which will be very resource consuming either way.
In the short-term, you could also just run the detailed query. It will return nothing when there's nothing to update (which replaces the first query!).
That might buy you some time for caching or deeper analysis of your query speed.
Related
I have Debian VPS configured with a standard LAMP.
On this server, there is only one site (shop) which has a few cron jobs - mostly PHP scripts. One of them is update script executed by Lynx browser, which sends tons of queries.
When this script runs (it takes 3-4 minutes to complete) it consumes all MySQL resources, and the site almost doesn't work (page generates in 30-60 seconds instead of 1-2s).
How can I limit this script (i.e. extending its execution time limiting available resources) to allow other services to run properly? I believe there is a simple solution to the problem but can't find it. Seems my Google superpowers are limited last two days.
You don't have access to modify the offending script, so fixing this requires database administrator work, not programming work. Your task is called tuning the MySQL databse.
(I guess you already asked your vendor for help with this, and they said no.)
Ron top or htop while the script runs. Is CPU pinned at 100%? Is RAM exhausted?
1) Just live with it, and run the update script at a time of day when your web site doesn't have many visitors. Fairly easy, but not a real solution.
2) As an experiment, add RAM to your VPS instance. It may let MySQL do things all-in-RAM that it's presently putting on the hard drive in temporary tables. If it helps, that may be a way to solve your problem with a small amount of work, and a larger server rental fee.
3) Add some indexes to speed up the queries in your script, so each query gets done faster. The question is, what indexes will help? (Just adding indexes randomly generally doesn't help much.)
First, figure out which queries are slow. Give the command SHOW FULL PROCESSLIST repeatedluy while your script runs. The Info column in that result shows all the running queries. Copy them into a text file to keep them. (Or you can use MySQL's slow query log, about which you can read online.)
Second, analyze the worst offending queries to see whether there's an obvious index to add. Telling you how to do that generally is beyond the scope of a Stack Overflow answer. You might ask another question about a specific query. Before you do, please
reead this note about asking good SQL questions, and pay attention to the section on query performance.
3) It's possible your script is SELECTing many rows, or using SELECT to summarize many rows, from tables that also need to be updated when users visit your web site. In that case your visitors may be waiting for those SELECTs to finish. If you could change the script, you could put this statement right before long-running SELECTS.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
This allows the SELECT statement after it to do a "dirty read", in which it might get an earlier version of an updated row. See here.
Or, if you can figure out how to insert one statement into your obscured script, put this one right after it opens a database session.
SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
Without access to the source code, though, you only have one way to see if this is the problem. That is, access the MySQL server from a privileged account, right before your script runs, and give these SQL commands.
SHOW VARIABLES LIKE 'tx_isolation';
SET GLOBAL TRANSACTION ISOLATION LEVEL READ UNCOMMITED;
then see if the performance problem is improved. Set it back after your script finishes, probably like this (depending on the tx_isolation value retrieved above)
SET GLOBAL TRANSACTION ISOLATION LEVEL READ COMMITED;
Warning a permanent global change to the isolation level might foul up your application if it relies on transaction consistency. This is just an experiment.
4) Harass the script's author to fix this problem.
Slow queries? High CPU? High I/O? Then you must look at the queries. You cannot "tune your way out of a performance problem". Tuning might give you a few percent improvement; fixing the indexes and queries is likely to give you a lot more improvement.
See this for finding the 'worst' queries; then come back with SELECTs, EXPLAINs, and SHOW CREATE TABLEs for help.
I'm currently working on the project which is like e-commerce site. There are hundreds of thousands of records in the database tables. I also have to use join operations on them to get data as there is query builder in project to select criteria of data. It takes too much time to fetch data. So, I'm using limit as some no of records(e.g. 10) per page. Now I come to know the concept of memcached. So I thought to use memcached for my project as it will take too much time for only once. But still there are some doubts.
Will too many cache file affect? I mean there will be too many files will be created as for each page of each module, there will be one cache file. So digit will go approx 10000 cache file.
Let's assume that there is no any problem of no of files. But what about to update files using replace() when any row of table is being added or deleted from middle of the table. And here, table is being updated near about every week.
So I'm in dilemma that should I go for memcached or not? If any one can advice and answer with explanation, then it will be appreciated.
If your website executes many of the same MySQL queries that frequently return the same data, then yes, there is probably some benefit to running memcached.
Problem:
"There are hundreds of thousands of records...It takes too much time to fetch the data".
This probably indicates a problem with your schema. Properly indexed, even when using JOINs, the queries should be able to execute quickly (< 0.1 seconds). Run an EXPLAIN query on the queries that are taking a long time to run and see if they can be improved.
Answer to Question 1
There won't be an issue with too many cache files. Memcached stores all cached information in memory (hence the name), so no disk files are used. Cached objects are stored in RAM and accessed directly from RAM.
Answer to Question 2
Not exactly sure what you are asking here, but if your application updates or deletes information from the database, then it is critical that the cache items affected by the updates and deletes are deleted. If the application doesn't remove cached items affected by such operations, than the next time the data is queried, cached results which are no longer valid may be returned. Make sure any data cached either has appropriate expiration times set, or the application removes them from cache when the data in the database changes.
Hope that helps.
I would start not from Memcached but from figuring out what the bottleneck is. Your tables have roughly one millions rows. I don't know the size of a row but my educated guess is that it is less than 1K based on the fact that a browser window accommodates information from one record.
So it is probably 1G of information in your database. Correct me if I'm wrong. If that's true then the whole database should be automatically cached in RAM by MySQL.
Now that your database is totally in RAM then with proper organization of indexes complexity of a query should be linear with respect to the number of the result set which measured in a number of kilobytes because it fits the browser window.
So my advice is to determine the size of the database and to see the result of "top" command in order to know how much memory is consumed by MySQL. And if you make sure that your database sits totally in memory then run the explain command against your most popular queries and add some indexes to your database according to the result of the explain. Even if your database is bigger than the amount of RAM then I still recommend you to look into the results of the explain command cause it really helps a lot.
I have an AJAX search facility on my website, and when I search something in the live site, until the results come (tables have no more than 20 entries) the page freezes for a short period of time, nowhere else is clickable on the website but it doesn't freeze the computer. I can click other tabs on the browser etc.
I am using this query in MySQL/InnoDB, which takes 0.031 sec to run:
select * from members m where
memberID LIKE '%$keyword%' OR
memberName LIKE '%$keyword%' AND
memberTypeID=2;
I think it is a bit related with the connection to the server and a bit related with the server's performance. How can I improve this?
I use Bootstrap pagination to put all the data in a paginated table that has search, sort, page, amount of entries per page options and all of these are done on client side.
There is no exact method to measure exactly how powerful your server should be, but you can always predict things.
Solution 1:
If you are using complex database queries you should go for a dedicated powerful server.
Solution 2: You could use ajax asynchronous call to database so other things keep on loading and does not hangs up the page.
More about AJAX call: Get data from mysql database using php and jquery ajax
Mostly it depends on two things.
Size of Website.( Each Time someone will open website, your bandwidth
will be consumed)
Memory used by SQL queries.(Suppose you have
100,000 records in a table, a simple SELECT
query wont consume that much, but it your query is complex, contains
joins and not optimized properly then it may consume a lot)
According to My Experience.
A 2MB website , with 4 tables and approximately 50,000 records each. Can you work fine on Shared Hosting if you have less than 50 user in an hour.
Anything above that, you need Virtual Private server. (This is not fixed few website provide very powerful shared hosting, but price is also high)
In your case i guess you are using a free hosting or your localhost.
We are building a social website using PHP (Zend Framework), MySQL, server running Apache.
There is a requirement where in dashboard the application will fetch data for different events (there are about 12 events) on which this dashboard for user will be updated. We expect the total no of users to be around 500k to 700k. While at one time on average about 20% users would be online (for peak time we expect 50% users to be online).
So the problem is the event data as per our current design will be placed in a MySQL database. I think running a few hundred thousands queries concurrently on MySQL wouldn't be a good idea even if we use Amazon RDS. So we are considering to use both DynamoDB (or Redis or any NoSQL db option) along with MySQL.
So the question is: Having data both in MySQL and any NoSQL database would give us this benefit to have this power of scalability for our web application? Or we should consider any other solution?
Thanks.
You do not need to duplicate your data. One option is to use the ElastiCache that amazon provides to give your self in memory caching. This will get rid of your database calls and in a sense remove that bottleneck, but this can be very expensive. If you can sacrifice rela time updates then you can get away with just slowing down the requests or caching data locally for the user. Say, cache the next N events if possible on the browser and display them instead of making another request to the servers.
If it has to be real time then look at the ElastiCache and then tweak with the scaling of how many of them you require to handle your estimated amount of traffic. There is no point in duplicating your data. Keep it in a single DB if it makes sense to keep it there, IE you have some relational information that you need and then also have a variable schema system then you can use both databases, but not to load balance them together.
I would also start to think of some bottle necks in your architecture and think of how well your application will/can scale in the event that you reach your estimated numbers.
I agree with #sean, there’s no need to duplicate the database. Have you thought about a something with auto-scalability, like Xeround. A solution like that can scale out automatically across several nodes when you have throughput peaks and later scale back in, so you don’t have to commit to a larger, more expansive instance just because of seasonal peaks.
Additionally, if I understand correctly, no code changes are required for this auto-scalability. So, I’d say that unless you need to duplicate your data on both MySQL and NoSQL DB’s for reasons other than scalability-related issues, go for a single DB with auto-scaling.
If I have a single page, which its contents are generated from a MySQL database (simple query to display the contents of a cell, HTML contents), how many users can hit that page at once without crashing the database?
I have to display the same data across multiple domains, so instead of duplicating the page on three domains, I'm going to load it into mysql and use this route to display it. However, I want to find out how many concurrent connections I could handle without crashing the database.
Can anyone point me in the right directions to finding this out? I'm assuming this small query shouldn't be a huge load, but if 10,000 visitors hit it at once, then what?
You need to check ur setting for max_connections, you could get more information about this by looking at the following link: http://www.electrictoolbox.com/update-max-connections-mysql/
Are you using any framework? If so most have caching built into them which should solve the issue assuming that the information isn't being updated on a moment to moment basis. Even if it is, try to cache whatever parts of the page you assume are not going to change that often.
how many users can hit that page at once without crashing the database?
All of them.
It's a bad question.
The database should not crash because users are trying to connect. There are lots of limits on how many php clients can open concurrent connections to the database. Really you should have the DBMS correctly configured so it handles the situation gracefully - i.e. max_connections should limit the number of clients - not the amount of memory available to the DBMS.
If you mean how many concurrent connections can you support without connections being rejected / queued, that's something VERY different. And it's nearly a sensible question.
In that case, it's going to depend on the amount of memory you've got, how you've configured the DBMS, how fast the CPU is, what size the content is, how many content items there are, how PHP connects to the DBMS, how far away the PHP is from the DBMS....
You're more likely to run into connection limits on the webserver before you hit them on the DBMS - but the method for establishing what those limits are is the same for both - generate a controlled number of connection and measure how memory it uses - repeat for a different number, get lots of data. Draw a graph, see where the line crosses your resource limits.
so instead of duplicating the page on three domains, I'm going to load it into mysql
The number of domains is pretty much irrelevant - unless you're ecxplicitly looking for a method of avoiding ESI.
but if 10,000 visitors
Really? On a typical site that would mean perhaps a million concurrent HTTP sessions.