Certain pages on my site are populated via numerous ajax calls, often 20+ ajax calls running asynchronously. The problem I'm having is that each call is creating a new mysql_connection, causing me to receive the 'max_user_connections' error, which I'm told is capped off at 20. I've tried turning async off, but as the JQuery documentation states, this will often freeze up the page. I have considered turning this into 1 ajax call and letting php perform the loop, but the idea was to display each element as it becomes available (some of the data is gathered from external sources, thus the timing can vary)
Is this something that can be fixed with mysql_pconnect?
Is this something that can be fixed with mysql_pconnect?
No - you need to either:
1) increase the number of connections mysql will handle (not recommended - not a scalable solution without replication)
2) improve the speed of the ajax call handler (e.g. query tuning, database/opcode/http level caching)
3) decrease the frequency of the ajax calls - e.g. by skiping the call if the last one occurred within 500msec
20+ Ajax calls are a concern even in absence of MySQL issue. Browser would not make these 20 calls simultaneously since it caps max http connection (to a single domian) to 2,4 or 8.
I would combine these requests and bring number of Ajax calls down to 6 - 8. As you mentioned, rest of calls get data from external sources. Do these calls talk to MySQL as well?
In addition, these things might help:
Can you cache some of things (E.g.
username , user full name) in http session
Can you cache some data out of MySQL into memcached or even a file ? When data becomes available, cache can be updated.
Related
I'm building a PHP application which has a database containing approximately 140 URL's.
The goal is to download a copy of the contents of these web pages.
I've already written code which reads the URL's from my database then uses curl to grab a copy of the page. It then gets everything between <body> </body>, and writes it to a file. It also takes into account redirects, e.g. if I go to a URL and the response code is 302, it will follow the appropriate link. So far so good.
This all works ok for a number of URL's (maybe 20 or so) but then my script times out due to the max_execution_time being set to 30 seconds. I don't want to override or increase this, as I feel that's a poor solution.
I've thought of 2 work arounds but would like to know if these are a good/bad approach, or if there are better ways.
The first approach is to use a LIMIT on the database query such that it splits the task up into 20 rows at a time (i.e. run the script 7 separate times, if there were 140 rows). I understand from this approach it still needs to call the script, download.php, 7 separate times so would need to pass in the LIMIT figures.
The second is to have a script where I pass in the ID of each individual database record I want the URL for (e.g. download.php?id=2) and then do multiple Ajax requests to them (download.php?id=2, download.php?id=3, download.php?id=4 etc). Based on $_GET['id'] it could do a query to find the URL in the database etc. In theory I'd be doing 140 separate requests as it's a 1 request per URL set up.
I've read some other posts which have pointed to queueing systems, but these are beyond my knowledge. If this is the best way then is there a particular system which is worth taking a look at?
Any help would be appreciated.
Edit: There are 140 URL's at the moment, and this is likely to increase over time. So I'm looking for a solution that will scale without hitting any timeout limits.
I dont agree with your logic , if the script is running OK and it needs more time to finish, just give it more time it is not a poor solution.What you are suggesting makes things more complicated and will not scale well if your urls increase.
I would suggest moving your script to the command line where there is no time limit and not using the browser to execute it.
When you have an unknown list wich will take an unknown amount of time asynchronous calls are the way to go.
Split your script into a single page download (like you proposed, download.php?id=X).
From the "main" script get the list from the database, iterate over it and send an ajax call to the script for each one. As all the calls will be fired all at once, check for your bandwidth and CPU time. You could break it into "X active task" using the success callback.
You can either set the download.php file to return success data or to save it to a database with the id of the website and the result of the call. I recommend the later because you can then just leave the main script and grab the results at a later time.
You can't increase the time limit indefinitively and can't wait indefinitively time to complete the request, so you need a "fire and forget" and that's what asynchronous call does best.
As #apokryfos pointed out, depending on the timing of this sort of "backups" you could fit this into a task scheduler (like chron). If you call it "on demand", put it in a gui, if you call it "every x time" put a chron task pointing the main script, it will do the same.
What you are describing sounds like a job for the console. The browser is for the users to see, but your task is something that the programmer will run, so use the console. Or schedule the file to run with a cron-job or anything similar that is handled by the developer.
Execute all the requests simultaneously using stream_socket_client(). Save all the socket IDs in an array
Then loop through the array of IDs with stream_select() to read the responses.
It's almost like multi-tasking within PHP.
Which one is best when to choose from server-side or client-side?
I have a PHP function something like:
function insert(argument)
{
//do some heavy MySQL work such as sp_call
// that takes near about 1.5 seconds
}
I have to call this function about 500 times.
for(i=1;i<=500;i++)
{
insert(argument);
}
I have two options:
a) call through loop in PHP(server-side)-->server may timed out
b) call through loop in JavaScript(AJAX)-->takes a long time.
Please suggest the the best one, if there is any third one.
If I understand correctly your server still needs to do all the work, so you can't use the clients computer to lessen the power needed on your server, so you have a choice of the following:
Let the client ask the server 500 times. This will easily let you show the process for the client, giving him the satisfactory knowledge that something is happening, or
Let the server do everything to skip the 500 extra round trip times, and extra overhead needed to process the 500 requests.
I would probably go with 1 if it't important that the client don't give up early, or 2 if it's important that the job is done all the way though, as the client might stop the requests after 300.
EDIT: With regard to your comment I would then suggest having a "start work"-button on the client that tells the server to start the job. Your server then tells a background service (which can be created in php) to do the work. And it can update it's process to a file or in a database or something. Then the client and the php server is free to timeout and log out without problems. And then you can update the page to see if the work is completed in the background, which can be collected from the database or file or whatever. Then you minimize both time and dependencies.
You have not given any context for what you are trying to achieve - of key importance here are performance and whether a set of values should be treated as a single transaction.
The further the loop is from the physical storage (not just the DBMS) then the bigger the performance impact. For most web applications the biggest performance bottleneck is the network latency between the client and webserver - even if you are relatively close....say 50 milliseconds away...and have keeaplives working properly, then it will take a minimum of 25 seconds to carry out this operation for 500 data items.
For optimal performance you should be sending the data the DBMS in the least number of DML statements - you've mentioned MySQL which supports multiple row inserts and if you're using MySQLi you can also submit multiple DML statements in the same database call (although the latter just eliminates the chatter between PHP and DBMS while a single DML inserting multiple rows also reduces chatter between the DBMS and the storage). Depending on the data structure and optimiziation this should take in the region of 10s of milliseconds to insert hundreds of rows - both methods will be much, MUCH faster than having the loop running in the client even if the latency were 0.
The length of time the transaction in progress is going to determine the likelihood of the transaction failing - the faster method will therefore be thousands of times more reliable than the Ajax method.
As Krycke suggests, using the client to do some of the work will not save resource on your system - there is an additional overhead of the webserver, PHP instances and DBMS connection. Although these are relatively small, they add up quickly. If you test both approaches you will find that having the loop in PHP or in the database will result in significantly less effort and therefore greater capacity on your server.
Once I had script which was running tens of minutes. My solutions was doing long request through AJAX with timeout 1 second and checking for result in another AJAX threads. Experience for user is better than waiting too long for response from php without ajax.
$.ajax({
...
timeout: 1000
})
So Finally I Got this.
a) Use AJAX if you wanna sure that it will complete. it is also user-friendly as he gets regular responses between AJAX calls.
b) Use Server Side Script if you almost sure that server will not get it down in between and want less load on client.
Now i am using Server Side Script with a waiting message window for the user and user waits for successful submission message else he have to try again.
with a probability that it will succeed in first attempt is 90-95%.
I've looked around and haven't found a pre-existing answer to this question.
Info
My site relies on Ajax, Apache, Mysql, and PHP.
I have already built my site and it works well however as soon as too many users begin to connect (when receiving roughly 200+ requests per second) the server performs very poorly.
This site is very reliant on ajax. The main page of the site performs an ajax request every second so if 100 people are online, I'm receiving at least 100 requests per second.
These ajax queries invoke mysql queries on the server-side. These queries return small datasets. The returned datasets will change very often so I'd imagine caching would be ineffective.
Questions
1) What configuration practices would be best to help me increase the maximum number of requests per second? This applies to Ajax, Mysql, PHP, and Apache.
2) For Apache, do I want persistent connections (the KeepAlive directive) to be "On" or "Off"? As I understand, Off is useful if you are expecting many users, but On is useful for ajax and I need both of these things.
3) When I test the server's performance on serving a plain, short html page with no ajax (and involving only 1 minor mysql query) it still performs very poorly when this page gets 200+ requests per second. I'd imagine this must be due to apache configuration / server resources. What options do I have to improve this?
Thanks for any help!
Depending on the actual user need the caching can be implemented in different patterns. In many cases the users don't really need updates per every second and/or they can be cached for longer period of times and just make it look like it updates a lot. It depends...
Just to give some ideas:
Do every user need to get really unique, user specific responses from ajax requests or is it same or similar to all or sub groups of the users?
Does it make sense to have every second updates for every user?
Can the users notice the difference if the data is cached for, let's say, 10 seconds?
If the data is really unique for every user, but doesn't get updated for every user per every second, couldn't you use data refreshing (invalidate cached data when the data actually changes)?
I used requirejs to lazy load the js, html and css files. For server to serve loads of assets you need to keep the KeepAliveTimeout to 15
I have a db with over 5 million rows and for each row i have to do a http post to a server with some parameters at maximum rate of 500 connections. each post request takes 12 secs to process. so as old connections are completed i have to do new ones and maintain ~500 connection. I have to then update DB with values returned from these webcalls.
How do I make the webcalls as above?
My App is in PHP. Can I use PHP or should I switch to something else for this.
Actually you can definitely do this with PHP using a technique called long-polling. Basically how it works is the client machine pings the server and says "Do you have anything for me" and the server sees that it does not. Instead of responding it holds onto the request and responds when it has something to send.
Long polling is a method that is used by both DrupalChat and the APE project (AJAX Push Engine).
http://drupal.org/project/drupalchat
http://www.ape-project.org/
Here is some more info on push tech: http://en.wikipedia.org/wiki/Push_technology and http://en.wikipedia.org/wiki/Comet_%28programming%29
And here is a stackoverflow post about it: How do I implement basic "Long Polling"?
Now I have to say that 12 seconds is really dang long for a DB query to run. It sounds like either the query needs to be optimized or the DB does (or both). Have you normalized the database and setup good table and inter-table indexing?
Now as for preventing DB update collisions you need to use transactions (which both PostGres and newer versions of MySQL offer along with most enterprise DB systems). Transactions will allow you to rollback db changes and reserve table IDs and things like that.
http://en.wikipedia.org/wiki/Database_transaction
PHP isn't the right tool to make long-running scripts, since it by default has a maximum execution time which is pretty short. You might look into using python for this task. Also note that you can call external scripts from PHP (such as python scripts) using the system() function, if the only reason you're using PHP is to make it easy to integrate a web front-end.
However, you [b]can[/b] do this in php with a cron-job by simply having your php script only handle a single row at a time, and have the cron-job call the php script every second. Just maintain the index into the table elsewhere (either elsewhere in the DB or just write the number to a file)
If you wanted to saturate your 500 connection limit, have your script do 40 rows at a time. 40 rows / second is roughly 500 rows / 12 seconds
I have a basic HTML file, using jQuery's ajax, that is connecting to my polling.php script every 2 seconds.
The polling.php simply connections to mysql, checks for ID's newer than my hidden, stored current ID, and then echo's if there is anything new. Since the javascript is connecting every 2 seconds, I am getting thousands of connections in TIME_WAIT, just for my client. This is because my script is re-connecting to MySQL over and over again. I have tried mysql_pconnect but it didn't help any.
Is there any way I can get PHP to open 1 connection, and continue to query using it? Instead of reconnecting every single time and making all these TIME_WAIT connections. Unsure what to do here to make this work properly.
I actually ended up doing basic Long Polling. I made a simple PHP script to to an infinite while loop, and it queries every 2 seconds. If it finds something new, it echoes it out, and breaks the loop. My jquery simply ajax connects to it, and waits for a reponse; on reponse, it updates my page, and restarts the polling. Very simple!
PS, the Long Polling method also reduces browser memory issues, as well as drastically reduces the TIME_WAIT connections on the server.
There's no trivial way of doing this, as pconnect doesn't work across multiple web page calls. However, some approaches to minimise the database throughput would be:
Lower the polling time. (2 seconds is perhaps a bit excessive?)
Have a "master" PHP script that runs every 'n' seconds, extracts the data from the database and saves it in the appropriate format (serialised PHP array, XML, HTML data, etc.) in the filesystem. (I'd recommend writing to a temp file and then renaming over the existing one to minimise any partial file collection issues.) The Ajax requested PHP page would then simply use the information in this data file.
In terms of executing the master PHP script, you could either use cron or simply let the user who first requests the page when the contents of file is deemed too stale. (You could use the data file's timestamp for this purpose via the filemtime function.) I'd personally use the latter approach, as cron is overkill for this purpose.
You could take this even further and use memcached instead of a flat file, etc. if so required. (That said, it would perhaps be an over-complex solution at this stage of affairs.)