I'm a bit confused because the logical/programmer brain in me says that if all things are constant, the speed of a function must be constant.
I am working on a PHP web application with jqGrid as a front end for showing the data. I am testing on my personal computer, so network traffic does not apply. I make an HTTP request to a PHP function, it returns the data, and then jqGrid renders it. What has me befuddled is that sometimes, Firebug reports that this is taking 300-600 milliseconds sometimes, and sometimes, it's taking 3.68 seconds. I can run the request over and over again, with very radically different response times.
The query is the same. The number of users on the system is the same. No network latency. Same code. I'm not running other applications on the computer while testing. I could understand query caching improving performance on subsequent requests, but the speed is just fluctuating wildly with no rhyme or reason.
So, my question is, what else can cause such variability in the response time? How can I determine what's doing it? More importantly, is there any way to get things more consistent?
If you use the Apache Benchmarking tool (ab) you can get a set of statistics based on multiple accesses and multiple concurrent accesses, giving you min, mean, median and max access times (and stddev) broken down by connect, processing and waiting; and percentiles... making it an extremely useful tool for identifying if this is really a problem or an aberration.
While it can't diagnose the cause of performance problems, it can tell you if you really do have a problem
The first thing you should do is profile your code (see Simplest way to profile a PHP script). This will show you where your bottleneck is, and then you can figure out why your response times are fluctuating so much.
If you are using Firebug to determine speed you need to consider that both Firefox and PHP are caching resources. PHP, in particular, has a built-in opcode cache to minimize subsequent runtimes. This shows up as a long run on the first instance, followed by a number of short-runs on refresh. I concur with rfw to go deeper in your analysis.
Related
I'm developing an online sudoku game, with ActionScript 3.
I made the game and asked people to test it, it works, but the website goes down constantly. I'm using 000webhost, and I'm suspecting it is a bandwidth usage precaution.
My application updates the current puzzle, by parsing a JSON string every 2 seconds. And of course when players enter a number, it sends a $_GET request to update the mysql database. Do you think this causes a lot of data traffic?
How can I see the bandwidth usage value?
And how should I decrease the data traffic between Flash and mysql (or php, really).
Thanks !
There isn't enough information for a straight answer, and if there were, it'd probably take more time to figure out. But here's some directions you could look into.
Bandwidth may or may not be an issue. There are many things that could happen, you may very well put too much strain on the HTTP server, run out of worker threads, have your MySQL tables lock up most of the time, etc.
What you're doing indeed sounds like it's putting a lot of strain on the server. Monitoring this client side could be inefficient, you need to look at some server-side values, but you generally don't have access to those unless you have at least a VPS.
Transmitting data as JSONs is easier to implement and debug, but a more efficient way to send data (binary, instead of strings) is AMF: http://en.wikipedia.org/wiki/Action_Message_Format
One PHP implementation for the server side part is AMFPHP: http://www.silexlabs.org/amfphp/
Alternatively, your example is a good use case for the remote shared objects in Flash Media Server (or similar products). A remote shared object is exactly what you're doing with MySQL: it creates a common memory space where you can store whatever data and it keeps that data synchronised with all the clients. Automagically. :)
You can start from here: http://livedocs.adobe.com/flashmediaserver/3.0/hpdocs/help.html?content=00000100.html
Im implementing a delay system so that any IP i deem abusive will automatically get an incremental delay via Sleep().
My question is, will this result in added CPU usage and thus kill my site anyways if the attacker just keeps opening new instances while being delayed? Or is the sleep() command use minimal CPU/memory and wont be much of a burden on a small script. I dont wish to flat out deny them as i'd rather they not know about the limit in an obvious way, but willing to hear why i should.
[ Please no discussion on why im deeming an IP abusive on a small site, cause heres why: I recently built a script that cURL's a page & returns information to the user and i noticed a few IP's spamming my stupid little script. cURLing too often sometimes renders my results unobtainable from the server im polling and legitimate users get screwed out of their results. ]
The sleep does not use any CPU or Memory which is not already used by the process accepting the call.
The problem you will face with implementing sleep() is that you will eventually run out of file descriptors while the attacker site around waiting for your sleep to time out, and then your site will appear to be down to any other people who tries to connect.
This is a classical DDoS scenario -- the attacker do not actually try to break into your machine (they may also try to do that, but that is a different storry) instead they are trying to harm your site by using up every resource you have, being either bandwidth, file descriptors, thread for processing etc. -- and when one of your resources are used up, then you site appears to be down although your server is not actually down.
The only real defense here is to either not accept the calls, or to have a dynamic firewall configuration which filters out calls -- or a router/firewall box which does the same but off your server.
I think the issue with this would be that you could potentially have a LARGE number of sleeping threads laying around the system. If you detect your abuse, immediately send back an error and be done with it.
My worry with your method is repeat abusers that get their timeout up to several hours. You'll have their threads sticking around for a long time even though they aren't using the CPU. There are other resources to keep in mind besides just CPU.
Sleep() is a function that "blocks" execution for a specific amount of time. It isn't the equivalent of:
while (x<1000000);
As that would cause 100% CPU usage. It simply puts the process into a "Blocked" state in the Operating System and then puts the process back into the "Ready" state after the timer is up.
Keep in mind, though, that PHP has a default of 30-second timeout. I'm not sure if "Sleep()" conforms to that or not (I would doubt it since its a system call instead of script)
Your host may not like you having so many "Blocked" processes, so be careful of that.
EDIT: According to Does sleep time count for execution time limit?, it would appear that "Sleep()" is not affected by "max execution time" (under Linux), as I expected. Apparently it does under Windows.
If you are doing what I also tried, I think you're going to be in the clear.
My authentication script built out something similar to Atwood's hellbanning idea. SessionIDs were captured in RAM and rotated on every page call. If conditions weren't met, I would flag that particular Session with a demerit. After three, I began adding sleep() calls to their executions. The limit was variable, but I settled on 3 seconds as a happy number.
With authentication, the attacker relies on performing a certain number of attempts per second to make it worth their while to attack. If this is their focal point, introducing sleep makes the system look slower than it really is, which in my opinion will make it less desirable to attack.
If you slow them down instead of flat out telling them no, you stand a slightly more reasonable chance of looking less attractive.
That being said, it is security through a "type" of obfuscation, so you can't really rely on it too terribly much. Its just another factor in my overall recipe :)
Why is HTTP-Polling so laggy?
What I have is a button, and whenever a user clicks it a MySQL database field gets updated and the value is displayed to the user. I'm polling every 800 milliseconds and it's very laggy/glitchy. Sometimes when clicking the button it doesn't register it. And I actually need to be polling quite a bit more frequent than every 800 milliseconds.
This is also with just 1 user on the website at a time... When in the end there is going to be many at once.
HTTP-streaming/Long-polling/Websockets instead of polling
When you need real-time information you should avoid polling(frequently). Below I would try to explain why this is wrong. You could compare it to a child in the back of your car screaming every second "are we there yet" while you are replying "we are not there yet" all the time.
Instead you would like to have something like long-polling/HTTP-streaming or websockets. You could compare this to a child in the back of your car telling you to let him know when "we are there" instead of asking us every second. You could imagine this is way more efficient then the previous example.
To be honest I don't think PHP is the right tool for this kind of applications(yet). Some options you have available are:
hosted solutions:
http://pusherapp.com:
Pusher is a hosted API for quickly,
easily and securely adding scalable
realtime functionality via WebSockets
to web and mobile apps.
Our free Sandbox plan includes up to
20 connections and 100,000 messages
per day. Simply upgrade to a paid plan
when you're ready.
http://beaconpush.com/
Beaconpush is a push service for
creating real-time web apps using
HTML5 WebSockets and Comet.
host yourself:
http://socket.io:
Socket.IO aims to make realtime apps
possible in every browser and mobile
device, blurring the differences
between the different transport
mechanisms
When becoming very big the "host yourself" solution is going to less expensive, but on the other hand using something like pusherapp will get you started easier(friendly API) and also is not that expensive. For example pusherapp's "Bootstrap" can have 100 concurrent connections and 200,000 messages per day for $19 per month(but when small beaconpush is cheaper => do the math :)). As a side-note this plan does not include SSL so can not be used for sensitive data. I guess having a dedicated machine(VPS) will cost you about the same amount of money(for a simple website) and you will also have to manage the streaming solution yourself, but when getting bigger this is probably way more attractive.
Memory instead of Disc
whenever a user clicks it a MySQL
database field gets updated and the
value is displayed to the user
When comparing disc I/O(MySQL in standard mode) to memory it is extremely slow. You should be using an in-memory database like for example redis(also has persistent snapshots) or memcached(completely in memory) to speed up the process. I myself really like redis for it's insane speed, simplicity and persistent snapshots. http://redistogo.com/ offers a free plan with 5MB of memory which will probably cover your needs. If not the mini plan of $5 a month will probably cover you, but when getting even bigger a VPS will be cheaper and in my opinion the prefered solution.
Best solution
The best solution(especially if you are getting big) is to host socket.io/redis yourself using a VPS(cost money). If really small I would use redistogo, if not I would host it myself. I would also start using something like beaconpush/pusherapp because of it's simplicity(getting started immediately). Hosting socket.io(advice to play with it on your own machine for when getting big) is pretty simple, but in my opinion more difficult than beaconpush/pusherapp.
Laggy/glitchy? Sounds like a client-side problem. As does the button thing. I'd get your JavaScript in order first.
As for polling, 0.8 sounds a bit time-critical. I don't know about most countries, but here in the third world simple network packets may get delayed for as long a few seconds. (Not to mention connection drops, packet losses and the speed of light.) Is your application ready to cope with all that?
As for an alternate approach, I agree with #Vern in that an interrupt-driven one would be much better. In HTTP terms, it translates to a long-standing HTTP request that does not receive a response until the server has some actual data to send, minimizing delay and bandwidth. (AFAIK) it's an older technique than AJAX, though has been named more recently. Search for "COMET" and you'll end up with both client- and server-side libraries.
there are many things that might cause the lag that you are experiencing. Your server might be able to process the requests fast enough, but if the connection between your client and the server is slow, then you'll see the obvious lag.
The first thing you should try is to ping the server and see what response time you're getting.
Secondly, rather than poll, you might want to consider an interrupt driven approach. This means that only when your server replies, will you send out your next request. This makes sense, so that many clients won't be flooding the server with requests till the point the server cannot cope. This is especially true, then the RTT (Round-Trip-Time) of your request is pretty long.
Hope it helps. Cheers!
A good place to start would be to use a tool like Firebug in Mozilla Firefox that will allow you to watch the requests being sent to the server and look for bottlenecks.
Firebug will break down each part of the request, so you can see if you are having trouble talking to the server or if it is simply taking a long time to come up with a response.
Along with #Vern's answer I would also say that if at all possible I would have the server cache the data ahead of time and then all of the clients will pull from that same cache and not need separate MySQL calls to reach the same data for every update. Then you just have your PHP update the cache whenever the actual DB data changes.
By cache I mean having php write to a file on the sever side, and then clients will simply look at the contents of that one file to see the most updated info. There might be better ways of caching, but being that I have never done this personally before, this is the first solution that popped into my mind.
I develop a website based on the way that every front end thing is written in JavaScript. Communication with server is made trough JSON. So I am hesitating about it: - is the fact I'm asking for every single data with http request query OK, or is it completely unacceptable? (after all many web developers change multiple image request to css sprites).
Can you give me a hint please?
Thanks
It really depends upon the overall server load and bandwidth use.
If your site is very low traffic and is under no CPU or bandwidth burden, write your application in whatever manner is (a) most maintainable (b) lowest chance to introduce bugs.
Of course, if the latency involved in making thirty HTTP requests for data is too awful, your users will hate you :) even if you server is very lightly loaded. Thirty times even 30 milliseconds equals an unhappy experience. So it depends very much on how much data each client will need to render each page or action.
If your application starts to suffer from too many HTTP connections, then you should look at bundling together the data that is always used together -- it wouldn't make sense to send your entire database to every client on every connection :) -- so try to hit the 'lowest hanging fruit' first, and combine the data together that is always used together, to reduce extra connections.
If you can request multiple related things at once, do it.
But there's no real reason against sending multiple HTTP requests - that's how AJAX apps usually work. ;)
The reason for using sprites instead of single small images is to reduce loading times since only one file has to be loaded instead of tons of small files at once - or at a certain time when it'd be desirable to have the image already available to be displayed.
My personal philosophy is:
The initial page load should be AJAX-free.
The page should operate without JavaScript well enough for the user to do all basic tasks.
With JavaScript, use AJAX in response to user actions and to replace full page reloads with targeted AJAX calls. After that, use as many AJAX calls as seem reasonable.
I'm in the process of developing my first major project. It's a light-weight content management system.
I have developed all of my own framework for the project. I'm sure that will attract many flames, and a few 'tut-tut's, but it seems to be doing quite well so far.
I'm seeing page generation times of anywhere from 5-15 milliseconds. (An example, just in case my numbers are wrong, is 0.00997686386108 seconds).
I want to make sure that the application is as efficient as possible. While it looks good in my testing environment, I want to be sure that it will perform as well as possible in the real world.
Should I be concerned about these numbers - and thus, take the time to fine tune MySQL and my interaction with it?
Edit: Additionally, are there some tools or methods that people can recommend for saturating a system, and reporting the results?
Additional Info: My 'testing' system is a spare web hosting account that I have over at BlueHost. Thus, I would imagine that any performance I see (positive or negative) would be roughly indicative of what I would see in the 'real world'.
Performing well in your testing environment is a good start, but there's other issues you'll need to think about as well (if you haven't already). Here's a couple I can think of off the top of my head:
How does your app perform as data sizes increase? Usually a test environment has very little data. With lots of data, things like poorly optimized queries, missing indexes, etc. start to cause issues where they didn't before. Performance can start to degrade exponentially with respect to data size if things are not designed well.
How does your app perform under load? Sometimes apps perform great with one or two users, but resource contention or concurrency issues start to pop up when lots of users get involved.
You're doing very well at 5-15 ms. You're not going to know how it performs under load by any method other than throwing load at it, though.
As mentioned in another question: What I often miss is the fact, that most websites could increase their speed enormously by optimizing their frontend, not their backend. Have a look at this superb list about speeding up your frontend # yahoo.com:
Minimize HTTP Requests
Use a Content Delivery Network
Add an Expires or a Cache-Control Header
Gzip Components
Put Stylesheets at the Top
Put Scripts at the Bottom
Avoid CSS Expressions
Make JavaScript and CSS External
Reduce DNS Lookups
Minify JavaScript and CSS
Avoid Redirects
Remove Duplicate Scripts
Configure ETags
Make Ajax Cacheable
Flush the Buffer Early
Use GET for AJAX Requests
Post-load Components
Preload Components
Reduce the Number of DOM Elements
Split Components Across Domains
Minimize the Number of iframes
No 404s
Reduce Cookie Size
Use Cookie-free Domains for Components
Minimize DOM Access
Develop Smart Event Handlers
Choose < link> over #import
Avoid Filters
Optimize Images
Optimize CSS Sprites
Don't Scale Images in HTML
Make favicon.ico Small and Cacheable
Keep Components under 25K
Pack Components into a Multipart Document
5-15 milliseconds is totally acceptable as a page generation time. But what matters most is how well your system performs with many people accessing your content at the same time. So you need to test your system under a heavy load, and see how well it scales.
About tuning, setting up a clever cache policy is often more efficient than tuning MySQL, especially when your database and your http server are on different machines. There are very good Qs and As about cache on StackOverflow, if you need advices on that topic (I like that one, maybe because I wrote it :)
It depends on a few factors. The most important is how much traffic you're expecting the site to get.
If your site is going to be fairly low traffic (maybe 1,000,000 page views per day - an average of around 11 per second), it should be fine. You'll want to test this - use an HTTP benchmarking tool to run lots of requests in parallel, and see what kind of results you get.
Remember that the more parallel requests you're handling, the longer each request will take. The important numbers are how many parallel requests you can handle before the average time becomes unacceptable, and the rate at which you can handle requests.
Taking that 1,000,000 views per day example - you want to be able to handle far more than 11 requests per second. Likely at least 20, and at least 10 parallel requests.
You also want to test this with a representative dataset. There's no point benchmarking a CMS with one page, if you're expecting to have 100. Take your best estimate, double it, and test with a data set at least that large.
As long as you're not doing something stupid in your code, the single biggest improvement you can make is caching. If you make sure to set up appropriate caching headers, you can stick a reverse proxy (such as Squid) in front of your webserver. Squid will serve anything that's in it's cache directly, leaving your PHP application to handle only unique or updated page requests.