I'm currently re-writing my site using my own framework (it's very simple and does exactly what I need, i've no need for something like Zend or Cake PHP). I've done alot of work in making sure everything is cached properly, caching pages in files so avoid sql queries and generally limiting the number of sql queries.
Overall it looks like it's very speedy. The average time taken for the front page (taken over 100 times) is 0.046152 microseconds.
But one thing i'm not sure about is whether i've done enough to reduce php memory usage. The only time i've ever encountered problems with it is when uploading large files.
Using memory_get_peak_usage(TRUE), which I THINK returns the highest amount of memory used whilst the script has been running, the average (taken over 100 times) is 1572864 bytes.
Is that good?
I realise you don't know what it is i'm doing (it's rather simple, get the 10 latest articles, the comment count for each, get the user controls, popular tags in the sidebar etc). But would you be at all worried with a script using that sort of memory getting hit 50,000 times a day? Or once every second at peak times?
I realise that this is a very open ended question. Hopefully you can understand that it's a bit of a stab in the dark and i'm really just looking for some re-assurance that it's not going to die horribly come re-launch day.
EDIT: Just an mini experiment I did for myself. I downloaded and installed Wordpress and a default installation with no extra add ons, just one user and just one post and it used 10.5 megabytes of memory or "11010048 bytes". Quite pleased with my 1.5mb now.
Memory usage values can vary heavily and are subject to fluctuation, but as you already say in your update, a regular WordPress instance is much, much fatter than that. I have had great troubles to get the WordPress backend running with a memory_limit of sixteen megabytes - let alone when Plug-ins come into play. So from that, I'd say a peak of 1,5 Megabytes performing normal tasks is quite okay.
Generation time is extremely subject to the hardware your site runs on, obviously. However, a generation time of 0.046152 seconds (I assume you mean seconds here) sounds very okay to me under normal circumstances.
It is a subjective question. PHP has a lot of overhead and when calling the function with TRUE, that overhead will be included. You'll see what I mean when you call the function in a simple Hello World script. Also keep in mind that results can differ greatly depending on whether PHP is run as an apache module or FastCGI.
Unfortunately, no one can provide assurances. There will always be unforseen variables that can bring down a site. Perform load testing. Use a code profiler to narrow down the location of any bottlenecks to see if there are ways to make those code blocks more efficient
Encyclopaedia Britannica thought they were prepared when they launched their ad-supported encyclopedia ten years ago. The developers didn't know they would be announcing it on Good Morning America the day of the launch. The whole thing came crashing down for days.
As long as your systems aren't swapping, your memory usage is reasonable. Any additional concern is just premature optimization.
Related
we have a PHP/MySQL/Apache Web app which holds a rating system. From time to time we do full recalculations for ratings, which means about 500 iterations of calculation, each taking 4-6 minutes and depending on the results of previous iteration (i.e., parallel solutions are not possible). Time is taken mostly by MySQL queries and loops for each rated player (about 100000 players on each iteration, but complex logic of linking between players gives no possibility for parallelization here also).
The problem is - when we start recalculation in plain old way (one PHP POST request), it dies after about 30-40 minutes from start (which gives only 10-15 iterations completed). The question "why it dies?" and other optimization issues are kinda out of league now - too complex logic, which needs to be refactored and even maybe rewritten in other language/infrastructure, yes, but we have no resources (time/people) for it now. We just need to make things work in the least annoying way.
So, the question: what is the best way to organize such recalculation, if possible, so that site admin can start recalculation by just one click and forget about it for one day, and it still does the thing?
I found on the web few advices for similar problems, but no silver bullet:
move iterations (and, therefore, timeouting) from server to client with usage of AJAX requests instead of plain old PHP requst - could possibly make the browser freeze (and AJAX's async nature is kinda bad for iterations);
make PHP to start a backend service which does the thing (like advised here) - it should take lot of work and I have no idea how to implement it.
So, I humbly ask for any advices possible in such situation.
I have created a simple web application with own framework, and have a confusion that when I am dividing the php code into many files for reusability purpose, how much it will affect on performance. I have used CodeIgniter, but if I compare my framework, it has more files to process than the CodeIgniter.
In order to properly answer this question you have to know various things about your hard drive in terms of it's IOPs, cluster size, seek time, sata connection, and/or RAID configuration.
Once you know this stuff and can calculate the time it takes to read a specfic file size from your disk then you can begin calculating how many requests per second would bog down your system.
Once you know this then you need to anticipate how many users are going to hit the system at once.
Another factor is CPU power and RAM speed because if your script is complex or uses a lot of memory then your CPU will be doing a lot of work and hopefully the RAM can keep up.
If you don't want to follow all these steps then run a while() loop that creates, reads, and deletes 5000 dynamic files between 4-50 KB each and use microtime(true) to bench it.
If you are on a shared hosting plan then your only option might be to implement the benchmarking idea at various peak and down times. I will bet that a 2am benchmark will fare much better than a 2pm one.
Good luck!
Theoretically, Number of files matters, but practically, it has a little affect. For example, if one file is divided into 2 files, but if you divide a file into 100 files then it might matter
I wrote a PHP-CLI script that mixes two audio (.WAV PCM) files (with some math involved) so PHP needs to crunch through thousands (if not even millions) of samples with unpack(), do math on them and save them with pack().
Now, I dont need actual info on how to do the mixing or anything, as the title says, I'm looking for possibilites to speed this process up since the script needs 30 seconds of processing time to produce 10 seconds of audio output.
Things that I tried:
Cache the audiofiles to memory and crunch through with substr() instead of fseek()/fread(). Performance gain: 3 seconds.
Write the output file in 5000-samples chunks. Performance gain: 10 seconds.
After those optimizations I ended up at approximately 17 seconds processing time for 10 seconds audio output. What bugs me, is that other tools can do simple audio operations like mixing two files in realtime or even much faster.
Another idea I had was paralellization, but I refrained from that due to the extra problems that would occur (like calculating correct seek positions for the forks/threads and other related things).
So am I missing stuff out or is this actually good performance for a PHP-CLI script?
Thanks for everyone's input on this one.
I rewrote the thing in C++ and can now perform the above actions in less than a second.
I'd never have thought that the speed difference is that huge (compiled application is ~40X faster).
Hey,
I currently have over 300+ qps on my mysql. There is roughly 12000 UIP a day / no cron on fairly heavy PHP websites. I know it's pretty hard to judge if is it ok without seeing the website but do you think that it is a total overkill?
What is your experience? If I optimize the scripts, do you think that I would be able to get substantially lower of qps? I mean if I get to 200 qps that won't help me much. Thanks
currently have over 300+ qps on my mysql
Your website can run on a Via C3, good for you !
do you think that it is a total overkill?
That depends if it's
1 page/s doing 300 queries, yeah you got a problem.
30-60 pages/s doing 5-10 queries each, then you got no problem.
12000 UIP a day
We had a site with 50-60.000, and it ran on a Via C3 (your toaster is a datacenter compared to that crap server) but the torrent tracker used about 50% of the cpu, so only half of that tiny cpu was available to the website, which never seemed to use any significant fraction of it anyway.
What is your experience?
If you want to know if you are going to kill your server, or if your website is optimizized, the following has close to zero information content :
UIP (unless you get facebook-like numbers)
queries/s (unless you're above 10.000) (I've seen a cheap dual core blast 20.000 qps using postgres)
But the following is extremely important :
dynamic pages/second served
number of queries per page
time duration of each query (ALL OF THEM)
server architecture
vmstat, iostat outputs
database logs
webserver logs
database's own slow_query, lock, and IO logs and statistics
You're not focusing on the right metric...
I think you are missing the point here. If 300+ qps are too much heavily depends on the website itself, on the users per second that visit the website, that the background scripts that are concurrently running, and so on. You should be able to test and/or compute an average query throughput for your server, to understand if 300+ qps are fair or not. And, by the way, it depends on what these queries are asking for (a couple of fields, or large amount of binary data?).
Surely, if you optimize the scripts and/or reduce the number of queries, you can lower the load on the database, but without having specific data we cannot properly answer your question. To lower a 300+ qps load to under 200 qps, you should on average lower your total queries by at least 1/3rd.
Optimizing a script can do wonders. I've taken scripts that took 3 minutes before to .5 seconds after simply by optimizing how the calls were made to the server. That is an extreme situation, of course. I would focus mainly on minimizing the number of queries by combining them if possible. Maybe get creative with your queries to include more information in each hit.
And going from 300 to 200 qps is actually a huge improvement. That's a 33% drop in traffic to your server... that's significant.
You should not focus on the script, focus on the server.
You are not saying if these 300+ querys are causing issues. If your server is not dead, no reason to lower the amount. And if you have already done optimization, you should focus on the server. Upgrade it or buy more servers.
I have a map. On this map I want to show live data collected from several tables, some of which have astounding amounts of rows. Needless to say, fetching this information takes a long time. Also, pinging is involved. Depending on servers being offline or far away, the collection of this data could vary from 1 to 10 minutes.
I want the map to be snappy and responsive, so I've decided to add a new table to my database containing only the data the map needs. That means I need a background process to update the information in my new table continuously. Cron jobs are of course a possibility, but I want the refreshing of data to happen as soon as the previous interval has completed. And what if the number of offline IP addresses suddenly spike and the loop takes longer to run than the interval of the Cron job?
My own solution is to create an infinite loop in PHP that runs by the command line. This loop would refresh the data for the map into MySQL as well as record other useful data such as loop time and failed attempts at pings etc, then restart after a short pause (a few seconds).
However - I'm being repeatedly told by people that a PHP script running for ever is BAD. After a while it will hog gigabytes of RAM (and other terrible things)
Partly I'm writing this question to confirm if this is in fact the case, but some tips and tricks on how I would go about writing a clean loop that doesn't leak memory (If that is possible) wouldn't go amiss. Opinions on the matter would also be appreciated.
The reply I feel sheds the most light on the issue I will mark as correct.
The loop should be in one script which will activate/call the actual script as a different process...much like cron is doing.
That way, even if memory leaks, and non collected memory is accumulating, it will/should be free after each cycle.
However - I'm being repeatedly told by people that a PHP script running for ever is BAD. After a while it will hog gigabytes of RAM (and other terrible things)
This used to be very true. Previous versions of PHP had horrible garbage collection, so long-running scripts could easily accidentally consume far more memory than they were actually using. PHP 5.3 introduced a new garbage collector that can understand and clean up circular references, the number one cause of "memory leaks." It's enabled by default. Check out that link for more info and pretty graphs.
As long as your code takes steps to allow variables to go out of scope at proper times and otherwise unset variables that will no longer be used, your script should not consume unnecessary amounts of memory just because it's PHP.
I don't think its bad, as with anything that you want to run continuously you have to be more careful.
There are libraries out there to help you with the task. Have a look at System_Daemon, which release RC 1 just over a month ago, which allows you to "Set options like max RAM usage".
Rather than running an infinite loop I'd be tempted to go with the cron option you mention in conjunction with a database table entry or flat-file that you'd use to store a "currently active" status bit to ensure that you didn't have overlapping processes attempting to run at the same time.
Whilst I realise that this would mean a minor delay before you perform the next iteration, this is probably a better idea anyway as:
It'll let the RDBMS perform any pending low-priority updates, etc. that may well been on-hold due to the amount of activity that you've been carrying out.
Even if you neatly unset all the temporary variables you've been using, it's still possible that PHP will "leak" memory, although recent improvements (5.2 introduced a new memory management system and garbage collection was overhauled in 5.3) should hopefully mean that this less of an issue.
In general, it'll also be easier to deal with other issues (if the DB connection temporarily goes down due to a config change and restart for example) if you use the cron approach, although in an ideal world you'd cater for such eventualities in your code anyway. (That said, the last time I checked, this was far from an ideal world.)
First I fail to see how you need a daemon script in order to provide the functionality you describe.
Cron jobs are of course a possibility, but I want the refreshing of data to happen as soon as the previous interval has completed
The neither a cron job nor a daemon are the way to solve the problem (unless the daemon becomes the data sink for the scripts). I'd spawn a dissociated process when the data is available using a locking strategy to aoid concurrency.
Long running PHP scripts are not intrinsically bad - but there reference counting garbage collector does not deal with all possible scenarios for cleaning up memory - but more recent implementations have a more advanced collector which should clean up a lot more (circular reference checker).