I wrote a PHP-CLI script that mixes two audio (.WAV PCM) files (with some math involved) so PHP needs to crunch through thousands (if not even millions) of samples with unpack(), do math on them and save them with pack().
Now, I dont need actual info on how to do the mixing or anything, as the title says, I'm looking for possibilites to speed this process up since the script needs 30 seconds of processing time to produce 10 seconds of audio output.
Things that I tried:
Cache the audiofiles to memory and crunch through with substr() instead of fseek()/fread(). Performance gain: 3 seconds.
Write the output file in 5000-samples chunks. Performance gain: 10 seconds.
After those optimizations I ended up at approximately 17 seconds processing time for 10 seconds audio output. What bugs me, is that other tools can do simple audio operations like mixing two files in realtime or even much faster.
Another idea I had was paralellization, but I refrained from that due to the extra problems that would occur (like calculating correct seek positions for the forks/threads and other related things).
So am I missing stuff out or is this actually good performance for a PHP-CLI script?
Thanks for everyone's input on this one.
I rewrote the thing in C++ and can now perform the above actions in less than a second.
I'd never have thought that the speed difference is that huge (compiled application is ~40X faster).
Related
I have created a simple web application with own framework, and have a confusion that when I am dividing the php code into many files for reusability purpose, how much it will affect on performance. I have used CodeIgniter, but if I compare my framework, it has more files to process than the CodeIgniter.
In order to properly answer this question you have to know various things about your hard drive in terms of it's IOPs, cluster size, seek time, sata connection, and/or RAID configuration.
Once you know this stuff and can calculate the time it takes to read a specfic file size from your disk then you can begin calculating how many requests per second would bog down your system.
Once you know this then you need to anticipate how many users are going to hit the system at once.
Another factor is CPU power and RAM speed because if your script is complex or uses a lot of memory then your CPU will be doing a lot of work and hopefully the RAM can keep up.
If you don't want to follow all these steps then run a while() loop that creates, reads, and deletes 5000 dynamic files between 4-50 KB each and use microtime(true) to bench it.
If you are on a shared hosting plan then your only option might be to implement the benchmarking idea at various peak and down times. I will bet that a 2am benchmark will fare much better than a 2pm one.
Good luck!
Theoretically, Number of files matters, but practically, it has a little affect. For example, if one file is divided into 2 files, but if you divide a file into 100 files then it might matter
Hello to everyone on this WONDERFUL site!!!!
I am in the process of coding a php script and it is expected to have over 5000 lines of code when finished. Each 100 lines or so will be broken up by esleif's so only about 100 lines will need to be processed when it runs.
My question is does php precess every line or will it literally skip if the conditions are not met? I want to know if it makes a difference in processing time. Is one large file broken up with elseif's the same as multiple files?
Thank you all in advance!
The skipped lines will still need to be parsed and compiled, which can result in a significant overhead of done for each execution of the script.
However, if you use a PHP accelerator that caches the compiled bytecode, this overhead will disappear completely.
PHP will preprocess all of your code indeed, and compile it to memory. Then, only the part where conditions are met will be executed.
So loading thousands of lines of code is slower than loading a few ones, but loading a big php file is faster than loading many small ones, because of disk accesses.
PHP will have to look at all code before it starts. Having it in a big file might be quicker, but one helluva kahuna to maintain.
Consider control statements like switch if you're writing lots of if..else.. stuff.
Consider caching plugins to speed stuff up
Consider code redundancy, using functions and modularising code. From your description, I've an awful vision of a main() with 5000 lines, which will be a nightmare to pick up in 6 months' time
I have PHP function file with ~2,000 rows and I always include it for each page. Do those many rows make my site slower? Should I separate function file to different files? Thank you.
What do you have 2k lines of code that you must run on every page? Depending on what it's doing (processing / database calls / etc) it could significantly slow down the site, not to mention also clutter up the application itself.
If I'm understanding you right there is simply a bunch of functions? If so, then they aren't getting run unless you call them, and as such won't slow it down noticeably.
If they're only functions inside that file, then it shouldn't matter. The functions aren't parsed until you actually use them...although they are validated for errors, so that may take some time.
And if by 2000 rows, you mean 2000 lines, then you shouldn't worry too much, but yeah, it is ideal to separate similar functions into different files, but if you're going to include all of those files anyway, you're just adding to your overhead with more include calls
Any number of additional lines will slow things down to some small degree, but it's usually not enough to notice. And 2000 lines isn't a lot -- for one file it kinda is, but most frameworks include that much without batting an eye, and they run fine.
Don't worry about it unless the site is actually slow -- and consider algorithmic improvements before worrying about something like whether PHP is parsing too much.
It would have some effect, but it would be very minimal and more than likely it will be nothing noticeable.
It comes down to how fast your server's disk system is, v.s. how fast the CPU is.
Disk speed will determine how fast those 2000 lines can get found on the disk and slurped into memory and fed into the PHP parser. CPU speed will determine how fast the PHP parser can process the 2000 lines (and I'm ignoring other factors such as memory speed, we'll just pretend the computer is just a cpu with a disk).
The only way to say for sure is to benchmark it. Maybe one big file is far faster than multiple smaller ones on your particular development server, but the exact opposite on the production machine.
So, let's fake up some numbers. Assume the filesystem and physical disk together take a constant 0.5 seconds to locate any file on the disk, and 0.2 seconds "per 1000 lines" to load it. Then let's assume that PHP's parser has perfect performance and takes exactly 0.1 seconds to parse 1000 lines, with no startup overhead. Of course these are ludicrous numbers, but it's just for illustration
So, with your 2000 line file, we end up with:
0.5seconds to locate on disk
2000/1000 * 0.2 = 0.4 seconds to load
2000/1000 * 0.1 = 0.2 seconds to parse
= 1.1 seconds to make your 2000 line file available for use.
= 0.5 seconds to locate
= 0.6 seconds to load/parse
Now let's say you've rejigged things and split the file into smaller chunks. You start using one particular script heavily, which requires 3 of those smaller scripts to be loaded. Let's pretend those smaller chunks are all 500 lines each. So.
0.5 seconds * 3 files = 1.5 seconds to locate on disk
500/1000 * 0.2 * 3 = 0.3 seconds to load
500/1000 * 0.1 * 3 = 0.15 seconds to load
= 1.85 seconds, much slower than the single bigger file.
= 1.5 seconds to locate
= 0.45 seconds to load/parse
So, with this contrived example, you've reduced your loading/parsing overhead by 0.15 seconds (25% reduction), but you've TRIPLED the disk time.
So, again, the only way to say what'll work best in your situation is to benchmark it. Run a series of loads with the single large monolithic file, versus a series of loads with multiple smaller fragments.
I'm currently re-writing my site using my own framework (it's very simple and does exactly what I need, i've no need for something like Zend or Cake PHP). I've done alot of work in making sure everything is cached properly, caching pages in files so avoid sql queries and generally limiting the number of sql queries.
Overall it looks like it's very speedy. The average time taken for the front page (taken over 100 times) is 0.046152 microseconds.
But one thing i'm not sure about is whether i've done enough to reduce php memory usage. The only time i've ever encountered problems with it is when uploading large files.
Using memory_get_peak_usage(TRUE), which I THINK returns the highest amount of memory used whilst the script has been running, the average (taken over 100 times) is 1572864 bytes.
Is that good?
I realise you don't know what it is i'm doing (it's rather simple, get the 10 latest articles, the comment count for each, get the user controls, popular tags in the sidebar etc). But would you be at all worried with a script using that sort of memory getting hit 50,000 times a day? Or once every second at peak times?
I realise that this is a very open ended question. Hopefully you can understand that it's a bit of a stab in the dark and i'm really just looking for some re-assurance that it's not going to die horribly come re-launch day.
EDIT: Just an mini experiment I did for myself. I downloaded and installed Wordpress and a default installation with no extra add ons, just one user and just one post and it used 10.5 megabytes of memory or "11010048 bytes". Quite pleased with my 1.5mb now.
Memory usage values can vary heavily and are subject to fluctuation, but as you already say in your update, a regular WordPress instance is much, much fatter than that. I have had great troubles to get the WordPress backend running with a memory_limit of sixteen megabytes - let alone when Plug-ins come into play. So from that, I'd say a peak of 1,5 Megabytes performing normal tasks is quite okay.
Generation time is extremely subject to the hardware your site runs on, obviously. However, a generation time of 0.046152 seconds (I assume you mean seconds here) sounds very okay to me under normal circumstances.
It is a subjective question. PHP has a lot of overhead and when calling the function with TRUE, that overhead will be included. You'll see what I mean when you call the function in a simple Hello World script. Also keep in mind that results can differ greatly depending on whether PHP is run as an apache module or FastCGI.
Unfortunately, no one can provide assurances. There will always be unforseen variables that can bring down a site. Perform load testing. Use a code profiler to narrow down the location of any bottlenecks to see if there are ways to make those code blocks more efficient
Encyclopaedia Britannica thought they were prepared when they launched their ad-supported encyclopedia ten years ago. The developers didn't know they would be announcing it on Good Morning America the day of the launch. The whole thing came crashing down for days.
As long as your systems aren't swapping, your memory usage is reasonable. Any additional concern is just premature optimization.
There is some way to execute a php script every 40 milliseconds?
I don't know if cronjob is the right way, because 25 times per second require a lot of CPU.
Well, If php isn't the correct language, what language I should use?
I am making a online game, but I need something to process what is happening in the game, to move the characters, to calculate projectiles paths, etc.
If you try to invoke a PHP script every 40 milliseconds, that will involve:
Create a process
Load PHP
Load and compile the script
Run the compiled script
Remove the process and all of the memory
You're much better off putting your work into the body of a loop, and then using time_sleep_until at the end of the loop to finish out the rest of your 40 milliseconds. Then your run your PHP program once.
Keep in mind, this needs to be a standalone PHP program; running it out of a web page will cause the web server to timeout on that page, and then end your script prematurely.
Every 40 milliseconds would be impressive. It's not really suited for cron, which runs on 1-minute boundaries.
Perhaps if you explained why you need that level of performance, we could make some better suggestions.
Another thing you have to understand is that it takes time to create processes under UNIX - this may be better suited to a long running task started once and just doing the desired activity every 40ms.
Update: For an online game with that sort of performance, I think you seriously need to consider having a fat client running on the desktop.
By that I mean a language compiled to machine language (not interpreted) and where the bulk of the code runs on the client, using the network only for transmitting information that needs to be shared.
I don't doubt that the interpreted languages are suitable for less performance intensive games but I don't think, from personal experience, you'll be able to get away with them for this purpose.
PHP is a slow, interpreted language. For it to open a file takes almost that amount of time. Rxecuting a PHP script every 40 milliseconds would lead to a huge queue, and a crash very quickly. This definitely sounds like a task you don't want to use PHP for, but a daemon or other fast, compiled binary. What are you looking to do?
As far as I know a cronjob can only be executed every minute. That's the smallest amount of time possible. I'm left wondering why you need such a small amount of time of execution?
If you really want it to be PHP, I guess you should keep the process running through a shell, as some kind of deamon, instead of opening/closing it all the time.
I do not know how to do it but I guess you can at least get some inspiration from this post:
http://kevin.vanzonneveld.net/techblog/article/create_daemons_in_php/
As everyone else is saying, starting a new process every 40ms doesn't sound like a good idea. It would be interesting to know what you're trying to do. What do you want to do if one execution for some reason takes more than 40ms? If you're now careful you might get lots of processes running simultaneous stepping on each other toes.
What language will depend a lot on what you're trying to do, but you should chose a language with thread support so you don't have to fork a new process all the time. Java, Python might be suited.
I'm not so sure every 40 MS is realistic if the back end job has to deal with things like database queries. You'd probably do better working out a way to be adaptive to system conditions and trying hard to run N times per second, rather than every 40 MS like clockwork. Again, this depends on the complexity of what you need to accomplish behind the curtain.
PHP is probably not the best language to write this with. This is for several reasons:
Depending on the version of PHP, garbage collection may be broken. If you daemonize, you run a risk of leaking memory N times a second.
Other reasons detailed in this answer.
Try using C or Python and keep track of how long each iteration takes. This lets you make a 'best effort' to run N times a second, or every 40 MS, whichever is greater. This avoids your process perpetually running since every time it finishes, its already late to get started again.
Again, I'm not sure how long these tasks should take on a 'worst case' scenario system load .. so my answer may or may not apply in full. Regardless, I advise you to not write a stand alone daemon in PHP.
PHP is the wrong language for this job. If you want to do something updating that fast in a browser, you need to use Javascript. PHP is only for the backend, which means everything PHP does has to be send from your server to the browser and then rendered.