What kind of ideas or tips and tricks do you have in order to boost PHP performance.
Something like,
I use:
$str = 'my string';
if(isset($str[3])
Instead of:
if(strlen($str) > 3)
Which is a bit faster.
Or storing values as keys instead of vars in array, makes searching if key exists much faster. Hence using isset($arr[$key]) instead of array_exists($arr, $key)
Shoot your ideas, i would love to hear them.
Use a profiler and measure your performance.
Optimise the areas that need it.
Typical areas that will give you the most bang for effort in a typical php website.
think about database queries carefully. They often take up most of the execution time.
Don't include code you don't need
Don't write you own versions of the built in functions - the built in ones are compiled C and will be faster that you L33T php version.
Use an OpCode cache.
Most PHP accelerators work by caching the compiled bytecode of PHP scripts to avoid the overhead of parsing and compiling source code on each request (some or all of which may never even be executed). To further improve performance, the cached code is stored in shared memory and directly executed from there, minimizing the amount of slow disk reads and memory copying at runtime.
Dont do as list of this things, you'll make your code unreadable... or harder to read, even by yourself.
Leave this things to the Zend Engine, or to the accelerator of your choice( actually a opcode cache).
Optimizations like these may be faster now, but they may actually get slower if the guys from the zend engine starts to auto-optimize things like these.
Ex: one may speed up the strlen() function a lot by giving up on z-strings and using l-strings(the length being in the first char, or word). This in turn will end up making your (pre-optimized) script slower if you optimize like this.
Use parameterized SQL instead of mysql_query(). And reduce the overall number of database queries. Everything else are shallow optimizations.
Related
I've been working with MySQL for a while, but I've never really used the supported mathematical functions, such as FLOOR(), SQRT(), CRC32(), etc.
Is it faster / better to use these functions in queries rather than just doing the same on the result set with PHP?
EDIT: I don't think this question is a duplicate of this, as my question is about mathematical functions, listed on the page I linked, not CONCAT() or NOW() or any other function as in that question. Please consider this before flagging.
It is more efficient to do this in PHP.
Faster depends on the machines involved, if you're talking about faster for one user. If you're talking about faster for a million users hitting a website, then it's more efficient to do these calculations in PHP.
The load of a webserver running PHP is very easily distributed over a large number of machines. These machines can run in parallel, handling requests from visitors and fetching necessary information from the database. The database, however, is not easy to run in parallel. Issues such as replication or sharding are complex and can require specialty software and properly organized data to function well. These are expensive solutions compared to adding another PHP installation to a server array.
Because of this, the value of a CPU cycle on the database machine is far more valuable than one on the webserver. So you should perform these math functions on the webserver where CPU cycles are cheaper and significantly more easy to parallelize.
There's no general answer to that. You certainly shouldn't go out of your way to do math in SQL instead of PHP; it really doesn't make that much of a difference, if there is any. However, if you're doing an SQL query anyway, and you have the choice of doing the operation in PHP before you send it to MySQL or in the query itself... it still won't make much of a difference. Oftentimes there will be a logical difference, in terms of when and how often exactly the operation is performed and where it needs to be performed and where the code to do this is best kept for good maintainability and reuse. That should be your first consideration, not performance.
Overall, you have to do some really complex math for any of it to make any difference whatsoever. Some simple math operations are trivial in virtually any language and environment. If in doubt, benchmark your specific case.
Like deceze said there's probably not going to be much difference in speed between using math functions in SQL and PHP. If you're really worried, you should always benchmark both of your use cases.
However, one example that comes to my mind, when is probably better to use SQL math functions than doing PHP math functions: when you don't need to perform any additional operations on the results from the DB. If you do your operations in MySQL, you avoid having to loop through results in PHP.
There is an additional consideration to think of and that's scaling. You usually have one MySQL server (if you have more than one, then you probably already know all of this). But you can have many web servers connecting to the same MySQL server (e.g. when having load balancer).
In that case, it's going to be better to move the computation to the PHP to take the load of MySQL. It's "easier" to add more web servers than to increase the performance of MySQL. In theory you can add infinite amount of web servers, but the amount of memory and the processor speed / num. of cores in a MySQL server is finite. You can scale MySQL in some other ways, like using MySQL cluster or doing master-slave replication and reading from slaves, but that will always be more complicated / harder to do.
MySQL is faster in scope of SQL query. PHP is faster in PHP code. If you make SQL query to find out SQRT() it should be definitely slower (unless PHP is broken) because MySQL parser and networking overhead.
as my webprojects are getting bigger I wonder if PHP interprets this code,
<?php
function helloWorldOutput($helloworldVariable) {
echo 'Hello World' . $helloworldVariable;
}
helloWorldOutput("I am PHP");
?>
slower than this:
<?php
function a($b) { echo 'Hello World'+$b; }
a("I am PHP");
?>
Because PHP is an interpreted language without compiled binary I think the second sample should be a bit faster. Is that true, and is there any kind of pre-interpreting mechanic which caches a faster version of the code in PHP?
Yes, it will take some extra time to parse/compile the larger code to byte code. The time is usually negligible, so you should probably just not worry about it since there are better ways to deal with the time spent compiling.
What you would do for quite a bit more performance boost, is to use a PHP accelerator like for example APC that will cache compiled code and eliminate the whole compile step except for at the first access to a page.
Using an accelerator will remove any possible downside with keeping your code commented and clear, and lets you concentrate on functionality instead of shortening your code.
Parsing the first version and the calls to it will take longer. So if you decide on using the first version and call a function with a name that lone many, many times just because of the parsing the second version will be slightly faster. As of the actual function execution - no both functions will be equally faster.
Still my advice is do not ever attempt to do such micro-optimizations. Performance will improve just slightly, readability will suffer greatly.
the first example has fewer characters which means it's farset to parse,
php runs some sort of bytecode internally so execution speed will not differ much.
the slowest bit is probably reading the file from disk, and short code will win that race easily.
I generally include 1 functions file into the hader of my site, now this site is pretty high traffic and I just like to make every little thing the best that I can, so my question here is,
Is it better to include multiple smaller function type files with just the code that's needed for that page or does it really make no difference to just load it all as 1 big file, my current functions file has all the functions for my whole site, it's about 4,000 lines long and is loaded on every single page load sitewide, is that bad?
It's difficult to say. 4,000 lines isn't that large in the realms of file parsing. In terms of code management, that's starting to get on the unwieldy side, but you're not likely to see much of a measurable performance difference by breaking it up into 2, 5 or 10 files, and having pages include only the few they need (it's better coding practice, but that's a separate issue). Your differential in number-of-lines read vs. number-of-files that the parser needs to open doesn't seem large enough to warrant anything significant. My initial reaction is that this is probably not an issue you need to worry about.
On the opposite side of the coin, I worked on an enterprise-level project where some operations had an include() tree that often extended into the hundreds of files. Profiling these operations indicated that the time taken by the include() calls alone made up 2-3 seconds of a 10 second load operation (this was PHP4).
If you can install extensions on your server, you should take a look at APC (see also).
It is free, by the way ;-) ; but you must be admin of your server to install it ; so it's generally not provided on shared hosting...
It is what is called an "opcode cache".
Basically, when a PHP script is called, two things happen :
the script is "compiled" into opcodes
the opcodes are executed
APC keeps the opcodes in RAM ; so the file doesn't have to be re-compiled each time it is called -- and that's a great thing for both CPU-load and performances.
To answer the question a bit more :
4,000 lines is not that much, speaking of performances ; Open a couple of files of any big application / Framework, and you'll rapidly get to a couple thousand of lines
a really important thing to take into account is maintenability : what will be easier to work with for you and your team ?
loading many small files might imply many system calls, which are slow ; but those would probably be cached by the OS... So probably not that relevant
If you are doing even 1 database query, this one (including network round-trip between PHP server and DB server) will probably take more time than the parsing of a couple thousand lines ;-)
I think it would be better if you could split the functions file up into components that is appropriate for each page; and call for those components in the appropriate pages. Just my 2 cents!
p/s: I'm a PHP amateur and I'm trying my hands on making a PHP site; I'm not using any functions. So can you enlighten me on what functions would you need for a site?
In my experience having a large include file which gets included everywhere can actually kill performance. I worked on a browser game where we had all game rules as dynamically generated PHP (among others) and the file weighed in at around 500 KiB. It definitely affected performance and we considered generating a PHP extension instead.
However, as usual, I'd say you should do what you're doing now until it is a performance problem and then optimize as needed.
If you load a 4000 line file and use maybe 1 function that is 10 lines, then yes I would say it is inefficient. Even if you used lots of functions of a combined 1000 lines, it is still inefficient.
My suggestion would be to group related functions together and store them in separate files. That way if a page only deals with, for example, database functions you can load just your database functions file/library.
Anothe reason for splitting the functions up is maintainability. If you need to change a function you need to find it in your monalithic include file. You may also have functions that are very, very similar but don't even realise it. Sorting functions by what they do allows you to compare them and get rid of things you don't need or merge two functions into one more general purpose function.
Most of the time Disc IO is what will kill your server so I think the lesser files you fetch from disc the better. Furthermore if it is possible to install APC then the file will be stored compiled into memory which is a big win.
Generally it is better, file management wise, to break stuff down into smaller files because you only need to load the files that you actually use. But, at 4,000 lines, it probably won't make too much of a difference.
I'd suggest a solution similar to this
function inc_lib($name)
{
include("/path/to/lib".$name.".lib.php");
}
function inc_class($name)
{
include("/path/to/lib".$name.".class.php");
}
What are some of the most expensive operations in PHP? I know things like overusing the # operator can be expensive. What else would you consider?
serialize() is slow, as is eval(), create_function(), and spawning additional processes via system() and related functions.
beware of anything APC can't cache -- conditional includes, eval()ed code, etc.
Opening database connections. Always cache your connections and re-use them.
Object cloning
Regular expressions. Always use the normal string operations over a regular expression operation if you don't need the functionality of a regexp, e.g. use str_replace() over preg_replace() where possible.
Logging and disk writes can be slow - eliminate unnecessary logging and file operations
Some micro-optimizations that are good practice, but won't make much difference to your bottom line performance:
Using echo is faster than print
Concatenating variables is faster than using them inline in a double-quoted string.
Using echo with a list of arguments is faster than concatenating the arguments. Example: echo 'How are you ',$name,' I am fine ',$var1 is faster than echo 'How are you '.$name.' I am fine '.$var1
Develop with Notices and Warnings turned on. Making sure they don't get triggered saves PHP from having to run error control on them.
Rather than trying to figure out potential areas that are slow, use a profiling tool. Installing xDebug was probably one of the easiest and best things I've done to improve the code I write. Install with WinCacheGrind (or the correct version for your OS) for best results.
"Hello $name"
syntax is slower than
'Hello ' . $name
also __get() __set() __call(), etc are slow
and, if you care so much, you can use optimized structures from SPL
Anything that's going though a network connection -- like calling a webservice, for instance : it'll generally take more time than doing an operation locally.
(Even if it doesn't cost much CPU, it'll cost time)
I'd say SQL queries inside loops. Such as this:
foreach ($db->query('SELECT * FROM categories') as $cat)
{
foreach ($db->query('SELECT * FROM items WHERE cat_id = ' . $cat['cat_id']) as $item)
{
}
}
Which, for the record, could be shortened into something like this:
$sql = 'SELECT c.*, i.*
FROM categoriess c
LEFT JOIN items i USING (cat_id)
ORDER BY c.cat_order';
foreach ($db->query($sql) as $row)
{
}
curl_exec() is very slow, compared to typical operations. Also, most str_* operations are faster than regex operations.
json_encode is faster than serialize
Concatenate in loop is faster than implode
People think that # is expensive maybe only because this saying is quite wide-spread on the web.
quoting from : http://www.php.net/manual/en/language.operators.errorcontrol.php#102543
If you're wondering what the performance impact of using the #
operator is, consider this example. Here, the second script (using
the # operator) takes 1.75x as long to execute...almost double the
time of the first script.
So while yes, there is some overhead, per iteration, we see that the #
operator added only .005 ms per call. Not reason enough, imho, to
avoid using the # operator.
real 0m7.617s user 0m6.788s sys 0m0.792s
vs
real 0m13.333s user 0m12.437s sys 0m0.836s
You can nearly unable to "overuse" an operator and it often worth if it is doing an operation you want.
foreach() statements, especially with nesting, are frequently expensive; though that's as much my naive -and occasionally poorly-planned- approach to programming as php's fault.
Though I think it's true, also, of JS and other languages, so almost certainly my fault. =/
From my own experience the most expensive operation in real terms is the echo statement. Try and join all string together before outputting them to the browser, followed by database calls especially joins!
Code can also sometimes get a x10 performance increase by just simply refactoring your algorithms and data structures. Get any program and try to half its length, can you half it again?
uniqid() is stupid expensive. Don't use to generate lots of unique identifiers.
A few minutes ago, I asked whether it was better to perform many queries at once at log in and save the data in sessions, or to query as needed. I was surprised by the answer, (to query as needed). Are there other good rules of thumb to follow when building PHP/MySQL multi-user apps that speed up performance?
I'm looking for specific ways to create the most efficient application possible.
hashing
know your hashes (arrays/tables/ordered maps/whatever you call them). a hash lookup is very fast, and sometimes, if you have O(n^2) loops, you may reduce them to O(n) by organizing them into an array (keyed by primary key) first and then processing them.
an example:
foreach ($results as $result)
if (in_array($result->id, $other_results)
$found++;
is slow - in_array loops through the whole $other_result, resulting in O(n^2).
foreach ($other_results as $other_result)
$hash[$other_result->id] = true;
foreach ($results as $result)
if (isset($hash[$result->id]))
$found++;
the second one is a lot faster (depending on the result sets - the bigger, the faster), because isset() is (almost) constant time. actually, this is not a very good example - you could do this even faster using built in php functions, but you get the idea.
optimizing (My)SQL
mysql.conf: i don't have any idea how much performance you can gain by optimizing your mysql configuration instead of leaving the default. but i've read you can ignore every postgresql benchmark that used the default configuration. afaik with configuration matters less with mysql, but why ignore it? rule of thumb: try to fit the whole database into memory :)
explain [query]: an obvious one, a lot of people get wrong. learn about indices. there are rules you can follow, you can benchmark it and you can make a huge difference. if you really want it all, learn about the different types of indices (btrees, hashes, ...) and when to use them.
caching
caching is hard, but if done right it makes the difference (not a difference). in my opinion: if you can live without caching, don't do it. it often adds a lot of complexity and points of failures. google did a bit of proxy caching once (to make the intertubes faster), and some people saw private information of others.
in php, there are 4 different kinds of caching people regulary use:
query caching: almost always translates to memcached (sometimes to APC shared memory). store the result set of a certain query to a fast key/value (=hashing) storage engine. queries (now lookups) become very cheap.
output caching: store your generated html for later use (instead of regenerating it every time). this can result in the biggest speed-ups, but somewhat works against PHPs dynamic nature.
browser caching: what about etags and http responses? if done right you may avoid most of the work right at the beginning! most php programmers ignore this option because they have no idea what HTTP is.
opcode caching: APC, zend optimizer and so on. makes php code load faster. can help with big applications. got nothing to do with (slow) external datasources though, and the potential is somewhat limited.
sometimes it's not possible to live without caches, e.g. if it comes to thumbnails. image resizing is very expensive, but fortunatley easy to control (most of the time).
profiler
xdebug shows you the bottlenecks of your application. if your app is too slow, it's helpful to know why.
queries in loops
there are (php-)experts who do not know what a join is (and for every one you educate, two new ones without that knowledge will surface - and they will write frameworks, see schnalles law). sometimes, those queries-in-loops are not that obvious, e.g. if they come with libraries. count the queries - if they grow with the results shown, there is something wrong.
inexperienced developers do have a primal, insatiable urge to write frameworks and content management systems
schnalle's law
Optimize your MySQL queries first, then the PHP that handles it, and then lastly cache the results of large queries and searches. MySQL is, by far, the most frequent bottleneck in an application. A poorly designed query can take two to three times longer than a well designed query that only selects needed information.
Therefore, if your queries are optimized before you cache them, you have saved a good deal of processing time.
However, on some shared hosts caching is file-system only thanks to a lack of Memcached. In this instance it may be better to run smaller queries than it is to cache them, as the seek time of the hard drive (and waiting for access due to other sites) can easily take longer than the query when your site is under load.
Cache.
Cache.
Speedy indexed queries.