include selectively or globally? - php

With this question, I aim to understand the inner workings of PHP a little better.
Assume that you got a 50K library. The library is loaded with a bunch of handy functions that you use here and there. Also assume that these functions are needed/used by say 10% of the pages of your site. But your home page definitely needs it.
Now, the question is... should you use a global include that points to this library - across the board - so that ALL the pages ( including the 90% that do not need the library ) will get it, or should you selectively add that include reference only on the pages you need?
Before answering this question, let me point "why" I ask this question...
When you include that reference, PHP may be caching it. So the performance hit I worry may be one time deal, as opposed to every time. Once that one time out of the way, subsequent loads may not be as bad as one might think. That's all because of the smart caching mechanisms that PHP deploys - which I do not have a deep knowledge of, hence the question...
Since the front page needs that library anyway, the argument could be why not keep that library warm and fresh in the memory and get it served across the board?
When answering this question, please approach the matter strictly from a caching/performance point of view, not from a convenience point of view just to avoid that the discussion shifts to a programming style and the do's and don'ts.
Thank you

Measure it, then you know.
The caching benefit it likely marginal after the first hit, since the OS will cache it as well, but that save only the I/O hit (granted, this is not nothing). However, you will still incur the processing hit. If you include your 50K of code in to a "Hello World" page, you will still pay the CPU and memory penalty to load and parse that 50K of source, even if you do not execute any of it. That part of the processing will most likely not be cached in any way.
In general, CPU is extremely cheap today, so it may not be "worth saving". But that's why you need to actually measure it, so you can decide yourself.

The caching I think you are referring to would be opcode caching from something like APC? All that does is prevent PHP from needing to interpret the source each time. You still take some hit for each include or require you are using. One paradigm is to scrap the procedural functions and use classes loaded via __autoload(). That makes for a simple use-on-demand strategy with large apps. Also agree with Will that you should measure this if you are concerned. Premature optimization never helps.

I very much appreciate your concerns about performance.
The short answer is that, for best performance, I'd probably conditionally include the file on only the pages that need it.
PHP's opcode caches will maintain both include files in a cached form, so you don't have to worry about keeping the cache "warm" as you might when using other types of caches. The cache will remain until there are memory limitations (not an issue with your 50K script), the source file is updated, you manually clear the cache, or the server is restarted.
That said, opcode (PHP bytecode) caching is only one part of the PHP parsing process. Every time a script is run, the bytecode is then processed to build up the functions, classes, objects, and other instance variables that are defined and optionally used within the script. This all adds up.
In this case, a simple change can lead to significant improvement in performance. Be green, every cycle counts :)

Related

Calling one page in another makes website slow?

I am making a website where I have to keep track of logged in users. So in every PHP document, I have written code to connect to the database. If I write the database connecting code in one php document and call it in other PHP documents, will it make my page slow?
Instead of putting all features on one page, what if I design features in different pages and call all features on one page? Will it a slow downloading speed of a website?
It would certainly have some impact, but this should be weighed against the benefits of code organization. In this case, I'd strongly err on the side of code organization, so I suggest break up your logic into multiple files. A few points in favor of this approach:
Keep in mind that you are talking server-side only. That means the delay comes from opening local files on the server, rather than, say, sending HTTP requests. This is a very fast operation on any modern computer.
"Premature optimization is the root of all evil". Until you actually have speed issues, bending backwards for optimization's sake is universally considered a bad idea. This is because optimization tends to obfuscate code while rarely providing appreciable speed benefits. This bogs down developer comprehension and increases the likelyhood of bugs.
And, as Andreas pointed out, code reuse is king. Rewriting the same code in multiple places means that making a change requires duplicating that change in all those places, which takes time and (again) increases the likelyhood of bugs.

How important is caching for a site's speed with PHP?

I've just made a user-content orientated website.
It is done in PHP, MySQL and jQuery's AJAX. At the moment there is only a dozen or so submissions and already I can feel it lagging slightly when it goes to a new page (therefore running a new MySQL query)
Is it most important for me to try and optimise my MySQL queries (by prepared statements) or is it worth in looking at CDN's (Amazon S3) and caching (much like the WordPress plugin WP Super Cache) static HTML files when there hasn't been new content submitted.
Which route is the most beneficial, for me as a developer, to take, ie. where am I better off concentrating my efforts to speed up the site?
Premature optimization is the root of all evil
-Donald Knuth
Optimize when you see issues, don't jump to conclusions and waste time optimizing what you think might be the issue.
Besides, I think you have more important things to work out on the site (like being able to cast multiple votes on the same question) before worrying about a caching layer.
Its done in PHP, MySQL and jQuery's AJAX, at the moment there is only a dozen or so submissions and already i can feel it lagging slightly when it goes to a new page (therefore running a new mysql query)
"Can feel it lagging slightly" – Don't feel it, know it. Run benchmarks and time your queries. Are you running queries effectively? Is the database setup with the right indexes and keys?
That being said...
CDN's
A CDN works great for serving static content. CSS, JavaScript, images, etc. This can speed up the loading of the page by minimizing the time it takes to request all the resources. It will not fix bad query practice.
Content Caching
The easiest way to implement content caching is with something like Varnish. Basically sits in front of your site and re-serves content that hasn't been updated. Minimally intrusive and easy to setup while being amazingly effective.
Database
Is it most important for me to try and optimise my MySQL queries (by prepared statements)
Why the hell aren't you already using prepared statements? If you're doing raw SQL queries always use prepared statements unless you absolutely trust the content in the queries. Given a user content based site I don't think you can safely say that. If you notice query times running high then take a look at the database schema, the queries you are running per-page, and the amount of content you have. With a few dozen entries you should not be noticing any issue even with the worst queries.
I checked out your site and it seems a bit sluggish to me as well, although it's not 100% clear it's the database.
A good first step here is to start on the outside and work your way in. So use something like Firebug (for Firefox), that - like similar plug-ins of its type - will allow you to break down where the time goes in loading a page.
http://getfirebug.com/
Second, per your comment above, do start using PreparedStatements where applicable; it can make a big difference.
Third, make sure your DB work is minimally complete - that means make sure you have indexes in the right place. It can be useful here to run the types of queries you get on your site and where the time goes. Explaining plans
http://dev.mysql.com/doc/refman/5.0/en/explain.html
and MySQL driver logging (if your driver supports it) can be helpful here.
If the site is still slow and you've narrowed it to use of the database, my suggestion is to do a simple optimization at first. Caching DB data, if feasible, is likely to give you a pretty big bang for the buck here. One very simple solution towards that end, especially given the stack you mention above, is to use Memcached:
http://memcached.org/
After injecting that into your stack, measure your performance + scalability and only pursue more advanced technologies if you really need to. I think you'll find that simple load balancing, caching, and a few instances of your service will go pretty far in addressing basic performance + scalability goals.
In parallel, I suggest coming up with a methodology to measure this more regularly and accurately. For example, decide how you will actually do automated latency measures and load testing, etc.
For me - optimising DB is on first place - because any caching can cause that when you find some problem , you need to rebuild all cache
There are several areas that can be optimized.
Server
CSS/JS/Images
PHP Code/Setup
mySQL Code/Setup
1st, I would use firefox, and the yslow tag, to evaluate your website's performance, and it will give server based suggestions.
Another solution, I have used is this addon.
http://aciddrop.com/php-speedy/
"PHP Speedy is a script that you can install on your web server to automatically speed up the download time of your web pages."
2nd, I would create a static domain name like static.yourdomainane.com, in a different folder, and move all your images, css, js there. Then point all your code to that domain, and then tweak your web server settings to cache all those files.
3rd, I would look at articles/techniques like this, http://www.catswhocode.com/blog/3-ways-to-compress-css-files-using-php to help compress/optimize your static files like css/js.
4th, review all your images, and their sizes, and make sure they are fully optimized. Or, convert to using css sprites.
http://www.smashingmagazine.com/2009/04/27/the-mystery-of-css-sprites-techniques-tools-and-tutorials/
http://css-tricks.com/css-sprites/
Basically for all your main site images, move them into 1 css sprite, then change your css, to refer to different spots on that sprite to display the image needed.
5th, Review your content pages, which pages, change frequently, and which ones rarely change, and those that rarely change, make those into static html pages. Those that change frequently, you can either leave as php pages, or create a cron or scheduled task using php command line to create new static html versions of the php page.
6th, for mySQL, I recommend you have the slow query log on, to help identify slow queries. Review your table structure, make sure they are optimal, and have tables, that are well designed. Use views and stored procedures, to move hard sql logic or functioning from php to mySQL.
I know this is a lot, but I hope it's useful.
It depends where your slowdowns really lie. You have a lot of twitter and facebook stuff on there that could easily slow your page down significantly.
Use firebug to see if anything is being downloaded during your perceived slow loading times. You can also download the YSlow firefox plugin to give you tips on speeding up page loads.
A significant portion of perceived slowness can be due to the javascript on the page rather than your back-end. With such a small site you should not see any performance issues on the back end until you have thousands of submissions.
Is it most important for me to try and optimise my MySQL queries (by prepared statements)
Sure.
But prepared statements has nothing to do with optimizations.
Nearly 99% of sites are running with no cache at all. So, I don't think you're really need it.
If your site is running slow, you have to profile it first and then optimise certain place that proven being a bottleneck.

Using a single PHP script for an entire site

I had an idea today (that millions of others have probably already had) of putting all the sites script into a single file, instead of having multiple, seperate ones. When submitting a form, there would also be a hidden field called something like 'action' which would represent which function in the file would handle it.
I know that things like Code Igniter and CakePHP exist which help seperate/organise the code.
Is this a good or bad idea in terms of security, speed and maintenance?
Do things like this already exist that i am not aware of?
What's the point? It's just going to make maintenance more difficult. If you're having a hard time managing multiple files, you should invest the time into finding a better text editor / IDE and stop using Notepad or whatever is making it so difficult in the first place!
Many PHP frameworks rely on the Front Controller design: a single small PHP script serves as the landing point for all requests. Based on request arguments, the front controller invokes code in other PHP scripts.
But storing all code for your site in a single file is not practical, as other people have commented.
There are many forums that do this. Personally, I don't like it, mainly because if you make an error in the file, the entire site is broken until you fix it.
I like separation of each part, but I guess it has its plusses.
It's likely bad for maintenance, as you can't easily disable a section of your site for an update.
Speed: I'm not sure to be honest.
Security: You could accomplish the exact same security settings but just adding a security check to a file and then including that file in all your pages.
If you're not caching your scripts, everything in a single file means less disk I/O, and since generally, disk I/O is an expensive operation, this probably can be a significant benefit.
The thing is, by the time you're getting enough traffic for this to matter, you're probably better off going with caching anyway. I suppose it might make some limited sense, though, in special cases where you're stuck on a shared hosting environment where bandwidth isn't an issue.
Maintenance and security: composing software out of small integral pieces of code a programmer can fit inside their head (and a computer can manage neatly in memory) is almost always a better idea than a huge ol' file. Though if you wanted to make it hell for other devs to tinker with your code, the huge ol' file might serve well enough as part of an obfuscation scheme. ;)
If for some reason you were using the single-file approach to try and squeeze out extra disk I/O, then what you'd want to do is create a build process, where you did your actual development work in a series of broken-out discrete files, and issued make or ant like command to generate your single file.

Seriously, should I write bad PHP code? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm doing some PHP work recently, and in all the code I've seen, people tend to use few methods. (They also tend to use few variables, but that's another issue.) I was wondering why this is, and I found this note "A function call with one parameter and an empty function body takes about the same time as doing 7-8 $localvar++ operations. A similar method call is of course about 15 $localvar++ operations" here.
Is this true, even when the PHP page has been compiled and cached? Should I avoid using methods as much as possible for efficiency? I like to write well-organized, human-readable code with methods wherever a code block would be repeated. If it is necessary to write flat code without methods, are there any programs that will "inline" method bodies? That way I could write nice code and then ugly it up before deployment.
By the way, the code I've been looking at is from the Joomla 1.5 core and several WordPress plugins, so I assume they are people who know what they're doing.
Note: I'm pleased that everyone has jumped on this question to talk about optimization in general, but in fact we're talking about optimization in interpreted languages. At least some hint of the fact that we're talking about PHP would be nice.
How much "efficiency" do you need? Have you even measured? Premature optimization is the root of all evil, and optimization without measurement is ALWAYS premature.
Remember also the rules of Optimization Club.
The first rule of Optimization Club is, you do not Optimize.
The second rule of Optimization Club is, you do not Optimize without measuring.
If your app is running faster than the underlying transport protocol, the optimization is over.
One factor at a time.
No marketroids, no marketroid schedules.
Testing will go on as long as it has to.
If this is your first night at Optimization Club, you have to write a test case.
I think Joomla and Wordpress are not the greatest examples of good PHP code, with no offense. I have nothing personal against the people working on it and it's great how they enable people to have a website/blog and I know that a lot of people spend all their free time on either of those projects but the code quality is rather poor (with no offense).
Review security announcements over the past year if you don't believe me; also assuming you are looking for performance from either of the two, their code does not excel there either. So it's by no means good code, but Wordpress and Joomla both excel on the frontend - pretty easy to use, people get a website and can do stuff.
And that's why they are so successful, people don't select them based on code quality but on what they enabled them to do.
To answer your performance question, yes, it's true that all the good stuff (functions, classes, etc.) slow your application down. So I guess if your application/script is all in one file, so be it. Feel free to write bad PHP code then.
As soon as you expand and start to duplicate code, you should consider the trade off (in speed) which writing maintainable code brings along. :-)
IMHO this trade off is rather small because of two things:
CPU is cheap.
Developers are not cheap.
When you need to go back into your code in six months from now, think if those nano seconds saved running it, still add up when you need to fix a nasty bug (three or four times, because of duplicated code).
You can do all sorts of things to make PHP run faster. Generally people recommend a cache, such as APC. APC is really awesome. It runs all sorts of optimizations in the background for you, e.g. caching the bytecode of a PHP file and also provides you with functions in userland to save data.
So for example if you parse a configuration file each time you run that script disk i/o is really critical. With a simple apc_store() and apc_fetch() you can store the parsed configuration file either in a file-based or a memory-based (RAM) cache and retrieve it from there until the cache expired or is deleted.
APC is not the only cache, of course.
You should see the responses to this question: Should a developer aim for readability or performance first?
To summarize the consensus: Unless you know for a fact (through testing/profiling) that your performance needs to be addressed in some specific area, readability is far more important.
In 99% of the cases, you should better worry about code understandability. Write code easy to test, understand and mantain.
In those few cases where performance really is critical, scripting languages like PHP are not your best choice. There's a reason many base library functions in PHP are written in C, after all.
Personally, while there may be overhead for a function call, if it means I write the code once (parameterized), and then use it in 85 places, I'm WAY further ahead because I can fix it in one place.
Scripting languages tend to give people the idea that "good enough" and "works" are the only criteria to consider when coding.
Especially with a fast interpreter like PHP's, I don't think lack of readability/maintainability is EVER worth the efficiency you may (or may not!) gain from it.
And a note about WordPress: I've done a lot of browsing of the WordPress code. Don't assume those people know anything about good code, please.
To answer your first question, yes it is true and it is also true for compiled op-code. Yes you can make your code faster by avoiding function calls except in extreme cases where your code grows too large because of code duplication.
You should do what you like "I like to write well-organized, human-readable code with methods wherever a code block would be repeated."
If your going to commit this horrible atrocity of removing all function calls at least use a profiler and only do it to the 10% of your code that matters.
An example of how micro-optimization leads to macro slowdowns:
If you're seriously considering manually inlining functions, consider manually unrolling loops.
JMPs are expensive, and if you can eliminate loops by unrolling and also eliminate all conditional blocks, you'll eliminate all that time wasted merely seeking around the CPU's cache.
Variable augmentation at runtime is slow too, as is pulling things out of a database, so you should inline all that data into your code as well.
Actually, loading up an interpreter for merely executing code and copying memory out to a user is exhaustively wasteful, why don't we just pre-compute all the possible pages and store each page in memory ready to go so its just a mem-copy? surely thats fast!
Ah, now we've got that slow thing called the internet between us, which is hindering user experience and limiting how much content we can use, how about we pre-compute the pages in advance, and archive them all and run them on the users local machine? that'll be really fast!
But that's going to waste cpu cycles, lots of them, what with page load time and browser content rendering etc, we'll skip the middleman and just deliver the pages to them on printed media!. Genius!.
/me watches your company collapse on its face while you spend 10 years precomputing (by hand) and printing pages nobody wants to see.
This may sound silly to you, but to the rest of us, what you proposed is just that ridiculous.
Optimisation is good, but draw the line somewhere sensible so you don't have to worry about future people whom work on the code tracking you down in your sleep for having such a crappy codebase thats unmaintainable.
note: yes, I use gentoo. how did you guess?
Of course you shouldn't write bad PHP code. But once you have something written bad, you may always use perfomance as an excuse :-)
This is premature optimization. While the statement is true that a function call costs more than increasing a local integer variable (nearly everything costs more), the costs of a function call are still very low compared to a database query.
See also:
Wikipedia -> Optimization -> When to optimize
c2.com Wiki -> Premature Optimization
PHP's main strength is that it's quick and easy to get a working app. That strength comes from the opportunity to write loose (bad) code and have it still operate in a somewhat expected way.
If you are in a position to need to conserve a few CPU cycles, PHP is not what you should be using. When PHP web apps perform poorly, it is far more likely due to inefficient queries, not the speed of the code execution.
If you're that worried about every bit on efficiency, then why on earth are you using a scripting language? You should be programming in a much faster language (insert your favorite compiled language here), probably resulting in more, and less readable code, but it'll run really fast, and you can still aim for best coding practices.
Seriously, if you're coding for running speed, you shouldn't be using PHP at all.
If you develop web applications with a MVC architectural pattern, you can greatly benefit from caching and serialization. You can cache views, or portions of it, and you can serialize models.
From experience, models often parse and generate most of the data that's being displayed. If you know a certain model won't be generating new data frequently, like a model that parses an RSS feed, you can just have it stuffed somewhere with all the parsed data and have it refreshed every once in a while.
If you look at wordpress php code, it intermingles php tags in between its html which leads to spaghetti in my mind.
Phpbb3 however is way better in that regard. For example it has a strict division between the php part, and the styles part, which are xhtml formatted files with {template} tags, parsed by a template engine. Which is much cleaner.
Write a couple 10 minute examples and run them in your profiler.
That will tell you which is faster to the millisecond.
If you don't have a profiler, post them here, and I will run them in my PHPEd profiler.
I suspect that much of the time difference, if any, comes from having to open the file that a class is stored in, but that would have to be tested too.
Then ask yourself if you care that much about a few milliseconds vs having to maintain spaghetti code - will any of your users ever notice?
Edit
The profiler won't simulate high traffic volumes, but it will tell you which method is faster for a single user, and which parts of the code are using how much time. Especially if you profile the operations being done repeatedly - say 1000 times each in a loop.
We can assume (though not always) that faster code used by a lot of people will be faster than slower code used by a lot of people.
Those who will lecture you about code micro-optimization are generally the same ones which will have 50 SQL queries per page, taking up a total of 2 seconds, because they never heard about profiling. But their code is optimizized !!! (and slow as hell)
Fact : adding another webserver is not difficult. Replicating a database is.
Optimizing webserver code can be a net loss if it adds load on the DB.
Note : 2-3 ms for simple pages (like a forum topic) including SQL is a good target for a PHP website. My old website used to do that.

Creating a two-pass PHP cache system with mutable items

I want to implement a two-pass cache system:
The first pass generates a PHP file, with all of the common stuff (e.g. news items), hardcoded. The database then has a cache table to link these with the pages (eg "index.php page=1 style=default"), the database also stores an uptodate field, which if false causes the first pass to rerun the next time the page is viewed.
The second pass fills in the minor details, such as how long ago something(?) was, and mutable items like "You are logged in as...".
However I'm not sure on a efficient implementation, that supports both cached and non-cached (e.g., search) pages, without a lot of code and several queries.
Right now each time the page is loaded the PHP script is run regenerating the page. For pages like search this is fine, because most searches are different, but for other pages such as the index this is virtually the same for each hit, yet generates a large number of queries and is quite a long script.
The problem is some parts of the page do change on a per-user basis, such as the "You are logged in as..." section, so simply saving the generated pages would still result in 10,000's of nearly identical pages.
The main concern is with reducing the load on the server, since I'm on shared hosting and at this point can't afford to upgrade, but the site is using a sizeable portion of the servers CPU + putting a fair load on the MySQL server.
So basically minimising how much has to be done for each page request, and not regenerating stuff like the news items on the index all the time seems a good start, compared to say search which is a far less static page.
I actually considered hard coding the news items as plain HTML, but then that means maintaining them in several places (since they may be used for searches and the comments are on a page dedicated to that news item (i.e. news.php), etc).
I second Ken's rec of PEAR's Cache_Lite library, you can use it to easily cache either parts of pages or entire pages.
If you're running your own server(s), I'd strongly recommend memcached instead. It's much faster since it runs entirely in memory and is used extensively by a lot of high-volume sites. It's a very easy, stable, trouble-free daemon to run. In terms of your PHP code, you'd use it much the same way as Cache_Lite, to cache various page sections or full pages (or other arbitrary blobs of data), and it's very easy to use since PHP has a memcache interface built in.
For super high-traffic full-page caching, take a look at doing Varnish or Squid as a caching reverse proxy server. (Pages that get served by Varnish are going to come out easily 100x faster than anything that hits the PHP interpreter.)
Keep in mind with caching, you really only need to cache things that are being frequently accessed. Sometimes it can be a trap to develop a really sophisticated caching strategy when you don't really need it. For a page like your home page that's getting hit several times a second, you definitely want to optimize it for speed; for a page that gets maybe a few hits an hour, like a month-old blog post, it's a bad idea to cache it, you only waste your time and make things more complicated and bug-prone.
I recommend to don't reinvent the wheel... there are some template engines that support caching, like Smarty
For server side caching use something like Cache_Lite (and let someone else worry about file locking, expiry dates, file corruption)
You want to save the results to a file and use logic like this to pull them back out:
if filename exists
include filename
else
generate results
render to html (as string)
write to file
output string or include file
endif
To be clear, you don't need two passes because you can save parts of the page and leave the rest dynamic.
As always with this type of question, my response is:
Why do you need the caching?
Is your application consuming too much IO on your database?
What metrics have you run?
Your are talking about adding an extra level of complexity to your app so you need to be very sure that you actually need it.
You might actually benefit from using the built-in MySQL query cache, if the database is the contention point in your system. The other option is too use Memcache.
I would recommend using existing caching mechanism. Depending on what you really need, You might be looking for APC, memcached, various template caching libs... It easier/faster to tune written/tested code to please your need than to write everything from scratch. (usually, although there might be situations when you don't have a choisce)

Categories