Functionality via Url Interface vs. Include - php

I have been working on a project which had been split over several servers and so php scripts had been run through a url interface. e.g. to resize an image I would call a script on one server either from the same or from one of the other servers as
file_get_contents('http://mysite.com/resizeimg.php?img=file.jpg&x=320&y=480');
now, this works but we are upgrading to a new server structure where the code can be on the same machine. So instead of all these wrapper functions I could just include and call a function. My question is: is it worth the overhead of rewriting the code to do this?
I do care about speed, but don't worry about security -- I already I have a password system and certain scripts only accept from certain ips. I also care about the overhead of rewriting code but cleaner more understandable code is also important. What are the trade offs that people see here and ultimately is it worth it to rewrite it?
EDIT: I think that I am going to rewrite it then to include the functions. Does anyone know if it is possible to include between several servers of the same domain? Like if there is a server farm where I have 2-3 servers can I have some basic functionality on one of them that the others can access but no one else could access from the outside?

is it worth the overhead of rewriting the code to do this?
Most likely yes - a HTTP call will always be slower (and more memory intensive) than directly embedding the generating library.

Related

PHP Include for static remote HTML files on a CDN

I have an app that creates static HTML files. The files are intended to be hosted on a remote CDN, they'd be standard .html files.
I am wondering two things:
If it's possible to do a PHP include on these files?
Can you possibly have good performance doing it this way?
Can it be done?
To answer the question directly, yes, you technically can include a remote file using the PHP include function. In order to do this, you simply need to set the allow_url_include directive to On in your php.ini.
Depending on exactly what you intend to use this for, I would also encourage you to look at file_get_contents.
To enable remote files for file_get_contents, you will need to set allow_url_fopen to On.
Should it be done?
To answer your second question directly, there are many factors that will determine whether you will get good performance, but all in all, it is unlikely to make a dramatic difference to performance.
However, there are other considerations:
From a security perspective, it is ill-advised to enable either of these directives
By delivering the file from your server instead of the CDN you will be negating all of the benefits of the CDN (see below)
Is it really necessary?
CDNs
A frequent misunderstanding when it comes to CDNs is that all they do is serve your data from a closer location, thus it makes the request slightly faster... This is wrong!
There are endless benefits to CDN's, but I have listed a few below (obviously depends on configuration and provider):
They strip out unnecessary headers
No cookies are sent as the CDN tends to be on a different, thus cookie-free, domain
They handle compression
They deliver your content from the nearest location
They handle caching
... and a lot more
By serving the file from your server, you will lose all of the above benefits, unless, of course, you set the server up to handle requests in the same way (this can take time).
To conclude; personally, I would avoid including your .html files into PHP remotely and just serve them directly to the client from the CDN.
To see how you can further optimise your site, and to see the many benefits that most CDNs offer, take a look at GTMetrix.

Why is it a bad idea to tell your server to parse HTML as PHP? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
You know you can make a server parse HTML pages as PHP (execute PHP code in a HTML doc) using .htaccess?
Well, some people say it's bad to do so. Why?
Some people also say it opens a security vulnerability in your application. How?
The source code is still removed before the document reaches the browser, so it can't be the case of unauthorized access to source code, right?
Let me start with a little story: back when I was a security contact at a Linux distribution vendor, the PHP security team begged Linux vendors to stop calling interpreter crashes security bugs, even when the PHP interpreter was running inside the web server (say, mod_php on Apache). (At the time, roughly one interpreter crash was being found per week.)
It took a little bit of conversation for them to actually convince us that whoever supplied the running PHP code is completely trusted and any attempt to control what the scripts could do from the interpreter was misguided -- and if someone figured out how to crash the interpreter to walk around the restrictions it tried to impose (such as the entire silly safe mode pile of crap), it was not a security flaw, because the safe execution of scripts was not the goal of the PHP interpreter -- it never was and never would be.
I'm actually pretty happy with the end result of the discussions -- it clearly defined PHP's security goals: You should only ever allow execution of PHP code that you 100% completely trust. If you do not trust it, you do not run it. It's that simple.
Whatever operating system resources are available to the interpreter are all available and fair game, regardless of whether the script exploits a bug in the interpreter or just does something unexpected.
So, please do not allow random code to be executed in the context of your webserver unless that is what you really want.
Please use the principle of least privilege to guide what resources are available to every program.
Consider using a mandatory access control tool such as AppArmor, SELinux, TOMOYO, or SMACK to further confine what your programs can and can't do. I've worked on the AppArmor project since 2001 or so and am fairly confident that with a day's effort most system administrators can enhance their sites security in a meaningful way with AppArmor. Please evaluate several options, as the different tools are designed around different security models -- one or another may be a better fit.
But whatever you do, please don't run your server in a fashion that needlessly opens it up to attack via extra vectors.
The main concern is if you ever move your code to another server or let someone else work with your code, server settings, or .htaccess file your html pages could stop being parsed by the PHP interpreter.
In that case the PHP code would be served up to the browser.
There's a security vulnerability in that that if you do that, HTML files are really PHP files, and so uploading them should be taken as seriously as uploading PHP files. Often people don't see uploading HTML files as as big a deal, precisely because they don't expect them to be set to parse as PHP (so others in your company might inadvertently open up a security hole). [PaulP.R.O.'s answer notes that a security problem can also arise in the opposite direction - due to PHP being mistaken for HTML later on when this setting is mistakenly dropped.]
There's also a bit of a performance issue, in that every HTML file then has to be run through the PHP parser (even if it happens to contain no PHP).
Parsing HTML as PHP is bad(ish) for speed and organization reasons.
HTML files parsed as PHP will technically load slower, because you're invoking the PHP engine.
But mostly it's bad for organizational purposes: As your project expands, imagine hunting for embedded PHP code in HTML files. When browsing your project, your file extension should be the true indicator of that files purpose. If a form submits to 'login.php' you can be reasonably sure it contains server code. But 'login.html' could just be another HTML page.
Concerning the rest of your comment, I'm not sure about the security aspect, but I'm thinking mixing up your HTML and PHP output could lead to unnoticed XSS vulnerabilities? Not the expert when it comes to that though.
It's bad for speed, and if the PHP interpreter doesn't work for some reason, the PHP code will show up in the page source. If you have, for instance, a database username and password in the PHP code, anyone could connect and access your database easily.
And as zed said, it is bad for organisational reasons. Instead of updating one file, you would need to update all of the files on your site to make a simple change.
Allowing the server to parse HTML files as PHP shows that you are using no propery application design patterns. That is to say that you are blazing your own path instead of doing things the recommended way. There is a reason that designs like MVC (which separate of concerns), exist.
One problem with allowing arbitrary documents to be directly called is the loss of a "front controller" (usually an index.php) which helps to tighten down the number of doorways into your application.
You can have many paths into your application, but you have that many more possible attack routes you must cover in your designs.

Preformance difference between classes and just functions

I am writing a site in PHP and I am noticing that the page is taking 3-5 seconds to load (from a remote server), which is unacceptable. The software relies on around 12 classes to function correctly. I was wondering how much of a performance gain I would get if I rewrote most of the classes to just use regular php functions.
Thanks for any input.
Edit: I rely primarily on Redis, and using a simple MySQL query here and there.
Functions or classes should make little to no difference (totally negligible) : that's not what is making your website / application slow.
Hard to give you more information, as we don't know how your setup looks like, but you might want to take a look at the answer I posted to this question : it contains some interesting ideas, when it comes to performances of a PHP application.
BTW: 12 classes is really not a big number of classes, if I may...
Rewriting all the application in procedural programming is probably the worst thing you can ever do. Object oriented programming is not about performance gain but writing programmer-friendly and easily maintainable applications (among others).
You should not ever think about rewriting an OO application procedural. That's not the purpose. If you ever have bigger resource consumption using OO rather than procedural programming (and that is very unlikely) you should probably think about better scaling the app. Hardware nowadays is not that expensive.
On the other hand, your application has many possible bottlenecks and OO is probably not even on the list.
Did you check:
your Internet connection?
your server's Internet connection?
your ping loss to your server?
your server's configuration? (Apache/Nginx/Lighttpd or whatever it is)
your database server's configuration?
your database queries?
your server's load?
the timing for a connection to Redis?
your firewall's rules for the ports used by Redis?
your Redis' configuration? (maxclients, timeout)
If you answered NO to at least one question above, please do check that and if the problem persists, let me know!
The difference will probably not even be measurable.
Your problem most definitely isn't your code per se, but the way you access the database. Make sure your tables are appropriately indexed and you will see a substantial drop in page load times. You can use EXPLAIN SELECT ... to get further info on how a query actually runs and why it performs badly.
You wont find much of a difference, if anything at all between functions and classes.
Chances are, one or more of your classes is written inefficiently, or relies on something that is. Examples could include waiting on a remote server, image processing, some databases (such as overloaded databases) or countless other methods. You should try to profile your code to see where the problem lies.

PHP: an example where allow_url_include is a good idea?

I just noticed a PHP config parameter called allow_url_include, which allows you to include a PHP file hosted elsewhere as you would a locally. This seems like a bad idea, but "why is this bad" is too easy a question.
So, my question: When would this actually be a good option? When it would actually be the best solution to some problem?
Contrary to the other responders here, I'm going to go with "No". I can't think of any situation where this would make a good idea.
Some quick responses to the other ideas:
Licensing : would be very easy to circumvent
Single library for multiple servers: I'm sorry but this is a very dumb solution to something that should be solved by syncing files from for example a
sourcecontrol system
packaging / distribution system
build system
or a remote filesystem. NFS was mentioned
Remote library from google: nobody has a benefit to a slow non-caching PHP library loading over PHP. This is not (asynchronous) javascript
I think I covered all of them..
Now..
your question was about 'including a file hosted elsewhere', which I think you should never attempt. However, there are uses for allow_url_include. This setting covers more than just http://. It also covers user-defined protocol handlers, and I believe even phar://. For these there a quite a bit of valid uses.
The only things I can think of are:
for a remote library, for example the google api's.
Alternately, if you are something like facebook, with devs in different locations, and the devs use includes from different stage servers (DB, dev, staging, production).
Once again during dev, to a third party program that is in lots of beta transition, so you always get the most recent build without having to compile yourself (for example, using a remote tinymce beta that you are going to be building against that will be done before you reach production).
However, if the remote server goes down, it kills the app, so for most people, not a good idea for production usage.
Here is one example that I can think of.
At my organization my division is responsible for both the intranet and internet site. Because we are using two different servers and in our case two different subdomains then I could see a case for having a single library that is used by both servers. This would allow both servers to use the same class. This wouldn't be a security problem because you have complete control over both servers and would be better than trying to maintain two versions of the same class.
Since you have control over the servers, and because having an external facing server and internal server requires seperation (because of the firewall) then, this would be a better solution than trying to keep a copy of the same class in two locations.
Hmm...
[insert barrel scraping noise here]
...you could use this is a means of licensing software - in that the license key, etc. could be stored on the remote system (managed by the seller). By doing this, the seller would retain control of all the systems attempting to access the key.
However, as you say the list of reasons this is a horrifying idea outweigh any positives in my mind.

Best practices for optimizing LAMP sites for speed? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I want to know when building a typical site on the LAMP stack how do you optimize it for the best possible load times. I am picturing a typical DB-driven site.
This is a high-level look and could probably pull in question and let me break it down into each layer of the stack.
L - At the system level, (setup and filesystem) can you do to improve speed? One thing I can think of is image sizes, can compression here help optimize anything?
A - There have to be a ton of settings related to site speed here in the web server. Not my Forte. Probably depends a lot on how many sites are running concurrently.
M - MySQL in a database driven site, DB performance is key. Is there a better normalization approach i.e, using link tables? Web developers often just make simple monolithic tables resembling 1NF and this can kill performance.
P - aside from performance-boosting settings like caching, what can the programmer do to affect performance at a high level? I would really like to know if MVC design approaches hit performance more than quick-and-dirty. Other simple tips like are sessions faster than cookies would be interesting to know.
Obviously you have to get down and dirty into the details and find what code is slowing you down. Also I realize that many sites have many different performance characteristics, but let's assume a typical site that has more reads then writes.
I am just wondering if we can compile a bunch of best practices and fully expect people to link other questions so we can effectively workup a checklist.
My goal is to see if even in addition to the usual issues in performance we can see some oddball things you might not think of crop up to go along with a best-practices summary.
So my question is, if you were starting from scratch, how would you make sure your LAMP site was fast?
Here's a few personal must-dos that I always set up in my LAMP applications.
Install mod_deflate for apache, and
do not use PHP's gzip handlers.
mod_deflate will allow you to
compress static content, like
javascript/css/static html, as well
as the usual dynamic PHP output, and
it's one less thing you have to worry
about in your code.
Be careful with .htaccess files!
Enabling .htaccess files for
directories in your app means that
Apache has to scan the filesystem
constantly, looking for .htaccess
directives. It is far better to put
directives inside the main
configuration or a vhost
configuration, where they are loaded
once. Any time you can get rid of a
directory-level access file by moving
it into a main configuration file,
you save disk access time.
Prepare your application's database
layer to utilize a connection manager
of some sort (I use a Singleton for
most applications). It's not very
hard to do, and reducing the number
of database connections your
application opens saves resources.
If you think your application will
see significant load, memcached can
perform miracles. Keep this in mind
while you write your code... perhaps
one day instead of creating objects
on the fly, you will be getting them
from memcached. A little foresight
will make implementation painless.
Once your app is up and running, set
MySQL's slow query time to a small
number and monitor the slow query log
diligently. This will show you where
your problem queries are coming from,
and allow you to optimize your
queries and indexes before they
become a problem.
For serious performance tweakers, you
will want to compile PHP from source.
Installing from a package installs a
lot of libraries that you may never
use. Since PHP environments are
loaded into every instance of an
Apache thread, even a 5MB memory
overhead from extra libraries quickly
becomes 250MB of lost memory when
there's 50 Apache threads in
existence. I keep a list of my
standard ./configure line I use when
building PHP here, and I find it
suits most of my applications. The
downside is that if you end up
needing a library, you have to
recompile PHP to get it. Analyze
your code and test it in a devel
environment to make sure you have
everything you need.
Minify your Javascript.
Be prepared to move static content,
such as images and video, to a
non-dynamic web server. Write your
code so that any URLs for images and
video are easily configured to point
to another server in the future. A
web server optimized for static
content can easily serve tens or even
hundreds of times faster than a
dynamic content server.
That's what I can think of off the top of my head. Googling around for PHP best practices will find a lot of tips on how to write faster/better code as well (Such as: echo is faster than print).
First, realize that performance is an iterative process. You don't build a web application in a single pass, launch it, and never work on it again. On the contrary, you start small, and address performance issues as your site grows.
Now, onto specifics:
Profile. Identify your bottlenecks. This is the most important step. You need to focus your effort where you'll get the best results. You should have some sort of monitoring solution in place (like cacti or munin), giving you visibility into what's going on on your server(s)
Cache, cache, cache. You'll probably find that database access is your biggest bottleneck on the back end -- but you should verify this on your own. Fortunately, you'll probably find that a lot of your traffic is for a small set of resources. You can cache those resources in something like memcached, saving yourself the database hit, and resulting in better backend performance.
As others have mentioned above, take a look at the YDN performance rules. Consider picking up the accompanying book. This'll help you with front end performance
Install PHP APC, and make sure it's configured with enough memory to hold all your compiled PHP bytecode. We recently discovered that our APC installation didn't have nearly enough ram; giving it enough to work in cut our CPU time in half, and disk activity by 10%
Make sure your database tables are properly indexed. This goes hand in hand with monitoring the slow query log.
The above will get you very far. That is to say, even a fairly db-heavy site should be able to survive a frontpage digg on a single modestly-spec'd server if you've done the above.
You'll eventually hit a point where the default apache config won't always be able to keep up with incoming requests. When you hit this wall, there are two things to do:
As above, profile. Monitor your apache activity -- you should have an idea of how many connections are active at any given time, in addition to the max number of active connections when you get sudden bursts of traffic
Configure apache with this in mind. This is the best guide to apache config I've seen: Practical mod_perl chapter 11
Take as much load off of apache as you can. Apache's too heavy-duty to serve static content efficiently. You should be using a lighter-weight reverse proxy (like squid) or webserver (lighttpd or nginx) to serve static content, and to take over the job of spoon-feeding bytes to slow clients. This leaves Apache to do what it does best: execute your code. Again, the mod_perl book does a good job of explaining this.
Once you've gotten this far, it's largely an issue of caching more, and keeping an eye on your database. Eventually, you'll outgrow a single server. First, you'll probably add more front end boxes, all backed by a single database server. Then you're going to have to start spreading your database load around, probably by sharding. For an excellent overview of this growth process, see this livejournal presentation
For a more in-depth look at much of the above, check out Building Scalable Web Sites, by Cal Henderson, of Flickr fame. Google has portions of the book available for preview
I've used MysqlTuner for performance analysis on my mysql servers and its given a good insight into further issues for googling, as well as making its own recommendations
A resource you might find helpful is the YDN set of performance rules.
Don't forget the fact that your users will be thousands of miles away from your server, and downloading dozens of files to render a single page. That latency, and the overhead of rendering the page in their browsers can be larger than the amount of time that you spend collecting the information, and generating the page.
See the pages at Yahoo Developer Network about Best Practices for Speeding Up Your Web Site, and the YSlow tool for seeing what part of the downloading of the site is taking time.
Don't forget to turn off atime for your filesystem!
I'd recommend using Jet Profiler for MySQL to find any bad queries. I've successfully used it on a couple of my sites. Really helpful, and much easier to digest than the slow query log.
I'd recommend starting with http://highscalability.com/
As for your suggestions:
Compression for images, definitely no. Type of files system tunning, yes, that could have some effect, but minimal. But actually the best is to use in-memory reverse proxy, or even better CDN.
For Apache basically only load the modules you need. Do not load anything else. As with PHP you can only use forking MPM, it's important to keep it slim. As for optimal settings, well you have to fine tune them to specific application, hardware etc. If you have enough CPU, it's recommendable that you use mod_deflate. Faster the server can send data to the client, faster it can start processing next request.

Categories