PHP Include for static remote HTML files on a CDN - php

I have an app that creates static HTML files. The files are intended to be hosted on a remote CDN, they'd be standard .html files.
I am wondering two things:
If it's possible to do a PHP include on these files?
Can you possibly have good performance doing it this way?

Can it be done?
To answer the question directly, yes, you technically can include a remote file using the PHP include function. In order to do this, you simply need to set the allow_url_include directive to On in your php.ini.
Depending on exactly what you intend to use this for, I would also encourage you to look at file_get_contents.
To enable remote files for file_get_contents, you will need to set allow_url_fopen to On.
Should it be done?
To answer your second question directly, there are many factors that will determine whether you will get good performance, but all in all, it is unlikely to make a dramatic difference to performance.
However, there are other considerations:
From a security perspective, it is ill-advised to enable either of these directives
By delivering the file from your server instead of the CDN you will be negating all of the benefits of the CDN (see below)
Is it really necessary?
CDNs
A frequent misunderstanding when it comes to CDNs is that all they do is serve your data from a closer location, thus it makes the request slightly faster... This is wrong!
There are endless benefits to CDN's, but I have listed a few below (obviously depends on configuration and provider):
They strip out unnecessary headers
No cookies are sent as the CDN tends to be on a different, thus cookie-free, domain
They handle compression
They deliver your content from the nearest location
They handle caching
... and a lot more
By serving the file from your server, you will lose all of the above benefits, unless, of course, you set the server up to handle requests in the same way (this can take time).
To conclude; personally, I would avoid including your .html files into PHP remotely and just serve them directly to the client from the CDN.
To see how you can further optimise your site, and to see the many benefits that most CDNs offer, take a look at GTMetrix.

Related

What's the difference between handling gzip with PHP and Apache?

How can we handle compression in both? Whose responsibility is it? Which one is better, using apache or php to compress files?
PHP compression code:
ob_start("ob_gzhandler");
Or the apache one:
AddOutputFilterByType DEFLATE text/html text/plain text/xml
Is this right that requests first reach in apache then to PHP? If answer is positive so can we infer that we should use the apache one?
Well here is what I know, presented in a pros and cons way.
Apache:
The .htaccess code will always be executed faster, because servers
cache .htaccess files by default.
With .htaccess, you can define custom rules for individual folders
and the server will automatically pick them up
With PHP, you cannot write everything in once place. There are
many other things your .htaccess should have, besides compression:
A charset, expiry/cache control, most likely a few URL re-write
rules, permissions, robot(Googlebot etc) specific stuff.
As far as I know, you cannot do all of this solely with PHP, and since you may need to get all of this done, I don't see why you should combine both of them.
I have always relied on .htaccess or server level configuration to control the above enumerated aspects, and rarely ever had a problem.
PHP:
perhaps a bit more hassle free. With .htaccess files on shared hosting planes you are rather limited and you might run into tedious
problems.
Some servers won't pick up certain commands, some(like 1and1) have a
default configuration which messes with your settings(and nerves).
probably easier to use for someone who is less of a tech person
Overall, Apache is the winner. That's what I would go with all the time!
I don't see why any of the two should be faster but do keep in mind that apache can also do the compression for css files/js files... You don't want to parse these files with php to compress them before you deliver them to the browser.
So I would suggest use the apache method.
In my company we usually use gzip compression on static resources. Apache asks PHP to process those resources (if necessary), then it compresses the output result. I would say that it is faster in theory (C & C++ are faster than PHP) and 'safer' to use Apache compression.
NB: Safer here means that the whole page is going to be compressed while you can forget to compress part of your web page with the ob_start function.
You would have to run your own tests to see which is faster, but I don't believe there should be any difference in how the content is served. Using PHP, you have to handle the output buffering on your own which may be more difficult. It's more transparent with the apache method.
Apache is better since it prevents memory limit errors of php and acts faster because of compiled code vs interpreted code in php and also it is more meaningful to do compression in a different layer than php

Functionality via Url Interface vs. Include

I have been working on a project which had been split over several servers and so php scripts had been run through a url interface. e.g. to resize an image I would call a script on one server either from the same or from one of the other servers as
file_get_contents('http://mysite.com/resizeimg.php?img=file.jpg&x=320&y=480');
now, this works but we are upgrading to a new server structure where the code can be on the same machine. So instead of all these wrapper functions I could just include and call a function. My question is: is it worth the overhead of rewriting the code to do this?
I do care about speed, but don't worry about security -- I already I have a password system and certain scripts only accept from certain ips. I also care about the overhead of rewriting code but cleaner more understandable code is also important. What are the trade offs that people see here and ultimately is it worth it to rewrite it?
EDIT: I think that I am going to rewrite it then to include the functions. Does anyone know if it is possible to include between several servers of the same domain? Like if there is a server farm where I have 2-3 servers can I have some basic functionality on one of them that the others can access but no one else could access from the outside?
is it worth the overhead of rewriting the code to do this?
Most likely yes - a HTTP call will always be slower (and more memory intensive) than directly embedding the generating library.

Need some thoughts and advice if I need to do anything more to improve performance of my webapp

I'm working on a webapp that uses a lot of ajax to display data and I'm wondering if I could get any advice on what else I could do to speed up the app, and reduce bandwidth, etc.
I'm using php, mysql, freeBSD, Apache, Tomcat for my environment. I own the server and have full access to all config files, etc.
I have gzip deflate compression turned on in the apache http.conf file. I have obfuscated and minified all the .js and .css files.
My webapp works in this general manner. After login the user lands on the index.php page. All links on the index page are ajax calls to read a .php class function that will retrieve the html in a string and display it inside a div somewhere on the main index.php page.
Most of the functions returning the html are returning strings like:
<table>
<tr>
<td>Data here</td>
</tr>
</table>
I don't return the full "<html><head>" stuff, because it already exists in the main index.php page.
However, the html strings returned are formatted with tabs, spaces, comments, etc. for easy reading of the code. Should I take the time to minify these pages and remove the tabs, comments, spaces? Or is it negligible to minify the .php pages because its on the server?
I guess I'm trying to figure out if the way I've structured the webapp is going to cause bandwidth issues and if I can reduce the .php class file size could I improve some performance by reducing them. Most of the .php classes are 40-50KB with the largest being 99KB.
For speed, I have thought about using memcache, but don't really know if adding it after the fact is worth it and I don't quite know how to implement it. I don't know if there is any caching turned on on the server...I guess I have left that up to the browser...I'm not very well versed in the caching arena.
Right now the site doesn't appear slow, but I'm the only user...I'm just wondering if its worth the extra effort.
Any advice, or articles would be appreciated.
Thanks in advance.
My recommendation would be to NOT send the HTML over the AJAX calls. Instead, send just the underlying data ("Data here" part) through JSON, then process that data through a client-side function that would decorate it with the right HTML, then injecting it into the DOM. This will drastically speed up the Ajax calls.
Memcache provides an API that allows you to cache data. What you additionally need (and in my opinion more important is) is a strategy about what to cache and when to invalidate the cache. This cannot be determined by looking at the source code, it comes from how your site is used.
However, an opcode cache (e.g. APC) could be used right away.
Code beautifier is for human not for machine.
As part of the optimization you should take off.
Or simply add a flag checking in your application, certain condition match (like debug mode), it return nicely formatted javascript. Otherwise, whitespace does not mean anything to machine.
APC
You should always use APC to compile & cache php script into op-code.
Why?
changes are hardly make after deployment
if every script is op-code ready, your server does not required to compile plain-text script into binary op-code on the fly
compile once and use many
What are the benefits?
lesser execution cycle to compile plain-text script
lesser memory consume (both related)
a simple math, if a request served in 2 seconds in your current environment, now with APC is served in 0.5 seconds, you gain 4 times better performance, 2 seconds with APC can served 4 requests. That's mean previously you can fit 50 concurrent users, now you can allow 200 concurrent users
Memcache - NO GO?
depends, if you are in single host environment, probably not gain any better. The biggest advantages of memcache is for information sharing & distribution (which mean multiple server environment, cache once and use many).
etc?
static files with expiration header (prime cache concept, no request is fastest, and save bandwidth)
cache your expensive request into memcache/disk-cache or even database (expensive request such as report/statistics generation)
always review your code for best optimization (but do not over-do)
always do benchmark and compare the results (was and current)
fine-tune your apache/tomcat configuration
consider to re-compile PHP with minimum library/extension and load the necessary libraries during run-time only (such as application using mysqli, not using PDO, no reason to keep it)

SSI or PHP Include()?

basically i am launching a site soon and i predict ALOT of traffic. For scenarios sake, lets say i will have 1m uniques a day. The data will be static but i need to have includes aswell
I will only include a html page inside another html page, nothing dynamic (i have my reasons that i wont disclose to keep this simple)
My question is, performance wise what is faster
<!--#include virtual="page.htm" -->
or
<?php include 'page.htm'; ?>
Performance wise fastest is storing the templates elsewhere, generating the full HTML, and regenerate based on alterations in your template.
If you really want a comparison between PHP & SSI, I guess SSI is probably faster, and more important: not having PHP is a lot lighter on RAM needed on the webservers processes/threads, thereby enabling you to have more apache threads/processes to serve requests.
SSI is built in to Apache, while Apache has to spawn a PHP process to process .php files, so I would expect SSI to be somewhat faster and lighter.
I'll agree with the previous answer, though, that going the PHP route will give you more flexibility to change in the future.
Really, any speed difference that exists is likely to be insignificant in the big picture.
Perhaps you should look into HipHop for php which compiles PHP into C++. Since C++ is compiled its way faster. Facebook uses it to reduce the load on their servers.
https://github.com/facebook/hiphop-php/wiki/
I don't think anyone can answer this definitively for you. It depends on your web server configuration, operating system and filesystem choices, complexity of your SSI usage, other competing processes on your server, etc.
You should put together some sample files and run tests on the server you intend to deploy on. Use some http testing tools such as ab or siege or httperf or jmeter to generate some load and compare the two approaches. That's the best way to get an answer that's correct for your environment.
Using PHP with mod_php and an opcode cache like APC might be very quick because it would cache high-demand files automatically. If you turn off apc.stat it won't have to hit the disk at all to serve the PHP script (with the caveat that this makes it harder to update the PHP script on a running system).
You should also make sure you follow other high-scalability best practices. Use a CDN for static resources, optimize your scripts and stylesheets, etc. Get books by Steve Souders and Theo & George Schlossnagle and read them cover to cover.
I suggest you use a web cache like Squid or, for something more sophisticated, Oracle Web Cache.

Rolling and packing PHP scripts

I was just reading over this thread where the pros and cons of using include_once and require_once were being debated. From that discussion (particularly Ambush Commander's answer), I've taken away the fact(?) that any sort of include in PHP is inherently expensive, since it requires the processor to parse a new file into OP codes and so on.
This got me to thinking.
I have written a small script which will "roll" a number of Javascript files into one (appending the all contents into another file), such that it can be packed to reduce HTTP requests and overall bandwidth usage.
Typically for my PHP applications, I have one "includes.php" file which is included on each page, and that then includes all the classes and other libraries which I need. (I know this isn't probably the best practise, but it works - the __autoload feature of PHP5 is making this better in any case).
Should I apply the same "rolling" technique on my PHP files?
I know of that saying about premature optimisation being evil, but let's take this question as theoretical, ok?
There is a problem with Apache/PHP on Windows which causes the application to be extremely slow when loading or even touching too many files (page which loads approx. 50-100 files may spend few seconds only with file business). This problem appears both with including/requiring and working with files (fopen, file_get_contents etc).
So if you (or more likely anybody else, due to the age of this post) will ever run your app on apache/windows, reducing the number of loaded files is absolutely necessary for you. Combine more PHP classes into one file (an automated script for it would be useful, I haven't found one yet) or be careful to not touch any unneeded file in your app.
That would depend somewhat on whether it was more work to parse several small files or to parse one big one. If you require files on an as-needed basis (not saying you necessarily should do things that way ) then presumably for some execution paths there would be considerably less compilation required than if all your code was rolled into one big PHP file that the parser had to encode the entirety of whether it was needed or not.
In keeping with the question, this is thinking aloud more than expertise on the internals of the PHP runtime, - it doesn't sound as though there is any real world benefit to getting too involved with this at all. If you run into a serious slowdown in your PHP I would be very surprised if the use of require_once turned out to be the bottleneck.
As you've said: "premature optimisation ...". Then again, if you're worried about performance, use an opcode cache like APC, which makes this problem almost disappear.
This isn't an answer to your direct question, just about your "js packing".
If you leave your javascript files alone and allow them to be included individually in the HTML source, the browser will cache those files. Then on subsequent requests when the browser requests the same javascript file, your server will return a 304 not modified header and the browser will use the cached version. However if your "packing" the javascript files together on every request, the browser will re-download the file on every page load.

Categories