Can nginx serve up files cached by PHP? - php

As part of a new CMS that I'm planning, I'm going to be caching the HTML output of some "static" content. I say "static" because no content stored in a database is really static, but it's close enough. The system will MD5 hash the request URL and save a minified version of the HTML output in a cache folder. The next time the page is requested, the CMS will check if a cached version exists, then it'll check the age of the file and then serve up the cached version, thus saving a lot of resources (especially when it comes to DB calls).
Obviously getting nginx to serve up static files without going near PHP is going to be much quicker. So is there a way to get nginx to hash the request URL and check a temp folder to see if it exists?

Yes!
You can get nginx to cache the response from the fcgi. It is using the HttpProxyModule that you must have enabled into nginx.
Here is a good guide on how to do it.
Alternatively you can use nginx with memcache too. There are actually quite a few options to you. Both work really well!

Related

php header location vs file_get_contents performace

I have a website where the client search a term and results are retrieved through an ajax request. On php side, the called script check the date of the cache (cache are files) and if it's older than an established time it refreshes the results, else it return the cache file content: die(file_get_contents($cache_path));
The cache time is a few hours, an to refresh it takes just a few seconds, so the greatest part of the requests will end up in cache response.
So I thought that using header("location: $cache_path"); would be less stressful for the server, because it simply tells the browser to get the contents from the cache file without passing it through the script.
The downside is that the cache file path would become public (which is not biggest problem ever, because the content is the same), but, you know, it's never good to give the resources locations...
So, performance wise, is there a big difference between file_get_contents and redirecting? The average cache file size is 120kb... Any other ideas and suggestions?
You can use "internal redirect": through X-Accel-Redirect header for nginx or X-Sendfile for Apache. In this case you don't show any additional URLs to a client and don't deal with cache files in you script.
For configuration details you can read an official documentation or, of course, other SO questions (like this one).

Laravel development on remote server experiences CSS/JavaScript change delays

I develop on a remote server. I'm not sure if this is a laravel caching tendancy, but CSS and JavaScript changes tend to be delayed longer than usual. I can make instant HTML and php changes, but sometimes it takes more than a few minutes for CSS and JavaScript changes to be reflected on a page. I do know that .blade.php files are generated and cached within an app/storage/views folder, but even when I delete those the changes are not reflected right away.
I have tried on Chrome and Firefox.
Any ideas? Thanks!
CSS and Javascript are not handled by Laravel.
You can check it from the .htaccess of the public folder
Most likely you are hitting the cache browser. One solution to avoid browser cache is to append to the css or javascript a unique identifier whenever there is a new release, e.g.:
site/my/css.css?12345
^
|
+ - Change this to force a fresh copy
What ended up working best was to clear the cached views on the server, as #dynamic hinted at (Laravel and Blade store these cached files).
AS WELL, to close the tab completely and re-access the webpage fresh. There were cases where it was still cached, and so I would just switch to another browser.

Use a cached version of a php page unless the database has changed

I've looked at similar questions about caching in PHP and I'm still stumped as to how to check whether the database has changed without making a new call to the database, which would defeat the point of caching.
I understand technically how to implement caching in PHP -- using ETag and Last Modified headers, output buffering, storing static files, etc. What is tripping me up is how to determine when to serve up a new version of a page instead of a cached version. If the database content has changed, I want to show the new version and not the cached version.
For example, let's say I have a page that displays details about a product. Generally, once the product info is stored in the database, it won't change much. But occasionally there might be an edit to the product description or a price change. If the product has a new price, I don't want to show the user the old price by using a cached version of the page. For that reason, updating the cached content every hour doesn't seem sufficient. Not to mention that that's too often for the content that doesn't change, the real problem is that it won't update the content fast enough when there is a change.
So should I store something (e.g., an ETag value or a static html file) every time the product database is updated through a form in the Admin area of the application? What am I missing here?
[Note: Not interested in using a caching library here. I'd like to learn how to do it in straight PHP for now.]
Caching is a pretty complex topic, because you can cache all sort of data in various places. Usually you implement caching to relieve bottlenecks in your server structure.
In your setup you can cache data at three different locations:
1) Clientside, between client and server
You would use this method to save bandwidth and shorten loading times for the user. You can achieve this by setting cache related fields into the http header (Cache-Control, Expires, ETag and so on).
If you use Cache-Control or Expires, the decision wether to load an updated version from the server or not purely depends on the client. So even if there is a new version available, the user won't see it. On the plus side you are saving lots of cpu cycles on the server, because your php script won't be executed.
If you use ETag, you can inform the client on each request, if the version of the requested content has changed. But your php script will be executed on each request, even if the ETag is unchanged.
2) Serverside, between client and server
This kind of caching primarily reduces high cpu load on your server. It won't affect the amount of traffic generated between client and server.
You can use a client proxy like Varnish to store rendered responses on the server side. The good thing is, that you have full control over the cache. If an updated version of a requested content is available, you can simply purge the old version from the cache, so that a new version is generated from your php script and stored in your cache.
Every response that is cacheable will only be generated one time and then be served from cache to the clients.
3) In your application
If you are heavily using your database, you should consider using a fast key value store like memcached to cache query results. Of course you have to adjust your database classes for this (first ask memcached, if memcached doesn't have the result ask the database and store the result into memcached), but the performance gain will be quite impressive, because memcached is really fast.
Sometimes it even makes sense to store data solely in memcached, if the data doesn't has to be persisted permanently (php sessions for example).
I had also faced the same problem long back (I dont know if you will find my way to be correct).
Why I needed the cache :-
What my site use to do was, it use to update the database by running the script on cron.php file and index.php use to show the listing from database (this use to take ages to load )
my solution :-
every time a new list was created or updated I unlinked the cache file then on index.php page I checked if cache file exists load cache or else load content from database also at the same time write this data to the cache file so next time when user requests for index.php file

Routing .htaccess to GitHub

I was wondering if there was a way to basically host a site on your server so you can run PHP, but have the actual code hosted on GitHub. In other words...
If a HTTP request went to:
http://mysite.com/docs.html
It'd request and pull in the content (via file_get_contents() or something):
https://raw.github.com/OscarGodson/Core.js/master/docs.html
Or, if they went to:
http://mysite.com/somedir/another/core.js
It'd pull down:
https://raw.github.com/OscarGodson/Core.js/master/somedir/another/core.js
I know GitHub has their own DNS servers, but id rather host it on my so i can run server side code. What would the htaccess code look like for this?
This is beyond the capabilities of .htaccess files, if the requirement is to run the PHP embedded in the HTML stored on github.com at the server on yourserver.com simply by a configuration line like a redirect in the .htaccess file.
A .htaccess file is typically used to provide directives to the Apache web server. These directives can indicate, for example, access permissions, popup password protection, linkages between URLs and the server's file system, handlers for certain types of files when fetched by the server before delivery to the browser, and redirects from one URL to another URL.
An .htaccess file can issue redirects for http://mysite.com/somedir/another/core.js to https://raw.github.com.... but then the browser will be pointed to raw.github.com, not mysite.com. Tricks can be done with frames to make this redirection less transparent to the human at the browser... but these dont affect the fact that the data comes from github.com without ever going to the server at mysite.com
In particular, PHP tags embedded in the HTML on github.com are never received by mysite.com's server and therefore will not run. Probably not want you want. Unless some big changes have occurred in Apache, .htaccess files will not set up that workflow. It might be possible for some expert to write an apache module to do it, but I am not sure.
What you can do is put a cron job on mysite.com that git pull's from github.com every few minutes. Perhaps that is what you want to do instead?
If the server can run PHP code, you can do this.
Basically, in the .htaccess file you use a RewriteRule to send all paths to a PHP script on your server. For example, a request for /somedir/anotherdir/core.js becomes /my-script.php/somedir/anotherdir/core.js. This is how a lot of app frameworks operate. When my-script.php runs the "real" path is in the PATH_INFO variable.
From that point the script could then fetch the file from GitHub. If it was HTML or JavaScript or an image, it could just pass it along to the client. (To do things properly, though, you'll want to pass along all the right headers, too, like ETag and Last-Modified and then also check those files, so that caching works properly and you don't spend a lot of time transferring files that don't need to be transferred again and again. Otherwise your site will be really slow.)
If the file is a PHP file, you could download it locally, then include it into the script in order to execute it. In this case, though, you need to make sure that every PHP file is self-contained, because you don't know which files have been fetched from GitHub yet, so if one file includes another you need to make sure the files dependent on the first file are downloaded, too. And the files dependent on those files, also.
So, in short, the .htaccess part of this is really simple, it's just a single RewriteRule. The complexity is in the PHP script that fetches files from GitHub. And if you just do the simplest thing possible, your site might not work, or it will work but really painfully slowly. And if you do a ton of genius level work on that script, you could make it run OK.
Now, what is the goal here? To save yourself the trouble of logging into the server and typing git pull to update the server files? I hope I've convinced you that trying to fetch files on demand from GitHub will be even more trouble than that.

Choosing a PHP caching technique: output caching into files vs. opcode caching

I've heard of two caching techniques for the PHP code:
When a PHP script generates output it stores it into local files. When the script is called again it check whether the file with previous output exists and if true returns the content of this file. It's mostly done with playing around the "output buffer". Somthing like this is described in this article.
Using a kind of opcode caching plugin, where the compiled PHP code is stored in memory. The most popular of this one is APC, also eAccelerator.
Now the question is whether it make any sense to use both of the techniques or just use one of them. I think that the first method is a bit complicated and time consuming in the implementation, when the second one seem to be a simple one where you just need to install the module.
I use PHP 5.3 (PHP-FPM) on Ubuntu/Debian.
BTW, are there any other methods to cache PHP code or output, which I didn't mention here? Are they worth considering?
You should always have an opcode cache like APC. Its purpose is to speed up the parsing of your code, and will be bundled into PHP in a future version. For now, it's a simple install on any server and doesn't require you write or change any code.
However, caching opcodes doesn't do anything to speed up the actual execution of your code. Your bottlenecks are usually time spent talking to databases or reading to/from disk. Caching the output of your program avoids unnecessary resource usage and can speed up responses by orders of magnitude.
You can do output caching many different ways at many different places along your stack. The first place you can do it is in your own code, as you suggested, by buffering output, writing it to a file, and reading from that file on subsequent requests.
That still requires executing your PHP code on each request, though. You can cache output at the web server level to skip that as well. Crafting a set of mod_rewrite rules will allow Apache to serve the static files instead of the PHP code when they exist, but you'll have to regenerate the cached versions manually or with a scheduled task, since your PHP code won't be running on each request to do so.
You can also stick a proxy in front of your web server and use that to cache output. Varnish is a popular choice these days and can serve hundreds of times more request per second with caching than Apache running your PHP script on the same server. The cache is created and configured at the proxy level, so when it expires, the request passes through to your script which runs as it normally would to generate the new version of the page.
You know, for me, optcache , filecache .. etc only use for reduce database calls.
They can't speed up your code. However, they improve the page load by using cache to serve your visitors.
With me, APC is good enough for VPS or Dedicated Server when I need to cache widgets, $object to save my mySQL Server.
If I have more than 2 Servers, I like to used Memcache , they are good on using memory to cache. However it is up to you, not everyone like memcached, and not everyone like APC.
For caching whole web page, I ran a lot of wordpress, and I used APC, Memcache, Filecache on some Cache Plugins like W3Total Cache. And I see ( my own exp ): Filecache is good for caching whole website, memory cache is good for caching $object
Filecache will increase your CPU if your hard drive is slow, and Memory cache is terrible if you don't have enough memory on your VPS.
An SSD HDD will be super good speed to read / write file, but Memory is always faster. However, Human can't see what is difference between these speed. You only pick one method base on your project and your server ( RAM, HDD ) or are you on a shared web hosting?
If I am on a shared hosting, without root permission, without php.ini, I like to use phpFastCache, it a simple file cache method with set, get, stats, delete only.
In Addition, I like to use .htaccess to cache static files like images, js, css or by html headers. They will help visitors speed up your page, and save your server bandwidth.
And If you can use .htaccess to redirect to static .html cache if you cache whole page is a great thing.
In future, APC or some Optcache will be bundle into PHP version, but I am sure all the cache can't speed up your code, they use to:
Reduce Database / Query calls.
Improve the speed of page load by use cache to serve.
Save your API Transactions ( like Bing ) or cURL request...
etc...
A lot of times, when it comes to PHP web applications, the database is the bottleneck. As such, one of the best things you can do is to use memcached to cache results in memory. You can also use something like xhprof to profile your code, and really dial in on what's taking the most time.
Yes, those are two different cache-techniques, and you've understood them correctly.
but beware on 1):
1.) Caching script generated output to files or proxies may render problems
if content change rapidly.
2.) x-cache exists too and is easy to install on ubuntu.
regards,
/t
I don't know if this really would work, but I came across a performance problem with a PHP script that I had. I have a plain text file that stores data as a title and a URL tab separated with each record separated by a new line. My script grabs the file at each URL and saves it to its own folder.
Then I have another page that actually displays the local files (in this case, pictures) and I use a preg_replace() to change the output of each line from the remote url to a relative one so that it can be displayed by the server. My tab separated file is now over 1 MB and it takes a few SECONDS to do the preg_replace(), so I decided to look into output caching. I couldn't find anything definitive, so I figured I would try my own hand at it and here's what I came up with:
When I request the page to view stuff locally, I try to read it from a variable in a global scope. If this is empty, it might be that this application hasn't run yet and this global needs populated. If it was empty, read from an output file (plain html file that literally shows everything to output) and save the contents to the global variable and then display the output from the global.
Now, when the script runs to update the tab separated file, it updates the output file and the global variable. This way, the portion of the script that actually does the stuff that runs slowly only runs when the data is being updated.
Now I haven't tried this yet, but theoretically, this should improve my performance a lot, although it does actually still run the script, but the data would never be out of date and I should get a much better load time.
Hope this helps.

Categories