I am currently building a caching component for my applications. It will have support for different adapters:
APC
Memcached
Files
For all of them, I need to generate a cache key for them. What's the best way to do this? I am considering contatenating the function name and arguements and then running md5() on it. Is this a good strategy?
Finally, when caching objects as files to disk, how should the cache files be organized? I have a feeling that having a cache folder and just throwing all the cache files in there would probably be pretty bad performance.
The application will be hosted on Linux and Windows servers.
Both md5() and sha1() fit your need to name cache files, since they both have a good performance.
When saving the cache files to the file system, you can refer to how git store its files.
Links useful:
Benchmark: http://www.cryptopp.com/benchmarks.html
How git stores objects: http://book.git-scm.com/7_how_git_stores_objects.html
How about http://php.net/manual/en/function.uniqid.php?
Related
I'm thinking of replacing some of my file system caching (via phpfastcache) over to memcache (again using phpfastcache) but I have a query on storage sizes. Potentially a silly question but:
If I had 80gb of file cache storage being used would that equal to needing 80gb of ram to store the exact same cache via memory?
I'm thinking they potentially use different compression methods but I've struggled to find an answer online.
80gb of cache is huge, I'm myself surprised that Phpfastcache handles it well. If you migrate to Memcache, for sure you can expect some serious memory issue on the server.
For such size I recommend you more performant backends such as Mongodb/Couchdb/Arangodb or maybe Redis too.
If you're just looking for backend storage stats, you can have a look at this API offered by Phpfastcache: Wiki - Cache Statistics
It will return you misc statistics provided by the backends supported by Phpfastcache.
(Note: I'm the author of this Library)
I'm using Zend Framework and Zend_Cache with the File backend.
According to the ZF manual the recommended place to store the cache files would be under /data/cache but I'm thinking it would make more sense to store them under /temp/cache. Why is /data/cache preferred?
Here is a link to the part of ZF manual I mentioned:
http://framework.zend.com/manual/en/project-structure.project.html
I guess you're talking about these recommendations: Recommended Project Directory Structure.
The interesting parts are:
data/: This directory provides a place to store application data that
is volatile and possibly temporary. The disturbance of data in this
directory might cause the application to fail. Also, the information
in this directory may or may not be committed to a subversion
repository. Examples of things in this directory are session files,
cache files, sqlite databases, logs and indexes.
temp/: The temp/ folder is set aside for transient application data. This information would not typically be committed to the
applications svn repository. If data under the temp/ directory were
deleted, the application should be able to continue running with a
possible decrease in performance until data is once again restored or
recached.
Now, you can understand that Zend doesn't recommend to store your Zend_Cache data into data/cache/ only, but it could also be stored under the temp/ directory. The real question is: should I commit these data cache files and are they necessary for the application to run correctly? Once you answered these questions, you know where you should put your cache files. In my opinion, in most cases cached data should be stored under the temp/ directory.
Finally, remember that this is only a recommendation, you are always free to do the way you want.
I can't find the part of the Zend_Cache manual that recommends using data/cache as the cache directory, maybe you could link to it. I did find some examples that use ./temp/.
Either way, Zend Cache doesn't care where you decide to store the cache files, it is up to you. You just need to make sure that the directory is readable and writable by PHP.
First Some Background
I'm planning out the architecture for a new PHP web application and trying to make it as easy as possible to install. As such, I don't care what web server the end user is running so long as they have access to PHP (setting my requirement at PHP5).
But the app will need some kind of database support. Rather than working with MySQL, I decided to go with an embedded solution. A few friends recommended SQLite - and I might still go that direction - but I'm hesitant since it needs additional modules in PHP to work.
Remember, the aim is easy of installation ... most lay users won't know what PHP modules their server has or even how to find their php.ini file, let alone enable additional tools.
My Current Objective
So my current leaning is to go with a filesystem-based data store. The "database" would be a folder, each "table" would be a specific subfolder, and each "row" would be a file within that subfolder. For example:
/public_html
/application
/database
/table
1.data
2.data
/table2
1.data
2.data
There would be other files in the database as well to define schema requirements, relationships, etc. But this is the basic structure I'm leaning towards.
I've been pretty happy with the way Microsoft built their Open Office XML file format (.docx/.xlsx/etc). Each file is really a ZIP archive of a set of XML files that define the document.
It's clean, easy to parse, and easy to understand.
I'd like to actually set up my directory structure so that /database is really a ZIP archive that resides on the server - a single, portable file.
But as the data store grows in size, won't this begin to affect performance on the server? Will PHP need to read the entire archive in to memory to extract it and read its composite files?
What alternatives could I use to implement this kind of file structure but still make it as portable as possible?
Sqlite is enabled by default since PHP5 so most all PHP5 users should have it.
I think there will be tons of problems with the zip approach, for example adding a file to a relatively large zip archive is very time consuming. I think there will be horrible concurrency and locking issues.
Reading zip files requires a php extension anyway, unless you went with a pure PHP solution. The downside is most php solutions WILL want to read the whole zip into memory, and will also be way slower than something that is written in C and compiled like the zip extension in PHP.
I'd choose another approach, or make SQLite/MySQL a requirement. If you use PDO for PHP, then you can allow the user to choose SQLite or MySQL and your code is no different as far as issuing queries. I think 99%+ of webhosts out there support MySQL anyway.
Using a real database will also affect your performance. It's worth loading the extra modules (and most PHP installations have at least the mysql module and probably sqlite as well) for the fact that those modules are written in C and run much faster than PHP, and have been optimized for speed. Using sqlite will help keep your web app portable, if you're willing to deal with sqlite BS.
Zip archives are great for data exchange. They aren't great for fast access, though, and they're awful for rewriting content. Both of these are extremely important for a database used by a web application.
Your proposed solution also has some specific performance issues -- the list of files in a zip archive is internally stored as a "flat" list, so accessing a file by name takes O(n) time relative to the size of the archive.
I was wondering if itwould be possible to somehow speed up symfony templates by loading the files in memcached, and then instead of doing include, grabbing them from memory? Has anyone tried this? WOuld it work?
Have you looked at the view cache already? This built-in system makes it possible to cache the output from actions, and has a lot of configuration options, and is overridable on a per-action (and per-component) level. It works by default on a file level, but I think it is possible to configure it in a way that the action output is cached to memcached. (Or you should write this part)
If you want really lightning fast pages, you should also look at the sfSuperCachePlugin, which stores the output as an HTML file in your public HTML folder. That way Apache can directly serve the pages, and doesn't need to start up PHP and symfony to generate the output.
Sorry for not having more time to give an explanation here but you can review the notes at:
http://www.symfony-project.org/book/1_2/12-Caching
under the heading:
Alternative Caching storage
Quote from the page:
"By default, the symfony cache system stores data in files on the web server hard disk. You may want to store cache in memory (for instance, via memcached) or in a database (notably if you want to share your cache among several servers or speed up cache removal). You can easily alter symfony's default cache storage system because the cache class used by the symfony view cache manager is defined in factories.yml."
good luck!
What is the best way of implementing a cache for a PHP site? Obviously, there are some things that shouldn't be cached (for example search queries), but I want to find a good solution that will make sure that I avoid the 'digg effect'.
I know there is WP-Cache for WordPress, but I'm writing a custom solution that isn't built on WP. I'm interested in either writing my own cache (if it's simple enough), or you could point me to a nice, light framework. I don't know much Apache though, so if it was a PHP framework then it would be a better fit.
Thanks.
You can use output buffering to selectively save parts of your output (those you want to cache) and display them to the next user if it hasn't been long enough. This way you're still rendering other parts of the page on-the-fly (e.g., customizable boxes, personal information).
If a proxy cache is out of the question, and you're serving complete HTML files, you'll get the best performance by bypassing PHP altogether. Study how WP Super Cache works.
Uncached pages are copied to a cache folder with similar URL structure as your site. On later requests, mod_rewrite notes the existence of the cached file and serves it instead. other RewriteCond directives are used to make sure commenters/logged in users see live PHP requests, but the majority of visitors will be served by Apache directly.
The best way to go is to use a proxy cache (Squid, Varnish) and serve appropriate Cache-Control/Expires headers, along with ETags : see Mark Nottingham's Caching Tutorial for a full description of how caches work and how you can get the most performance out of a caching proxy.
Also check out memcached, and try to cache your database queries (or better yet, pre-rendered page fragments) in there.
I would recommend Memcached or APC. Both are in-memory caching solutions with dead-simple APIs and lots of libraries.
The trouble with those 2 is you need to install them on your web server or another server if it's Memcached.
APC
Pros:
Simple
Fast
Speeds up PHP execution also
Cons
Doesn't work for distributed systems, each machine stores its cache locally
Memcached
Pros:
Fast(ish)
Can be installed on a separate server for all web servers to use
Highly tested, developed at LiveJournal
Used by all the big guys (Facebook, Yahoo, Mozilla)
Cons:
Slower than APC
Possible network latency
Slightly more configuration
I wouldn't recommend writing your own, there are plenty out there. You could go with a disk-based cache if you can't install software on your webserver, but there are possible race issues to deal with. One request could be writing to the file while another is reading.
You actually could cache search queries, even for a few seconds to a minute. Unless your db is being updated more than a few times a second, some delay would be ok.
The PHP Smarty template engine (http://www.smarty.net) includes a fairly advanced caching system.
You can find details in the caching section of the Smarty manual: http://www.smarty.net/manual/en/caching.php
You seems to be looking for a PHP cache framework.
I recommend you the template system TinyButStrong that comes with a very good CacheSystem plugin.
It's simple, light, customizable (you can cache whatever part of the html file you want), very powerful ^^
Simple caching of pages, or parts of pages - the Pear::CacheLite class. I also use APC and memcache for different things, but the other answers I've seen so far are more for more complete, and complex systems. If you just need to save some effort rebuilding a part of a page - Cache_lite with a file-backed store is entirely sufficient, and very simple to implement.
Project Gazelle (an open source torrent site) provides a step by step guide on setting up Memcached on the site which you can easily use on any other website you might want to set up which will handle a lot of traffic.
Grab down the source and read the documentation.