We need a caching solution that essentially caches data (text files) anywhere from 3 days up to a week based on user preferences and criteria. In this case memory based caching does not make sense to us. We were referred to MemcacheDB however I also thought of some NO SQL solutions.
Our current application uses RDMS (MYSQL) and I guess it makes sense to use MemcacheDB however NOSQL does appeal as it is something more on the horizon. However we have not deployed a production level application under NOSQL and the beta stuff does not settle well with management/investors. Any how what are your thoughts and how would you address it?
Thank You
CouchDB and MongoDB are both great databases, but they are terrible choices for a cache layer on top of your existing RDBMS. Besides the fact that they are still fairly immature, they don't suit the purpose at all. Also, speed-wise you'd be better off going without a cache layer than using CouchDB or MongoDB--they are both slower for simple read/writes than even MySQL. Yes, the NoSQL databases are "cool", but that does not mean you should use them for something they were not meant to do.
I'd go with Memcached, as it's just about the fastest and lightest thing you'll find, and it's well-known and well-supported.
If you're worried about the appeal to management and investors, and the current system (you mention MySQL) works, why would you change? You're moving from a fairly stable project to projects still in beta, and what value are you adding if the current system already works?
As mentioned above, all CouchDB resources contain etags.
What wasn't mentioned is that you can put any HTTP caching solution in front of CouchDB and have it do etag based caching. This way you can use Varnish, nginx, whatever you want.
I'd also take a look at Cassandra ( http://cassandra.apache.org/ ). I've tried MemcacheDB and CouchDB, somehow found Cassandra more appealing (Dunno about PHP since i work with Coldfusion). Here's related question Cassandra PHP module
CouchDB does already some caching: when you get a document the server also sends the HTTP ETag header (it's the same as the document revision in CouchDB).
The next time the browser asks for the same document it sends the Etag received. If the document hasn't been modified the server responds with the HTTP code 304 Not Modified and your browser retrieves the document from its local cache.
However if you have to cache files for different times based on user preferences, even if the text file changes, probably your best option is to write custom code that sends the approriate HTTP caching headers based on the user preferences.
For completeness another good option is Redis. You get performance comparable to Memcache but Redis also supports various data structures (hashes, lists, set, sorted sets) and atomic operations.
If you memcached with persistence, you should check out Redis. It has all the memecached functionality (and more) along with persistence.
I have not tried it myself, but I do remember reading that Redis also supported the memcached API as well.
Related
I have a website that requires a shared priority queue with objects that store the IP of the user and other important user posted values.
I figured I could use a sql database, but since the priority queue isn't that big and is frequently updated, something like a cache would be more appropriate.
What tools could I use from the Zend framework? I want the shared priority queue to be consistent, of course, without data races and other concurrency related problems. Yet I still want a good performance.
Any ideas?
APC, Memcache, or an actual queuing solution, like beanstalkd. Not sure zf has something particular for this except perhaps a pre-built client for some or all of these tools. Of course you could use the filesystem as well, but you need to determine if that will work, for example if you have multiple servers behind a load balancer a filesystem approach on one webserver is probly insufficient.
what are the available cache methods i could use in php ?
Cache HTML output
Cache some variables
it would be great to implement more than one caching method , so i need them all , all the available out there (i do caching currently with files , any other ideas ?)
Most PHP build don't have a caching mechanism built in. There are extensions though that can take care of caching for you.
Have a look at APC or MemCache
If you are using a framework, then most come with some form of caching mechanism that you can use e.g. Zend Framework's Zend_Cache.
If you are not using a framework then the APC or Memcache as Pelle ten Cate mentioned can be used. The correct approach to use does depend in your situation though, do you have your website or application running on more than server and does the information in the cache need to be shared between those servers? (if yes then something like memcache is your answer, or maybe a database or distributed NoSQL solution if you are feeling brave).
If you code is only running on the one server you could try something simple like serializing your variables, and writing them to disk, then on every request afterwards, see if the files exists, if it does, open it and unserialize the string into the variable you need.
This though is only worth it if it would take a long time to generate the varaible normally,
(e.g longer than it would to open,read,unserialize the file on disk)
For HTML caching you are generally going to get the most mileage from using a proxy like Varnish or Squid to do it for you but i realise that this may not be an option for you.
If its not then you could the write to disk approach i mentioned above, and save chunks of HTML to files. look in the PHP manual for ob_start and its friends.
Since every PHP run starts from scratch on page request, there is nothing that would persist between calls, making cacheing moot.
Well, that's the basic view. Of course there are ways to implement a caching, sort of - and a few packages and extensions do so (like Zend Extensions and APC). However, you should have a very close look whether it actually improves performance. Other methods like memcache (for DB results), or switching from PHP to e.g. Java will often yield better results.
You can store variables in the $_SESSION, but you shouldn't keep larger HTML there.
Please check what you are actually trying to do. "Bytecode cacheing" (that is, saving PHP parsing time) needs to be done by the PHP runtime executable. For cacheing Database (SQL) request/reply-pairs, there is memcache. Cacheing HTML output can be done, but is often not a good idea.
See also an earlier answer on a similar question.
I am going to develop a social + professional networking website using Php (Zend or Yii framework). We are targeting over 5000 requests per minute. I have experience in developing advanced websites, using MVC frameworks.
But, this is the first time, I am going to develop something keeping scalability in mind. So, I will really appreciate, if someone can tell me about the technologies, I should be looking for.
I have read about memcache and APC. Which one should I look for? Also, should I use a single Mysql server or a master/slave combination (if its later, then why and how?)
Thanks !
You'll probably want to architect your site to use, at minimum, a master/slave replication system. You don't necessarily need to set up replicating mysql boxes to begin with, but you want design your application so that database reads use a different connection than writes (even if in the beginning both connections connect to the same db server).
You'll also want to think very carefully about what your caching strategy is going to be. I'd be looking at memcache, though with Zend_Cache you could use a file-based cache early on, and swap in memcache if/when you need it. In addition to record caching, you also want to think about (partial) page-level caching, and what kind of strategies you want to plan/implement there.
You'll also want to plan carefully how you'll handle the storage and retrieval of user-generated media. You'll want to be able to easily move that stuff off the main server onto a dedicated box to serve static content, or some kind of CDN (content distribution network).
Also, think about how you're going to handle session management, and make sure you don't do anything that will prevent you from using a non-file-based session storage ((dedicated) database, or memcache) in the future.
If you think carefully, and abstract data storage/retrieval, you'll be heading in a good direction.
Memcached is a distributed caching system, whereas APC is non-distributed and mainly an opcode cache.
If (and only if) your website has to live on different webservers (loadbalancing), you have to use memcache for distributed caching. If not, just stick to APC and its cache.
About MySQL database, I would advise a gridhosting which can autoscale according to requirements.
Depending on the requirements of your site it's more likely the database will be your bottle neck.
MVC frameworks tend to sacrifice performance for easy of coding, especially in the case of ORM. Don't rely on the ORM, instead benchmark different ways of querying the database and see which suits. You want to minimise the number of database queries, fetch a chunk of data at once instead of doing multiple small queries.
If you find that your php code is a bottle neck(profile it before optimizing) you might find facebook's hiphop useful.
I've been reading up on this subject for a while. Suddenly the day has come where this solution is a necessity, not just a dream.
Through my reading, I've seen the popular differences being (file based, memcached, shared memory (mm), sql table, and custom).
The original idea we thought of was using a ZFS or AFS mounted on each of the application servers (LAMP boxes), and pointing the session.save_path php.ini setting to a directory from that mounted path.
I'd like to hear of success stories.
John Campbell's answer here should help
What is the best way to handle sessions for a PHP site on multiple hosts?
The point he makes about NOT using only Memcached is important.
Also, as I mentioned in that question, you may want to consider the session clustering that comes with Zend Platform - but there are significant licensing costs associated with that solution.
I think storing your sessions in a database (like MySQL or PostgreSQL) will involve the least headaches, especially if you already have a DB for whatever it is your app does.
Memcached may also help, since it can store data across multiple machines, but I don't have any experience with it.
I have been using file based on sessions on shared servers for over 5 years with no problems. We have some sessions that can become quite large (>10MB) and file based works very well. Typically our shared servers store the session files for each site in chrooted directories so only root can access them all. We have found this to be very reliable and have had no problems. Although you loose some of the functionality of database or memcached, there is a reason why it's the PHP default.
If you're looking into a Memcached solution for sessions - maybe you should check out Repcached. Should reduce any problems with losing sessions if servers get rebooted, etc.
about repcached
"repcached" is patch set which adds data replication feature to memcached 1.2.x.
Note: I haven't actually tried repcached yet, but thought it was worth looking into.
I'm a bit biased, but I'd recommend HTTP_Session2. (I'm working on this package) While we support traditional session handling through files we also support database (MySQL, PostgreSQL, SQlite etc. through PEAR::MDB2) and also memcached.
Personally, we use the database-handler and we serve up to 100,000 users/day with no larger issues. I think optimization-wise, I'd go memcached next, but database is great for an intermediat fix which doesn't require you to bendover backwards. :-)
By the way, for more info on memcached, please check my answer on How to manage session variables in a web cluster?.
EDIT
Since you asked, here is an example (more in the API docs):
$options = array('memcache' => $memcache);
Where $memcache is an instance of PECL::Memcache, which is required. I know we lack an example, and we'll improve on that. In the meantime, our source code has pretty good documenation inline, so for example the check out the API documentation.
What is the best way of implementing a cache for a PHP site? Obviously, there are some things that shouldn't be cached (for example search queries), but I want to find a good solution that will make sure that I avoid the 'digg effect'.
I know there is WP-Cache for WordPress, but I'm writing a custom solution that isn't built on WP. I'm interested in either writing my own cache (if it's simple enough), or you could point me to a nice, light framework. I don't know much Apache though, so if it was a PHP framework then it would be a better fit.
Thanks.
You can use output buffering to selectively save parts of your output (those you want to cache) and display them to the next user if it hasn't been long enough. This way you're still rendering other parts of the page on-the-fly (e.g., customizable boxes, personal information).
If a proxy cache is out of the question, and you're serving complete HTML files, you'll get the best performance by bypassing PHP altogether. Study how WP Super Cache works.
Uncached pages are copied to a cache folder with similar URL structure as your site. On later requests, mod_rewrite notes the existence of the cached file and serves it instead. other RewriteCond directives are used to make sure commenters/logged in users see live PHP requests, but the majority of visitors will be served by Apache directly.
The best way to go is to use a proxy cache (Squid, Varnish) and serve appropriate Cache-Control/Expires headers, along with ETags : see Mark Nottingham's Caching Tutorial for a full description of how caches work and how you can get the most performance out of a caching proxy.
Also check out memcached, and try to cache your database queries (or better yet, pre-rendered page fragments) in there.
I would recommend Memcached or APC. Both are in-memory caching solutions with dead-simple APIs and lots of libraries.
The trouble with those 2 is you need to install them on your web server or another server if it's Memcached.
APC
Pros:
Simple
Fast
Speeds up PHP execution also
Cons
Doesn't work for distributed systems, each machine stores its cache locally
Memcached
Pros:
Fast(ish)
Can be installed on a separate server for all web servers to use
Highly tested, developed at LiveJournal
Used by all the big guys (Facebook, Yahoo, Mozilla)
Cons:
Slower than APC
Possible network latency
Slightly more configuration
I wouldn't recommend writing your own, there are plenty out there. You could go with a disk-based cache if you can't install software on your webserver, but there are possible race issues to deal with. One request could be writing to the file while another is reading.
You actually could cache search queries, even for a few seconds to a minute. Unless your db is being updated more than a few times a second, some delay would be ok.
The PHP Smarty template engine (http://www.smarty.net) includes a fairly advanced caching system.
You can find details in the caching section of the Smarty manual: http://www.smarty.net/manual/en/caching.php
You seems to be looking for a PHP cache framework.
I recommend you the template system TinyButStrong that comes with a very good CacheSystem plugin.
It's simple, light, customizable (you can cache whatever part of the html file you want), very powerful ^^
Simple caching of pages, or parts of pages - the Pear::CacheLite class. I also use APC and memcache for different things, but the other answers I've seen so far are more for more complete, and complex systems. If you just need to save some effort rebuilding a part of a page - Cache_lite with a file-backed store is entirely sufficient, and very simple to implement.
Project Gazelle (an open source torrent site) provides a step by step guide on setting up Memcached on the site which you can easily use on any other website you might want to set up which will handle a lot of traffic.
Grab down the source and read the documentation.