Need caching system that's both fast and big - php

Is there a caching system in PhP that's both fast and big?
If I cache in disk the IO goes up.
If I cache in memory, the memory is small.
I need a comprehensive caching system that cache in memory, but when the memory goes big automatically dump large data sequentially to disk.
Any such thing in PhP?
The site store huge number of small items (30k) millions of them. Each data is stored in a file now

You should try one of these:
APC
XCACHE
EACCELERATOR
MEMCACHE
ZEND OPTIMIZER
These are the best PHP cache solutions available.

Sounds like you want Redis. It's sort of like memcache but with disk based persistance/overflow.

Related

Would file system caching using the same amount of storage space as memory caching?

I'm thinking of replacing some of my file system caching (via phpfastcache) over to memcache (again using phpfastcache) but I have a query on storage sizes. Potentially a silly question but:
If I had 80gb of file cache storage being used would that equal to needing 80gb of ram to store the exact same cache via memory?
I'm thinking they potentially use different compression methods but I've struggled to find an answer online.
80gb of cache is huge, I'm myself surprised that Phpfastcache handles it well. If you migrate to Memcache, for sure you can expect some serious memory issue on the server.
For such size I recommend you more performant backends such as Mongodb/Couchdb/Arangodb or maybe Redis too.
If you're just looking for backend storage stats, you can have a look at this API offered by Phpfastcache: Wiki - Cache Statistics
It will return you misc statistics provided by the backends supported by Phpfastcache.
(Note: I'm the author of this Library)

Are there big drawbacks using opcache as file storage instead of Ram memory?

Since some time, I have been doing hobby projects again in PHP. One of the things I am running is a small application to store some Json data.
And another to make some big calculations.
As we all know, PHP itself is not the best language for this (working with big calculations), since we need to compile it everytime we call it, it's kinda slow (if you compare it with golang or other languages that are already compiled before it runs).
To increase the speed of PHP we can use opcache to compile everything and store it in the Ram memory, so it will load much quicker.
Ram: Random access memory
But one of the things I found out, is that you can use opcache as file storage as well (https://patrickkerrigan.uk/blog/php-opcache-file-cache/), instead of putting it in more expensive Ram memory (since storage is cheaper than memory).
I understand why you would add it into Ram, since it is really quick. but why not use file storage more for servers that don't have that much Ram (it's slower, but will still increase the speed of your application since it is compiled to binary code)?
When you are low on memory and there are multiple PHP in opcache it would eat the memory, but when you store it on file storage, it matters less?
I did not find a lot of articles about this. Are there big drawbacks to this approach?

variable caching softwares APC,Memcached performances

You need to cache arbitrary data like results of PHP logic within methods,database query calls and generally any data results from a process (not Opcode caching).
What would you want to use between third-party caching softwares like Apc and Memcached?What makes you prefer the above tools to caching your data onto your local file system?
thanks
Luca
Go with Memcache. It has a lot more support and larger community (because it can be used by multiple languages). Supports access from multiple servers, so it allow for a more scalable architecture.
That being said, still install APC or another opcode cache for PHP. It will significantly speed up PHP's execution time.
They're both different. APC is a local machine cache specific to PHP and memcached is a multiple-computer distributed cache. If you're trying to scale your programs memcached is often preferred. If you're designing for a single server then APC will suit you better.
I personally prefer a combination of both.
Simple answer, Memcache and APC store the data in memory, not on the disk. Access time is MUCH faster.

Fastest database engine for caching?

I use MySQL for my primary database, where I keep the actual objects. When an object is rendered using a template, rendering takes a lot of time.
Because of that I've decided to cache the produced HTML. Right now I store the cache in files, named appropriate, and it does work significantly faster. I am however aware that it is not the best way to do so.
I need a (preferably key-value) database to store my cache in. I cannot use a caching proxy because I still need to process the cached HTML. Is there such a database with a PHP front end?
Edit: If I use memcached, and I cache about a million pages, won't I run out of RAM?
Edit 2: And again, I have a lot of HTML to cache (gigabytes of it).
If I use memcached, and I cache about
a million pages, won't I run out of
RAM?
Memcached
memcached is also a real solid product(like redis more) used at all big sites to keep them up and running. Almost al active tweets(which user fetch) are stored in memcached for insane performance.
If you want to be fast you should have your active dataset in memory. But yeah if the dataset is bigger then your available memory you should(should always store data in persistent datastore because memcached is volatile) store data in a persistent datastore like for example mysql. When it's not available in memory you will try and fetch it from datastore and cache it memcache for future reference(with expire header).
Redis
I really like redis because it is an advanced key-value store with insane performance
Redis is an advanced key-value store.
It is similar to memcached but the
dataset is not volatile, and values
can be strings, exactly like in
memcached, but also lists, sets, and
ordered sets. All this data types can
be manipulated with atomic operations
to push/pop elements, add/remove
elements, perform server side union,
intersection, difference between sets,
and so forth. Redis supports different
kind of sorting abilities.
Redis has a VM so you don't need a seperate persisent datastore. I really like redis because of all the available commands (power :)?). This tutorial by simon willison displays(a lot of) the raw power which redis has.
Speed
Redis is pretty fast!, 110000 SETs/second, 81000 GETs/second in an entry level Linux box. Check the benchmarks.
Commits
Redis is more actively developed. 8 hours ago antirez(redis) commited something versus memcached 12 November latest commit.
Install Redis
Redis is insanely easy to install. It has no dependencies. You only have to perform:
make
./redis-server redis.conf #start redis
to compile redis(Awesome :)?).
Install Memcached
Memcached has dependency(libevent) which makes it more difficult to install.
wget http://memcached.org/latest
tar -zxvf memcached-1.x.x.tar.gz
cd memcached-1.x.x
./configure
make && make test
sudo make install
not totally true because memcached has libevent dependency and ./configure will fail of libevent is missing. But then again they have packages which are cool, but require root to install.
Redis is pretty fast: 110,000
SETs/second
If speed is a concern, why use the network layer?
According to: http://tokutek.com/downloads/mysqluc-2010-fractal-trees.pdf
InnoDB inserts ....................43,000 records per second AT ITS PEAK*;
TokuDB inserts ....................34,000 records per second AT ITS PEAK*;
G-WAN KV inserts ....100,000,000 records per second
(*) after a few thousands of inserts, performances degrade severely for InnoDB and TokuDB which end to write to disk when their cache and the system cache and the disk controller cache are full. See the PDF for an interesting discussion of the problems caused by the topology of the InnoDB database index (which severely breaks locality while the Fractals topology scales much better... but still not linearly).
To clarify the answers into logical views:
Flat Files are as fast the storage medium being used (DISK or RAM)
An environment which caches in RAM the MRU (Most Recently Used) items
Solution has a smart/fast hash index to all locations (what SQL systems rely on)
That combination will get you the best solution that you are looking for.
For argument sake, flat file or not - excluding a MEMORY ONLY solution - all engines use some form of flat file. The magic is knowing where your data is, and tuning reads to pull the data back most optimal. In the 80's at IBM we used a fixed record length flat file design - which wasn't optimized for disk space, it was optimized for I/O. Indexes then were based on Record Length * ROWID.
Now to your need, your ultimate performance for scale is to introduce a smart combination - we host over 1 million companies, with over 10 pages per company - 10 million files, plus js, css and images.
Theory 1) - You know your limitation is RAM - spool dynamic content to disk when feasible and drop such features as hit counters. Leverage NGINX or HIGHLY tune APACHE (or as we did, wrote our own web servers since 2001) - the whole concept is leverage RAM for the MOST USED, and have a very intelligent lookup for disk based content - normally the URI is fine.
Theory 2) - Trend Analysis and User Anticipation - I have spent years researching and developing systems that track trends. If I know a user will go path A, B, C, D - then when he hits B, I have already prefetched C and D. If I know a user will go A, B but may go E then D. You have the choice to pre-cache C and E, or for RAM sake prefetch D. and manually fetch C or E when the user picks that.
The Web Server we have developed along with some accounting systems I have developed over the years integrate Theory 2 to prefetch, with combinations of Smart Caching. We also store the content to disk in deflate - so the transport layer simply pumps the content onto the stack as 99% of the browsers support deflated streams. (It's faster to reflate before sending for that 1% than deflate 99% of the time)
Per the thought of MEMCACHED and SWAP - Disk speed is your enemy, however, tying up the kernel to manage that enemy is an epic fail! If you want to beat MEMCACHED performance, learn how to setup a RAM DISK and keep your deflated HOT requested items there!
** DISCLAIMER: This all assumes that you have enough bandwidth that your Infrastructure/Users bandwidth is not your bottleneck, but your servers are. #3FINC
http://memcached.org/ + http://php.net/manual/en/book.memcache.php
Flat files are "technically" the fastest - but if you're looking for something with a PHP front end and just screams - take a look at postgres.
http://www.wikivs.com/wiki/MySQL_vs_PostgreSQL#Raw_Speed
For memory caching look at memcached
http://memcached.org/
*Edit: from your edit ... (redundant yes) ... if you cache that volume in memory you will have issues. Look into postgres columnar table queries or a quasi-custom flat file solution.
As far as I know, using the file system is actually the fastest way to cache rendered templates without resorting to storing them in memory. Any database would simply add overhead and would make the whole thing slower by comparison.
I would use memcached or APC. Depending if you need caching shared between servers. Memcached is a daemon you connect to, where APC is actually inside of PHP instance (a little faster). Both of them store the cache in memory so it's blazing fast.
In fact storing cache in files is really the fastest way to do this. But, if you're really interested in putting them into a database, you can check out MongoDB. MongoDB is a document-oriented database so there are no server-side joins, that's why it's faster than mysql (1. with php 2. there are a lot of benchmarks on the internet).

PHP Opcode cached in hard disk?

I have websites developed in PHP. Im using Opcode cache.
But because Opcode cache like eAccelerator or APC is cached in RAM, I needs too much RAM.
So Im looking for any project or technique which cache the PHP Opcode in hard disk.
Thanks so much
(my website is not generate money, so Im thinking about cheaper solution)
all op-code caches allow you to configure the maximum size of shared memory used (look for a configuration option with shm - for SHared Memory - in the name, eg. apc.shm_size). so you can control that they don't use too much ram.
some caches also allow you to cache on disk instead/besides of caching in ram:
eAccelerator
the question is if a small amount of shared memory or a disk only cache gains you anything in performance compared to plain php without op-code cache. as always when using a cache, you should benchmark this.

Categories