What to use instead of memcache - php

I'm using memcache for caching (obviously) which is great. But I'm also using it as a cross-request/process data store. For instance I have a web chat on one of my pages and I use memcache to store the list of online users in it. Works great but it bothers me that if I have to flush the whole memcache server (for whatever reason) I loose the online list. I also use it to keep record of views for some content (I then periodically update the actual rows in the DB), and if I clear the cache I loose all data about views (from the last write to db).
So what I'm asking is this: what should I use instead of memcache for this kind of things? It needs to be fast and preferably store it's data in memory. I think some noSQL product would be a good fit here, but I've no idea which one. I'd like to use something that I could use for other use cases in the future, analytics come to mind (what are users searching the most for instance).
I'm using PHP so it has to have good bindings for it.

Redis! It's like memcache on steroids (the good kind). Here are some drivers.
Redis is an advanced key-value store. It is similar to memcached but the dataset is not volatile, and values can be strings, exactly like in memcached, but also lists, sets, and ordered sets. All this data types can be manipulated with atomic operations to push/pop elements, add/remove elements, perform server side union, intersection, difference between sets, and so forth. Redis supports different kind of sorting abilities.

You could try memcachedb. It uses exactly the same protocol as memcache, but it's persistent store.
You could also try cassandra or redis

Related

MongoDB as MySQL cache

I just had this idea and thinks it's a good solution for this problem but I ask if there are some downsides to this method. I have a webpage that often queries database, as much as 3-5 queries per page load. Each query is making a dozen(literally) joins and then each of these queries results are used for another queries to construct PHP objects. Needless to say the load times are ridiculous even on the cloud but it's the way it works now.
I thought about storing the already constructed objects as JSON, or in MongoDB - BSON format. Will it be a good solution to use MongoDB as a cache engine of this type? Here is the example of how I think it will work:
When the user opens the page, if there is no data in Mongo with the proper ID, the queries to MySQL fire, each returning data that is being converted to a properly constructed object. The object is sent to the views and is converted to JSON and saved in Mongo.
If there was data in Mongo with the corresponding ID, it is being sent to PHP and converted.
When some of the data changes in MySQL (administrator edits/deletes content) a delete function is fired that will delete the edited/deleted object in MongoDB as well.
Is it a good way to use MongoDB? What are the downsides of this method? Is it better to use Redis for this task? I also need NoSQL for other elements of the project, that's why I'm considering to use one of these two instead of memcache.
MongoDB as a cache for frequent joins and queries from MySQL has some information, but it's totally irrelevant.
I think you would be better off using memcached or Redis to cache the query results. MongoDB is more of a full database than a cache. While both memcached and Redis are optimized for caching.
However, you could implement your cache as a two-level cache. Memcached, for example, does not guarantee that data will stay in the cache. (it might expire data when the storage is full). This makes it hard to implement a system for tags (so, for example, you add a tag for a MySQL table, and then you can trigger expiration for all query results associated with that table). A common solution for this, is to use memcached for caching, and a second slower, but more reliable cache, which should be faster than MySQL though. MongoDB could be a good candidate for that (as long as you can keep the queries to MongoDB simple).
Well you can go with Memcached or Redis for caching objects. Mongodb can be also used as a cache. I use mongodb for caching aggregation results, since it has advantage of wide range of queries as well unlike Memcached.
For example, in a tagging application, if I have to display page count corresponding to each tag, it scans whole table for a group by query. So I have a cronjob which computes that group by query and cache the aggregation result in Mongo. This works perfectly well for me in production. You can do this for countless other complex computations as well.
Also mongodb capped collections and TTL collections are perfect for caching.

Is there a way of keeping database data in PHP while server is running?

I'm making a website that (essentially) lets the user submit a word, matches it against a MySQL database, and returns the closest match found. My current implementation is that whenever the user submits a word, the PHP script is called, it reads the database information, scans each word one-by-one until a match is found, and returns it.
I feel like this is very inefficient. I'm about to make a program that stores the list of words in a tree structure for much more effective searching. If there are tens of thousands of words in the database, I can see the current implementation slowing down quite a bit.
My question is this: instead of having to write another, separate program, and use PHP to just connect to it with every query, can I instead save an entire data tree in memory with just PHP? That way, any session, any query would just read from memory instead of re-reading the database and rebuilding the tree over and over.
I'd look into running an instance of memcached on your server. http://www.memcached.org.
You should be able to store the compiled tree of data in memory there and retrieve it for use in PHP. You'll have to load it into PHP to perform your search, though, as well as architect a way for the tree in memcached to be updated when the database changes (assuming the word list can be updated, since there's not a good reason to store it in a database otherwise).
Might I suggest looking at the memory table type in mysql: http://dev.mysql.com/doc/refman/5.0/en/memory-storage-engine.html
You can then still use mysql's searching features on fast "in memory" data.
PHP really isn't a good language for large memory structures. It's just not very memory efficient and it has a persistence problem, as you are asking about. Typically with PHP, people would store the data in some external persistent data store that is optimized for quick retrieval.
Usually people use a two fold approach:
1) Store data in the database as optimized as possible for standard queries
2) Cache results of expensive queries in memcached
If you are dealing with a lot of data that cannot be indexed easily by relational databases, then you'd probably need to roll your own daemon (e.g., written in C) that kept a persistent copy of the data structure in memory for fast querying capabilities.

Memcached Storing user accounts

I'm curious about speeding up my site using memcache. Now currently I have a mysql table with columns for a key and email address, users log in using their key and I query the database to check if it's correct. The email is used incase they forget their key and want it resending.
Now, obviously each record is very small (about 19B I think). Do you think it would be a good idea to preload all (Say, 1 million records) the records into Memcached and only use Mysql for keeping a permanent record?
Memcached is intended to be a short term caching solution handle key=value pairs. While you could, in theory, use it for something like this, it isn't the intended use for it and honestly I don't think you really gain anything for it.
Generally the best thing to use Memcached for is activities like dynamically generated content that you want to keep consistent along a session and accessible from multiple servers (Like an load balanced environment.
If you are using just 1 server, then Memcached won't prove useful. Also, when alloted memory for memcached is used completely, then memcached would overwrite the the last key. (LIFO). Storing user information in Memcache won't be ideal

When to use Redis instead of MySQL for PHP applications?

I've been looking at Redis. It looks very interesting. But from a practical perspective, in what cases would it be better to use Redis over MySQL?
Ignoring the whole NoSQL vs SQL debate, I think the best approach is to combine them. In other words, use MySQL for for some parts of the system (complex lookups, transactions) and redis for others (performance, counters etc).
In my experience, performance issues related to scalability (lots of users...) eventually forces you to add some kind of cache to remove load from the MySQL server and redis/memcached is very good at that.
I am no Redis expert, but from what I've gathered, both are pretty different. Redis :
Is not a relational database (no fancy data organisation)
Stores everything in memory (faster, less space, probably less safe in case of a crash)
Is less widely deployed on various webhosts (if you're not hosting yourself)
I think you might want to use Redis for when you have a small-ish quantity of data that doesn't need the relational structure that MySQL offers, and requires fast access. This could for example be session data in a dynamic web interface that needs to be accessed often and fast.
Redis could also be used as a cache for some MySQL data which is going to be accessed very often (ie: load it when a user logs in).
I think you're asking the question the wrong way around, you should ask yourself which one is more suited to an application, rather than which application is suited to a system ;)
MySQL is a relational data store. If configured (e.g. using innodb tables), MySQL is a reliable data-store offering ACID transactions.
Redis is a NoSQL database. It is faster (if used correctly) because it trades speed with reliability (it is rare to run with fsync as this dramatically hurts performance) and transactions (which can be approximated - slowly - with SETNX).
Redis has some very neat features such as sets, lists and sorted lists.
These slides on Redis list statistics gathering and session management as examples. There is also a twitter clone written with redis as an example, but that doesn't mean twitter use redis (twitter use MySQL with heavy memcache caching).
MySql -
1) Structured data
2) ACID
3) Heavy transactions and lookups.
Redis -
1) Non structured data
2) Simple and quick lookups. for eg - token of a session
3) use it for caching layer.
Redis, SQL (+NoSQL) have their benefits+drawbacks and often live side by side:
Redis - Local variables moved to a separate application
Easy to move from local variables/prototype
Persistant storrage
Multiple users/applications all see the same data
Scalability
Failover
(-) Hard to do more advanced queries/questions on the data
NoSQL
Dump raw data into the "database"
All/most of Redis features
(-) Harder to do advanced queries, compared to SQL
SQL
Advanced queries between data
All/most of Redis features
(-) Need to place data into "schema" (think sheet/Excel)
(-) Bit harder to get simple values in/out than Redis/NoSQL
(different SQL/NoSQL solutions can vary. You should read up on CAP theorem and ACID on why one system can't simultaneously give you all)
According to the official website, Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. Actually, Redis is an advanced key-value store. It is literally super fast with amazingly high throughput as it can perform approximately 110000 SETs per second, about 81000 GETs per second. It also supports a very rich set of data types to store. As a matter of fact, Redis keeps the data in-memory every time but also persistent on-disk database. So, it comes with a trade-off: Amazing speed with the size limit on datasets (as per memory). In this article, to have some benchmarks in comparison to MySQL, we would be using Redis as a caching engine only.
Read Here: Redis vs MySQL Benchmarks

Create a PHP cache system in MySQL database?

I'm creating a web service that often scrapes data from remote web pages. After scraping this data, I have a simple multidimensional array of information to use. The scraping process is fairly taxing on my server, and the page load takes a while. I was considering adding a simple cache system using a MySQL database, where I create one row per remote web page with a the array of information pulled from it stored as a JSON encoded string. Is this a good enough system? Or would something like a text file per web page be a better idea?
Since you're scraping multiple web pages, and you want to your data to be persistently cached, you have a few options -- the best of which would be to use memcache or a database such as MySQL. Using text files is not a good idea, because you would have to serialize / deserialize your data, and read from your filesystem. To query a database or a memcache is many times more efficient.
Since you're probably looking for your cache to be somewhat persistent, I would suggest going with MySQL. You would simply create a table that has an auto-incrementing primary key, which a column for each element in your parsed JSON object. (Note that MySQL currently does not support arrays. In order to emulate them, you will need to use relational tables, or serialize your array data and provide it to a text field. The former method is preferred).
Every time you scrape a page, you would run an UPDATE statement to update that individual page's information in the database. If you specify a unique index on whatever you use to uniquely identify your page (URL / etc), you will achieve optimal look-up performance.
If you're looking to store the cache locally on 1 server (e.g. if your mysql server and http server are on the same box), you might be better off using APC, which is a cache service that comes with PHP.
If you're looking to store the data remotely (e.g. a dedicated cache box) then I would go with Memcache instead of MySQL.
"When all you have is a hammer ..."
I don;'t tend to have particularly large APC configs, 64 - 128MB max. Memcache can go to a couple of gigabytes or maybe more (far more if you run multiple instances). Both are also transient - a restart of Apache, or Memcache (the the latter is slightly less likely, or often) will lose the data
It depends then, on how often you are willing to process the data to produce the cache, and how long that cache could otherwise be useful for. If it was good for weeks before you re-scraped the pages - Mysql is a entirely suitable backing store.
Potential pther options, depending on how many items are being cached & how big the data is, are, as you suggest, a file-based cache, SQlite, or other systems.

Categories