Memcached Storing user accounts - php

I'm curious about speeding up my site using memcache. Now currently I have a mysql table with columns for a key and email address, users log in using their key and I query the database to check if it's correct. The email is used incase they forget their key and want it resending.
Now, obviously each record is very small (about 19B I think). Do you think it would be a good idea to preload all (Say, 1 million records) the records into Memcached and only use Mysql for keeping a permanent record?

Memcached is intended to be a short term caching solution handle key=value pairs. While you could, in theory, use it for something like this, it isn't the intended use for it and honestly I don't think you really gain anything for it.
Generally the best thing to use Memcached for is activities like dynamically generated content that you want to keep consistent along a session and accessible from multiple servers (Like an load balanced environment.

If you are using just 1 server, then Memcached won't prove useful. Also, when alloted memory for memcached is used completely, then memcached would overwrite the the last key. (LIFO). Storing user information in Memcache won't be ideal

Related

What is faster? File_exist or MySQL query?

Users in my webgame are having certain player information cached in the $_SESSION of PHP.
Each time they load the game it checks if the session exists, if not they get the player information from a MySQL database and then it gets stored in the $_SESSION.
Now my problem is, what if the player information gets updated by another process or player? They can't update the $_SESSION cache of the other player.
I know memcached is most probably the solution for this, but I'm not sure if I should take the time for something like this. $_SESSION cache is doing well for me, except for this.
I was thinking about creating a MySQL table for it which get read at every request and if there's a record for the player that it recreates the cache.
One other solution would be to create a file in a directory with the id of the player in the name of the file. Every request PHP will check with file_exist if it should clear the cache or not.
What would you guys do? It gets executed every request so it's pretty important to get this optimized.
From a design standpoint alone I'd avoid the file_exists and directory approach. Sure 'file_exists' is fast, but it won't scale well... What happens if a use changes their name?
If you're using APC (and you should) you could APC's user memory cache. As long as you're on a single server it should give you similar performance benifits as memcached without the need for a separate memory caching server process. If a user entry changes frequently, you could run into fragmemntation issues with APC though. In that case, time to bite the bullet and go with memcached--you can even store your session data in memcached for a performance boost.
Also, neither APC or your file_exists solution will scale to multiple load balanced servers--you'd need a DB solution or memcached for that.
The way you exposed it, is not about how fast is one vs the other, the SESSION approach is just not valid because of your concurrency issue.
If your data can change concurrently, then your data storage needs to be able to handle that concurrency and whatever caching layer you want to use needs to behave accordingly to the nature of your problem.
If it is only about cache, and you dont want to install memcache(d), you can go with a mysql table in memory. It is not as fast as memcached, but still a fine solution. And make sure to create proper indexes on all your tables (maybe that is the better solution, no cache, just select it from your table).
CREATE TABLE t (i INT) ENGINE = MEMORY;

What to use instead of memcache

I'm using memcache for caching (obviously) which is great. But I'm also using it as a cross-request/process data store. For instance I have a web chat on one of my pages and I use memcache to store the list of online users in it. Works great but it bothers me that if I have to flush the whole memcache server (for whatever reason) I loose the online list. I also use it to keep record of views for some content (I then periodically update the actual rows in the DB), and if I clear the cache I loose all data about views (from the last write to db).
So what I'm asking is this: what should I use instead of memcache for this kind of things? It needs to be fast and preferably store it's data in memory. I think some noSQL product would be a good fit here, but I've no idea which one. I'd like to use something that I could use for other use cases in the future, analytics come to mind (what are users searching the most for instance).
I'm using PHP so it has to have good bindings for it.
Redis! It's like memcache on steroids (the good kind). Here are some drivers.
Redis is an advanced key-value store. It is similar to memcached but the dataset is not volatile, and values can be strings, exactly like in memcached, but also lists, sets, and ordered sets. All this data types can be manipulated with atomic operations to push/pop elements, add/remove elements, perform server side union, intersection, difference between sets, and so forth. Redis supports different kind of sorting abilities.
You could try memcachedb. It uses exactly the same protocol as memcache, but it's persistent store.
You could also try cassandra or redis

Create a PHP cache system in MySQL database?

I'm creating a web service that often scrapes data from remote web pages. After scraping this data, I have a simple multidimensional array of information to use. The scraping process is fairly taxing on my server, and the page load takes a while. I was considering adding a simple cache system using a MySQL database, where I create one row per remote web page with a the array of information pulled from it stored as a JSON encoded string. Is this a good enough system? Or would something like a text file per web page be a better idea?
Since you're scraping multiple web pages, and you want to your data to be persistently cached, you have a few options -- the best of which would be to use memcache or a database such as MySQL. Using text files is not a good idea, because you would have to serialize / deserialize your data, and read from your filesystem. To query a database or a memcache is many times more efficient.
Since you're probably looking for your cache to be somewhat persistent, I would suggest going with MySQL. You would simply create a table that has an auto-incrementing primary key, which a column for each element in your parsed JSON object. (Note that MySQL currently does not support arrays. In order to emulate them, you will need to use relational tables, or serialize your array data and provide it to a text field. The former method is preferred).
Every time you scrape a page, you would run an UPDATE statement to update that individual page's information in the database. If you specify a unique index on whatever you use to uniquely identify your page (URL / etc), you will achieve optimal look-up performance.
If you're looking to store the cache locally on 1 server (e.g. if your mysql server and http server are on the same box), you might be better off using APC, which is a cache service that comes with PHP.
If you're looking to store the data remotely (e.g. a dedicated cache box) then I would go with Memcache instead of MySQL.
"When all you have is a hammer ..."
I don;'t tend to have particularly large APC configs, 64 - 128MB max. Memcache can go to a couple of gigabytes or maybe more (far more if you run multiple instances). Both are also transient - a restart of Apache, or Memcache (the the latter is slightly less likely, or often) will lose the data
It depends then, on how often you are willing to process the data to produce the cache, and how long that cache could otherwise be useful for. If it was good for weeks before you re-scraped the pages - Mysql is a entirely suitable backing store.
Potential pther options, depending on how many items are being cached & how big the data is, are, as you suggest, a file-based cache, SQlite, or other systems.

different databases for handling sessions...am I doing the right thing?

I'm looking for some advice on whether or not I should use a separate database to handle my sessions.
We are writing a web app for multiple users to login and check/update their account specific information. We didn't want to use the file storage method on the webserver for storing session information, so we decided to use a database (MySQL). It's working fine, but I'm wondering about performance when this gets into production.
Currently, we have two databases (rst_sessions, and rst). The "RST" database is where all the tables are stored for the webapp...they are all MYSQL InnoDB using Referential Integrity/foreign keys to link the tables. The "RST_SESSIONS" database simply has one table and all the session information gets stored there.
Here's one of my concerns. In the PHP code if I want to run a query against "RST" then I have to select that database as such inside php ( $db->select("RST") )...when I'm done with the query I have to re-select the "RST_SESSIONS" ( $db->select("RST_SESSIONS") ) or else the session specific information doesn't get set. So, throught the webapp the code is doing a lot of selecting and reselecting of the two databases. Is this likely to cause performance issues with user base of say (10,000 - 15,000)? Would we be better off moving the RST_SESSIONS table into the RST database to avoid all the selecting?
One reason we initially set things up this way was to be able to store the sessions information on a separate database server so it didn't interfere with the operations of the webapp database.
What are some of the pro's and con's of both methods and what would you suggest we do for performance? Thanks in advance.
If you're worrying about performances, another alternate solution would be to not store your sessions in database, but to use something like memcached -- the PHP library to dialog with memcached already provides a handler for sessions.
A couple of advantages of using memcached :
No hit to the disk : everything is in RAM
Of course, this means sessions will be lost if your server crashes ; but if a crash happens, you'll probably have other troubles than jsut losing sessions, and this is not likely to happen often
Used in production by many websites, and works well (I'm using it for a couple of websites)
Better scalability : if you need more RAM or more CPU-power for your memcached cluster, just add a couple of servers
And I would add : once you've started using memcached, you can also use it as a caching mecanism ;-)
Now, to answer to your specific questions :
Instead of selecting the DB, I would use two distinct connections :
One for the DB that's use for the application,
And one other for the DB that's used for the sessions.
Of course, this means a bit more load on the server (it doubles the number of opened connections), but it make sure that, the day it becomes needed, you'll be able to move the "session" database to another server : you'll just have to re-configure a connection string ; and as the application already uses two separate connections, it'll still work fine.
If you can live with it, just open a second connection to the database. That way you won't have to switch between databases at all. Of course, now you consume twice as many connections, and may need to bump the limit.
Unless there's some overriding reason to put your auth information in a separate database, why not put it with the rest of your data? You may find it convenient to have everything in one place.
Notice also that you can qualify your table names in your sql queries with a schema (database) name e.g.
SELECT ACTIVE
FROM RST_SESSIONS.SESSION
WHERE SID=*whatever*
This may get you out of the need to switch dbs explicitly, if they're both on the same server.

Efficient rating system

I'm building a news rating script for this website that has a lot of users. I'm trying to make this websites as efficient as possible and now I'm wondering what would be the most efficient way to keep track of the votes. Of course I don't want users to vote more than once.
My first though was to store it in a my MySQL database, but I'm worried this would have a negative influence in my website's speed because this table would get quite big.
Would storing it in a database still be the best solution or are there any better solutions.
If you plan on having > 1,000,000 records you should make sure the table's structure is efficient (which shouldnt be hard for your example) and that you index it correctly.
Memcached would be the simplest way to implement caching and is easy to scale if your site grows and more servers are necessary.
With a properly indexed vote table, you can keep reasonable performance regardless of how large your table is (of course, beyond a certain point, your tables will be too large to fit in cache, but that would involve having a very large number of users and items).
Add in some per-user caching (on the client, in $_SESSION, using memcached) and you can get a quite fast "no" response time).
Since you can't use memcached I would say this. A decent database server ( decent hardware + decent db implementation) should be able to handle this quite well. A single table with a physical index on article-id and a second entry representing the vote will handle a few googillion (yes I made up the word) articles easily :P
Rationale :
Database servers maintain statistics -- read: self-tuning -- and only hot items (index + row-entries) remain in-memory.
Moral:
Don't worry about such things unless they become a problem -- i.e., If your company is the size of facebook I would worry.
Memcached would be a very good way to do this. you need to synchronize from memcached once in a while (I would do this using the pull model using a cron script on you mysql server).
Did you see this?
http://destiney.com/php#Destiney_rated_images
Demo here: http://ratedsite.com/

Categories