I used to clear cash using flush() in PHP every day, like 5-10 times.
I want to clear specific items cache to prevent caching all server cache.
This is the right way:
Use Memcache::delete() to delete item
Memcache::add() the same item
Is it correct?
https://www.php.net/manual/en/book.memcache.php
Yes, that is correct.
But be aware that Memcache::add() will fail if the key already exists.
If you always want to write the data even though it exists already, you can use Memcache::set() instead.
Another a little bit funky thing with the memcache PHP class, is that TTL is measured in seconds, but if it is larger than 30 days it instead is interpreted as a date stamp (unix timestamp).
Related
I am using CI's sessions in connection with a database. So all of our sessions are in this ci_sessions table on our database and it can get a lot of rows, considering that the session_id keep changing every 5 minutes.
Do we need to empty the table, say every one a month / week maybe?
While what #Marc-Audet said is true, if you take a look at the code, you can see it is a really lousy way to clean up sessions.
The constructor calls the _sess_gc function every time it is initiated. So, basically each request to your server if you have it autoloaded.
Then, it generates a random number below 100 and sees if that's below a certain value (by default it is 5). If this condition is met, then it will remove any rows on the session table with last_activity value less than current time minus your session expiration.
While this works for most cases, it is technically possible that (if the world is truly random) the random number generator does not generate a number below 5 for a long time, in which case, your sessions will not be cleaned up.
Also, if you have your session expiry time set to a long time (if you set to 0, CI will set it to 2 years) then those rows are not going to get deleted anyway. And if your site is good enough to get a decent amount of visitors, your DBA will be pointing fingers at the session table some time soon :)
It works for most cases - but I would not call it a proper solution. Their session id regeneration really should have been built to remove the records pertaining to the previous ids and the garbage collection really should not be left to a random number - in theory, it is possible that the required number is not generated as frequently as you wished.
In our case, I have removed the session garbage collection from the session library and I manually take care of it once a day (with a cron job .. and a reasonable session expiration time). This reduces the number of unnecessary hits to the DB and also does not leave a massive table in the DB. It is still a big table, but lot smaller than what it used to be.
Given the fact that the OP question doesn't have a CodeIgniter 2 tag, I'll answer how to deal with sessions cleanup when the database keeps growing for CodeIgniter 3.
Issue:
When you set (in the config.php file) sess_expiration key too high (let's say 1 year) and sess_time_to_update key low (let's say 5 min), the session table will keep growing as the users browse though your website, until sessions rows will expire and will be garbage collected (which is 1 year).
Solution:
Setting sess_regenerate_destroy key to TRUE (default set to FALSE) will delete an old session when it will regenerate itself with the new id, thus cleaning your table automatically.
No, CodeIgniter cleans up after itself...
Note
According to the CodeIgniter documentation:
The Session class has built-in garbage collection which clears out expired sessions so you do not need to write your own routine to do it.
CodeIgniter's Session Class probably checks the session table and cleans up expired entries. However, the documentation does not say when the clean up happens. Since there are no cron jobs as part of CodeIgniter, the clean up must occur when the Session class is invoked. I suppose if the site remains idle forever, the session table will never be cleared. But, this would be an unusual case.
CodeIgniter implements the SessionHandlerInterface (see the docs for the custom driver).
CodeIgniter defines a garbage collector method named gc() for each driver (database, file, redis, etc) or you can define your custom gc() for your custom driver.
The gc() method is passed to PHP with the session_set_save_handler() function, therefore the garbage collector is called internally by PHP based on session.gc_divisor, session.gc_probability settings.
For example, with the following settings:
session.gc_probability = 1
session.gc_divisor = 100
There is a 1% chance that the garbage collector process starts on each request.
So, you do not need to clean the session table if your settings are properly set.
When you call:
$this->session->sess_destroy();
It deletes the information in database by itself.
Since PHP7, the GC-based method is disabled by default, as per the documentation at https://www.php.net/manual/en/function.session-gc.php Stumbled upon this because a legacy application suddenly stopped working, reaching a system limitation since sessions are never ever cleaned up. A cronjob to clean up the sessions would be a good idea...
It is always good practice to clear the table. Otherwise, if your querying the session data for say creating reports or something, it will be slow and unreliable. Nevertheless, given the performance of mysql, yes do so.
From the memcached wiki:
When the table is full, subsequent inserts cause older data to be purged in least recently used (LRU) order.
I have the following questions:
Which data will be purged? The one which is older by insertion, or the one which is least recently used? I mean if recently accessed data is d1 which is oldest by insertion and the cache is full while replacing data will it replace d1?
I am using PHP for interacting with memcached. Can I have control over how data is replaced in memcached? Like I do not want some of my data to get replaced until it expires even if the cache is full. This data should not be replaced instead other data can be removed for insertion.
When data is expired is it removed immediately?
What is the impact of the number of keys stored on memcached performance?
What is the significance of -k option in memcached.conf? I am not able to understand what "lock down all paged memory" means. Also, the description in README is not sufficient.
When memcached needs to store new data in memory, and the memory is already full, what happen is this:
memcached searches for a a suitable* expired entry, and if one is found, it replaces the data in that entry. Which answers point 3) data is not removed immediately, but when new data should be set, space is reallocated
if no expired entry is found, the one that is least recently used is replaced
*Keep in mind how memcached deals with memory: it allocates blocks of different sizes, so the size of the data you are going to set in the cache plays role in deciding which entry is removed. The entries are 2K, 4K, 8K, 16K... etc up to 1M in size.
All this information can be found in the documentation, so just read in carefully. As #deceze says, memcached does not guarantee that the data will be available in memory and you have to be prepared for a cache miss storm. One interesting approach to avoid a miss storm is to set the expiration time with some random offset, say 10 + [0..10] minutes, which means some items will be stored for 10, and other for 20 minutes (the goal is that not all of items expire at the same time).
And if you want to preserve something in the cache, you have to do two things:
a warm-up script, that asks cache to load the data. So it is always recently used
2 expiration times for the item: one real expiration time, let's say in 30 minutes; another - cached along with the item - logical expiration time, let's say in 10 minutes. When you retrieve the data from the cache, you check the logical expiration time and if it is expired - reload data and set it in the cache for another 30 minutes. In this way you'll never hit the real cache expiration time, and the data will be periodically refreshed.
5) What is the significance of -k option in "memcached.conf". I am not
able to understand what does "Lock down all paged memory" means. Also
description in README is also not sufficient.
No matter how much memory you will allocate for memcached, it will use only the amount it needs, e.g. it allocates only the memory actually used. With the -k option however, the entire memory is reserved when memcached is started, so it always allocates the whole amount of memory, no matter if it needs it or not
I'll most probably be using MemCache for caching some database results.
As I haven't ever written and done caching I thought it would be a good idea to ask those of you who have already done it. The system I'm writing may have concurrency running scripts at some point of time. This is what I'm planning on doing:
I'm writing a banner exchange system.
The information about banners are stored in the database.
There are different sites, with different traffic, loading a php script that would generate code for those banners. (so that the banners are displayed on the client's site)
When a banner is being displayed for the first time - it get's cached with memcache.
The banner has a cache life time for example 1 hour.
Every hour the cache is renewed.
The potential problem I see in this task is at step 4 and 6.
If we have for example 100 sites with big traffic it may happen that the script has a several instances running simultaneously. How could I guarantee that when the cache expires it'll get regenerated once and the data will be intact?
How could I guarantee that when the cache expires it'll get regenerated once and the data will be intact?
The approach to caching I take is, for lack of a better word, a "lazy" implementation. That is, you don't cache something until you retrieve it once, with the hope that someone will need it again. Here's the pseudo code of what that algorithm would look like:
// returns false if there is no value or the value is expired
result = cache_check(key)
if (!result)
{
result = fetch_from_db()
// set it for next time, until it expires anyway
cache_set(key, result, expiry)
}
This works pretty well for what we want to use it for, as long as you use the cache intelligently and understand that not all information is the same. For example, in a hypothetical user comment system, you don't need an expiry time because you can simply invalidate the cache whenever a new user posts a comment on an article, so the next time comments are loaded, they're recached. Some information however (weather data comes to mind) should get a manual expiry time since you're not relying on user input to update your data.
For what its worth, memcache works well in a clustered environment and you should find that setting something like that up isn't hard to do, so this should scale pretty easily to whatever you need it to be.
Using a PHP script I need to update a number every 5 seconds while somebody is on my page. So let's say I have 300 visitors, each one spending about 1 minute on the page and every 5 seconds they stay on the page the number will be changed...which is a total of 3600 changes per minute. I would prefer to update the number in my MySQL database, except I'm not sure if it's not too inefficient to have so many MySQL connections (just for the one number change), when I could just change the number in a file.
P.S.: I have no idea weather 3600 connections/minute is a high number or not, but what about this case in general, considering an even higher number of visitors. What is the most efficient way to do this?
Doing 3,600 reads and writes per minute against the same file is just out of question. It's complicate (you need to be extremely careful with file locking), it's going to have an awful performance and sooner or later your data will get corrupted.
DBMSs like MySQL are designed for concurrent access. If they can't cope with your load, a file won't do it better.
It will fail eventually if the user count grows but the performance depends of your server setup and other tasks that are related to this update.
You can do a slight test and open up 300 persistent connections to your database end fire up as much query's you can in minute.
If you don't need it to be transactional (the order of executed query's is not important) then i suggest you to use memcached (or redis if you need to save stuff on disk) for this instead
If you save to file, you have to solve concurrency issues (and all but the currently reading/writing process will have to wait). The db solves this for you. For better performance you could use memcached.
Maybe you could do without this "do every 5s for each user" by another means (e.g. saving current time and subtracting next time the user does something). This depends on your real problem.
Don't even think about trying to handle this with files - its just not going to work unless you build a lock queue manager - and if you're going to all that trouble you might as well use the daemon to manage the value rather than just queue locks.
Using a DBMS is the simplest approach.
For a more efficient but massively more esoteric approach, write a single-threaded socket server daemon and have the clients connect to that. (there's a lib here for doing the socket handling, and there's a PEAR class for running PHP as a daemon)
files aren't transactional and you don't want to lose count so the database is the way to go
memcached's inc command is faster then the database and was the basis of i think one really fast view counting setup
if you use say a key per hour and switch so when a page view happens inc page:time occurs and you can have a process in the background collect the counts from the past hour and insert them in a database if the memcache fails you might lose the count for that hour but you will not have double counted or missed any and keeping counts per period gives interesting statistics
Using a dedicated temporary file will certainly be the most efficient disk access you can have. However, you will not be protected from concurrent access to the file in case your server uses multiple threads or processes. If what you want to do is update 1 number per user, then using a $_SESSION sub-variable will work, and I believe this is stored in memory, so it shouldbe very efficient. Then you can easily store this number into your database every 5 minutes per user
I'm caching tweets on my site (with 30 min expiration time). When the cache is empty, the first user to find out will repopulate it.
However, at that time the Twitter API may return a 200. In that case I'd like to prolong the previous data for another 30 mins. But the previous data will already be lost.
So instead I'd like to look into repopulating the cache, say, 5 minutes before expiration time so that I don't lose any date.
So how do I know the expiration time of an item when using php's memcache::get()?
Also, is there a better way of doing this?
In that case, isn't this the better logic?
If the cache is older than 30 minutes, attempt to pull from Twitter
If new data was successfully retrieved, overwrite the cache
Cache data for an indefinite amount of time (or much longer than you intend to cache anyway)
Note the last time the cache was updated (current time) in a separate key
Rinse, repeat
The point being, only replace the data with something new if you have it, don't let the old data be thrown away automatically.
don't store critical data in memcached. it guarantees nothing.
if you always need to get "latest good" cache - you need to store data at any persistent storage, such as database or flat file.
in this case if nothing found in cache - you do twitter api request. if it fails - you read data from persistent. and on another http request you will make same iteration one more time.
or you can put data from persistent into memcache with pretty shor lifetime. few minutes for example (1-5) to let twitter servers time to get healthy. and after it expired - repeat the request.
When you are putting your data into memcache - you are setting also how long the cache is valid. So theoretically you could also put the time when cache was created and/or when cache will expire. Later after fetching from cache you can always validate how much time left till cache will expire and decide what you want to do.
But letting cache to be repopulated on user visit can be still risky at some point - lets say if you would like to repopulate cache when it reaches ~5 min before expiration time - and suddenly there would be no visitors coming in last 6 minutes before cache expires - then cache will still expire and no one will cause it to be repopulated. If you want to be always sure that cache entry exists - you need to do checks periodically - for example - making a cronjob which does cache checks and fill-ups.