I need to hold a semi-static large object in cache so I don't need to request it every time from database. Something like $_SESSION, but not tied to a session, because the data are common to all users.
I can cache client side that data, once I got it, but I would like to avoid disturbing the database with select queries of large data that (almost) never changes.
Also, I cannot add modules (like APC cache) in this environment.
I could store my data into a file, say a JSON, which I read with php instead of querying db, but accessing filesystem is also disturbing if php needs to do it many times per seconds AND filesize is not tiny.
Is there a built in way in php to store objects in memory, common to all php instances?
EDIT: Could I use $_session as storing space, forcing session_id to be always the same? Is it dangerous? I don't use sessions for the application itself. I tried and it works
Most Operating systems will store the result of reading from disk in its cache.
This means that the disk will not be hit each time. File based storage is actually pretty quick for multiple reads of the same file as its really just coming direct from memory.
as long as "pretty large" still means fits in memory this way should be fine
Related
I'm designing an application in PHP which involves Trie data structure.
For time efficient prefix search, I'm using Trie.
I'm constructing the Trie using records from the database.
Now, the database has millions of records. So it is not feasible to everytime create the Trie and then search in it, for every new user request.
Instead can I create the Trie only once and somehow store this information, such that it does not have to be re-created for every new user request, and then searching can be immediately done. Is there somehow I can cache the created Trie (not just for one user session, but for all user requests) using PHP?
Any help would be much appreciated.
You have a couple of standard options.
Cache the database result in memory, using a simple cache like memcached
Cache using Redis, perhaps taking advantage of some of its extra features. This might involve a process where you load the data into a structure in REDIS and have your trie search code work against Redis directly rather than the database result set.
In either case, you are going to cache the result for some period of time that is acceptable, and since the database result will be in memory in some form, there is no load placed on the RDBMS.
In your related question, you indicated that he raw serialized form of the variable would be about 200mb in size. That is well within the max object size (512mb) for Redis, but could be problematic for memcached. I personally use Redis for most app server caching these days.
I am currently programming a php site, which atm needs to query a large amount of data (about 4 - 5MB) everytime. I already have a session going and wanted to ask, if its good practice to store that data in the session variable?
The current plan is to also maintain a table in the Database containing when a table has changed last. If that timestamp would be newer, then the data would be queried again, if not, use the data of the session variable as its still consistent...
Is this a good way to avoid querying too much data? And what speed impacts would the site have when a session is about 5MB in size?
Thanks in advance!
It's not really good practice (it will make PHP chew far more memory than it really should), but I'm not sure how it will affect performance.
I suppose the real question is this: Why do you need to store so much in the session? If it's information that is meant to be accessible between sessions, then you should be storing it in a database and loading it 'at need'.
If it's binary data (images, files, etc.) that are only relevant while the session is valid, then store it in a temporary file for the user (look at tempnam() and sys_get_temp_dir()), then store the temporary filename on the session.
No, it's not good practice to do this.
Points to consider:
By defailt, the session data is stored on disk in a temp folder. Every time you call session_start() (ie every page load), it will have to load the whole of that data into memory and populate it into the session array. If you're loading large amounts of data, this could have performance implications.
Also, since you're loading this large chunk of data every time, it means that each page load will take more memory. This reduces the number of concurrent users that your server can support.
If you're doing this for caching purposes to reduce hits to your DB, there are much better solutions available. APCu, Memcache, Redis and others can all do a much better job of caching your data than your proposed custom-written solution. There are also wrapper libraries available that make it even easier and allow you to mix and match between caching solutions. If you're using a framework like Laravel or Symphony, there may be caching classes built into your framework. Alternatively, you could try a stand-alone library like phpFastCache. But also, don't forget that modern DB engines have their own caching mechanisms built in, so repeated calls to the same or similar queries should be reasonbly fast anyway.
I'd like to make a very, very small, but persistent data structure that I can reference quickly server-side, and I'm not sure how.
Basically, what I want is an array that holds little structures that hold 3-10 strings in them. The array would be of size somewhere from 50-5,000 (expandable).
I was considering using a database, but that seems like overkill in this case. I was considering using a file that held JSON, but that just doesn't seem right (I think my server would have to load the file, parse the file, then return every time the cgi is called).
I'd like to be able to have PHP get something out of this persistent data structure in constant, fast time every time it's called.
I'm currently using just vanilla Apache and PHP.
Even without a file APC can store those data! apc_fetch and apc_store. The only problem is that the data is restricted to one server, so as soon as you will have clusters or multiple servers they don't share the data. (http://www.php.net/manual/de/ref.apc.php)
If multiple servers are involved, memcached or redis are worth a check. Redis has built-in arrays.
Edit:
Check if json_encode/json_decode are as fast as serialize/unserialize for your scenario or even faster, jsonlib can be real fast. It removes some php-specific data, which is probably unnecessary for you (object names etc).
Edit2: If the server crashes, the plain apc-solution will lose all data. That is the reason you should also write it to a file if needed. apc is inside the apache process so it will be faster than memcached or redis.
I'm planning on saving my users often used parameters, i.e. name, picture, etc, in session variables as opposed to pulling then from the MySQL table each time they are needed. Saving often used parameters in variables as opposed to a database in theory should be more efficient, but because I'm not sure how SESSION variables are saved I'm not too sure if this is true. Does anyone know if pulling info. from a SESSION variable is more efficient than querying the MySQL table?
The term variable is used loosely as SESSION "variables" are stored in files in the server's temporary directory.
You would think reading files is more costly than reading a database, I mean that is what a database is essentially, a file, but it is optimized for this purpose as opposed to "temporary session files"
Yes, pulling information from a session variable is more efficient than querying a database for that info. However, loading the information INTO the session variables requires reading a file off of your servers file system and into RAM, which depending on many factors (disk speed, IO load, db speed, etc) might be slower or faster than reading the same information from a DB. Without information on your specific setup, it's hard to say. One thing to keep in mind, if you plan on growing and using more than one web server, you will need to write some custom session handlers to either store your sessions to a central server (possibly a database), memcache, or a shared mount point where all your web servers can go to fetch the session files.
In the end, putting something into the session and using it from there can be more efficient than loading it from the DB every time, but you are still loading it from somewhere, and so, knowledge of your hardware and your setup will be your best guide.
The default Session handler for PHP stores that info to disk; one unique temporary file per session. The issues you may come across are if the disk/file system gets overloaded, or if your data becomes stale.
If you're making a trip to disk to access the session, there is slightly less overhead than accessing MySQL, but you're still making a trip to disk upon every page request. You can try to use an in-memory Session handler.
Session variables are preferred for persisting a relatively small amount of temporary data. They're good for "sessions".
Use a database for everything else. Especially for:
larger amounts of data,
for any kind of "transaction", or
for data that needs to be persisted between "sessions".
This article is somewhat dated, and it doesn't apply to PHP per se ... but it should give you some idea about the relative efficiencies of filesystem (e.g. NTFS) vs database (e.g. MSSQL):
To Blob or Not To Blob: MS Research white paper
Yes it's more efficient to use session variables.
Typically Session variables are stored on the server in the /tmp directory (you can check your PHP Info file to see how yours is configured.
And because they're stored on the server, you can assume they're just as secure as the rest of your server.
Yes it is more efficient. Session is saved on server. However, with or without sessions you need to check if user is logged and if user has correct SESSION ID. It depends on number of your columns, rows and many other things
I'm using the session array to cache chunks of information retrieved from the db:
$result = mysql_query('select * from table');
array_push($_SESSION['data'],new Data(mysql_fetch_assoc($result)));
My question is, is there a limit/a sizeable amount of information that can/should be passed around in a session? Is it ill advised or significantly performance hindering to do this?
By default, $_SESSION data is stored on disk in the /tmp directory of your server. As long as you have enough room in there AND you aren't hitting your PHP memory limit, you're fine.
However, if you're attempting to cache a query that is the SAME for a larger number of users, you might want to use something like APC or memcache that isn't tied to the individual user. Otherwise, your essentially going to cache the same result 1x for each user, and not leveraging a cache across all users.
I think the answer would depend on where you are storing your data and how fast you can transfer it there.
If the data is 44 MB big, and you are on a 1000base-T network, you can expect it to take 1 second to actually transfer THERE. And 1 second to transfer back..
If you use local memory, then you have a finite amount of memory the machine.
If you use disk, then you have load/save times (disk is slow).
But also keep in mind, PHP has a finite amount of memory it allows a script to use. I think the default setting is 8 MB.
If you are talking about large blocks of data, you may want to consider Redis, Tokyo Cabinet or other key/value stores. Or even a backend interface to manipulate the data/cache it for you without transferring it through PHP.
Because Session data is stored in a file (or database record) on your server, it shouldn't matter too much how much data you store in it. I would just advise against huge objects.
You might want to look at APC or memcached to cache the results instead, as it is not a per-user cache, and it uses the memory instead of files.
The session is serialized and written to disk by default, so depending on the size and the amount of users things can become slow. However both things can be changed (read the session manual under http://php.net/session for all details) like using memcache for in-memory storage of the data. Best thing is to try it out under an environment as similar as possible tothe live system and check the resulting load and throughput.
Mmm, tricky. I think you could save it in the session. The real question is: do you want that all that information serialize and unserialize every time a client make a request?
I think it would be OK to save it in there if you will use all that information in every page of your website, but this is unprobable. It would be better if you save that information in a directory like /temptables/sometable/ and each file have the name of the session. You can use session_id to get it, and save and load the information in the pages you have to use with:
$info = unserialize(file_get_contents('/templatebles/sometable/'.session_id().'.ser'));
and saving with:
file_put_contents('/temptables/sometable/'.session_id().'.ser'), serialize($info));
But you need a cron job to clean that directory for old file. You can do it getting the session from the filename and ask for some variable, like 'itsalive', using session_start() or doing something like file_exists(session_save_path().'/sess_'.$session_name) to check if you should delete the temporary file.