Let's say we have a PHP array with ~ 200 keys containing site data, globally shared for all users.
This array is constructed from an SQL database, which takes too long. We want to store this array.
What's the difference (mainly in speed) between storing the array with apc_store() or serializing it and saving to a .php file on a disk, then retrieving by either apc_fetch() or file_get_contents() and unserialize?
Which would be faster? Why not use the file? Why use the cache?
EDIT One reason to use a file instead of a cache (for me) is that I can access the file from CLI/shell/root with CRON.
From best to worst:
APC is in-memory and very fast; it's serialized and unserialized automatically for you.
memcached is in-memory too, and a bit slower than APC. This is more than compensated by the fact that it allows to use the same cache across servers.
unserialize(file_get_contents()) involves hitting the disk, but is faster than parsing php. It's an OK option if you don't have APC, memcached, or equivalent in-memory caching.
var_export() to create a php file that you then include is slower than unserializing a string because the file needs to be parsed -- in addition to hitting the disk. The plus side is that it allows to easily edit the array if you ever need to.
serialize() into a variable held in a php file offers the worst of each: a disk hit, parsing of php and unserializing the data.
(There might also be something to be said about having proper indexes in your database. Fetching 200 rows to build an array shouldn't be slow.)
Related
I need to hold a semi-static large object in cache so I don't need to request it every time from database. Something like $_SESSION, but not tied to a session, because the data are common to all users.
I can cache client side that data, once I got it, but I would like to avoid disturbing the database with select queries of large data that (almost) never changes.
Also, I cannot add modules (like APC cache) in this environment.
I could store my data into a file, say a JSON, which I read with php instead of querying db, but accessing filesystem is also disturbing if php needs to do it many times per seconds AND filesize is not tiny.
Is there a built in way in php to store objects in memory, common to all php instances?
EDIT: Could I use $_session as storing space, forcing session_id to be always the same? Is it dangerous? I don't use sessions for the application itself. I tried and it works
Most Operating systems will store the result of reading from disk in its cache.
This means that the disk will not be hit each time. File based storage is actually pretty quick for multiple reads of the same file as its really just coming direct from memory.
as long as "pretty large" still means fits in memory this way should be fine
I'd like to make a very, very small, but persistent data structure that I can reference quickly server-side, and I'm not sure how.
Basically, what I want is an array that holds little structures that hold 3-10 strings in them. The array would be of size somewhere from 50-5,000 (expandable).
I was considering using a database, but that seems like overkill in this case. I was considering using a file that held JSON, but that just doesn't seem right (I think my server would have to load the file, parse the file, then return every time the cgi is called).
I'd like to be able to have PHP get something out of this persistent data structure in constant, fast time every time it's called.
I'm currently using just vanilla Apache and PHP.
Even without a file APC can store those data! apc_fetch and apc_store. The only problem is that the data is restricted to one server, so as soon as you will have clusters or multiple servers they don't share the data. (http://www.php.net/manual/de/ref.apc.php)
If multiple servers are involved, memcached or redis are worth a check. Redis has built-in arrays.
Edit:
Check if json_encode/json_decode are as fast as serialize/unserialize for your scenario or even faster, jsonlib can be real fast. It removes some php-specific data, which is probably unnecessary for you (object names etc).
Edit2: If the server crashes, the plain apc-solution will lose all data. That is the reason you should also write it to a file if needed. apc is inside the apache process so it will be faster than memcached or redis.
I have an array containing my reference variables, and in my scripts I need to catch one variable or two. In the current system, I have to include the entire array (and its elements) to use one element. It seems that using a database is better for two reasons:
One record is read instead of the entire array
Variables can be easily edited
However, there is a major drawback for using database: on every php run, we need to make a connection to the database.
Since simple database systems like SQLite has no server, persistent_connection is not like advanced database servers like mysql.
In action,
$db = new SQLite3('mysqlitedb.db');
takes more time (and consumes more resources) than
include 'array.php';
Is there any solution for having a basic database system (with fast connection) to be a replacement to PHP array and include file?
In other words, I need a simple database system with fast connection comparable with fopen. However, even CDB which is incredibly fast, is not fast enough on initial connection.
By including the static array file you are essentially doing what caching systems do when they pull a result from a database. You are loading a pre-digested result directly from disk.
All database connections have some overhead (certainly more than including a rendered file). You use a database when you need operational maintainability for your data, but this comes at the cost of application overhead.
If you are not worried about persistance of the data, you may want to look at using a caching system like APC, memcached or redis.
Have you considered caching the variables? You could use APC or Memcached for this purpose. They will both be faster than a database since the data is stored in the RAM, not on the disc.
It will still be slower than just including the array.
I need to implement a kind of a fast caching mechanism for a PHP application. It works something like this: Multiple node servers are requesting data from a central server (VIA JSON service). The node server should cache the responses on the file system in some fast efficient way. And that is the question - What will be the most optimal solution for the storage part. I have some types - XML (heard it can be inefficient with many records), store array definition with content in a PHP file or just dump an array of records to a file. Which of these would be most efficient for that scenario? Or maybe something else? I need to note that it must be implementend on a clean PHP >=5.2 without any additional libraries nor SQL.
Given the information you have provided, i would suggest simply dumping the JSON string to a file. This means there are no external libs or SQL engines required.
You could use XML if you want something that is "human readable" too, however XML isn't as quick and you would of course have to spend additional time generating the XML before you could store the data cache.
Reading is simply then just a case of getting the string from file and running through json_decode. If you only require parts of the data and not the entire lot, you could improve read performance by splitting the json object into blocks and writing to individual files, this trades off some of the write speed (not too much) but makes the read speed better.
Write speed could be made even better by writing to a partition configured with the ext2 filesystem.
However unless you working with large data sets and multiple cache files, there is no real reason to go to that sort of optimisation extent, writing the json to file as a string, and reading it back should be more than good enough for you.
You shouldn't generate XML files for caching content for only one application. It's overhead generating and parsing the XML and it results in much more bytes required.
Generating PHP-files is effective but there are some issues with it:
- possible parsing errors
- Could cache data twice (Filesystem-Cache of OS + PHP-Opcode-Cache)
I would prefer to wrinting cache files as simple serialized PHP data because It has a low parsing semantic and is very effective. You can also speed-up it by using a binary serializer like igbinary or mgspack.
Btw: If you cache data from a remote service on different web-node I would recommend you to use a caching server like memcached ;)
I'm using the session array to cache chunks of information retrieved from the db:
$result = mysql_query('select * from table');
array_push($_SESSION['data'],new Data(mysql_fetch_assoc($result)));
My question is, is there a limit/a sizeable amount of information that can/should be passed around in a session? Is it ill advised or significantly performance hindering to do this?
By default, $_SESSION data is stored on disk in the /tmp directory of your server. As long as you have enough room in there AND you aren't hitting your PHP memory limit, you're fine.
However, if you're attempting to cache a query that is the SAME for a larger number of users, you might want to use something like APC or memcache that isn't tied to the individual user. Otherwise, your essentially going to cache the same result 1x for each user, and not leveraging a cache across all users.
I think the answer would depend on where you are storing your data and how fast you can transfer it there.
If the data is 44 MB big, and you are on a 1000base-T network, you can expect it to take 1 second to actually transfer THERE. And 1 second to transfer back..
If you use local memory, then you have a finite amount of memory the machine.
If you use disk, then you have load/save times (disk is slow).
But also keep in mind, PHP has a finite amount of memory it allows a script to use. I think the default setting is 8 MB.
If you are talking about large blocks of data, you may want to consider Redis, Tokyo Cabinet or other key/value stores. Or even a backend interface to manipulate the data/cache it for you without transferring it through PHP.
Because Session data is stored in a file (or database record) on your server, it shouldn't matter too much how much data you store in it. I would just advise against huge objects.
You might want to look at APC or memcached to cache the results instead, as it is not a per-user cache, and it uses the memory instead of files.
The session is serialized and written to disk by default, so depending on the size and the amount of users things can become slow. However both things can be changed (read the session manual under http://php.net/session for all details) like using memcache for in-memory storage of the data. Best thing is to try it out under an environment as similar as possible tothe live system and check the resulting load and throughput.
Mmm, tricky. I think you could save it in the session. The real question is: do you want that all that information serialize and unserialize every time a client make a request?
I think it would be OK to save it in there if you will use all that information in every page of your website, but this is unprobable. It would be better if you save that information in a directory like /temptables/sometable/ and each file have the name of the session. You can use session_id to get it, and save and load the information in the pages you have to use with:
$info = unserialize(file_get_contents('/templatebles/sometable/'.session_id().'.ser'));
and saving with:
file_put_contents('/temptables/sometable/'.session_id().'.ser'), serialize($info));
But you need a cron job to clean that directory for old file. You can do it getting the session from the filename and ask for some variable, like 'itsalive', using session_start() or doing something like file_exists(session_save_path().'/sess_'.$session_name) to check if you should delete the temporary file.