I have an array containing my reference variables, and in my scripts I need to catch one variable or two. In the current system, I have to include the entire array (and its elements) to use one element. It seems that using a database is better for two reasons:
One record is read instead of the entire array
Variables can be easily edited
However, there is a major drawback for using database: on every php run, we need to make a connection to the database.
Since simple database systems like SQLite has no server, persistent_connection is not like advanced database servers like mysql.
In action,
$db = new SQLite3('mysqlitedb.db');
takes more time (and consumes more resources) than
include 'array.php';
Is there any solution for having a basic database system (with fast connection) to be a replacement to PHP array and include file?
In other words, I need a simple database system with fast connection comparable with fopen. However, even CDB which is incredibly fast, is not fast enough on initial connection.
By including the static array file you are essentially doing what caching systems do when they pull a result from a database. You are loading a pre-digested result directly from disk.
All database connections have some overhead (certainly more than including a rendered file). You use a database when you need operational maintainability for your data, but this comes at the cost of application overhead.
If you are not worried about persistance of the data, you may want to look at using a caching system like APC, memcached or redis.
Have you considered caching the variables? You could use APC or Memcached for this purpose. They will both be faster than a database since the data is stored in the RAM, not on the disc.
It will still be slower than just including the array.
Related
I need to hold a semi-static large object in cache so I don't need to request it every time from database. Something like $_SESSION, but not tied to a session, because the data are common to all users.
I can cache client side that data, once I got it, but I would like to avoid disturbing the database with select queries of large data that (almost) never changes.
Also, I cannot add modules (like APC cache) in this environment.
I could store my data into a file, say a JSON, which I read with php instead of querying db, but accessing filesystem is also disturbing if php needs to do it many times per seconds AND filesize is not tiny.
Is there a built in way in php to store objects in memory, common to all php instances?
EDIT: Could I use $_session as storing space, forcing session_id to be always the same? Is it dangerous? I don't use sessions for the application itself. I tried and it works
Most Operating systems will store the result of reading from disk in its cache.
This means that the disk will not be hit each time. File based storage is actually pretty quick for multiple reads of the same file as its really just coming direct from memory.
as long as "pretty large" still means fits in memory this way should be fine
I'm designing an application in PHP which involves Trie data structure.
For time efficient prefix search, I'm using Trie.
I'm constructing the Trie using records from the database.
Now, the database has millions of records. So it is not feasible to everytime create the Trie and then search in it, for every new user request.
Instead can I create the Trie only once and somehow store this information, such that it does not have to be re-created for every new user request, and then searching can be immediately done. Is there somehow I can cache the created Trie (not just for one user session, but for all user requests) using PHP?
Any help would be much appreciated.
You have a couple of standard options.
Cache the database result in memory, using a simple cache like memcached
Cache using Redis, perhaps taking advantage of some of its extra features. This might involve a process where you load the data into a structure in REDIS and have your trie search code work against Redis directly rather than the database result set.
In either case, you are going to cache the result for some period of time that is acceptable, and since the database result will be in memory in some form, there is no load placed on the RDBMS.
In your related question, you indicated that he raw serialized form of the variable would be about 200mb in size. That is well within the max object size (512mb) for Redis, but could be problematic for memcached. I personally use Redis for most app server caching these days.
I'd like to make a very, very small, but persistent data structure that I can reference quickly server-side, and I'm not sure how.
Basically, what I want is an array that holds little structures that hold 3-10 strings in them. The array would be of size somewhere from 50-5,000 (expandable).
I was considering using a database, but that seems like overkill in this case. I was considering using a file that held JSON, but that just doesn't seem right (I think my server would have to load the file, parse the file, then return every time the cgi is called).
I'd like to be able to have PHP get something out of this persistent data structure in constant, fast time every time it's called.
I'm currently using just vanilla Apache and PHP.
Even without a file APC can store those data! apc_fetch and apc_store. The only problem is that the data is restricted to one server, so as soon as you will have clusters or multiple servers they don't share the data. (http://www.php.net/manual/de/ref.apc.php)
If multiple servers are involved, memcached or redis are worth a check. Redis has built-in arrays.
Edit:
Check if json_encode/json_decode are as fast as serialize/unserialize for your scenario or even faster, jsonlib can be real fast. It removes some php-specific data, which is probably unnecessary for you (object names etc).
Edit2: If the server crashes, the plain apc-solution will lose all data. That is the reason you should also write it to a file if needed. apc is inside the apache process so it will be faster than memcached or redis.
I'm making a website that (essentially) lets the user submit a word, matches it against a MySQL database, and returns the closest match found. My current implementation is that whenever the user submits a word, the PHP script is called, it reads the database information, scans each word one-by-one until a match is found, and returns it.
I feel like this is very inefficient. I'm about to make a program that stores the list of words in a tree structure for much more effective searching. If there are tens of thousands of words in the database, I can see the current implementation slowing down quite a bit.
My question is this: instead of having to write another, separate program, and use PHP to just connect to it with every query, can I instead save an entire data tree in memory with just PHP? That way, any session, any query would just read from memory instead of re-reading the database and rebuilding the tree over and over.
I'd look into running an instance of memcached on your server. http://www.memcached.org.
You should be able to store the compiled tree of data in memory there and retrieve it for use in PHP. You'll have to load it into PHP to perform your search, though, as well as architect a way for the tree in memcached to be updated when the database changes (assuming the word list can be updated, since there's not a good reason to store it in a database otherwise).
Might I suggest looking at the memory table type in mysql: http://dev.mysql.com/doc/refman/5.0/en/memory-storage-engine.html
You can then still use mysql's searching features on fast "in memory" data.
PHP really isn't a good language for large memory structures. It's just not very memory efficient and it has a persistence problem, as you are asking about. Typically with PHP, people would store the data in some external persistent data store that is optimized for quick retrieval.
Usually people use a two fold approach:
1) Store data in the database as optimized as possible for standard queries
2) Cache results of expensive queries in memcached
If you are dealing with a lot of data that cannot be indexed easily by relational databases, then you'd probably need to roll your own daemon (e.g., written in C) that kept a persistent copy of the data structure in memory for fast querying capabilities.
I'm creating a web service that often scrapes data from remote web pages. After scraping this data, I have a simple multidimensional array of information to use. The scraping process is fairly taxing on my server, and the page load takes a while. I was considering adding a simple cache system using a MySQL database, where I create one row per remote web page with a the array of information pulled from it stored as a JSON encoded string. Is this a good enough system? Or would something like a text file per web page be a better idea?
Since you're scraping multiple web pages, and you want to your data to be persistently cached, you have a few options -- the best of which would be to use memcache or a database such as MySQL. Using text files is not a good idea, because you would have to serialize / deserialize your data, and read from your filesystem. To query a database or a memcache is many times more efficient.
Since you're probably looking for your cache to be somewhat persistent, I would suggest going with MySQL. You would simply create a table that has an auto-incrementing primary key, which a column for each element in your parsed JSON object. (Note that MySQL currently does not support arrays. In order to emulate them, you will need to use relational tables, or serialize your array data and provide it to a text field. The former method is preferred).
Every time you scrape a page, you would run an UPDATE statement to update that individual page's information in the database. If you specify a unique index on whatever you use to uniquely identify your page (URL / etc), you will achieve optimal look-up performance.
If you're looking to store the cache locally on 1 server (e.g. if your mysql server and http server are on the same box), you might be better off using APC, which is a cache service that comes with PHP.
If you're looking to store the data remotely (e.g. a dedicated cache box) then I would go with Memcache instead of MySQL.
"When all you have is a hammer ..."
I don;'t tend to have particularly large APC configs, 64 - 128MB max. Memcache can go to a couple of gigabytes or maybe more (far more if you run multiple instances). Both are also transient - a restart of Apache, or Memcache (the the latter is slightly less likely, or often) will lose the data
It depends then, on how often you are willing to process the data to produce the cache, and how long that cache could otherwise be useful for. If it was good for weeks before you re-scraped the pages - Mysql is a entirely suitable backing store.
Potential pther options, depending on how many items are being cached & how big the data is, are, as you suggest, a file-based cache, SQlite, or other systems.