A general PHP question about organizing a website: for efficiency purposes, is it better to store data from MySQL queries into global arrays, or to make a new query every time data is needed? I am thinking specifically of a sports stats-oriented website, with a lot of data that does not necessarily change very often.
I have heard that storing the data into arrays is much more efficient, but I don't see how since global variables are only global in the scope of the current PHP page. Ideally, I'd like to populate all my arrays once I start my server. Should I use session variables then? I haven't heard of anybody doing that.
Session variable won't resolve the issue, as the session is not global as well (unless you hack it by setting the same session_id to all visitors).
If you have a lot of traffic and you need to save queries, than use a cache server like memcached or redis.
If you can't install memcached or redis, you can create a PHP file that contains the arrays, and include it in the scripts - e.g. use file caching. The bad thing about this approach is that you will use a lot of memory - the whole data should be read in the memory by any PHP script, by any visitor. So in case the database is not the bottleneck, better keep the queries.
Related
I am currently programming a php site, which atm needs to query a large amount of data (about 4 - 5MB) everytime. I already have a session going and wanted to ask, if its good practice to store that data in the session variable?
The current plan is to also maintain a table in the Database containing when a table has changed last. If that timestamp would be newer, then the data would be queried again, if not, use the data of the session variable as its still consistent...
Is this a good way to avoid querying too much data? And what speed impacts would the site have when a session is about 5MB in size?
Thanks in advance!
It's not really good practice (it will make PHP chew far more memory than it really should), but I'm not sure how it will affect performance.
I suppose the real question is this: Why do you need to store so much in the session? If it's information that is meant to be accessible between sessions, then you should be storing it in a database and loading it 'at need'.
If it's binary data (images, files, etc.) that are only relevant while the session is valid, then store it in a temporary file for the user (look at tempnam() and sys_get_temp_dir()), then store the temporary filename on the session.
No, it's not good practice to do this.
Points to consider:
By defailt, the session data is stored on disk in a temp folder. Every time you call session_start() (ie every page load), it will have to load the whole of that data into memory and populate it into the session array. If you're loading large amounts of data, this could have performance implications.
Also, since you're loading this large chunk of data every time, it means that each page load will take more memory. This reduces the number of concurrent users that your server can support.
If you're doing this for caching purposes to reduce hits to your DB, there are much better solutions available. APCu, Memcache, Redis and others can all do a much better job of caching your data than your proposed custom-written solution. There are also wrapper libraries available that make it even easier and allow you to mix and match between caching solutions. If you're using a framework like Laravel or Symphony, there may be caching classes built into your framework. Alternatively, you could try a stand-alone library like phpFastCache. But also, don't forget that modern DB engines have their own caching mechanisms built in, so repeated calls to the same or similar queries should be reasonbly fast anyway.
For large arrays, is it better to save the data to global variables or query the database each time I need them? In my situation keeping them local scope and passing them to functions isn't an option.
I'm using wordpress and in most pages I get every user and all metadata attached to them. Often times I use these variables in multiple places on the same page. Unfortunately wordpress won't let me pass variables between templates so I'm stuck either using global variables or calling the database each time. Eventually, this will be hundreds of users with a lot of metadata attached to each. Should I call the database each time to keep the variables local, or should save them to global variables to save on database queries? What are the considerations? Should I worry about performance, overhead, and/or other issues?
Thanks so much!
The only real solution to your problem is using some kind of cache system (Memcache and Redis are your best options). Fortunately, there are plenty of Wordpress plugins that make the integration an easy thing. For instance:
Redis: https://wordpress.org/plugins/redis-object-cache/
Memcache: https://wordpress.org/plugins/memcached/
EDIT
If you only want to cache a few databases calls, you can forget about Wordpress plugins and start coding a bit. Let's say you only want to cache the call for retrieving the list of users from database, and let's assume you are using Memcache to accomplish this task (Memcache stores key-value pairs and allows super fast access to a value given a key).
Query Memcache asking for the key "users".
Memcache still doesn't have such key, so you'll have a cache fail and after it, you'll query your database to retrieve the user list. Now serialize the database response (serialize and json_encode are two different ways to do this) and store the key "users" along this serialized value in your memcache.
Next time you query your memcache asking for "users", you'll get a hit. In this moment you just have to unserialize the value and work with your user list.
And that's all. Now you just have to decide what you want to cache and apply this procedure to those elements.
You shouldn't have to perform the calls but once per page, you might have to execute the call once for every page. So I would suggest you creating some sort of class to interact with your database that you can call on to get the data that you need. I would also recommend using stored procedures and functions on your database instead of straight queries since this will help both with security and separation of application logic and data functionality.
I read a lot about how evil global variables are in PHP, but I am trying to optimize a code I am writing. In this webapp a lot of functions are using the same data (about up to 50 items at once) to perform numerous operations and the data itself is stored in a database.
I have two options which are a) fetching data from the database EVERY TIME a function needs it or b) fetching the data ONCE and storing it in (a) global variable(s).
When it comes to performance, which option is the best ?
There is nothing wrong with "global variables". It is passing data into functions using global keyword is prohibited (but nevertheless, using this keyword to pass indeed global variable is okay).
Yes, speaking of one script instance (and sane amount of data), there is no use for reaching for database for the same data again. Fetch it once and then use in whatever functions you need. It's okay and nothing wrong with it.
When it comes to performance, here comes the best option ever:
do care of performance only if you have a certain reason to.
I have an array containing my reference variables, and in my scripts I need to catch one variable or two. In the current system, I have to include the entire array (and its elements) to use one element. It seems that using a database is better for two reasons:
One record is read instead of the entire array
Variables can be easily edited
However, there is a major drawback for using database: on every php run, we need to make a connection to the database.
Since simple database systems like SQLite has no server, persistent_connection is not like advanced database servers like mysql.
In action,
$db = new SQLite3('mysqlitedb.db');
takes more time (and consumes more resources) than
include 'array.php';
Is there any solution for having a basic database system (with fast connection) to be a replacement to PHP array and include file?
In other words, I need a simple database system with fast connection comparable with fopen. However, even CDB which is incredibly fast, is not fast enough on initial connection.
By including the static array file you are essentially doing what caching systems do when they pull a result from a database. You are loading a pre-digested result directly from disk.
All database connections have some overhead (certainly more than including a rendered file). You use a database when you need operational maintainability for your data, but this comes at the cost of application overhead.
If you are not worried about persistance of the data, you may want to look at using a caching system like APC, memcached or redis.
Have you considered caching the variables? You could use APC or Memcached for this purpose. They will both be faster than a database since the data is stored in the RAM, not on the disc.
It will still be slower than just including the array.
Is it "better" (more efficient, faster, more secure, etc) to (A) cache data that is used on every page load in the $_SESSION array (but still querying a table for a flag to reload the data fresh), or (B) to load it from the database each time?
I'm using the cache method (A), but I'm worried that with hundreds of users, memory could become an issue? It's just simple data, like firstname, lastname, birthday, etc.
With either method, there's still a query being run. Thoughts?
If your data is used on every pages, and is the same for all users, I wouldn't cache it in $_SESSION (which means having a different copy of that data for each user), but with another mecanism, like :
file
In memory, with APC for instance (if only 1 server)
In memory, with memcached, for instance (if you have several servers)
If your data requires long calculations or several DB queries to be obtained, caching it in database could be another possibility (would mean only 1 query to fetch back, and less calculations)
If your data is not the same for each user (which seems to be the case in your situation, as you are caching names, birthdates, ...) :
I would make sure I only cache what is necessary
Once you only have a few data to cache, putting it in session should be quite OK
If you really have that many users, you'll probably have some other scalability problems, and will most likely come to use something like memcached anyway ; which means you'll have some other way of caching ;-)
As a sidenote : if you are doing the same query over and over again, you DB server should cache it by itself (for MySQL, it would go into the "query cache") ; so, it would not be as bad as you think, I suppose -- even if not that much optimized ^^
It depends on what you're session handler is. Your session handler could be MySQL, and thus the question would not be which is better, but how to optimize your session handling.
The default PHP session handler is files, but it can be changed to mysql quite easily.
If you're talking about non-user specific data, then just save it to the DB. Worry about optimizing if you run into problems later. It is usually much more beneficial to use a better design pattern then thinking about optimizing before hand. Design your code so you can easily use a different handler for storage, and you won't have optimizing problems later.
If it is user specific, use the session, but use an appropriate session handler if necessary.