Retrieving Records : Array VS DB - php

Concern about my page loading speed, I know there are a lot of factors that affect page loading time.
Does retrieving records (Categories) in a array instead of DB is faster?
Thanks

It is faster to keep it all in PHP till you have an absurd amount of records and you use up RAM.
BUT, both of these things are super fast. Selecting a handful of records on a single table that has an index should take less than a msec. Are you sure that you know the source of your web page slowness?
I would be a little bit cautious of having your Data in your code. It will make your system less maintainable. How will users change categories?
THis gets back to deciding if you want your site static versus dynamic.

Yes of course retrieving data from an array is much faster than retrieving data from a Database, but usually arrays and databases have totally different use cases, because data in an array is static (you type the value in code or in a separate file and you can't modify them) while data in a database is dynamic

Yes, it's probably faster to have an array of your categories directly in your PHP script, especially if you need all the categories on every page load. This makes it possible for APC to cache the array (if you have APC running), and also lessen the traffic to/from the database.
But is this where your bottleneck is? It seems to me as the categories should have been cached in the query cache and therefore be easily retrieved. If this is not your biggest bottleneck, chances are you won't see any decrease in loading times. Make sure to profile your application to find the large bottlenecks or you will waste your time on getting only small performance gains.

If you store categories in a database, you have to connect to the database, prepare a SQL statement, send it to the server, fetch the result set, and (probably) store the results in an array. (But you'll probably already have a connection to the database anyway, and hardware and software is designed to do this kind of work quickly.)
Storing and retrieving categories
from a database trades speed for
maintenance. Your data is always up
to date; it might take a little
longer to get it.
You can also store categories as constants or as literals in an array assignment. It would be smart to generate the constants or the array literals from data stored in the database, but you might not have to do that for every page load. If "categories" doesn't change much, you might be able to get away with generating the code once or twice a day, plus whenever someone adds a category. It depends on your application.
Storing and retrieving categories
from an array trades maintenance for
speed. Your data loads a little
faster; it might be incomplete.
The unsatisfying answer is that you're not going to be able to tell how different storage and page generation strategies affect page loading speed until you test them. And even testing isn't that easy, because the effect of changing server and database parameters can be, umm, surprising.
(You can also generate static pages from the database using php. I suggest you test some static pages to give you an idea of "best case" performance.)

Related

MySQL or JSON for data retrieval

So, I have situation and I need second opinion. I have database and it' s working great with all foreign keys, indexes and stuff, but, when I reach certain amount of visitors, around 700-800 co-current visitors, my server hits bottle neck and displays "Service temporarily unavailable." So, I had and idea, what if I pull data from JSON instead of database. I mean, I would still update database, but on each update I would regenerate JSON file and pull data from it to show on my homepage. That way I would not press my CPU to hard and I would be able to make some kind of cache on user-end.
What you are describing is caching.
Yes, it's a common optimization to avoid over-burdening your database with query load.
The idea is you store a copy of data you had fetched from the database, and you hold it in some form that is quick to access on the application end. You could store it in RAM, or in a JSON file. Some people operate a Memcached or Redis in-memory database as a shared resource, so your app can run many processes or threads that access the same copy of data in RAM.
It's typical that your app reads some given data many times for every single time it updates the data. The greater this ratio of reads to writes, the better the savings in terms of lightening the load on your database.
It can be tricky, however, to keep the data in cache in sync with the most recent changes in the database. In other words, how do all the cache copies know when they should re-fetch the data from the database?
There's an old joke about this:
There are only two hard things in Computer Science: cache invalidation and naming things.
— Phil Karlton
So after another few days of exploring and trying to get the right answer this is what I have done. I decided to create another table, instead of JSON, and put all data, that was suposed to go in JSON file, in the table.
WHY?
Number one reason is MySQL has ability to lock tables while they're being updated, JSON has not.
Number two is that I will downgrade from few dozens of queries to just one, simplest, query: SELECT * FROM table.
Number three is that I have better control over content this way.
Number four, while I was searching for answer I found out that some people had issues with JSON availability if a lot of co-current connections were making request for same JSON, I would never have a problem with availability.

problems due to large number of sql queries in a php page

i am working on a project where i need to put large number of sql queries on a single page ..
my question is that is there any problem that i will be having in future if my site gets heavy traffic ...
i do not want my site to slow down..
please suggest some way so that the number of queries does not affect my site performance..
i am working on php
sql query may look like
$selectcomments=mysql_query("select `comment`,`email`,`Date` from `fk_views` where (`onid`='$idselect_forcomments' and comment !='') order by Date asc");
Of course, if your site gets bigger, you will have problems putting everything on one page. That's logic, and you can't change it.
Different solutions:
Pagination: you could create a pagination system (plenty of tutorials out there... http://net.tutsplus.com/tutorials/php/how-to-paginate-data-with-php/)
If it's possible, divide your pages. Don't have all the comments on one and only one page. Try to have different pages, with different type of data, so it'll divide the load.
It's obvious that if your database gets too big, it'll be impossible to simply dump all the data on one page. Even the quickest browsers would crash.
One thing what you can do is Memcached. It will store in cache those results. So for the next visitor, who chick on the same page will read a cached objects from sql not need to run again.
Other trick: order by Date asc if you have huge result, than better, faster to do it in PHP side, those can rally slow down the query if they need to do a full table scan.
Other like Yannik told: pagination ( this is basic, ofc ) and divide pages.
You can speed up the delay from pagination with pre-executing sql with Ajax: get count of total results for pagination.
Yes, obviously if you have a lot of queries on a single page then even in moderate traffic it can flood your database with queries.
Few Tips:
1)You should work on your database structure,how you have created tables,which table stores what,normalization etc.Try to optimise storage and retrieval of information so that in a single query,you fetch max. information.This will reduce the calls to database.
2)Never store and fetch redundant info (like age,that you can calculate from DOB) from database.
3)Pagination (as pointed earlier).
4)Caching
5)If you are updating small portions on your page at a time,then instead of loading entire page,use AJAX to update necessary portions.It will also increase interactivity.

Which is faster / more efficient - lots of little MySQL queries or one big PHP array?

I have a PHP/MySQL based web application that has internationalization support by way of a MySQL table called language_strings with the string_id, lang_id and lang_text fields.
I call the following function when I need to display a string in the selected language:
public function get_lang_string($string_id, $lang_id)
{
$db = new Database();
$sql = sprintf('SELECT lang_string FROM language_strings WHERE lang_id IN (1, %s) AND string_id=%s ORDER BY lang_id DESC LIMIT 1', $db->escape($lang_id, 'int'), $db->escape($string_id, 'int'));
$row = $db->query_first($sql);
return $row['lang_string'];
}
This works perfectly but I am concerned that there could be a lot of database queries going on. e.g. the main menu has 5 link texts, all of which call this function.
Would it be faster to load the entire language_strings table results for the selected lang_id into a PHP array and then call that from the function? Potentially that would be a huge array with much of it redundant but clearly it would be one database query per page load instead of lots.
Can anyone suggest another more efficient way of doing this?
There isn't an answer that isn't case sensitive. You can really look at it on a case by case statement. Having said that, the majority of the time, it will be quicker to get all the data in one query, pop it into an array or object and refer to it from there.
The caveat is whether you can pull all your data that you need in one query as quickly as running the five individual ones. That is where the performance of the query itself comes into play.
Sometimes a query that contains a subquery or two will actually be less time efficient than running a few queries individually.
My suggestion is to test it out. Get a query together that gets all the data you need, see how long it takes to execute. Time each of the other five queries and see how long they take combined. If it is almost identical, stick the output into an array and that will be more efficient due to not having to make frequent connections to the database itself.
If however, your combined query takes longer to return data (it might cause a full table scan instead of using indexes for example) then stick to individual ones.
Lastly, if you are going to use the same data over and over - an array or object will win hands down every single time as accessing it will be much faster than getting it from a database.
OK - I did some benchmarking and was surprised to find that putting things into an array rather than using individual queries was, on average, 10-15% SLOWER.
I think the reason for this was because, even if I filtered out the "uncommon" elements, inevitably there was always going to be unused elements as a matter of course.
With the individual queries I am only ever getting out what I need and as the queries are so simple I think I am best sticking with that method.
This works for me, of course in other situations where the individual queries are more complex, I think the method of storing common data in an array would turn out to be more efficient.
Agree with what everybody says here.. it's all about the numbers.
Some additional tips:
Try to create a single memory array which holds the minimum you require. This means removing most of the obvious redundancies.
There are standard approaches for these issues in performance critical environments, like using memcached with mysql. It's a bit overkill, but this basically lets you allocate some external memory and cache your queries there. Since you choose how much memory you want to allocate, you can plan it according to how much memory your system has.
Just play with the numbers. Try using separate queries (which is the simplest approach) and stress your PHP script (like calling it hundreds of times from the command-line). Measure how much time this takes and see how big the performance loss actually is.. Speaking from my personal experience, I usually cache everything in memory and then one day when the data gets too big, I run out of memory. Then I split everything to separate queries to save memory, and see that the performance impact wasn't that bad in the first place :)
I'm with Fluffeh on this: look into other options at your disposal (joins, subqueries, make sure your indexes reflect the relativity of the data -but don't over index and test). Most likely you'll end up with an array at some point, so here's a little performance tip, contrary to what you might expect, stuff like
$all = $stmt->fetchAll(PDO::FETCH_ASSOC);
is less memory efficient compared too:
$all = array();//or $all = []; in php 5.4
while($row = $stmt->fetch(PDO::FETCH_ASSOC);
{
$all[] = $row['lang_string '];
}
What's more: you can check for redundant data while fetching the data.
My answer is to do something in between. Retrieve all strings for a lang_id that are shorter than a certain length (say, 100 characters). Shorter text strings are more likely to be used in multiple places than longer ones. Cache the entries in a static associative array in get_lang_string(). If an item isn't found, then retrieve it through a query.
I am currently at the point in my site/application where I have had to put the brakes on and think very carefully about speed. I think these speed tests mentioned should consider the volume of traffic on your server as an important variable that will effect the results. If you are putting data into javascript data structures and processing it on the client machine, the processing time should be more regular. If you are requesting lots of data through mysql via php (for example) this is putting demand on one machine/server rather than spreading it. As your traffic grows you are having to share server resources with many users and I am thinking that this is where getting JavaScript to do more is going to lighten the load on the server. You can also store data in the local machine via localstorage.setItem(); / localstorage.getItem(); (most browsers have about 5mb of space per domain). If you have data in database that does not change that often then you can store it to client and then just check at 'start-up' if its still in date/valid.
This is my first comment posted after having and using the account for 1 year so I might need to fine tune my rambling - just voicing what im thinking through at present.

Which is better? An extra database call or a generated PHP file?

I want to add some static information associated with string keys to all of my pages. The individual PHP pages use some of that information filtered by a query string. Which is the better approach to add this information? Generate a 100K (or larger if more info is needed later) PHP file with an associated array or add an other DB table with this info and query that?
The first solution involves loading the 100K file every time even if I use only some of the information on the current page. The second on the other hand adds an extra database call to the rendering of every page.
Which is the less costly if there are a large number of pages? Loading a PHP file or making an extra db call?
Unless it is shown to really be a bottleneck (be it including the php file or querying the database), you should choose the option that is best maintainable.
My guess is that it is the second option. Store it in a database.
Storing it in a database is a much better plan. With the database you can provide better data constraints, more easily cross reference with other data and create strong relationships. You may or may not need that at this time, but it's a much more flexible solution in the end.
What is the data used for? I'm wondering if the data you need could be stored in a session variable/cookie once it is pulled from the database which would allow you to not query the db on the rendering of every page.
If you were to leverage a PHP file then utilizing APC or some other opcode cache will mitigate performance concerns as your PHP files will only be loaded each time the file changes.
However, as others have noted, a database is the best place to store this stuff as it is much easier to maintain (this should be your priority to begin with).
Having ensured ease of maintenance and a working application, should you require a performance boost then generally accepted practice would be to cache this static data in an in-memory key/value store such as memcached. This will give you rapid access to your static values (for most requests).
I wouldn't call this information "static".
To me, it's just a routine call to get dome information from the database, among other calls being made to assemble whole page. What I am missing?
And I do agree with Dennis, all optimizations should be based on real needs and profiling. Otherwise it's effect could be opposite.
If you want to utilize some caching, consider to implement Conditional GET for the whole page.

performance issue on displaying records

I have a table with just 3,000 records.
I render these 3000 records in the home page without pagination, my client is not interested in pagination...
So to show page completely it takes around 1 min, 15 sec. What can be done to make the page load more quickly?
My table structure:
customer table
customer id
customer name
guider id
and few columns
guider table
guider id
guider name
and few columns
Where's the slow down? The query or the serving?
If the former, see the comments above. If the latter:
Enable gzip on the server. Otherwise capture the [HTML?] output to a file, compress it (zip), then serve it as a download. Same for any other format if you think something else can render it better than a browser (CSV and Open Office).
If you're outputting the data into a HTML table then you may have an issue where the browser is waiting for the end of the table before rendering it. You can either break this into multiple table chunks like every 500 records/rows or try CSS "table-layout: fixed;".
Check the Todos
sql Connection (dont open the
connection in loop) for query it
should be one time connection
check your queries and analyse it if you are using some complex logic
which can be replaced
use standard class for sql connection and query ; use ezsql
sql query best practice
While you could implement a cache to do this, you don't necessarily need to do so, an introducing unnecessary cache structures can often cause problems of its own. Depending on where the bottleneck is, it may not even help you much, or at all.
You need to look in two places for your analysis:
1) The query you're using to get your data. Take a look at its plan, or if you're not comfortable doing that, run it in your favorite query tool and see how long it takes to come back. If it doesn't take too long, you've got a pretty good idea that your bottleneck isn't the query. If the query itself takes a long time, that's where you should focus your efforts.
2) How your page is rendering. What is the size of your page, in bytes? It may be too big. Can you cut the size down by formatting? Can you more effectively use CSS to eliminate duplicate styling on the page? Are you using a fixed or dynamic table layout? Dynamic is generally going to be quite a bit slower, especially for large tables. Try to avoid nesting tables. Do everything you can to make the page as small as possible, and keep testing!
while displaying records i want to
display guidername so , i did once
function that return the guider name
Sounds like you need to use a JOIN. Here's a simple example:
SELECT * FROM customer JOIN guider ON guider.id=customer.guider_id
This will change your page from using N + 1 (3001) queries to just one.
Make sure both guider.id and customer.guider_id are indexed and of appropriate data types (such as integers).
This is a little list, what you should think about for improving the performance, the importance is relative to each point, so the first ist not to be the most important to you - which depends on the details of your project.
Check your database structure. If there are just these two tables, their might be little you can do. But keep in mind that there is stuff like indices and with an increasing number of records a second denormalizes table structure will improve the speed of retrieving results.
Use rather one Query for selecting your data, than iterating through ids and doing selects repeatedly
Run a separate Query for the guiders, I assume there are only a few of them. Save all guiders in a data structure, e.g. a dictionary, first and use the foreign key to apply the correct one to the current record - this might save a lot of data which has to be transmitted from the database to your web server.
Get your result set by using something like mysqli_result::fetch_all() which returns a 2-dimensional array with all results. This should be faster than iteration through each row with fetch_row()
Sanitize your HTML Output, use (external) CSS. This will save a lot of output space if you format your stuff with style=" ... a lot of formatting code ..." attributes in each line. If you use one large table, split them up in multiple tables (some browsers wait for the complete table to load before rendering it).
In a lot of languages very important: Use a string builder for concatenating your results into the output string!
Caching: Think about generating the output once a day or once an hour. Write it to a cachefile which is opened instead of querying the database and building the same stuff on every request. Maybe you want to offer this generated file as download, rather than displaying it as plain HTML Site on the web.
Last but not least, check the connections to webserver and database, the server load as well as the number of requests. If your servers are running on heavy load everything ales here might help reducing the load or you just have to upgrade hardware.
LOL
everyone is talking of big boys toys, like database structure, caching and stuff.
While the problem most likely lays in mere HTML and browsers.
Just to split whole HTML table in chunks will help first chunk to show up immediately while others will eventually come.
Only ones were right who said to profile whole thing first.
Trying to answer without profiling results is shooting in the dark.

Categories