Data fetching performance improve from mysql database - php

For example, i have a table "tbl_book" with 100 records or more with multiple column like book_name, book_publisher,book_author,book_rate in mysql "db_bookshop". Now i would like to fetch them all by one query without iterate 100 times instead of one or two time looping. Is it possible? Is there any tricky way to do that. Generally we do what
$result = mysql_query("SELECT desire_column_name FROM table_name WHERE clause");
while( $row = mysql_fetch_array($result) ) {
$row['book_name'];
$row['book_publisher'];
$row['book_author'];
..........
$row['book_rate'];
}
// Or we may can use mysqli_query(); -mysqli_fetch_row(), mysqli_fetch_array(), mysqli_fetch_assoc();
My question is, is there any idea or any tricky way that we can be
avoided 1oo times iterate for fetching 1oo records? It's may be wired
to someone but one of the most experience programmer told me that it's
possible. But unfortunately i was not able to learn it from him. I
feel sorry for him because he is not anymore. Advance thanks for your idea sharing.

You should not use mysql_query the mysql extension is deprecated:
This extension is deprecated as of PHP 5.5.0, and has been removed as of PHP 7.0.0.
-- https://secure.php.net/manual/en/intro.mysql.php
When you use PDO you can fetch all items without looping over query like this
$connection = new PDO('mysql:host=localhost;dbname=testdb', 'dbuser', 'dbpass');
$statement = $connection->query('SELECT ...');
$rows = $statement->fetchAll();

The short answer - NO, it's impossible to fetch more than one record from a database without a loop.
But the the question here is that you don't want it.
There is no point in "just fetching" the data - you're always going to do something with it. With each row. Obviously, a loop is a natural way to do something with each row. Therefore, there is no point in trying to avoid a loop.
Which renders your question rather meaningless.
Regarding performance. The truth is that you experience not a single performance problem related to fetching just 100 records from a database. Which renters your problem an imaginary one.
The only plausible question I can think off your post is your performance as a programmer, as lack of education makes you write a lot of unnecessary code. If you manage to ask a certain question regarding that matter, you'll be shown a way to avoid the useless repetitive typing.

Have you tried using mysql_fetch_assoc?
$result = mysql_query("SELECT desire_column_name FROM table_name WHERE clause");
while ($row = mysql_fetch_assoc($result)) {
// do stuff here like..
if (!empty($row['some_field'])){
echo $row["some_field"];
}
}

It is possible to read all 100 records without loop by hardcoding the main column values, but that would involve 100 x number of columns to be listed, and there could be limitation on the number of columns you can display in MySQL.
eg,
select
case when book_name='abc' then book_name end Name,
case when book_name='abc' then book_publisher end as Publisher,
case when book_name='abc' then book_author end as Author,
case when book_name='xyz' then book_name end Name,
case when book_name='xyz' then book_publisher end as Publisher,
case when book_name='xyz' then book_author end as Author,
...
...
from
db_bookshop;
It's not practical but if you have less rows to query you might find it useful.

The time taken to ask the MySQL server for something is far greater than one iteration through a client-side WHILE loop. So, to improve performance, the goal is to have the SELECT go to the server in one round trip. Different API calls do this or don't do this; read their details.
I have written a lot of UIs with MySQL under the covers. I think nothing of fetching a few dozen rows at once, and then build a <table> (or something) with the results. I rarely fetch more than 100, not because of performance, but because 100 is (usually) too much for the user to take in on a single web page.
Also, I think nothing of issuing several, maybe dozens, of queries in support of a single web page. The delay is insignificant, especially when compared to the user's time for reading, digesting, and moving to the next page. So, I try to give the user a digestible amount of info without having to click to another page to get more. There are tradeoffs.
When it is practical to have SQL do the 'digesting', do so. It is faster for MySQL do do a SUM() and return just the total, rather than return dozens of rows for the client to add up. This is mostly a 'bandwidth' issue. Either way, MySQL will fetch (internally) all the needed rows.

Related

What's faster, db calls or resorting an array?

In a site I maintain I have a need to query the same table (articles) twice, once for each category of article. AFAIT there are basically two ways of doing this (maybe someone can suggest a better, third way?):
Perform the db query twice, meaning the db server has to sort through the entire table twice. After each query, I iterate over the cursor to generate html for a list entry on the page.
Perform the query just once and pull out all the records, then sort them into two separate arrays. After this, I have to iterate over each array separately in order to generate the HTML.
So it's this:
$newsQuery = $mysqli->query("SELECT * FROM articles WHERE type='news' ");
while($newRow = $newsQuery->fetch_assoc()){
// generate article summary in html
}
// repeat for informational articles
vs this:
$query = $mysqli->query("SELECT * FROM articles ");
$news = Array();
$info = Array();
while($row = $query->fetch_assoc()){
if($row['type'] == "news"){
$news[] = $row;
}else{
$info[] = $row;
}
}
// iterate over each array separate to generate article summaries
The recordset is not very large, current <200 and will probably grow to 1000-2000. Is there a significant different in the times between the two approaches, and if so, which one is faster?
(I know this whole thing seems awfully inefficient, but it's a poorly coded site I inherited and have to take care of without a budget for refactoring the whole thing...)
I'm writing in PHP, no framework :( , on a MySql db.
Edit
I just realized I left out one major detail. On a given page in the site, we will display (and thus retrieve from the db) no more than 30 records at once - but here's the catch: 15 info articles, and 15 news articles. On each page we pull the next 15 of each kind.
You know you can sort in the DB right?
SELECT * FROM articles ORDER BY type
EDIT
Due to the change made to the question, I'm updating my answer to address the newly revealed requirement: 15 rows for 'news' and 15 rows for not-'news'.
The gist of the question is the same "which is faster... one query to two separate queries". The gist of the answer remains the same: each database roundtrip incurs overhead (extra time, especially over a network connection to a separate database server), so with all else being equal, reducing the number database roundtrips can improve performance.
The new requirement really doesn't impact that. What the newly revealed requirement really impacts is the actual query to return the specified resultset.
For example:
( SELECT n.*
FROM articles n
WHERE n.type='news'
LIMIT 15
)
UNION ALL
( SELECT o.*
FROM articles o
WHERE NOT (o.type<=>'news')
LIMIT 15
)
Running that statement as a single query is going to require fewer database resources, and be faster than running two separate statements, and retrieving two disparate resultsets.
We weren't provided any indication of what the other values for type can be, so the statement offered here simply addresses two general categories of rows: rows that have type='news', and all other rows that have some other value for type.
That query assumes that type allows for NULL values, and we want to return rows that have a NULL for type. If that's not the case, we can adjust the predicate to be just
WHERE o.type <> 'news'
Or, if there are specific values for type we're interested in, we can specify that in the predicate instead
WHERE o.type IN ('alert','info','weather')
If "paging" is a requirement... "next 15", the typical pattern we see applied, LIMIT 30,15 can be inefficient. But this question isn't asking about improving efficiency of "paging" queries, it's asking whether running a single statement or running two separate statements is faster.
And the answer to that question is still the same.
ORIGINAL ANSWER below
There's overhead for every database roundtrip. In terms of database performance, for small sets (like you describe) you're better off with a single database query.
The downside is that you're fetching all of those rows and materializing an array. (But, that looks like that's the approach you're using in either case.)
Given the choice between the two options you've shown, go with the single query. That's going to be faster.
As far as a different approach, it really depends on what you are doing with those arrays.
You could actually have the database return the rows in a specified sequence, using an ORDER BY clause.
To get all of the 'news' rows first, followed by everything that isn't 'news', you could
ORDER BY type<=>'news' DESC
That's MySQL short hand for the more ANSI standards compliant:
ORDER BY CASE WHEN t.type = 'news' THEN 1 ELSE 0 END DESC
Rather than fetch every single row and store it in an array, you could just fetch from the cursor as you output each row, e.g.
while($row = $query->fetch_assoc()) {
echo "<br>Title: " . htmlspecialchars($row['title']);
echo "<br>byline: " . htmlspecialchars($row['byline']);
echo "<hr>";
}
Best way of dealing with a situation like this is to test this for yourself. Doesn't matter how many records do you have at the moment. You can simulate whatever amount you'd like, that's never a problem. Also, 1000-2000 is really a small set of data.
I somewhat don't understand why you'd have to iterate over all the records twice. You should never retrieve all the records in a query either way, but only a small subset you need to be working with. In a typical site where you manage articles it's usually about 10 records per page MAX. No user will ever go through 2000 articles in a way you'd have to pull all the records at once. Utilize paging and smart querying.
// iterate over each array separate to generate article summaries
Not really what you mean by this, but something tells me this data should be stored in the database as well. I really hope you're not generating article excerpts on the fly for every page hit.
It all sounds to me more like a bad architecture design than anything else...
PS: I believe sorting/ordering/filtering of a database data should be done on the database server, not in the application itself. You may save some traffic by doing a single query, but it won't help much if you transfer too much data at once, that you won't be using anyway.

PHP/MySQL: Massive SQL query or several smaller queries?

I have a database design here that looks this in simplified version:
Table building:
id
attribute1
attribute2
Data in there is like:
(1, 1, 1)
(2, 1, 2)
(3, 5, 4)
And the tables, attribute1_values and attribute2_values, structured as:
id
value
Which contains information like:
(1, "Textual description of option 1")
(2, "Textual description of option 2")
...
(6, "Textual description of option 6")
I am unsure whether this is the best setup or not, but it is done as such per requirements of my project manager. It definitely has some truth in it as you can modify the text easily now without messing op the id's.
However now I have come to a page where I need to list the attributes, so how do I go about there? I see two major options:
1) Make one big query which gathers all values from building and at the same time picks the correct textual representation from the attribute{x}_values table.
2) Make a small query that gathers all values from the building table. Then after that get the textual representation of each attribute one at a time.
What is the best option to pick? Is option 1 even faster as option 2 at all? If so, is it worth the extra trouble concerning maintenance?
Another suggestion would be to create a view on the server with only the data you need and query from that. That would keep the work on the server end, and you can pull just what you need each time.
If you have a small number of rows in attributes table, then I suggest to fetch them first, fetch all of them! store them into some array using id as index key in array.
Then you can proceed with building data, now you just have to use respective array to look for attribute value
I would recommend something in-between. Parse the result from the first table in php, and figure out how many attributes you need to select from each attribute[x]_values table.
You can then select attributes in bulk using one query per table, rather than one query per attribute, or one query per building.
Here is a PHP solution:
$query = "SELECT * FROM building";
$result = mysqli_query(connection,$query);
$query = "SELECT * FROM attribute1_values";
$result2 = mysqli_query(connection,$query);
$query = "SELECT * FROM attribute2_values";
$result3 = mysqli_query(connection,$query);
$n = mysqli_num_rows($result);
for($i = 1; $n <= $i; $i++) {
$row = mysqli_fetch_array($result);
mysqli_data_seek($result2,$row['attribute1']-1);
$row2 = mysqli_fetch_array($result2);
$row2['value'] //Use this as the value for attribute one of this object.
mysqli_data_seek($result3,$row['attribute2']-1);
$row3 = mysqli_fetch_array($result3);
$row3['value'] //Use this as the value for attribute one of this object.
}
Keep in mind that this solution requires that the tables attribute1_values and attribute2_values start at 1 and increase by 1 every single row.
Oracle / Postgres / MySql DBA here:
Running a query many times has quite a bit of overhead. There are multiple round trips to the db, and if it's on a remote server, this can add up. The DB will likely have to parse the same query multiple times in MySql which will be terribly inefficient if there are tons of rows. Now, one thing that your PHP method (multiple queries) has as an advantage is that it'll use less memory as it'll release the results as they're no longer needed (if you run the query as a nested loop that is, but if you query all the results up front, you'll have a lot of memory overhead, depending on the table sizes).
The optimal result would be to run it as 1 query, and fetch the results 1 at a time, displaying each one as needed and discarding it, which can reek havoc with MVC frameworks unless you're either comfortable running model code in your view, or run small view fragments.
Your question is very generic and i think that to get an answer you should give more hints to how this page will look like and how big the dataset is.
You will get all the buildings with theyr attributes or just one at time?
Cause your data structure look like very simple and anything more than a raspberrypi can handle it very good.
If you need one record at time you don't need any special technique, just JOIN the tables.
If you need to list all buildings and you want to save db time you have to measure your data.
If you have more attribute than buildings you have to choose one way, if you have 8 attributes and 2000 buildings you can think of caching attributes in an array with a select for each table and then just print them using the array. I don't think you will see any speed drop or improvement with so simple tables on a modern computer.
$att1[1]='description1'
$att1[2]='description2'
....
Never do one at a time queries, try to combine them into a single one.
MySQL will cache your query and it will run much faster. PhP loops are faster than doing many requests to the database.
The query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again.
http://dev.mysql.com/doc/refman/5.1/en/query-cache.html

Looking for recommended methods for storing/cacheing counts [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm building a website using php/mysql where there will be Posts and Comments.
Posts need to show number of comments they have. I have count_comments column in Posts table and update it every time comment is created or deleted.
Someone recently advised me that denormalazing this way is a bad idea and I should be using caching instead.
My take is: You are doing the right thing. Here is why:
See the field count_comments as not being part of your data model - this is easily provable, you can delete all contents of this field and it is trivial to recreate it.
Instead see it as a cache, the storage of which is just co-located with the post - perfectly smart, as you get it for free whenever you have to query for the post(s)
I do not think this is a bad approach.
One thing i do recognize is that its very easy to introduce side effects as code base is expanded by having a more rigid approach. The nice part is at some point the amount of rows in the database will have to be calculated or kept track of, there is not really a way of getting out of this.
I would not advise against this. There are other solutions to getting comment counts. Check out Which is fastest? SELECT SQL_CALC_FOUND_ROWS FROM `table`, or SELECT COUNT(*)
The solution is slower upon selects, but requires less code to keep track of comment count.
I will say that your approach avoids LIMIT DE-optimization, which is a plus.
This is an optimization that is almost never needed for two reasons:
1) Proper indexing will make simple counts extremely fast. Ensure that your comments.post_id column has an index.
2) By the time you need to cache this value, you will need to cache much more. If your site has so many posts, comments, users and traffic that you need to cache the comments total, then you will almost definitely need to be employing caching strategies for much of your data/output (saving built pages to static, memcache, etc.). Those strategies will, no doubt, encompass your comments total, making the table field approach moot.
I have no idea what was meant by "Caching" and I'll be interested in some other answer that the one I have to offer:
Remove redundant information from your database is important and, in a "Believer way" (means that I didn't really test it, its merely speculative), I think that using SUM() function from your database is a better way to go for it.
Assuming that all your comments has a post_id, all you need is something like:
SELECT SUM(id) FROM comments WHERE id = {post_id_variation_here}
That way, you reduce 1 constant CRUD happening just to read how much comments there are and increase performance.
Unless you haven't hundreds or thousands of hits per seconds on your application there's nothing wrong about using a SQL statement like this:
select posts_field1, ..., (select count(*) from comments where comments_parent = posts_id) as commentNumber from posts
you can go with caching the html output of your page anyway. than no database query has to be done at all.
Maby you could connect the post and comment tables to each other and count the comments rows in mysql with the mysql function: mysql_num_rows. Like so:
Post table
postid*
postcontent
Comment table
commentid
postid*
comment
And then count the comments in mysql like:
$link = mysql_connect("localhost", "mysql_user", "mysql_password");
mysql_select_db("database", $link);
$result = mysql_query("SELECT * FROM commenttable WHERE postid = '1'", $link);
$num_rows = mysql_num_rows($result);

Is this mysql statement inefficient?

Quite simply, is such a query/statement inefficient or bad?
<?
$strSql="SELECT * FROM clients, projects
WHERE clients.clientID = $intClientId
AND projects.clientID=$intClientId LIMIT 1";
$objResult=mysql_query($strSql);
if(mysql_num_rows($objResult)==0) {
echo("No data"); }
while ($arrRow=mysql_fetch_array($objResult))
{
?>
<h1>Sub Project(s) for: <span><?=$arrRow[clientName]?></span></h1>
<?
} ?>
In general, you should avoid using SELECT * and select only the fields you need unless absolutely necessary. Whether or not this is efficient depends on how your tables are indexed. I assume in this case that that clientID is the primary key of the clients table. If you have an index on clientID in the projects table, this query should be quite fast.
There are several things that are off here:
As Michael Mior mentions, you should avoid SELECT *. This can be an efficiency issue. It can also break applications in some cases if your application makes assumptions about the columns that are in the table and then the table changes in the database.
You have a LIMIT 1 in your query, but then you loop over the results. This doesn't make sense because LIMIT 1 means you'll only get one row of results, regardless of how many rows matched your query.
You are not escaping your inputs. This may be OK in this case if, in earlier code, you have already verified that those variables definitely contain integer values. I generally just use prepared statements and avoid this problem altogether.
And i also want to add one little advice: Try not to use php shorttags(Use <?php echo instead of <?=) from now. Since PHP6 it will not be supported and it might create code errors and other difficulties for you in the future.
When writing, or should I say developing new features, one should always debug and run a performance profile on query to ensure that you'r using proper index(s). Always use EXPLAIN (or EXPLAIN EXTENDED if your into specifics) on a query to determine its performance.
EXPLAIN SELECT * FROM clients, projects WHERE clients.clientID = 1 AND projects.clientID = 1
I noted also that your doing a while() loop, which is unnecessary if your are only fetching a single row.
I prefer writing projects.clientID=clients.clientID so to be more easy to see how the tables are joined.
The other thing I will avoid (mostly for clarity) is the while loop. Since you expect one record, no need to loop over the result set.
And finally, it is not a good idea to use the short tags.

One SQL query, or many in a loop?

I need to pull several rows from a table and process them in two ways:
aggregated on a key
row-by-row, sorted by the same key
The table looks roughly like this:
table (
key,
string_data,
numeric_data
)
So I'm looking at two approaches to the function I'm writing.
The first would pull the aggregate data with one query, and then query again inside a loop for each set of row-by-row data (the following is PHP-like pseudocode):
$rows = query(
"SELECT key,SUM(numeric_data)
FROM table
GROUP BY key"
);
foreach ($rows as $row) {
<process aggregate data in $row>
$key = $row['key'];
$row_by_row_data = handle_individual_rows($key);
}
function handle_individual_rows($key)
{
$rows = query(
"SELECT string_data
FROM table WHERE key=?",
$key
);
<process $rows one row at a time>
return $processed_data;
}
Or, I could do one big query and let the code do all the work:
$rows = query(
"SELECT key, string_data, numeric_data
FROM table"
);
foreach ($rows as $row) {
<process rows individually and calculate aggregates as I go>
}
Performance is not a practical concern in this application; I'm just looking to write sensible and maintainable code.
I like the first option because it's more modular -- and I like the second option because it seems structurally simple. Is one option better than the other or is it really just a matter of style?
One SQL query, for sure.
This will
Save you lots of roundtrips to database
Allow to use more efficient GROUP BY methods
Since your aggregates may be performed equally well by the database, it will also be better for mainainability: you have all your resultset logic in one place.
Here is an example of a query that returns every row and calculates a SUM:
SELECT string_data, numeric_data, SUM(numeric_data) OVER (PARTITION BY key)
FROM table
Note that this will most probably use parallel access to calculate SUM's for different key's, which is hardly implementable in PHP.
Same query in MySQL:
SELECT key, string_data, numeric_data,
(
SELECT SUM(numeric_data)
FROM table ti
WHERE ti.key = to.key
) AS key_sum
FROM table to
If performance isn't a concern, I'd go with the second. Seems the tiniest bit friendlier.
If performance were a concern, my answer would be "don't think, profile". :)
The second answer is by far more clear, sensible and maintainable. You're saying the same thing with less code, which is usually better.
And I know you said performance is not a concern, but why fetch data more than you have to?
I can't be certain from the example here, but I'd like to know if there's a chance to do the aggregation and other processing right in the SQL query itself. In this case, you'd have to evaluate "more maintainable" with respect to your relative comfort level expressing that processing in SQL code vs. PHP code.
Is there something about the additional processing you need to do on each row that would prevent you from expressing everything in the SQL query itself?
I don't think you'll find many situations at all where doing a query-per-iteration of a loop is the better choice. In fact, I'd say it's probably a good rule of thumb to never do that.
In other words, the fewer round trips to the database, the better.
Depending on your data and actual tables, you might be able to let SQL do the aggregation work and select all the rows you need with one query.
one sql query is probably a better idea.
It avoids you having to re-write relational operations
I think somehow you've answered your own question, because you say you have two different processings : one aggregation and one row by row.
if you want to keep everything readable and maintainable, mixing both in a single query doesn't sound right, the query will answer two different needs so it won't be very readable
even if perf is not an issue, it's faster to do the aggregation on the DB server instead of doing it in code
with only one query, the code that will handle the result will mix two processings, handling rows and computing aggregations in the same time, so in time this code will tend to get confusing and buggy
the same code might evolve over time, for instance the row-by-row can get complex and could create bugs in the aggregation part or the other way around
if in the future you'll need to split these two treatments, it will be harder to disentangle the code that at that moment, somebody else has written ages ago...
Performance considerations aside, in terms of maintainability and readability I'd recommend to use two queries.
But keep in mind that the performance factor might not be an issue at the moment, but it can be in time once the db volume grows or whatever, it's never a negligible factor on long term ...
Even if perf is not an issue, your mind is. When a musician practices every movement is intended to improve the musician's skill. As a developer, you should develop every procedure to improve your skill. iterative loops though data is sloppy and ugly. SQL queries are elegant. Do you want to develop more elegant code or more sloppy code?

Categories