Database structure:
id galleryId type file_name description
1 `artists_2010-01-15_7c1ec` `image` `band602.jpg` `Red Umbrella Promo`
2 `artists_2010-01-15_7c1ec` `image` `nov7.jpg` `CD Release Party`
3 `artists_2010-01-15_7c1ec` `video` `band.flv` `Presskit`
I'm going to pull images out for one section of an application, videos on another, etc. Is it better to make multiple mysql queries for each section like so:
$query = mysql_query("SELECT * FROM galleries WHERE galleryId='$galleryId' && type='image');
...Or should I be building an associative array and just looping through the array over and over whenever I need to use the result set?
Thanks for the thoughts.
It depends what's more important: readability or performance. I'd expect a single query and prefilling PHP arrays would be faster to execute, since database connections are expensive, but then a simple query for each section is much more readable.
Unless you know (and not just hope) you're going to get a huge amount of traffic I'd go for separate queries and then worry about optimising if it looks like it'll be a problem. At that point there'll be other things you'll want to do anyway, such as building a data access layer and adding some caching.
If by "sections" you mean separate single pages (separate HTTP requests) that users can view, I would suggest query-per-type as needed. If on a page where there are only image data sets, you really don't need to fetch the video data set for example. You won't be really saving much time fetching everything, since you will be connecting to the database for every page hit anyway (I assume.)
If by "sections" you mean different parts of one page, then fetch everything at once. This will save you time on querying (only one query.)
But depending on the size of your data set, you could run into trouble with PHP's memory limit querying for everything, though. You could then try raising the memory limit, but if that fails you'll probably have to fall back to query-per-type.
Using the query-per-type approach moves some of the computing load to the database server, as you will only be requesting and fetching what you really need. And you don't have to write code to filter and sort your results. Filtering and sorting is something the database is generally better at than PHP code. If at all possible, enable MySQL's query cache, that will speed up these queries much more than anything you could write in PHP.
If your data is all coming from one table, I would only do one query.
I presume you are building a single page with a section for pictures, a section for video, a section for music, etc. Write your query return results sorted by media type - iterate through all the pictures, then all the video, then all the music.
Better to have multiple queries. Every time you run a query all the data is getting pulled out and loaded into memory. If you have 5 different types, it means each page of that type is loading 5 times as much data as it needs to do.
Even with just one at a time, you are probably going to want to start paginating with LIMIT/OFFSET queries fairly quickly if you have more than 100 or however many you can reasonably display on one page at a time.
It really depends,
IN operator
ini_set('memory_limit', '-1');
$startMemory = memory_get_usage();
$conn = mysqli_connect("localhost", "", "", "");
$ar = array();
$sql = "SELECT * FROM table WHERE e IN (.....)";
$result = mysqli_query($conn, $sql);
while ($row = mysqli_fetch_assoc($result)) {
$ar[$row['c']] = $row;
}
echo (memory_get_usage() - $startMemory) / 1024 / 1024, ' MB'; //1409.7124481201
$end_time = microtime(true);
echo ($end_time - $start_time) . ' Seconds'; //5.2406549453735 Seconds
Foreach
ini_set('memory_limit', '-1');
$startMemory = memory_get_usage();
$conn = mysqli_connect("localhost", "", "", "");
$ar = array();
$array_loop = array(....)
foreach($array_loop as $key => $value){
$sql = "SELECT * FROM table WHERE e = '$value'";
$result = mysqli_query($conn, $sql);
while ($row = mysqli_fetch_assoc($result)) {
$ar[$row['c']] = $row;
}
}
echo (memory_get_usage() - $startMemory) / 1024 / 1024, ' MB'; //42.773330688477 MB
$end_time = microtime(true);
echo ($end_time - $start_time) . ' Seconds'; //12.469061136246 Seconds
I noticed that foreach consumes time but not memory and IN operator consumes memory but not time. All the test done based on test data generated by sql procudre about 1 Million
Related
I am using php to get special records from Database.
which one is better?
1.
Select * From [table] Limit 50000, 10;
while($row = $stmt->fetch()){
//save in array, total 10 times
}
or
2.
Select * From [table];
$start = 50000;
$length = 10;
while($row = $stmt->fetch()){
if($i < $start+$length && $j >=$start){
//save in array, total 50010 times
}
}
In this case, which one should I use?
Which one using DB with less resources?
which one is better?
Too vague: what is "better"?
Which one using DB with less resources?
You're much better off with the first approach. It's efficient to select as little data as you need and no more. Selecting the whole table will force your script to use a lot more memory because all that data needs to be kept live
The best answer you'll get is: test! You can run your queries multiple times in multiple ways and see for yourself. Just use SELECT SQL_NO_CACHE... instead of the generic SELECT... to force the DB to restart the work from scratch. Measure how long it takes to run the query and process results
function wayOne(){
// execute your 1st query and loop through results
}
function wayTwo(){
// execute 2nd query and loop through results
}
//Measures # of milliseconds it takes to execute another function
function timeThis(callable $callback){
$start_time = microtime();
call_user_func($callback);
$microsecs = microtime()-$start_time; //duration in microseconds
return round($microsecs*1000);//duration in milliseconds
}
$wayOneTime = timeThis('wayOne');
$wayTwoTime = timeThis('wayTwo');
You can then compare the two times. Generally (not always) a process that takes significantly less time uses fewer resources
I have five different queries running on my about page showing basic data like the number of news stories we have on the site. I am using queries like this:
$sql4 = "SELECT `ride_id` FROM `tpf_rides` WHERE `type` LIKE '%Roller Coaster%'" ;
$result4 = $pdo->query($sql4);
$coasters = $result4->rowCount();
but wonder if there is a more efficient way. I've tried to minimize the load by only pulling id's but because I only need the count can the load be lightened even more?
Also these queries only really need to run once or twice per day, not every time the page is loaded. Can someone point me in the direction of setting this up? I've never had to do this before. Thanks.
Yes there is a more efficient way. Let the database do the counting for you:
SELECT count(*) as cnt
FROM `tpf_rides`
WHERE `type` LIKE '%Roller Coaster%';
If all the counts you are looking for are from the tpf_rides table, then you can do them in one query:
SELECT sum(`type` LIKE '%Roller Coaster%') as RollerCoaster,
sum(`type` LIKE '%Haunted House%') as HauntedHouse,
sum(`type` LIKE '%Ferris Wheel%') as FerrisWheel
FROM `tpf_rides`;
That would be even faster than running three different queries.
If you want to run those queries only every now and then you need to keep the result stored somewhere. This can take a form of a pre-calculated sum you manage yourself or a simple cache.
Below is a very simple and naive cache implementation that should work reliably on linux. Many things can be improved here but maybe this will give you an idea of what you could do.
The below is not compatible with the query suggested by Gordon Linoff which returns multiple counts.
The code has not been tested.
$cache_directory = "/tmp/";
$cache_lifetime = 86400; // time to keep cache in seconds. 24 hours = 86400sec
$sql4 = "SELECT count(*) FROM `tpf_rides` WHERE `type` LIKE '%Roller Coaster%'";
$cache_key = md5($sql4); //generate a semi-unique identifier for the query
$cache_file = $cache_directory . $cache_key; // generate full cache file path
if (!file_exists($cache_file) || time() <= strtotime(filemtime($cache)) + $cache_lifetime)
{
// cache file doesn't exist or has expired
$result4 = $pdo->query($sql4);
$coasters = $result4->fetchColumn();
file_put_contents($cache_file, $coasters); // store the result in a cache file
} else {
// file exists and data is up to date
$coasters = file_get_contents($cache_file);
}
I would strongly suggest you break this down into functions that take care of different aspects of the problem.
I'm having a strange time dealing with selecting from a table with about 30,000 rows.
It seems my script is using an outrageous amount of memory for what is a simple, forward only walk over a query result.
Please note that this example is a somewhat contrived, absolute bare minimum example which bears very little resemblance to the real code and it cannot be replaced with a simple database aggregation. It is intended to illustrate the point that each row does not need to be retained on each iteration.
<?php
$pdo = new PDO('mysql:host=127.0.0.1', 'foo', 'bar', array(
PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION,
));
$stmt = $pdo->prepare('SELECT * FROM round');
$stmt->execute();
function do_stuff($row) {}
$c = 0;
while ($row = $stmt->fetch()) {
// do something with the object that doesn't involve keeping
// it around and can't be done in SQL
do_stuff($row);
$row = null;
++$c;
}
var_dump($c);
var_dump(memory_get_usage());
var_dump(memory_get_peak_usage());
This outputs:
int(39508)
int(43005064)
int(43018120)
I don't understand why 40 meg of memory is used when hardly any data needs to be held at any one time. I have already worked out I can reduce the memory by a factor of about 6 by replacing "SELECT *" with "SELECT home, away", however I consider even this usage to be insanely high and the table is only going to get bigger.
Is there a setting I'm missing, or is there some limitation in PDO that I should be aware of? I'm happy to get rid of PDO in favour of mysqli if it can not support this, so if that's my only option, how would I perform this using mysqli instead?
After creating the connection, you need to set PDO::MYSQL_ATTR_USE_BUFFERED_QUERY to false:
<?php
$pdo = new PDO('mysql:host=127.0.0.1', 'foo', 'bar', array(
PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION,
));
$pdo->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
// snip
var_dump(memory_get_usage());
var_dump(memory_get_peak_usage());
This outputs:
int(39508)
int(653920)
int(668136)
Regardless of the result size, the memory usage remains pretty much static.
Another option would be to do something like:
$i = $c = 0;
$query = 'SELECT home, away FROM round LIMIT 2048 OFFSET %u;';
while ($c += count($rows = codeThatFetches(sprintf($query, $i++ * 2048))) > 0)
{
foreach ($rows as $row)
{
do_stuff($row);
}
}
The whole result set (all 30,000 rows) is buffered into memory before you can start looking at it.
You should be letting the database do the aggregation and only asking it for the two numbers you need.
SELECT SUM(home) AS home, SUM(away) AS away, COUNT(*) AS c FROM round
The reality of the situation is that if you fetch all rows and expect to be able to iterate over all of them in PHP, at once, they will exist in memory.
If you really don't think using SQL powered expressions and aggregation is the solution you could consider limiting/chunking your data processing. Instead of fetching all rows at once do something like:
1) Fetch 5,000 rows
2) Aggregate/Calculate intermediary results
3) unset variables to free memory
4) Back to step 1 (fetch next set of rows)
Just an idea...
I haven't done this before in PHP, but you may consider fetching the rows using a scrollable cursor - see the fetch documentation for an example.
Instead of returning all the results of your query at once back to your PHP script, it holds the results on the server side and you use a cursor to iterate through them getting one at a time.
Whilst I have not tested this, it is bound to have other drawbacks such as utilising more server resources and most likely reduced performance due to additional communication with the server.
Altering the fetch style may also have an impact as by default the documentation indicates it will store both an associative array and well as a numerical indexed array which is bound to increase memory usage.
As others have suggested, reducing the number of results in the first place is most likely a better option if possible.
I have a php page query mysql database, it will return about 20000 rows. However the browser will take above 20 minutes to present. I have added index on my database and it do used it, the query time in command line is about 1 second for 20000 rows. but in web application, it takes long. is anyone know which causing this problem? and better way to improve it?Below is my php code to retrieve the data:
select * from table where Date between '2010-01-01' and '2010-12-31'
$result1 = mysql_query($query1) or die('Query failed: ' . mysql_error());
while ($line = mysql_fetch_assoc($result1)) {
echo "\t\t<tr>\n";
$Data['Date'] = $line['Date'];
$Data['Time'] = $line['Time'];
$Data['Serial_No'] = $line['Serial_No'];
$Data['Department'] = $line['Department'];
$Data['Team'] = $line['Team'];
foreach ($Data as $col_value) {
echo "\t\t\t<td>$col_value</td>\n";
};
echo "\t\t</tr>\n";
}
Try adding an index to your date column.
Also, it's a good idea to learn about the EXPLAIN command.
As mentioned in the comments above, 1 second is still pretty long for your results.
You might consider putting all your output into a single variable and then echoing the variable once the loop is complete.
Also, browsers wait for tables to be completely formed before showing them, so that will slow your results (at least slow the process of building the results in the browser). A list may work better - or better yet a paged view if possible (as recommended in other answers).
It's not PHP that's causing it to be slow, but the browser itself rendering a huge page. Why do you have to display all that data anyway? You should paginate the results instead.
Try constructing a static HTML page with 20,000 table elements. You'll see how slow it is.
You can also improve that code:
while ($line = mysql_fetch_assoc($result1)) {
echo "\t\t<tr>\n";
foreach ($line as $col_value) {
echo "\t\t\t<td>$col_value</td>\n";
flush(); // optional, but gives your program a sense of responsiveness
}
echo "\t\t</tr>\n";
}
In addition, you should increase your acceptance rate.
You could time any steps of the script, by echoing the time before and after connecting to the database, running the query and outputting the code.
This will tell you how long the different steps will take. You may find out that it is indeed the traffic causing the delay and not the query.
On the other hand, when you got a table with millions of records, retreiving 20000 of them can take a long time, even when it is indexed. 20 minutes is extreme, though...
Given 5,000 IDs of records fetch in the database, which query , in your opinion is faster?
Loop through 5000 IDs using php and perform a SELECT query for each one,
foreach($ids as $id){
// do the query
$r = mysql_query("SELECT * FROM TABLE WHERE ID = {$id}");
}
Or collect all ids in an array, and use SELECT * FROM TABLE WHERE ID IN (1 up to 5000)
//assuming $ids = array(1,2 ---- up to 5000);
$r = mysql_query("SELECT * FROM TABLE WHERE ID IN (".join(",",$ids).")");
Without a shadow of a doubt, loading them all in one go will be faster. Running 5,000 queries is going to be a lot slower as each query will carry a certain amount of overhead.
Also, to speed it up even more, DON'T use the * operator! Select the fields you are going to use, if you only need the ID column, specify this! If you want all the columns, specify them all, because you may later add fields in and you do not need to retrieve this new field.
option 2 is definitely going to be faster. 5000 separate db queries are going to have huge network connection overhead.
The fastest way is not to request 5000 rows at all.
You barely need 100 to display them on one page. 5000 is way overkill
Sure measure it, but I'd certainly recommend letting the database doing the job.
All depends, I hope you're not creating a connection for each call though.
Loop is faster if you use a Query Statement using bind variables. Declare the Statement off the loop; then inside the loop bind the variable per each id.
Do not underestimate the time going into SQL parsing; especially on these long winded things.
Option 2 is faster. With option 1 you do a full roundtrim to the server for each iteration.
I'd point out that in this case you might consider using paging to display the data.
Hint: Measure, Measure, Measure. With a code worth 10 minutes of your time you will have the answer right away.
Which is faster for many queries?
Try measure it for example like this:
<?php
$start = getmicrotime();
for ($i=0;$i<100000;$i++)
{
foreach($ids as $id){
// do the query
$r = mysql_query("SELECT * FROM TABLE WHERE ID = {$id}");
}
}
$end = getmicrotime();
echo 'Time (1): '.($end- $start).' sec';
$start = getmicrotime();
for ($i=0;$i<100000;$i++)
{
//assuming $ids = array(1,2 ---- up to 5000);
$r = mysql_query("SELECT * FROM TABLE WHERE ID IN (".join(",",$ids).")");
}
$end = getmicrotime();
echo 'Time (2): '.($end- $start).' sec';
?>