PHP vs MySQL Performance ( if , functions ) in query

PHP vs MySQL Performance ( if , functions ) in query - php

I just see this artice
i need to know what's is best berformance in this cases
if statment in query
SELECT *,if( status = 1 , "active" ,"unactive") as status_val FROM comments
VS
<?php
$x = mysql_query("SELECT * FROM comments");
while( $res = mysql_fetch_assoc( $x ) ){
if( $x['status'] == 1 ){
$status_val = 'active';
}else{
$status_val = 'unactive';
}
}
?>
Cut 10 from string
SELECT * , SUBSTR(comment, 0, 10) as min_comment FROM comments
VS
<?php
$x = mysql_query("SELECT * FROM comments");
while( $res = mysql_fetch_assoc( $x ) ){
$min_comment = substr( $x['comment'],0,10 ) ;
}
?>
etc ????? and When i use MYSQL functions or PHP functions ?

It depends.
From a network point of view, in the first VS, PHP wins and in the second VS, MYSQL wins. That's because you send less data via socket. And thus the script becomes faster.

Here is a nice description of your question:
Doing calculations in MySQL vs PHP
In case of the second example the speed issue can be significant.
First of all you do not know how big are your comments, so in case of
$x = mysql_query("SELECT * FROM comments");
while( $res = mysql_fetch_assoc( $x ) ){
$min_comment = substr( $x['comment'],0,10 ) ;
}
you ask your server to return you everything (here I mean the whole length of the comment) and this can be significant. Multiplying by the number of rows in the table it can be quite big size of data, which you have to transfer between php and sql. In the second case this SELECT * , SUBSTR(comment, 0, 10) as min_comment FROM comments
this will be already done on the server and will not require additional memory.
In case of the first example, I think it is also better to do it on sql side, because you will still need to do additional loop afterwards. Apart from this, people who will be reading your code might be confused why exactly do you need that code.

In that case, using MySQL functions keeps you from looping in PHP and saves you a lot of code.
In other cases they have no alternative : for instance, when you use them in the WHERE part.
In terms of performance, the difference is trivial in most cases : use the simplest solution.

Only answer: Measure it! You have two working solutions and want to achieve the best execution time.

I'd say it all depends on the systems on either end and current load. Typically DB servers are more stout than desktop machines and it would be faster to do the whole thing on the DB side rather than split it and use PHP partly.

Related

Efficient query to just return row count

I have five different queries running on my about page showing basic data like the number of news stories we have on the site. I am using queries like this:
$sql4 = "SELECT `ride_id` FROM `tpf_rides` WHERE `type` LIKE '%Roller Coaster%'" ;
$result4 = $pdo->query($sql4);
$coasters = $result4->rowCount();
but wonder if there is a more efficient way. I've tried to minimize the load by only pulling id's but because I only need the count can the load be lightened even more?
Also these queries only really need to run once or twice per day, not every time the page is loaded. Can someone point me in the direction of setting this up? I've never had to do this before. Thanks.

Yes there is a more efficient way. Let the database do the counting for you:
SELECT count(*) as cnt
FROM `tpf_rides`
WHERE `type` LIKE '%Roller Coaster%';
If all the counts you are looking for are from the tpf_rides table, then you can do them in one query:
SELECT sum(`type` LIKE '%Roller Coaster%') as RollerCoaster,
sum(`type` LIKE '%Haunted House%') as HauntedHouse,
sum(`type` LIKE '%Ferris Wheel%') as FerrisWheel
FROM `tpf_rides`;
That would be even faster than running three different queries.

If you want to run those queries only every now and then you need to keep the result stored somewhere. This can take a form of a pre-calculated sum you manage yourself or a simple cache.
Below is a very simple and naive cache implementation that should work reliably on linux. Many things can be improved here but maybe this will give you an idea of what you could do.
The below is not compatible with the query suggested by Gordon Linoff which returns multiple counts.
The code has not been tested.
$cache_directory = "/tmp/";
$cache_lifetime = 86400; // time to keep cache in seconds. 24 hours = 86400sec
$sql4 = "SELECT count(*) FROM `tpf_rides` WHERE `type` LIKE '%Roller Coaster%'";
$cache_key = md5($sql4); //generate a semi-unique identifier for the query
$cache_file = $cache_directory . $cache_key; // generate full cache file path
if (!file_exists($cache_file) || time() <= strtotime(filemtime($cache)) + $cache_lifetime)
{
// cache file doesn't exist or has expired
$result4 = $pdo->query($sql4);
$coasters = $result4->fetchColumn();
file_put_contents($cache_file, $coasters); // store the result in a cache file
} else {
// file exists and data is up to date
$coasters = file_get_contents($cache_file);
}
I would strongly suggest you break this down into functions that take care of different aspects of the problem.

correct way to retrieve mysql data on heavy load

I have build a system using PHP-MySQL. This system is subject to a very heavy load, with thousands of selects,updates,inserts,deletes every minute.
I would like to optimize this system, to make it faster, and reduce load on the servers.
I have already introduced memcache, but mysql data is still needed.
So my question is, which method would be the best in this case.
Currently my queries would look like this:
$q = mysql_query($sql);
while(mysql_fetch_array($q)) {...
I have read that there is a little speed to gain by using mysql_fetch_assoc (?)
But perhaps there is an antirely different approach, when i start optimizing this system?
Thank you all - (Apologies for my limited english skills)

mysql_fetch_assoc vs mysql_fetch_array will duplicate less data thus use less memory. Since the data is presented associative and by index in the array, with that you will get some tiny optimization although will help if your dataset is big.
Try to use natural sort (AKA avoid SORT in query sentences) and LIMIT your result set if you can
Batch queries: instead run 100 inserts over the same table try to do a few of them small.
cache cache cache if you can: using redis or memcached.
if you generate pages that can be treated as static try to use HTTP headers to avoid browsers to request your site all the time
etc. etc.

I would recommend you to use the mysql keyword LIMIT to limit the result set.
Adding pagination to the mysql returning resultset will make your application lighter, the ui will load faster because of less rows to fetch and the mysql server will only receive the select queries when needed.
Basically this is the syntax of how to use limit.
SELECT * FROM Person LIMIT X,Y
Where X is the total row count to be retrieved and Y the offset.
Example:
SELECT * FROM Person LIMIT 10, 0
This query will return the first ten rows of the table Person, and:
SELECT * FROM Person LIMIT 10, 10
Will display the next 10

I've been doing some timing tests on various methods of getting information out of MySQL in PHP. The goal was to find the fastest way of transferring a column of data into a simple array. I've tested it against the enSEMBL database, which is usefully huge.
The following code was common for methods 1 to 8 (9 used GROUP_CONCAT & 10 used PDO):
$query = "SELECT DISTINCT `name` FROM species LIMIT 5000";
$result = $mysqli->query($query);
*Method code*
print_r(array_filter($species));
Method 1: Textbook method
while ($row = $result->fetch_row()) {
$species[] = $row[0];
}
Method 2: while and reset (NB some IDEs detect an error here)
while ($species[] = reset($result->fetch_row())) ;
Method 3: foreach and reset
foreach ($result->fetch_all() as $value) $species[] = reset($value);
Method 4: while, foreach and reset
while ($species[] = $result->fetch_row()) ;
foreach ($species as $key => $value) $species[$key] = reset($value);
Method 5: while and index
while ($row = $result->fetch_row()) $species[] = $row[0];
Method 6: foreach and index
foreach ($result->fetch_all() as $value) $species[] = $value[0];
Method 7: recurse the array
$species = call_user_func_array('array_merge', $result->fetch_all());
Method 8: array_column
$species = array_column($result->fetch_all(), 0);
Method 9: Using GROUP_CONCAT in query.
$species = explode(',', $result->fetch_row()[0]);
Method 10: PDO
$species = $sth->fetchAll(PDO::FETCH_COLUMN, 0);
Surprisingly Method 1 (Textbook) was consistently about 4 times longer than the practically identical Method 5, but took about the same time as Method 10 (PDO).
Method 2 was consistently the slowest method at 50x longer, presumably because the system is writing warnings somewhere.
Method 4 (two loops) was the second slowest, taking 10x longer.
As stated Methods 1(textbook) & 10 (PDO) were third.
Method 9 was fourth slowest (2x longer, and had the disadvantage of hitting the GROUP_CONCAT limit without any warning).
The fastest method, however, wasn't consistent. Take your pick from 3, 5, 6, 7 & 8.
Method 8 (array_column) was often the fastest way to do this, but not always. However I think it's the most elegant method and provides slightly more flexibility as it can return an associative array using any two columns selected by your query (but don't mess with the order in the query!)

PDO/MySQL memory consumption with large result set

I'm having a strange time dealing with selecting from a table with about 30,000 rows.
It seems my script is using an outrageous amount of memory for what is a simple, forward only walk over a query result.
Please note that this example is a somewhat contrived, absolute bare minimum example which bears very little resemblance to the real code and it cannot be replaced with a simple database aggregation. It is intended to illustrate the point that each row does not need to be retained on each iteration.
<?php
$pdo = new PDO('mysql:host=127.0.0.1', 'foo', 'bar', array(
PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION,
));
$stmt = $pdo->prepare('SELECT * FROM round');
$stmt->execute();
function do_stuff($row) {}
$c = 0;
while ($row = $stmt->fetch()) {
// do something with the object that doesn't involve keeping
// it around and can't be done in SQL
do_stuff($row);
$row = null;
++$c;
}
var_dump($c);
var_dump(memory_get_usage());
var_dump(memory_get_peak_usage());
This outputs:
int(39508)
int(43005064)
int(43018120)
I don't understand why 40 meg of memory is used when hardly any data needs to be held at any one time. I have already worked out I can reduce the memory by a factor of about 6 by replacing "SELECT *" with "SELECT home, away", however I consider even this usage to be insanely high and the table is only going to get bigger.
Is there a setting I'm missing, or is there some limitation in PDO that I should be aware of? I'm happy to get rid of PDO in favour of mysqli if it can not support this, so if that's my only option, how would I perform this using mysqli instead?

After creating the connection, you need to set PDO::MYSQL_ATTR_USE_BUFFERED_QUERY to false:
<?php
$pdo = new PDO('mysql:host=127.0.0.1', 'foo', 'bar', array(
PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION,
));
$pdo->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
// snip
var_dump(memory_get_usage());
var_dump(memory_get_peak_usage());
This outputs:
int(39508)
int(653920)
int(668136)
Regardless of the result size, the memory usage remains pretty much static.

Another option would be to do something like:
$i = $c = 0;
$query = 'SELECT home, away FROM round LIMIT 2048 OFFSET %u;';
while ($c += count($rows = codeThatFetches(sprintf($query, $i++ * 2048))) > 0)
{
foreach ($rows as $row)
{
do_stuff($row);
}
}

The whole result set (all 30,000 rows) is buffered into memory before you can start looking at it.
You should be letting the database do the aggregation and only asking it for the two numbers you need.
SELECT SUM(home) AS home, SUM(away) AS away, COUNT(*) AS c FROM round

The reality of the situation is that if you fetch all rows and expect to be able to iterate over all of them in PHP, at once, they will exist in memory.
If you really don't think using SQL powered expressions and aggregation is the solution you could consider limiting/chunking your data processing. Instead of fetching all rows at once do something like:
1) Fetch 5,000 rows
2) Aggregate/Calculate intermediary results
3) unset variables to free memory
4) Back to step 1 (fetch next set of rows)
Just an idea...

I haven't done this before in PHP, but you may consider fetching the rows using a scrollable cursor - see the fetch documentation for an example.
Instead of returning all the results of your query at once back to your PHP script, it holds the results on the server side and you use a cursor to iterate through them getting one at a time.
Whilst I have not tested this, it is bound to have other drawbacks such as utilising more server resources and most likely reduced performance due to additional communication with the server.
Altering the fetch style may also have an impact as by default the documentation indicates it will store both an associative array and well as a numerical indexed array which is bound to increase memory usage.
As others have suggested, reducing the number of results in the first place is most likely a better option if possible.

Query optimization : Which SELECT syntax is faster?

Given 5,000 IDs of records fetch in the database, which query , in your opinion is faster?
Loop through 5000 IDs using php and perform a SELECT query for each one,
foreach($ids as $id){
// do the query
$r = mysql_query("SELECT * FROM TABLE WHERE ID = {$id}");
}
Or collect all ids in an array, and use SELECT * FROM TABLE WHERE ID IN (1 up to 5000)
//assuming $ids = array(1,2 ---- up to 5000);
$r = mysql_query("SELECT * FROM TABLE WHERE ID IN (".join(",",$ids).")");

Without a shadow of a doubt, loading them all in one go will be faster. Running 5,000 queries is going to be a lot slower as each query will carry a certain amount of overhead.
Also, to speed it up even more, DON'T use the * operator! Select the fields you are going to use, if you only need the ID column, specify this! If you want all the columns, specify them all, because you may later add fields in and you do not need to retrieve this new field.

option 2 is definitely going to be faster. 5000 separate db queries are going to have huge network connection overhead.

The fastest way is not to request 5000 rows at all.
You barely need 100 to display them on one page. 5000 is way overkill

Sure measure it, but I'd certainly recommend letting the database doing the job.
All depends, I hope you're not creating a connection for each call though.

Loop is faster if you use a Query Statement using bind variables. Declare the Statement off the loop; then inside the loop bind the variable per each id.
Do not underestimate the time going into SQL parsing; especially on these long winded things.

Option 2 is faster. With option 1 you do a full roundtrim to the server for each iteration.
I'd point out that in this case you might consider using paging to display the data.
Hint: Measure, Measure, Measure. With a code worth 10 minutes of your time you will have the answer right away.

Which is faster for many queries?
Try measure it for example like this:
<?php
$start = getmicrotime();
for ($i=0;$i<100000;$i++)
{
foreach($ids as $id){
// do the query
$r = mysql_query("SELECT * FROM TABLE WHERE ID = {$id}");
}
}
$end = getmicrotime();
echo 'Time (1): '.($end- $start).' sec';
$start = getmicrotime();
for ($i=0;$i<100000;$i++)
{
//assuming $ids = array(1,2 ---- up to 5000);
$r = mysql_query("SELECT * FROM TABLE WHERE ID IN (".join(",",$ids).")");
}
$end = getmicrotime();
echo 'Time (2): '.($end- $start).' sec';
?>

Can php query the results from a previous query?

In some languages (ColdFusion comes to mind), you can run a query on the result set from a previous query. Is it possible to do something like that in php (with MySQL as the database)?
I sort of want to do:
$rs1 = do_query( "SELECT * FROM animals WHERE type = 'fish'" );
$rs2 = do_query( "SELECT * FROM rs1 WHERE name = 'trout'" );

There is no MySQL function like this for PHP, however there is a more advanced substitute for it.
Edit: For those of you who don't know what a query of queries is, it's exactly this and there's a purpose some people do it like this. Using an AND operator is ****NOT**** the same thing! If I want results where username='animuson' for one part of my script and then want all the results out of that query where status='1', it is not logical for me to run another query using an AND operator, it is much more logical to loop through the previous results in PHP. Stop upvoting things without reading the comments on why they weren't upvoted in the first place, that's just lazy. If you don't have a clue what's being talked about, you shouldn't be upvoting or downvoting in the first place.

Well, you may want to do this without touching the db:
while($t = mysql_fetch_array($rs1)){
if($t[name] == 'trout'){
echo 'This is the one we\'re looking for!';
break;
}
}

In PHP, it would be terribly inefficient. You would have to loop through each row and check that its name was trout. However, is there any reason you can't do
SELECT * FROM `animals` WHERE `type` = 'fish' AND `name` = 'trout'
in SQL? It would be much, much faster.

You can also do something like
select morestuff from (select stuff from table where a = b ) where c = d;

Use the AND keyword?
"SELECT * FROM animals WHERE type = 'fish' and name='trout'"
Also, you can use LINQ for php http://phplinq.codeplex.com/

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP vs MySQL Performance ( if , functions ) in query - php

It depends. From a network point of view, in the first VS, PHP wins and in the second VS, MYSQL wins. That's because you send less data via socket. And thus the script becomes faster.

In that case, using MySQL functions keeps you from looping in PHP and saves you a lot of code. In other cases they have no alternative : for instance, when you use them in the WHERE part. In terms of performance, the difference is trivial in most cases : use the simplest solution.

Only answer: Measure it! You have two working solutions and want to achieve the best execution time.

I'd say it all depends on the systems on either end and current load. Typically DB servers are more stout than desktop machines and it would be faster to do the whole thing on the DB side rather than split it and use PHP partly.

Related

Efficient query to just return row count

correct way to retrieve mysql data on heavy load

PDO/MySQL memory consumption with large result set

Query optimization : Which SELECT syntax is faster?

Can php query the results from a previous query?

Categories

Resources