I have five different queries running on my about page showing basic data like the number of news stories we have on the site. I am using queries like this:
$sql4 = "SELECT `ride_id` FROM `tpf_rides` WHERE `type` LIKE '%Roller Coaster%'" ;
$result4 = $pdo->query($sql4);
$coasters = $result4->rowCount();
but wonder if there is a more efficient way. I've tried to minimize the load by only pulling id's but because I only need the count can the load be lightened even more?
Also these queries only really need to run once or twice per day, not every time the page is loaded. Can someone point me in the direction of setting this up? I've never had to do this before. Thanks.
Yes there is a more efficient way. Let the database do the counting for you:
SELECT count(*) as cnt
FROM `tpf_rides`
WHERE `type` LIKE '%Roller Coaster%';
If all the counts you are looking for are from the tpf_rides table, then you can do them in one query:
SELECT sum(`type` LIKE '%Roller Coaster%') as RollerCoaster,
sum(`type` LIKE '%Haunted House%') as HauntedHouse,
sum(`type` LIKE '%Ferris Wheel%') as FerrisWheel
FROM `tpf_rides`;
That would be even faster than running three different queries.
If you want to run those queries only every now and then you need to keep the result stored somewhere. This can take a form of a pre-calculated sum you manage yourself or a simple cache.
Below is a very simple and naive cache implementation that should work reliably on linux. Many things can be improved here but maybe this will give you an idea of what you could do.
The below is not compatible with the query suggested by Gordon Linoff which returns multiple counts.
The code has not been tested.
$cache_directory = "/tmp/";
$cache_lifetime = 86400; // time to keep cache in seconds. 24 hours = 86400sec
$sql4 = "SELECT count(*) FROM `tpf_rides` WHERE `type` LIKE '%Roller Coaster%'";
$cache_key = md5($sql4); //generate a semi-unique identifier for the query
$cache_file = $cache_directory . $cache_key; // generate full cache file path
if (!file_exists($cache_file) || time() <= strtotime(filemtime($cache)) + $cache_lifetime)
{
// cache file doesn't exist or has expired
$result4 = $pdo->query($sql4);
$coasters = $result4->fetchColumn();
file_put_contents($cache_file, $coasters); // store the result in a cache file
} else {
// file exists and data is up to date
$coasters = file_get_contents($cache_file);
}
I would strongly suggest you break this down into functions that take care of different aspects of the problem.
Related
Well I have this problem that I hoped someone could help me with:
So whats it about?
I have a developed PHP script that imports XML files from a folder in to a database.
XML file looks like this- XML file
Basically script stores information from the XML file in to 5 tables, and that works correctly.
But the problem is that my file does not contain ID information of players in the PLAYER object so after I import everything in to database I have to run this query:
$sql = "SELECT igraci.ID, utakmice.Player_ID, utakmice.ID AS broj FROM igraci LEFT JOIN utakmice ON (igraci.Team_ID = utakmice.Team_ID) AND (igraci.Surname = utakmice.Lastname) AND (igraci.Name = utakmice.Firstname);";
$tabela = mysql_query($sql);
$row = mysql_fetch_assoc($tabela);
$totalrow = mysql_num_rows($tabela);
$i=0;
do {
$i++;
$sql = "UPDATE utakmice SET Player_ID=" . $row['ID'] . " WHERE ID = " . $row['broj'] . "";
echo $sql."<br>";
mysql_query($sql);
} while ($row = mysql_fetch_assoc($tabela));
Select statement is executed really fast and I have no problem with that but the UPDATE command is making the script timeout.
I have tryed making the fields used in this QUERY indexes but that didn't help and as soon as I have more than 2200 rows the script fails.
The script was executing ok on older version of php but last month we had to upgrade to 5.3 and thats where the problem started.
Is there any way that I can speed this UPDATE up?
PS: XML file is from FIBA live Cms system.
Is it the php script timing out?
Do you need to do this as a SELECT followed by potentially a large number of updates?
Could you not just use a single UPDATE statement, something like this:-
UPDATE utakmice
INNER JOIN igraci
ON (igraci.Team_ID = utakmice.Team_ID)
AND (igraci.Surname = utakmice.Lastname)
AND (igraci.Name = utakmice.Firstname)
SET utakmice.Player_ID = igraci.ID
Add an INDEX on utakmice.ID to speed up the WHERE part.
If you're not sure about performance run:
EXPLAIN SELECT * FROM utakmice WHERE ID = [x]
See if it's using an index or doing a full table scan (index is good, table scan is slow)
Apart from setting an index on ID you can try batching your updates like explained in here.
You need to prepare a query by concatenating case-whens when neccessary. It's worth a try, but I haven't done any performance tests to see if it could give you a huge boost here.
In the end you'd get something like:
UPDATE utakmice SET title = CASE
WHEN id = <your_first_broj_from_result> THEN <your_first_id_from_result>
WHEN id = <your_second_broj_from_result> THEN <your_second_id_from_result>
...
END
WHERE id IN (<your_first_broj_from_result>, <your_second_broj_from_result>,...)
I have pull back a lot of information and as a result, my page is loading in about 22~24 seconds. Is there anything I can do to optimize my code?
Here is my code:
<?php
$result_rules = $db->query("SELECT source_id, destination_id FROM dbo.rules");
while($row_rules = sqlsrv_fetch_array($result_rules)){
$result_destination = $db->query("SELECT pk_id, project FROM dbo.destination WHERE pk_id=" . $row_rules['destination_id'] . " ORDER by project ASC");
while($row_destination = sqlsrv_fetch_array($result_destination)){
echo "Destination project: ";
echo "<span class='item'>".$row_destination['project']."</span>";
echo "ID: ".$row_rules['destination_id']."<br>";
if ($row_rules['source_id'] == null) {
echo "Source ID for Destination ID".$row_rules['destination_id']." is NULL<br>";
} else {
$result_source = $db->query("SELECT pk_id, project FROM dbo.source WHERE pk_id=" . $row_rules['source_id'] . " ORDER by project ASC");
while($row_source = sqlsrv_fetch_array($result_source)){
echo "Source project: ";
echo $row_source['project'];
echo " ID: ".$row_rules['source_id']."<br>";
}
}
}
}
?>
Here's what my tables look like:
Source table: pk_id:int, project:varchar(50), feature:varchar(50), milestone:varchar(50), reviewGroup:varchar(125), groupId:int
Rules table: pk_id:int, source_id:int, destination_id:int, login:varchar(50), status:varchar(50), batchId:int, srcPGroupId:int, dstPGroupId:int
Destination table: pk_id:int, project:varchar(50), feature:varchar(50), milestone:varchar(50), QAAssignedTo:varchar(50), ValidationAssignedTo:varchar(50), Priority:varchar(50), groupId:int
If you want help with optimizing queries then please provide details of the schema and the output of the explain plan.
Running nested loops is bad for performance. Running queries inside nested loops like this is a recipe for VERY poor performance. Using '*' in select is bad for performance too (particularly as your only ever using a couple of columns).
You should start by optimizing your PHP and merging the queries:
$result_rules = $db->query(
"SELECT rule.destination_id, [whatever fields you need from dbo.rules]
dest.project AS dest_project,
src.project AS src_project,
src.pk_id as src_id
FROM dbo.rules rule
INNER JOIN dbo.destination dest
ON dest.pk_id=rule.destination_id
LEFT JOIN dbo.source src
ON src.pk_id=rule.source_id
ORDER BY rule.destination_id, dest.project, src.project");
$last_dest=false;
$last_src=false;
while($rows = sqlsrv_fetch_array($result)){
if ($row['destination_id']!==$last_dest) {
echo "Destination project: ";
echo "<span class='item'>".$row['dest_project']."</span>";
echo "ID: ".$row['destination_id']."<br>";
$last_dest=$row['destination_id'];
}
if (null===$row['src_id']) {
... I'll let you sort out the rest.
Add an index on (pk_id, project) so it includes all fields important for the query.
Make sure that pk_Id is indexed: http://www.w3schools.com/sql/sql_create_index.asp
Rather than using select *, return only the columns you need, unless you need all of them.
I'd also recommend moving your SQL code to the server and calling the stored procedure.
You could consider using LIMIT if your back end is mysql: http://php.about.com/od/mysqlcommands/g/Limit_sql.htm .
I'm assuming that the else clause is what's slowing up your code. I would suggest saving all the data you're going to need at the start and then accessing the array again in the else clause. Basically, you don't need this to run every time.
$result_destination = $db->query("SELECT * FROM dbo.destination WHERE pk_id=" . $row_rules['destination_id'] . " ORDER by project ASC")
You could grab the data earlier and use PHP to iterate over it.
$result_destinations = $db->query("SELECT * FROM dbo.destination ORDER by project ASC")
And then later in your code use PHP to determine the correct destination. Depending on exactly what you're doing it should shave some amount of time off.
Another consideration is the time it takes for your browser to render the html generated by your php code. The more data you are presenting, the longer it's going to take. Depending on the requirements of your audience, you might want to display only x records at a time.
There are jquery methods of increasing the number of records displayed without going back to the server.
For starters you would want to lower the number of queries run. For example doing a query, looping through those results and running another query, then looping through that result set running more queries is generally considered bad. The number of queries run goes up exponentially.
For example, if you have 100 rows coming back from the first query and 10 rows from each sub-query. The first query returns 100 rows that you loop over. For each of those you query again. You are now at 101 queries. Then, for each of those 100 you run another query each returning 10 rows. You are now at 1001 queries. Each query has to send data to the server (the query text), wait for a response and get data back. That is what takes so long.
Use a join to do a single query on all the tables and loop over the single result.
I'm counting the right answers field of a table and saving that calculated value on another table. For this I'm using two queryes, first one is the count query, i retrieve the value using loadResult(). After that i'm updating another table with this value and the date/time. The problem is that in some cases the calculated value is not being saved, only the date/time.
queries look something like this:
$sql = 'SELECT count(answer)
FROM #_questionsTable
WHERE
answer = 1
AND
testId = '.$examId;
$db->setQuery($sql);
$rightAnsCount = $db->loadResult();
$sql = 'UPDATE #__testsTable
SET finish = "'.date('Y-m-d H:i:s').'", rightAns='.$rightAnsCount.'
WHERE testId = '.$examId;
$db->setQuery($sql);
$db->Query();
answer = 1 means that the question was answered ok.
I think that when the 2nd query is executed the first one has not finished yet, but everywhere i read says that it waits that the first query is finished to go to the 2nd, and i don't know how to make the 2nd query wait for the 1st one to end.
Any help will be appreciated. Thanks!
a PHP MySQL query is synchronous ie. it completes before returning - Joomla!'s database class doesn't implement any sort of asynchronous or call-back functionality.
While you are missing a ';' that wouldn't account for it working some of the time.
How is the rightAns column defined - eg. what happens when your $rightAnsCount is 0
Turn on Joomla!'s debug mode and check the SQL that's generated in out the profile section, it looks something like this
eg.
Profile Information
Application afterLoad: 0.002 seconds, 1.20 MB
Application afterInitialise: 0.078 seconds, 6.59 MB
Application afterRoute: 0.079 seconds, 6.70 MB
Application afterDispatch: 0.213 seconds, 7.87 MB
Application afterRender: 0.220 seconds, 8.07 MB
Memory Usage
8511696
8 queries logged.
SELECT *
FROM jos_session
WHERE session_id = '5cs53hoh2hqi9ccq69brditmm7'
DELETE
FROM jos_session
WHERE ( TIME < '1332089642' )
etc...
you may need to add a semicolon to the end of your sql queries
...testId = '.$examID.';';
ah, something cppl mentioned is the key I think. You may need to account for null values from your first query.
Changing this line:
$rightAnsCount = $db->loadResult();
To this might make the difference:
$rightAnsCount = ($db->loadResult()) ? $db->loadResult() : 0;
Basically setting to 0 if there is no result.
I am pretty sure you can do this in one query instead:
$sql = 'UPDATE #__testsTable
SET finish = NOW()
, rightAns = (
SELECT count(answer)
FROM #_questionsTable
WHERE
answer = 1
AND
testId = '.$examId.'
)
WHERE testId = '.$examId;
$db->setQuery($sql);
$db->Query();
You can also update all values in all rows in your table this way by slightly modifying your query, so you can do all rows in one go. Let me know if this is what you are trying to achieve and I will rewrite the example.
Given 5,000 IDs of records fetch in the database, which query , in your opinion is faster?
Loop through 5000 IDs using php and perform a SELECT query for each one,
foreach($ids as $id){
// do the query
$r = mysql_query("SELECT * FROM TABLE WHERE ID = {$id}");
}
Or collect all ids in an array, and use SELECT * FROM TABLE WHERE ID IN (1 up to 5000)
//assuming $ids = array(1,2 ---- up to 5000);
$r = mysql_query("SELECT * FROM TABLE WHERE ID IN (".join(",",$ids).")");
Without a shadow of a doubt, loading them all in one go will be faster. Running 5,000 queries is going to be a lot slower as each query will carry a certain amount of overhead.
Also, to speed it up even more, DON'T use the * operator! Select the fields you are going to use, if you only need the ID column, specify this! If you want all the columns, specify them all, because you may later add fields in and you do not need to retrieve this new field.
option 2 is definitely going to be faster. 5000 separate db queries are going to have huge network connection overhead.
The fastest way is not to request 5000 rows at all.
You barely need 100 to display them on one page. 5000 is way overkill
Sure measure it, but I'd certainly recommend letting the database doing the job.
All depends, I hope you're not creating a connection for each call though.
Loop is faster if you use a Query Statement using bind variables. Declare the Statement off the loop; then inside the loop bind the variable per each id.
Do not underestimate the time going into SQL parsing; especially on these long winded things.
Option 2 is faster. With option 1 you do a full roundtrim to the server for each iteration.
I'd point out that in this case you might consider using paging to display the data.
Hint: Measure, Measure, Measure. With a code worth 10 minutes of your time you will have the answer right away.
Which is faster for many queries?
Try measure it for example like this:
<?php
$start = getmicrotime();
for ($i=0;$i<100000;$i++)
{
foreach($ids as $id){
// do the query
$r = mysql_query("SELECT * FROM TABLE WHERE ID = {$id}");
}
}
$end = getmicrotime();
echo 'Time (1): '.($end- $start).' sec';
$start = getmicrotime();
for ($i=0;$i<100000;$i++)
{
//assuming $ids = array(1,2 ---- up to 5000);
$r = mysql_query("SELECT * FROM TABLE WHERE ID IN (".join(",",$ids).")");
}
$end = getmicrotime();
echo 'Time (2): '.($end- $start).' sec';
?>
Database structure:
id galleryId type file_name description
1 `artists_2010-01-15_7c1ec` `image` `band602.jpg` `Red Umbrella Promo`
2 `artists_2010-01-15_7c1ec` `image` `nov7.jpg` `CD Release Party`
3 `artists_2010-01-15_7c1ec` `video` `band.flv` `Presskit`
I'm going to pull images out for one section of an application, videos on another, etc. Is it better to make multiple mysql queries for each section like so:
$query = mysql_query("SELECT * FROM galleries WHERE galleryId='$galleryId' && type='image');
...Or should I be building an associative array and just looping through the array over and over whenever I need to use the result set?
Thanks for the thoughts.
It depends what's more important: readability or performance. I'd expect a single query and prefilling PHP arrays would be faster to execute, since database connections are expensive, but then a simple query for each section is much more readable.
Unless you know (and not just hope) you're going to get a huge amount of traffic I'd go for separate queries and then worry about optimising if it looks like it'll be a problem. At that point there'll be other things you'll want to do anyway, such as building a data access layer and adding some caching.
If by "sections" you mean separate single pages (separate HTTP requests) that users can view, I would suggest query-per-type as needed. If on a page where there are only image data sets, you really don't need to fetch the video data set for example. You won't be really saving much time fetching everything, since you will be connecting to the database for every page hit anyway (I assume.)
If by "sections" you mean different parts of one page, then fetch everything at once. This will save you time on querying (only one query.)
But depending on the size of your data set, you could run into trouble with PHP's memory limit querying for everything, though. You could then try raising the memory limit, but if that fails you'll probably have to fall back to query-per-type.
Using the query-per-type approach moves some of the computing load to the database server, as you will only be requesting and fetching what you really need. And you don't have to write code to filter and sort your results. Filtering and sorting is something the database is generally better at than PHP code. If at all possible, enable MySQL's query cache, that will speed up these queries much more than anything you could write in PHP.
If your data is all coming from one table, I would only do one query.
I presume you are building a single page with a section for pictures, a section for video, a section for music, etc. Write your query return results sorted by media type - iterate through all the pictures, then all the video, then all the music.
Better to have multiple queries. Every time you run a query all the data is getting pulled out and loaded into memory. If you have 5 different types, it means each page of that type is loading 5 times as much data as it needs to do.
Even with just one at a time, you are probably going to want to start paginating with LIMIT/OFFSET queries fairly quickly if you have more than 100 or however many you can reasonably display on one page at a time.
It really depends,
IN operator
ini_set('memory_limit', '-1');
$startMemory = memory_get_usage();
$conn = mysqli_connect("localhost", "", "", "");
$ar = array();
$sql = "SELECT * FROM table WHERE e IN (.....)";
$result = mysqli_query($conn, $sql);
while ($row = mysqli_fetch_assoc($result)) {
$ar[$row['c']] = $row;
}
echo (memory_get_usage() - $startMemory) / 1024 / 1024, ' MB'; //1409.7124481201
$end_time = microtime(true);
echo ($end_time - $start_time) . ' Seconds'; //5.2406549453735 Seconds
Foreach
ini_set('memory_limit', '-1');
$startMemory = memory_get_usage();
$conn = mysqli_connect("localhost", "", "", "");
$ar = array();
$array_loop = array(....)
foreach($array_loop as $key => $value){
$sql = "SELECT * FROM table WHERE e = '$value'";
$result = mysqli_query($conn, $sql);
while ($row = mysqli_fetch_assoc($result)) {
$ar[$row['c']] = $row;
}
}
echo (memory_get_usage() - $startMemory) / 1024 / 1024, ' MB'; //42.773330688477 MB
$end_time = microtime(true);
echo ($end_time - $start_time) . ' Seconds'; //12.469061136246 Seconds
I noticed that foreach consumes time but not memory and IN operator consumes memory but not time. All the test done based on test data generated by sql procudre about 1 Million