I am a newbie in MySQL and PHP.
I have the following code to get data within a date range (day 1 to day 2, then day 2 to day 3 and so on).
function getData($query) {
global $connect;
$result = mysqli_query($connect, $query);
if (!$result) {
echo 'MySQL Error: '.mysqli_error($connect);
die();
}
return mysqli_fetch_assoc($result);
}
$dayZero = date_create('2017-01-21');
$dayToday = date_create('Y-m-d');
$diff = date_diff($dayZero, $dayToday)->format('%a');
for ($i = 0; $i < $diff; ++$i) {
$start[$i] = date('Y-m-d', date_format($dayZero, 'U') + (24*60*60)*($i));
$end[$i] = date('Y-m-d', date_format($dayZero, 'U') + (24*60*60)*($i+1));
$days[$i] = getData('SELECT count(*) AS "b" FROM `table_name` WHERE `timestamp` BETWEEN "'.$start[$i].'" AND "'.$end[$i].'"')['b'];
}
The code works as expected, but it runs extremely slow. My guess is because it needs to check the database each time it loops.
Is there a way to make it runs faster? Or is there any optimization that I can make?
Yes! Great question. While you can execute queries as you have done, the better option is to use prepared statements. This separates the query into a prepared statement and it's variables see here:
http://www.w3schools.com/php/php_mysql_prepared_statements.asp
The actual statement or query is sent to the server one time. After this the server waits for you to supply the variables.
This is great for performance applications (like yours), where the server is able to make use of caching to greatly speed up the performance. It is also the preferred method for secure applications where there server is protected from injection attacks.
As a final note, there are a bunch of ways to optimize SQL queries and this is just one of them. You should really always be using prepared statements though.
Related
My goal is to get the time difference between 2 different times, one from my database, and one from the client's PHP timestamp, and compare them. Then, if the time difference is less than or equal to 10 seconds, then do something.
My code, which does not work, is as follows.
date_default_timezone_set('America/Chicago');
$timestamp = date('Y-m-d H:i:s');
$sqlcheck = $dbh->prepare("SELECT timestem FROM mytable WHERE UNIX_TIMESTAMP(timestem) - UNIX_TIMESTAMP('".$timestamp."') <= 10");
$sqlcheck->execute();
if ($sqlcheck === ''){
echo "Yes";
}
else {
echo "No";
}
timestem is a DATETIME value from mySQL. $sqlcheck is meant to portray the result of the query. If the query returns nothing, then echo Yes. If it returns something from my query, then echo No.
Without getting too convoluted in explanation, my end-goal is to check how long it has been since a database operation before a client is allowed to perform updates.
You should use the MySQL built-in TIMESTAMPDIFF function:
$sqlcheck = $dbh->prepare('SELECT timestem FROM mytable WHERE TIMESTAMPDIFF(SECOND, timestem, $1) <= 10');
$result = $sqlcheck->execute([$timestamp]);
Note that instead of concatenating strings, I am providing the $timestamp as a parameterized argument.
To check the result, you can't just check for string equality. Instead, use fetchColumn on the result object:
if ($timestem = $result->fetchColumn()) {
echo "YES";
// You can also use the value of `$timestem` here.
} else {
echo "NO";
}
Note that this assumes there is only one row. For multiple rows, you need a loop. Note that only rows that match the condition are returned, so you will never see any NO output if you use a loop.
You can just as easily do the check in PHP (which will probably be a bit more straight-forward)
$sqlcheck = $dbh->prepare( "SELECT timestem FROM mytable" );
$sqlcheck->execute();
// You missed this step
$timestem = $sqlcheck->fetchColumn();
if ( time() - strtotime( $timestem ) > 10 )
print 'yes';
else
print 'no';
Some notes:
In your code you didn't actually get the value out of the database. You used the result object in your conditional
Your code implicitly assumes there is only ever one record in your table. This probably isn't true: you will need a where condition
You have to be very certain that all of your timestamps are in the same time zone
This will only work if you are using PDO, not mysqli. I can't tell the difference from your example, but a subtle difference is that PDO will allow you to use a prepare and execute with no bound parameters, but mysqli won't.
i've got a script which is supposed to run through a mysql database and preform a certain 'test'on the cases. Simplified the database contains records which represent trips that have been made by persons. Each record is a singel trip. But I want to use only roundway trips. So I need to search the database and match two trips to each other; the trip to and the trip from a certain location.
The script is working fine. The problem is that the database contains more then 600.000 cases. I know this should be avoided if possible. But for the purpose of this script and the use of the database records later on, everything has to stick together.
Executing the script takes hours right now, when executing on my iMac using MAMP. Off course I made sure that it can use a lot of memory etcetare.
My question is how could I speed things up, what's the best approach to do this?
Here's the script I have right now:
$table = $_GET['table'];
$output = '';
//Select all cases that has not been marked as invalid in previous test
$query = "SELECT persid, ritid, vertpc, aankpc, jaar, maand, dag FROM MON.$table WHERE reasonInvalid != '1' OR reasonInvalid IS NULL";
$result = mysql_query($query)or die($output .= mysql_error());
$totalCountValid = '';
$totalCountInvalid = '';
$totalCount = '';
//For each record:
while($row = mysql_fetch_array($result)){
$totalCount += 1;
//Do another query, get all the rows for this persons ID and that share postal codes. Postal codes revert between the two trips
$persid = $row['persid'];
$ritid = $row['ritid'];
$pcD = $row['vertpc'];
$pcA = $row['aankpc'];
$jaar = $row['jaar'];
$maand = $row['maand'];
$dag = $row['dag'];
$thecountquery = "SELECT * FROM MON.$table WHERE persid=$persid AND vertpc=$pcA AND aankpc=$pcD AND jaar = $jaar AND maand = $maand AND dag = $dag";
$thecount = mysql_num_rows(mysql_query($thecountquery));
if($thecount >= 1){
//No worries, this person ID has multiple trips attached
$totalCountValid += 1;
}else{
//Ow my, the case is invalid!
$totalCountInvalid += 1;
//Call the markInvalid from functions.php
$totalCountValid += 1;
markInvalid($table, '2', 'ritid', $ritid);
}
}
//Echo the result
$output .= 'Total cases: '.$totalCount.'<br>Valid: '.$totalCountValid.'<br>Invalid: '.$totalCountInvalid; echo $output;
Your basic problem is that you are doing the following.
1) Getting all cases that haven't been marked as invalid.
2) Looping through the cases obtained in step 1).
What you can easily do is to combine the queries written for 1) and 2) in a single query and loop over the data. This will speed up the things a bit.
Also bear in mind the following tips.
1) Selecting all columns is not at all a good thing to do. It takes ample amount of time for the data to traverse over the network. I would recommend replacing the wild-card with all columns that you really need.
SELECT * <ALL_COlumns>
2) Use indexes - sparingly, efficiently and appropriately. Understand when to use them and when not to.
3) Use views if you can.
4) Enable MySQL slow query log to understand which queries you need to work on and optimize.
log_slow_queries = /var/log/mysql/mysql-slow.log
long_query_time = 1
log-queries-not-using-indexes
5) Use correct MySQL field types and the storage engine (Very very important)
6) Use EXPLAIN to analyze your query - EXPLAIN is a useful command in MySQL which can provide you some great details about how a query is ran, what index is used, how many rows it needs to check through and if it needs to do file sorts, temporary tables and other nasty things you want to avoid.
Good luck.
When I run my script I receive the following error before processing all rows of data.
maximum execution time of 30 seconds
exceeded
After researching the problem, I should be able to extend the max_execution_time time which should resolve the problem.
But being in my PHP programming infancy I would like to know if there is a more optimal way of doing my script below, so I do not have to rely on "get out of jail cards".
The script is:
1 Taking a CSV file
2 Cherry picking some columns
3 Trying to insert 10k rows of CSV data into a my SQL table
In my head I think I should be able to insert in chunks, but that is so far beyond my skillset I do not even know how to write one line :\
Many thanks in advance
<?php
function processCSV()
{
global $uploadFile;
include 'dbConnection.inc.php';
dbConnection("xx","xx","xx");
$rowCounter = 0;
$loadLocationCsvUrl = fopen($uploadFile,"r");
if ($loadLocationCsvUrl <> false)
{
while ($locationFile = fgetcsv($loadLocationCsvUrl, ','))
{
$officeId = $locationFile[2];
$country = $locationFile[9];
$country = trim($country);
$country = htmlspecialchars($country);
$open = $locationFile[4];
$open = trim($open);
$open = htmlspecialchars($open);
$insString = "insert into countrytable set officeId='$officeId', countryname='$country', status='$open'";
switch($country)
{
case $country <> 'Country':
if (!mysql_query($insString))
{
echo "<p>error " . mysql_error() . "</p>";
}
break;
}
$rowCounter++;
}
echo "$rowCounter inserted.";
}
fclose($loadLocationCsvUrl);
}
processCSV();
?>
First, in 2011 you do not use mysql_query. You use mysqli or PDO and prepared statements. Then you do not need to figure out how to escape strings for SQL. You used htmlspecialchars which is totally wrong for this purpose. Next, you could use a transaction to speed up many inserts. MySQL also supports multiple interests.
But the best bet would be to use the CSV storage engine. http://dev.mysql.com/doc/refman/5.0/en/csv-storage-engine.html read here. You can instantly load everything into SQL and then manipulate there as you wish. The article also shows the load data infile command.
Well, you could create a single query like this.
$query = "INSERT INTO countrytable (officeId, countryname, status) VALUES ";
$entries = array();
while ($locationFile = fgetcsv($loadLocationCsvUrl, ',')) {
// your code
$entries[] = "('$officeId', '$country', '$open')";
}
$query .= implode(', ', $enties);
mysql_query($query);
But this depends on how long your query will be and what the server limit is set to.
But as you can read in other posts, there are better way for your requirements. But I thougt I should share a way you did thought about.
You can try calling the following function before inserting. This will set the time limit to unlimited instead of the 30 sec default time.
set_time_limit( 0 );
I am currently learning parametrized queries as there are advantages to using them.
Could someone give some pointers by converting this block of code to a parametrized version?
Thanks.
if(isset($_GET['news_art_id']) && (!empty($_GET['news_art_id'])))
{
$news_art_id = htmlentities(strip_tags($_GET['news_art_id']));
$news_art_id = validate_intval($news_art_id);
//echo $news_art_id;
$_SESSION['news_art_id'] = $news_art_id;
// Assign value to status.
$onstatus = 1;
settype($onstatus, 'integer');
$query = 'SELECT M.id, M.j_surname, M.j_points_count, M.j_level, A.j_user_id,A.id, A.jart_title, A.jart_tags, A.jart_description, A.jart_createddate FROM jt_articles A, jt_members M WHERE M.id = A.j_user_id AND A.id = ' . check_db_query_id($news_art_id) . " AND A.jart_status = $onstatus;";
$result = mysql_query($query) or die('Something went wrong. ' . mysql_error());
$artrows = mysql_num_rows($result);
}
The general rule is: Every variable should be binded, no inline variables at all.
Technical details: http://php.net/manual/en/pdo.prepare.php
in your case there is no advantage, remember a parameterised query requires 2 calls to the db : one to setup the query template and parse, the other to populate the query template params and is typically used when looping. So in this instance you're better off calling a stored procedure (always the best choice) or using inline sql and making sure you use http://php.net/manual/en/function.mysql-real-escape-string.php when applicable.
I'm writing a semi-simple database wrapper class and want to have a fetching method which would operate automagically: it should prepare each different statement only the first time around and just bind and execute the query on successive calls.
I guess the main question is: How does re-preparing the same MySql statement work, will PDO magically recognize the statement (so I don't have to) and cease the operation?
If not, I'm planning to achieve do this by generating a unique key for each different query and keep the prepared statements in a private array in the database object - under its unique key. I'm planning to obtain the array key in one of the following ways (none of which I like). In order of preference:
have the programmer pass an extra, always the same parameter when calling the method - something along the lines of basename(__FILE__, ".php") . __LINE__ (this method would work only if our method is called within a loop - which is the case most of the time this functionality is needed)
have the programmer pass a totally random string (most likely generated beforehand) as an extra parameter
use the passed query itself to generate the key - getting the hash of the query or something similar
achieve the same as the first bullet (above) by calling debug_backtrace
Has anyone similar experience? Although the system I'm working for does deserve some attention to optimization (it's quite large and growing by the week), perhaps I'm worrying about nothing and there is no performance benefit in doing what I'm doing?
MySQL (like most DBMS) will cache execution plans for prepared statements, so if user A creates a plan for:
SELECT * FROM some_table WHERE a_col=:v1 AND b_col=:v2
(where v1 and v2 are bind vars) then sends values to be interpolated by the DBMS, then user B sends the same query (but with different values for interpolation) the DBMS does not have to regenerate the plan. i.e. it's the DBMS which finds the matching plan - not PDO.
However this means that each operation on the database requires at least 2 round trips (1st to present the query, the second to present the bind vars) as opposed to a single round trip for a query with literal values, then this introduces additional network costs. There is also a small cost involved in dereferencing (and maintaining) the query/plan cache.
The key question is whether this cost is greater than the cost of generating the plan in the first place.
While (in my experience) there definitely seems to be a performance benefit using prepared statements with Oracle, I'm not convinced that the same is true for MySQL - however, a lot will depend on the structure of your database and the complexity of the query (or more specifically, how many different options the optimizer can find for resolving the query).
Try measuring it yourself (hint: you might want to set the slow query threshold to 0 and write some code to convert literal values back into anonymous representations for the queries written to the logs).
Believe me, I've done this before and after building a cache of prepared statements the performance gain was very noticeable - see this question: Preparing SQL Statements with PDO.
An this was the code I came up after, with cached prepared statements:
function DB($query)
{
static $db = null;
static $result = array();
if (is_null($db) === true)
{
$db = new PDO('sqlite:' . $query, null, null, array(PDO::ATTR_ERRMODE => PDO::ERRMODE_WARNING));
}
else if (is_a($db, 'PDO') === true)
{
$hash = md5($query);
if (empty($result[$hash]) === true)
{
$result[$hash] = $db->prepare($query);
}
if (is_a($result[$hash], 'PDOStatement') === true)
{
if ($result[$hash]->execute(array_slice(func_get_args(), 1)) === true)
{
if (stripos($query, 'INSERT') === 0)
{
return $db->lastInsertId();
}
else if (stripos($query, 'SELECT') === 0)
{
return $result[$hash]->fetchAll(PDO::FETCH_ASSOC);
}
else if ((stripos($query, 'UPDATE') === 0) || (stripos($query, 'DELETE') === 0))
{
return $result[$hash]->rowCount();
}
else if (stripos($query, 'REPLACE') === 0)
{
}
return true;
}
}
return false;
}
}
Since I don't need to worry about collisions in queries, I've ended up using md5() instead of sha1().
OK, since I've been bashing methods of keying the queries for the cache, other than simply using the query string itself, I've done a naive benchmark. The following compares using the plain query string vs first creating the md5 hash:
$ php -v
$ PHP 5.3.0-3 with Suhosin-Patch (cli) (built: Aug 26 2009 08:01:52)
$ ...
$ php benchmark.php
$ PHP hashing: 0.19465494155884 [microtime]
$ MD5 hashing: 0.57781004905701 [microtime]
$ 799994
The code:
<?php
error_reporting(E_ALL);
$queries = array("SELECT",
"INSERT",
"UPDATE",
"DELETE",
);
$query_length = 256;
$num_queries = 256;
$iter = 10000;
for ($i = 0; $i < $num_queries; $i++) {
$q = implode('',
array_map("chr",
array_map("rand",
array_fill(0, $query_length, ord("a")),
array_fill(0, $query_length, ord("z")))));
$queries[] = $q;
}
echo count($queries), "\n";
$cache = array();
$side_effect1 = 0;
$t = microtime(true);
for ($i = 0; $i < $iter; $i++) {
foreach ($queries as $q) {
if (!isset($cache[$q])) {
$cache[$q] = $q;
}
else {
$side_effect1++;
}
}
}
echo microtime(true) - $t, "\n";
$cache = array();
$side_effect2 = 0;
$t = microtime(true);
for ($i = 0; $i < $iter; $i++) {
foreach ($queries as $q) {
$md5 = md5($q);
if (!isset($cache[$md5])) {
$cache[$md5] = $q;
}
else {
$side_effect2++;
}
}
}
echo microtime(true) - $t, "\n";
echo $side_effect1 + $side_effect2, "\n";
To my knowledge PDO does not reuse already prepared statements as it does not analyse the query by itself so it does not know if it is the same query.
If you want to create a cache of prepared queries, the simplest way imho would be to md5-hash the query string and generate a lookup table.
OTOH: How many queries are you executing (per minute)? If less than a few hundred then you only complicate the code, the performance gain will be minor.
Using a MD5 hash as a key you could eventually get two queries that result in the same MD5 hash. The probability is not high, but it could happen. Don't do it. Lossful hashing algorithms like MD5 is just ment as a way to tell if two objects are different with high certainty, but are not a safe means of identifying something.