Equivalent of a PHP loop in possibly one query - php

I have a table called 'Visits' that stores MAC addresses along with their timestamps. I have the task to check each day's MAC addresses against those of the previous days and if found, update them as 'Repeat' in today's records with the number of visits made so far (excluding today).
I have written the following PHP code that does the job nicely but the problem is that today it's taking 586.4 seconds to execute (checking 1,500 MACs against 70,000 from the previous 40 days) and it will surely become worse with each passing day.
$STH = $DBH->prepare("SELECT DISTINCT MAC FROM `Visits` WHERE TimeStamp=:TimeStamp");
$STH->bindParam(':TimeStamp', $unixDataDate);
$STH->execute();
while ($r = $STH->fetch(PDO::FETCH_ASSOC)) {
$MAC=$r['MAC'];
$STH2 = $DBH->prepare("SELECT COUNT(ID) FROM `Visits` WHERE MAC=:MAC AND TimeStamp<:TimeStamp");
$STH2->bindParam(':MAC', $MAC);
$STH2->bindParam(':TimeStamp', $unixDataDate);
$STH2->execute();
$prevVisits=$STH2->fetchColumn();
if ($prevVisits>0) {
$STH3 = $DBH->prepare("UPDATE `Visits` SET RepeatVisitor=:RepeatVisitor WHERE MAC=:MAC AND TimeStamp=:TimeStamp");
$STH3->bindParam(':RepeatVisitor', $prevVisits);
$STH3->bindParam(':MAC', $MAC);
$STH3->bindParam(':TimeStamp', $unixDataDate);
$STH3->execute();
}
}
Now I tried several ways to construct a query to do this job and compare execution times but I couldn't get the same results. Any help as to whether it's possible to do this task in one inexpensive query and how to format it would be greatly appreciated.

I assume that Visits.TimeStamp is a date
UPDATE Visits
SET RepeatVisitor =
(SELECT COUNT(*) FROM Visits as v2 WHERE v2.MAC = Visits.MAC and v2.TimeStamp!=Visits.TimeStamp AMD v2.TimeStamp>'[40 days ago generated in PHP]')
WHERE Visits.TimeStamp = '[Yesterday generated in PHP]'

Have you placed indexes on both the MAC and Timestamp column? If not, these should speed things up considerably.
Also, move the prepare statements outside of the while loop, you're preparing the same query over and over, which kind of misses the point. Prepare once, execute more often.

Related

MySQL PHP Order by Random for a certain time?

For example, say if you wanted a random result every 10 minutes. Is there a way to achieve this with ORDER BY RAND()?
$fetch = mysqli_query($conn, "
SELECT *
FROM food
JOIN food_images ON food.size = food_images.size
ORDER BY RAND()
");
I also am using a JOIN and worried if this might affect the answers. Thank you!
I don't have a MySQL server in front of me so most of this is a guess, but you might try as follows:
You can generate a number that changes only once every ten minutes by taking the system time in seconds, dividing by the number of seconds in ten minutes, and then casting to an integer:
$seed = (int) (time() / 600);
Then pass this value to MySQL's RAND() function as a parameter to seed the RNG, and you should get a repeatable sequence that changes every ten minutes:
$stmt = mysqli_prepare($conn, 'SELECT ... ORDER BY RAND(?)');
mysqli_stmt_bind_param($stmt, 'i', $seed);
You can do it as:
SELECT *, rand(time_to_sec(current_time()) / 600) as ord
FROM food
JOIN food_images ON food.size = food_images.size
order by ord
The parameter of the RAND() function is the seed. The expression in it, changes only every 10 minutes.
You can use MySQL Event Scheduler and as described in the documentation:
you are creating a named database object containing one or more SQL statements to be executed at one or more regular intervals, beginning and ending at a specific date and time
And I guess you are using php so You can use PHP Cron jobs too , Managing Cron Jobs With PHP

SQL query optimization - multiple query or DAYOFYEAR()?

I need to run queries with several conditions which will result large dataset. Whereas all the conditions are straight forward, I need advice regarding 2 issues in terms of speedoptimization:
1) If I need to run those queries between 1st Apr till 20th June of each year for last 10 years, I have 2 options in my knowledge:
a. Run the query 10 times
$year = 2015;
$start_month_date = "-04-01";
$end_month_date = "-06-20";
for($i=0;$i<10;$i++){
$start = $year.$start_month_date;
$end = $year.$start_month_date;
$result = mysql_query("....... WHERE .... AND `event_date` BETWEEN $start AND $end");
// PUSH THE RESULT TO AN ARRAY
$year = $year - 1;
}
b. Run the query single time, however query will compare by DayOfYear (hence each date has to be converted to DayOfYear by the query)
$start = Date("z", strtotime("2015-04-01")) + 1;
$end = Date("z", strtotime("2015-06-20")) + 1;
$result = mysql_query("....... WHERE .... AND DAYOFYEAR(`event_date`) BETWEEN $start AND $end");
I am aware of the 1 day difference in day count for leap year with other years, but I can live with that. I am sensing 1.b is more optimized, just want to verify.
2) I have a large query with 2 sub query. When I want to limit the result by date, I should put the conditions inside or outside the sub query?
a. Inside sub query means it has to validate the condition twice
SELECT X.a,X.b,Y.c FROM
(SELECT * FROM mytable WHERE `event_date` BETWEEN '$startdate' AND '$enddate' AND `case` = 'AAA' AND .......) X
(SELECT * FROM mytable WHERE `event_date` BETWEEN '$startdate' AND '$enddate' AND `case` = 'BBB' AND .......) Y
WHERE X.`event_date` = Y.`event_date` AND ........... ORDER BY `event_date`
b. Outside sub query means it will validate once, but has to join a larger dataset (for which I need to set SQL_BIG_SELECTS = 1)
SELECT X.a,X.b,Y.c FROM
(SELECT * FROM mytable WHERE `case` = 'AAA' AND .......) X
(SELECT * FROM mytable WHERE `case` = 'BBB' AND .......) Y
WHERE X.`event_date` = Y.`event_date` AND X.`event_date` BETWEEN '$startdate' AND '$enddate' AND ........... ORDER BY `event_date`
Again, in my opinion 2.a is more optimized, but requesting your advise.
Thanks
(1) Running the queries 10 times with event_date BETWEEN $start AND $end will be faster when the SQL engine can take advantage of an index on event_date. This could be significant, but it depends on the rest of the query.
Also, because you are ordering the entire data set, running 10 queries is likely to be a bit faster. That's because sorting is O(n log(n)), meaning that it takes longer to sort larger data sets. As an example, sorting 100 rows might take X time units. Sorting 1000 rows might take X * 10 * log(10) time units. But, sorting 100 rows 10 times takes just X * 10 (this is for explanatory purposes).
(2) Don't use subqueries if you can avoid them in MySQL. The subqueries are materialized, which adds additional overhead. Plus, they then prevent the use of indexes. If you need to use subqueries, filter the data as much as possible in the subquery. This reduces the data that needs to be stored.
I assume you have lots rows over 10 years otherwise that wouldn't be much of an issue.
Now the best bet is to do a couple explain on the different queries you plan to use, that will probably tell you which index it can use as currently we don't know them (you didn't post the structure of the table)
1.b. use a function in where clause so it will be terrible as it won't be able to use index for date (assuming there is one). So this will read the entire table
One thing that you could do, is ask the database to join the resultset of the 10 queries together using UNION. Mysql would join the result instead of php... (see https://dev.mysql.com/doc/refman/5.0/en/union.html)
2 - As gordon said, filter data as much as possible. However instead of trying option blindly you can use EXPLAIN and the database will help you decide which one make the most sense.

get new result from sql every week

I have a table that I want to pick one row from it and show it to the user. every week I want to make the website automatically picks another row randomly. so, basically I want to get new result every week not every time a user visit the page.
I am using this code right now :
$res = mysql_query("SELECT COUNT(*) FROM fruit");
$row = mysql_fetch_array($res);
$offset = rand(0, $row[0]-1);
/* the first three lines to pick a row randomly from the table */
$res = mysql_query("SELECT * FROM fruit LIMIT $offset, 1");
$row = mysql_fetch_assoc($res);
This code gets a new result everytime the user visit the page, and after every refresh another random row gets chosen. I want to make it update every week and the results are the same for every user. Is their a php command that does that? If so, how does it work?
My suggestion would be as follows:
Store the random result id and timestamp is some other kind of persistent storage (file, DB table, etc).
Setup a cron job or other automated task to update the record above weekly. If you don't have access to such solutions, you could write code to do it on each page load and check against the timestamp column. However, that's pretty inefficient.
Yes there is. Use the date function in php and write each week and the corresponding row to a file using fwrite. Then, using an if statement, check if it is a new week and if it is get a new random row, write it to the file and return that, if it isn't, return the same one for that week.
A cronjob is the best solution. Create a script weeklynumber.php, much as what you have already, that generates an entry. After this, go to your console, and open your crontab file using crontab -e.
In here, you may add
0 0 * * 0 php /path/to/weeklynumber.php
This means that at every Sunday at 0:00, php /path/to/weeklynumber.php is executed.
But all of this assumes you're on UNIX and that you have access to creating cronjobs. If not, here's another solution: Hash the week number and year, and use that to generate the weekly number.
// Get the current week and year
$week = date('Wy');
// Get the MD5 hash of this
$hash = md5($week);
// Get the amount of records in the table
$count = mysql_result(mysql_query("SELECT COUNT(*) FROM fruit"),0);
// Convert the MD5 hash to an integer
$num = base_convert($hash, 16, 10);
// Use the last 6 digits of the number and take modulus $count
$num = substr($num,-6) % $count;
Note that the above will only work as long as the amount of records in your table doesn't change.
And finally, just a little note to your current method. Instead of counting rows, getting a random number from PHP, and asking your DBMS to return that number, it can all be done with a single query
SELECT * FROM fruit ORDER BY RAND() LIMIT 1

Method for preventing a visitor from making more than 2 posts per day (PHP MySQL)

I've been searching for hours for a solution and either I can't find the right words to describe what I need in the search engines, or I'm just bad at finding things.
Here's the situation:
I'm currently making a website and I have a section where visitors can post messages, kind of like a "guestbook". However, I want to limit an IP address to 2 posts per day. I thought it'd be easy at first, just insert the IP address, date, and time along with all the other data into the mysql table and then do a comparison of the times and dates from each ip. Something like that.... Then as I got to working on it, so many questions came to mind. How would I compare the entries if the IP has posted more than 2 messages over a number of days? How would you even start comparing the date and time accurately? What's the best time and date format for comparison? Is there a better way such as self expiring data that you can compare to? And so on.. Sorry if this seems like such a simple task but I am having a hard time finding the answers. Tried googling everything such as "mysql php time limit", "php mysql prevent spam timer", "php mysql timer like megavideo" etc.
Just to clarify, I need a good method for preventing a visitor from posting more than 2 message per day. This is a "guestbook" kind of thing so any visitor can post. No logins.
Thank you in advance!
Create a table called exceptions with structure id, ip, date
and when user posts something, execute a query like :
$ip = $_SERVER['REMOTE_ADDR'];
$query = "insert into exceptions set ip = '{$ip}',date = 'NOW()'";
and before let user post something add this:
$ip = $_SERVER['REMOTE_ADDR'];
$count = mysql_num_rows(mysql_query("select count(*)
from exceptions where ip = '{$ip}'
and date > DATE_SUB(NOW(), INTERVAL 24 HOUR)"));
if($count> 2) {
echo 'You have exceeded posting limits, please try again in 24 hours';
exit;
}
something like this:
posts_in_last_24_hours = select count(*) from posts where ip=? and date>(NOW()-86400) -- 24 hours ago
if that is 2:
fail
else:
post message
If you record in your database the IP on every post, you can check if it has posted more than 2 times in the last X days with a query counting the number of records with that IP in the last X days, something like this:
SELECT COUNT (IP) FROM posts_log WHERE IP = 'the_user_ip' AND post_date > (NOW() - (X * (60 * 60 * 24)))
Using a trigger
Record the IP and store that in the guestbook table.
Then create a trigger like so:
DELIMITER $$
CREATE TRIGGER bi_guestbook_each BEFORE INSERT ON guestbook FOR EACH ROW
BEGIN
DECLARE posts_today INTEGER;
SELECT COUNT(new.ip) INTO posts_today FROM guestbook
WHERE ip = new.ip AND postdate = new.postdate;
IF posts_today > 1 THEN BEGIN
//Force error by selecting from non-existing table
//this will prevent the insert from happening.
SELECT no_more_than_2_post_per_day_allowed
FROM before_insert_guestbook_error;
END; END IF;
END$$
DELIMITER ;
Now you can just do inserts in you php code with worrying about the daily limit.
The trigger will take do the checking for you automatically.
Using insert .. select
If you don't want to use triggers, use an insert ... select statement.
First make sure that field ip is defined in the table definition as NOT NULL with no default.
This will make the following insert fail when trying to insert a null value into the guestbook table.
<?php
$name = mysql_real_escape_string($_GET['name']);
$post = .....
$ip = ....
$query = "INSERT INTO guestbook (id, name, post_text, ip, postdate) SELECT
NULL AS id
, '$name' AS name
, '$post' AS post_text
, IF(guestbook.count(*)>=2,null,ip) AS ip
, CURDATE() AS postdate
FROM guestbook WHERE ip = '$ip' AND postdate = CURDATE()";
Of course you can put the IF(count(*) >= 2,null,...) in the place of any column that is defined as NOT NULL in the table definition.
Links:
select...insert: http://dev.mysql.com/doc/refman/5.0/en/insert-select.html
triggers: http://dev.mysql.com/doc/refman/5.1/en/triggers.html
You could implement this simply with cookies. Anyone determined enough to get round the posting limit would be able to get round an IP block anyway, since proxies and dynamic IPs make it really easy. So blocking by cookie or by IP are about as useful as each other (not especially), but cookies are easier to implement:
If you store a counter in the cookie and set the expiry date to be 24 hours in the future.
You can then increment the counter when the user posts an entry, and simply check the counter to see if they've reached their limit.
That way you don't have to worry about mucking about with a database and any performance issues it can introduce.
Building upon Neville K's captcha point. If spam is your main concern, try out a negative captcha - it doesn't impact the user's experience since they never see it.
You've already got plenty of reasonable looking answers, so I won't add anything to that - but I do think you need to be careful using IP address as the unique identifier of the visitor. It's not a reliable way of identifying users - for instance, most people within a single company/university/government installation will appear to come from the same IP address, so this implementation would limit everyone within that structure to 2 posts per day.
The best way of avoiding spam is to include CAPTCHA - it's a pain in the backside for users, but it reliably foils robots; unless your site is super high value, spammers won't be motivated enough to have human beings post spam.

Mysql select query taking over 8 seconds yet limit is 5 records

following the answers i got on this thread PHP MySQL Query to get most popular from two different tables
ive timed the query and im getting disturbing times when calling this query. I get over 8 seconds. afrostarvideos table has about 2000 records,afrostarprofiles about 300records. even after limiting the query to only five. goto http://www.veepiz.com/index.php and check this generation time under popular artists. Please help optimize this query
//Get current time
$mtime = microtime();
//Split seconds and microseconds
$mtime = explode(" ",$mtime);
//Create one value for start time
$mtime = $mtime[1] + $mtime[0];
//Write start time into a variable
$tstart = $mtime;
$q= "SELECT `p`.*, SUM(`v`.`views`)+`p`.`views` AS `totalViews` FROM `afrostarprofiles` `p` LEFT OUTER JOIN `afrostarvideos` `v` ON `p`.`id` = `v`.`artistid`".
"GROUP BY `p`.`id` ORDER BY SUM(`v`.`views`)+`p`.`views` DESC LIMIT $pas_start,$end";
$r=mysql_query($q);
//Get current time as we did at start
$mtime = microtime();
$mtime = explode(" ",$mtime);
$mtime = $mtime[1] + $mtime[0];
//Store end time in a variable
$tend = $mtime;
//Calculate the difference
$totaltime = ($tend - $tstart);
//Output result
printf ("Page was generated in %f seconds !", $totaltime);
First thing to know - as soon as you have an ORDER BY, the LIMIT doesn't help you.
Once MySQL realizes that it needs to order the results before it can limit them, it has to go through the entire table to get all the results before it orders them (and then limits them) for you.
That said (and without seeing the EXPLAIN for your query [which would be useful]), I'm guessing that most of the time is taken by that LEFT JOIN.
Make sure that there is an index on afrostarprofiles.id (I'm guessing it's the primary key, so you're good) and afrostarvideos.artistid.
Also, unless you want to get a result even when there are no matching artists in your second join, I would recommend trying a JOIN (otherwise known as an INNER JOIN) instead of a LEFT JOIN.
Based on the previous question you should cache this query, since it's a most popular thingie and probably doesn't need to be always up to date, all the time. I did this to one of my websites, popular stuff, it's fun to create even! A small problem, you need something like crontab on your server to task a script to run this query. All in all, I hope you can cache this kind of stuff since it's so much better on the server.
I hope I was of help.
Others have answered most of points, but I will add little bit more on "design" aspects that for any "aggregation" or data-crunching operation on large tables will always hurt performance. nothing is free :) it may not be bad idea ( rather good one ) to pre-build them.
There are few choices for pre-building
Scheduled - Put a cron-job and choose right frequency to run it.
Trigger/event based - Most databases support triggers. So choose right event/trigger like insert/update/delete of row in source tables, you will rebuild summary/aggregate table
Choice of option may depend on your overall application, business rules and desired results.

Categories