Delete old rows in table if maximum exceeded - php

I have insert in table any time when users open any post on my site, in this way im get real time 'Whats happend on site'
mysql_query("INSERT INTO `just_watched` (`content_id`) VALUES ('{$id}')");
but now have problem because have over 100K hits every day, this is a 100K new rows in this table every day, there is any way to limit table to max 100 rows, and if max is exceeded then delete old 90 and insert again or something like that, have no idea what's the right way to make this
my table just_watched
ID - content_id
ID INT(11) - AUTO_INCREMENT
content_id INT(11)

Easiest way that popped into my head would be to use php logic to delete and insert your information. Then every time a user open a new post you would then add the count the database. (this you are already doing)
The new stuff comes here
Enter a control before the insertion, meaning before anything is inserted you would first count all the rows, if it does not exceed 100 rows then add a new row.
If it does exceed 100 rows then you before inserting a new row you, first do a delete statement THEN you insert a new row.
Example (sudo code) :
$sql = "SELECT COUNT(*) FROM yourtable";
$count = $db -> prepare($sql);
$count -> execute();
if ($count -> fetchColumn() >= 100) { // If the count is over a 100
............... //Delete the first 90 leave 10 then insert a new row which will leave you at 11 after the delete.
} else {
.................. // Keep inserting until you have 100 then repeat the process
}
More information on counting here. Then some more information on PDO here.
Hopefully this helps :)
Good luck.
Also information on how to set up PDO if you haven't already.
What I would do? :
At 12:00 AM every night run a cron job that deletes all rows from the past day. But thats just some advice. Have a good one.

Use this query for deleting old rows except last 100 rows:
DELETE FROM just_watched where
ID not in (SELECT id fromjust_watched order by ID DESC LIMIT 100)
You can run it by CRON in every n period where (n= hours, or minutes, or any)

$numRows = mysql_num_rows(mysql_query("SELECT ID FROM just_watched"));
if ($numRows > 100){
mysql_query("DELETE FROM just_watched LIMIT 90");
}
mysql_query("INSERT INTO `just_watched` (`content_id`) VALUES ('{$id}')");
I guess this should work fine.

You can get the number of rows in your table with:
$size = mysql_num_rows($result);
With the size of the table, you can check, if it's getting to big, and then remove 90 rows:
// Get 90 lines of code
$query = "Select * FROM just_watched ORDER BY id ASC LIMIT 90";
$result = mysql_query($query);
// Go through them
while($row = mysql_fetch_object($result)) {
// Delete the row with the id
$id = $row['id'];
$sql = 'DELETE FROM just_watched
WHERE id=$id';
}
Another way would be to just delete an old row if you add a new row to the table. The only problem is, that if something get's jammed, the table might get to big.

You may use
DELETE FROM just_watched ORDER BY id DESC LIMIT 100, 9999999999999999;
So, it'll delete all the rows from the offset 100 to a big number (for end of the tables). if you always run this query before you insert new one then it'll do the job for you.

Related

How to handle/optimize thousands of different to executed SELECT queries?

I need to synchronize specific information between two databases (one mysql, the other a remote hosted SQL Server database) for thousands of rows. When I execute this php file it gets stuck/timeouts after several minutes I guess, so I wonder how I can fix this issue and maybe also optimize the way of "synchronizing" it.
What the code needs to do:
Basically I want to get for every row (= one account) in my database which gets updated - two specific pieces of information (= 2 SELECT queries) from another SQL Server database. Therefore I use a foreach loop which creates 2 SQL queries for each row and afterwards I update those information into 2 columns of this row. We talk about ~10k Rows which needs to run thru this foreach loop.
My idea which may help?
I have heard about things like PDO Transactions which should collect all those queries and sending them afterwards in a package of all SELECT queries, but I have no idea whether I use them correctly or whether they even help in such cases.
This is my current code, which is timing out after few minutes:
// DBH => MSSQL DB | DB => MySQL DB
$dbh->beginTransaction();
// Get all referral IDs which needs to be updated:
$listAccounts = "SELECT * FROM Gifting WHERE refsCompleted <= 100 ORDER BY idGifting ASC";
$ps_listAccounts = $db->prepare($listAccounts);
$ps_listAccounts->execute();
foreach($ps_listAccounts as $row) {
$refid=$row['refId'];
// Refsinserted
$refsInserted = "SELECT count(username) as done FROM accounts WHERE referral='$refid'";
$ps_refsInserted = $dbh->prepare($refsInserted);
$ps_refsInserted->execute();
$row = $ps_refsInserted->fetch();
$refsInserted = $row['done'];
// Refscompleted
$refsCompleted = "SELECT count(username) as done FROM accounts WHERE referral='$refid' AND finished=1";
$ps_refsCompleted = $dbh->prepare($refsCompleted);
$ps_refsCompleted->execute();
$row2 = $ps_refsCompleted->fetch();
$refsCompleted = $row2['done'];
// Update fields for local order db
$updateGifting = "UPDATE Gifting SET refsInserted = :refsInserted, refsCompleted = :refsCompleted WHERE refId = :refId";
$ps_updateGifting = $db->prepare($updateGifting);
$ps_updateGifting->bindParam(':refsInserted', $refsInserted);
$ps_updateGifting->bindParam(':refsCompleted', $refsCompleted);
$ps_updateGifting->bindParam(':refId', $refid);
$ps_updateGifting->execute();
echo "$refid: $refsInserted Refs inserted / $refsCompleted Refs completed<br>";
}
$dbh->commit();
You can do all of that in one query with a correlated sub-query:
UPDATE Gifting
SET
refsInserted=(SELECT COUNT(USERNAME)
FROM accounts
WHERE referral=Gifting.refId),
refsCompleted=(SELECT COUNT(USERNAME)
FROM accounts
WHERE referral=Gifting.refId
AND finished=1)
A correlated sub-query is essentially using a sub-query (query within a query) that references the parent query. So notice that in each of the sub-queries I am referencing the Gifting.refId column in the where clause of each sub-query. While this isn't the best for performance because each of those sub-queries still has to run independent of the other queries, it would perform much better (and likely as good as you are going to get) than what you have there.
Edit:
And just for reference. I don't know if a transaction will help here at all. Typically they are used when you have several queries that depend on each other and to give you a way to rollback if one fails. For example, banking transactions. You don't want the balance to deduct some amount until a purchase has been inserted. And if the purchase fails inserting for some reason, you want to rollback the change to the balance. So when inserting a purchase, you start a transaction, run the update balance query and the insert purchase query and only if both go in correctly and have been validated do you commit to save.
Edit2:
If I were doing this, without doing an export/import this is what I would do. This makes a few assumptions though. First is that you are using a mssql 2008 or newer and second is that the referral id is always a number. I'm also using a temp table that I insert numbers into because you can insert multiple rows easily with a single query and then run a single update query to update the gifting table. This temp table follows the structure CREATE TABLE tempTable (refId int, done int, total int).
//get list of referral accounts
//if you are using one column, only query for one column
$listAccounts = "SELECT DISTINCT refId FROM Gifting WHERE refsCompleted <= 100 ORDER BY idGifting ASC";
$ps_listAccounts = $db->prepare($listAccounts);
$ps_listAccounts->execute();
//loop over and get list of refIds from above.
$refIds = array();
foreach($ps_listAccounts as $row){
$refIds[] = $row['refId'];
}
if(count($refIds) > 0){
//implode into string for use in query below
$refIds = implode(',',$refIds);
//select out total count
$totalCount = "SELECT referral, COUNT(username) AS cnt FROM accounts WHERE referral IN ($refIds) GROUP BY referral";
$ps_totalCounts = $dbh->prepare($totalCount);
$ps_totalCounts->execute();
//add to array of counts
$counts = array();
//loop over total counts
foreach($ps_totalCounts as $row){
//if referral id not found, add it
if(!isset($counts[$row['referral']])){
$counts[$row['referral']] = array('total'=>0,'done'=>0);
}
//add to count
$counts[$row['referral']]['total'] += $row['cnt'];
}
$doneCount = "SELECT referral, COUNT(username) AS cnt FROM accounts WHERE finished=1 AND referral IN ($refIds) GROUP BY referral";
$ps_doneCounts = $dbh->prepare($doneCount);
$ps_doneCounts->execute();
//loop over total counts
foreach($ps_totalCounts as $row){
//if referral id not found, add it
if(!isset($counts[$row['referral']])){
$counts[$row['referral']] = array('total'=>0,'done'=>0);
}
//add to count
$counts[$row['referral']]['done'] += $row['cnt'];
}
//now loop over counts and generate insert queries to a temp table.
//I suggest using a temp table because you can insert multiple rows
//in one query and then the update is one query.
$sqlInsertList = array();
foreach($count as $refId=>$count){
$sqlInsertList[] = "({$refId}, {$count['done']}, {$count['total']})";
}
//clear out the temp table first so we are only inserting new rows
$truncSql = "TRUNCATE TABLE tempTable";
$ps_trunc = $db->prepare($truncSql);
$ps_trunc->execute();
//make insert sql with multiple insert rows
$insertSql = "INSERT INTO tempTable (refId, done, total) VALUES ".implode(',',$sqlInsertList);
//prepare sql for insert into mssql
$ps_insert = $db->prepare($insertSql);
$ps_insert->execute();
//sql to update existing rows
$updateSql = "UPDATE Gifting
SET refsInserted=(SELECT total FROM tempTable WHERE refId=Gifting.refId),
refsCompleted=(SELECT done FROM tempTable WHERE refId=Gifting.refId)
WHERE refId IN (SELECT refId FROM tempTable)
AND refsCompleted <= 100";
$ps_update = $db->prepare($updateSql);
$ps_update->execute();
} else {
echo "There were no reference ids found from \$dbh";
}

Insert multiple rows in MySQL and check for random string without long delay (~80 rows each minute)

For a research project I am obtaining data from a local bus company's GPS system (through their API). I created a php cron job that runs every minute to obtain data like the vehicle, route ID, location, destination, etc. The data did not contain a unique "run number" for each bus route (a unique number so that I can track the progression of a single bus along its route), so I created my own that checks if the vehicle ID, destination, and relative time are similar, and assigns the unique "run ID" to it so that I can track the bus along its route. If no run ID exists, a random one is generated. (Any vehicle with the same "vid" and "pid" within 2 minutes of the last inserted row "timeadded" is on the same run, and this is important for my research)
Each time the cron runs (1 minute), approximately 80 rows are added into the database.
Initially the job would run quickly. However, with over 500,000 rows now, I've noticed the job can take upwards of 40 seconds. I believe it's because for each of the ~80 rows, it has to check the entire table ("vehicles") to see if the same run ID exists, essentially querying a large table and inserting a row 80 times. I want to get at least a week's worth of data (on day 4 now), at which point I can export the data, erase all rows, and start over. My question is: Is there any way I can refactor my PHP/SQL code to make the process run faster? It's been years since I've worked with SQL, so I'm sure there's a more ingenious way to insert all this data.
<?php
// Obtain data from XML
$xml = simplexml_load_file("url.xml");
foreach ($xml->vehicle as $vehicle) {
$vid = $vehicle->vid;
$tm = $vehicle->tmstmp;
$dat = substr($vehicle->tmstmp, 0, 8);
$tme = substr($vehicle->tmstmp, 9);
$lat = $vehicle->lat;
$lon = $vehicle->lon;
$hdg = $vehicle->hdg;
$pid = $vehicle->pid;
$rt = $vehicle->rt;
$des = $vehicle->des;
$pdist = $vehicle->pdist;
// Database connection and insert
mysql_connect("redacted", "redacted", "redacted") or die(mysql_error()); mysql_select_db("redacted") or die(mysql_error());
$sql_findsim = "SELECT vid, pid, timeadded, run, rt FROM vehicles WHERE vid=" . mysql_real_escape_string($vid). " AND pid=" . mysql_real_escape_string($pid). " AND rt=" . mysql_real_escape_string($rt). " AND timeadded > DATE_SUB(CURRENT_TIMESTAMP, INTERVAL 2 MINUTE);";
$handle = mysql_query($sql_findsim);
$row = mysql_fetch_row($handle);
$runid = $row[3];
if($runid !== null) {
$run = $runid;
} else {
$run = substr(md5(rand()), 0, 30);
}
$sql = "INSERT INTO vehicles (vid, tmstmp, dat, tme, lat, lon, hdg, pid, rt, des, pdist, run) VALUES ($vid,'$tm','$dat','$tme','$lat','$lon',$hdg,$pid,'$rt','$des',$pdist,'$run')";
$result = mysql_query($sql);
mysql_close();
}
?>
Thanks for any help with refactoring this code to get it to run more quickly and efficiently.
Do you have any indexes on the table? A compound index on (vid,pid,rt,timeadded) will make the query faster, avoiding a full table scan.
create index fastmagic on vehicles (vid,pid,rt,timeadded)
Alternatively, you could skip the select all together and just to the insert without assigning the "run" random value. This will keep your cron job at "constant time" since all you're doing is appending new data.
After you've got your week of data go back and write "second pass" code to step through each row (select * from vehicle order by timeadded). For each row, do your "select" similar to how you've already done it - then "update" the row you are processing now.
If you go with the alternate, you'll probably want an autoincrement "id" integer column to make row identification clearer (if you don't already have one).
I would suggest that,
Create a table as vehicle_ids ( or some meaningful name ) these fields.
vid, pid, run, rt
instead of checking in vehicles table for vid, you can check the above table for id, if not insert ( make vid as auto increment ).
Normalize your table and also index your vehicle table

show row only 100 times PHP

How can I make a limit of showing the results? I need to limit it for 100 views.
In DB I have:
ID|NAME|PAGE|COUNT|DATE
In count I want to count untill 100 and then stop showing that ID. I could do it with count < 100. And then update the specific ID. I could get records with less than 100 views, but I couldn't manage to update count on the specific ID.
Row is showed with:
php code:
foreach($bannerGroups[0] as $ban) {
echo '<li class="right1">'.$ban->html().'</li>';
}
But I just don't know where to put the update in there. I tried, but all I got was to update only one ID. But it shows 4 on one page and randomizes them on refresh. So I don't know what to do.
Also I would like to say I am only learning php. Sorry for all the mess.
Code at http://pastebin.com/A9hJTPLE
If I understand correctly, you want to show all banners that have been previously-displayed less than 100 times?
If that's right, you can just add that to your WHERE clause:
$bannerResult = mysql_query("SELECT * FROM table WHERE page='cat' WHERE `COUNT` < 100");
To update them all, you can either run a query while displaying each individual banner, or "record" the id of each and run a single query at the end, like:
$ids = array();
foreach($bannerGroups[0] as $ban) {
$ids[] = $ban['ID']; // record the ID; don't know how Banner
// class works, assuming uses indexes; maybe ID() method?
echo '<li class="right1">'.$ban->html().'</li>';
}
...
mysql_query('UPDATE table SET `COUNT` = `COUNT` + 1 WHERE ID IN (' . join(',', $ids) . ')');
UPDATE:
Based off of a comment, your Banner class doesn't have a method to retrieve the individual banner's ID. In this case, you can record the ID values when you're building your banners array:
$ids = array();
while($row=mysql_fetch_assoc($bannerResult)) {
$banners[] = new Banner($row);
$ids[] = $row['ID']; // record the ID
}
// update the `count` on each record:
mysql_query('UPDATE table SET `COUNT` = `COUNT` + 1 WHERE ID IN (' . join(',', $ids) . ')');
sorry, but I got your question wrong...
first you have to insert a new sql-column like "viewcount" to the db...
on every read, you have to increment the value in viewcount...
for that behaviour (because, mysql does not allow sub-selects on update-clause on the same table), you have to fetch the results from db, as you do that, and pass all the primary-keys of the records to an array...
after the view-logic you have to fire up a query like:
UPDATE foo SET viewcount = viewcount + 1 WHERE id IN (1,2,3,4,5,6...,100);
where the IN-clause can be easily generated using your primary-keys-array with "implode(',', $arr);"
hope this helps.
$bannerResult = mysql_query("SELECT * FROM table WHERE page='cat' AND `count`<100");
#newfurniturey figured it out. in each foreach($banneruGroups added: $ids = $ban->getValue('id'); and then mysql_query("UPDATE dataa SET COUNT = COUNT + 1 WHERE id = '$ids'"); but is there any way to update them by adding query only once? And if the id is showed already 100 times i get Warning: Invalid argument supplied for foreach() in. Any idea how to fix it? I have 4 ids in DB . If one of them already have 100 views (count) then i get error!
Try to limit your data source for 100 items.
It's like OFFSET x LIMIT 100 in MySQL/PostgreSQL query or TOP 100 in MSSQL.

Deleting on record from table every 5 seconds

I want to delete one record every 5 seconds and have a cron job for it too. But once the cron starts it deletes all records at once.
Whether I use sleep 5 here or not it does not effect in execution
The used code is below.
mysql_select_db($database_xm, $xm);
$query_ex = "SELECT * FROM table";
$ex = mysql_query($query_ex, $xm) or die(mysql_error());
$row_ex = mysql_fetch_assoc($ex);
$RecordCount=mysql_num_rows($ex);
for ($l=0;$l<=$RecordCount;$l++) {
mysql_select_db($database_xm, $xm);
$query_ss = "delete from table2 limit 1";
$ss = mysql_query($query_ss, $xm) or die(mysql_error());
sleep(5);
ob_flush();
}
How do I delete one record every 5 seconds.
Why don't you simply pick a record and identify it from the information you already have in $row_ex? This way you can also control the order in which records are deleted.
for ($l=0;$l<=$RecordCount;$l++) {
$row = $row_ex[$l];
$query_ss = "delete from table2 WHERE id = ".$row['id']; // EXAMPLE
$ss = mysql_query($query_ss, $xm) or die(mysql_error());
sleep(5);
ob_flush();
}
The cron min exec time is 1m, you have to make infinitive loop, inside there must me sleep(5) after every loop
SET ROWCOUNT 1 will limit the number of rows to 1
For example:
-- Sets limit of rows to 1
SET ROWCOUNT 1
delete from table2
-- Sets limit back to default
SET ROWCOUNT 0
Would do it.

How to get size per day of a table

I have a database with ~20 tables. Each table has a column "dtLogTime" that records the time that row was inserted. I want to figure out the size (probably kb or mb) each table is recording per day. More specifically, I'm only interested in the last 3 days. Also, these tables keep track of data up to a certain time interval (i.e. 2 weeks, 1 month, etc), meaning I lose a day's worth of data for every new day's data stored.
I came across this code that can show me the size of each table.
<?php
$link = mysql_connect('host', 'username', 'password');
$db_name = "your database name here";
$tables = array();
mysql_select_db($db_name, $link);
$result = mysql_query("SHOW TABLE STATUS");
while($row = mysql_fetch_array($result)) {
/* We return the size in Kilobytes */
$total_size = ($row[ "Data_length" ] +
$row[ "Index_length" ]) / 1024;
$tables[$row['Name']] = sprintf("%.2f", $total_size);
}
print_r($tables);
?>
When I tried doing
"SHOW TABLE STATUS WHERE dtLogTime < '2011-08-28 00:00:00'
AND dtLogTime >= '2011-08-27 00:00:00'"
it gave me an error. Is there a way to do this?
Thanks
You need to include a LIKE clause to specify the table. Source: http://dev.mysql.com/doc/refman/5.6/en/show-table-status.html
SHOW TABLE STATUS
LIKE YourTable
WHERE dtLogTime < '2011-08-28 00:00:00'
AND dtLogTime >= '2011-08-27 00:00:00'
The Where clause applies to the resulting table generated by SHOW TABLE STATUS, and cannot be actual columns of your various tables. For instance:
SHOW TABLE STATUS where Index_length = 0
Run SHOW TABLE STATUS by itself to see a list of all the legal columns you can use in the WHERE clause. Unfortunately, for your situation, you'll have to run SHOW TABLE STATUS each day and store the result somewhere.
UPDATE
For clarification, SHOW TABLE STATUS is a convenience method that interrogates the system table, INFORMATION_SCHEMA.TABLES. It will pare down the results from that system table to just the persistent tables in your current database. It doesn't perform any calculations of its own.

Categories