Mysql only update if row has been inserted before - php

I want to only run the update query if row exists (and was inserted). I tried several different things but this could be a problem with how I am looping this. The insert which works ok and creates the record and the update should take the existing value and add it each time (10 exists + 15 added, 25 exists + 15 added, 40 exists... I tried this in the loop but it ran for every item in a list and was a huge number each time. Also the page is run each time when a link is clicked so user exits and comes back
while($store = $SQL->fetch_array($res_sh))
{
$pm_row = $SQL->query("SELECT * FROM `wishlist` WHERE shopping_id='".$store['id']."'");
$myprice = $store['shprice'];
$sql1 = "insert into posted (uid,price) Select '$uid','$myprice'
FROM posted WHERE NOT EXISTS (select * from `posted` WHERE `uid` = '$namearray[id]') LIMIT 1";
$query = mysqli_query($connection,$sql1);
}
$sql2 = "UPDATE posted SET `price` = price + '$myprice', WHERE shopping_id='".$_GET['id']."'";
$query = mysqli_query($connection,$sql2);

Utilizing mysqli_affected_rows on the insert query, verifying that it managed to insert, you can create a conditional for the update query.
However, if you're running an update immediately after an insert, one is led to believe it could be accomplished in the same go. In this case, with no context, you could just multiply $myprice by 2 before inserting - you may look into if you can avoid doing this.
Additionally, but somewhat more complex, you could utilize SQL Transactions for this, and make sure you are exactly referencing the row you would want to update. If the insert failed, your update would not happen.
Granted, if you referenced the inserted row perfectly for your update then the update will not happen anyway. For example, having a primary, auto-increment key on these rows, use mysqli_insert_id to get the last inserted ID, and updating the row with that ID. But then this methodology can break in a high volume system, or just a random race event, which leads us right back to single queries or transaction utilization.

Related

MySQL query on cron overlaps, ignores 'locked' rows

I'm trying to lock a row in a table as being "in use" so that I don't process the data twice when my cron runs every minute. Because of the length of time it takes for my script to run, the cron will cause multiple instances of the script to run at once (usually around 5 or 6 at a time). For some reason, my "in use" method is not always working.
I do not want to LOCK the tables because I need them available for simultaneous processing, that is why I went the route of pseudo-locking individual rows with an 'inuse' field. I don't know of a better way to do this.
Here is an illustration of my dilemma:
<?
//get the first row from table_1 that is not in use
$result = mysqli_query($connect,"SELECT * FROM `table_1` WHERE inuse='no'");
$rows = mysqli_fetch_array($result, MYSQLI_ASSOC);
$data1 = $rows[field1];
//"lock" our row by setting inuse='yes'
mysqli_query($connect,"UPDATE `table_1` SET inuse='yes' WHERE field1 = '$data1'");
//insert new row into table_2 with our data if it doesn't already exist
$result2 = mysqli_query($connect,"SELECT * FROM `table_2` WHERE field='$data2'");
$numrows = mysqli_num_rows($result2);
if($numrows >= 1) {
//do nothing
} else {
//run some unrelated script to get data
$data2 = unrelatedFunction();
//insert our data into table_2
mysqli_query($connect,"INSERT INTO `table_2` (field) value ('$data2')");
}
//"unlock" our row in table_1
mysqli_query($connect,"UPDATE `table_1` SET inuse='no' WHERE field1 = '$data1'");
?>
You'll see here that $data2 won't be collected and inserted if a row already exists with $data2, but that part is for error-checking and does not answer my question as the error still occurs. I'm trying to understand why (if I don't have that error-check in there) my 'inuse' method is sometimes being ignored and I'm getting duplicate rows in table_2 with $data2 in them.
There's a lot of time in between your first select and the first update where another process can do the same operation. You're not using transaction either, so you're not guaranteeing any order of the changes becoming visible to others.
You can either move everything into a transaction with the isolation level you need and use SELECT .... FOR UPDATE syntax. Or you can try doing the copy in a different way. For example update N rows that you want to process and SET in_use=your_current_pid WHERE in_use IS NULL. Then you can read back the rows you manually marked for processing. After you finish, reset in_use to NULL.

Easy way to check rows in SQL and add if not there? (over 100k rows)

I have a script that adds about 100,000 entries to SQL if it doesn't exist. But it normally takes about 30 hours to fully check each row and add if it doesnt exist. Is there an easier way to do this?
my code currently uses a for Loop, within the loop is this.
$query = mysql_query("SELECT EXISTS (SELECT * FROM linkdb WHERE link='$currentlink')");
if (mysql_result($query, 0) == 1){
}else{
$qry = "INSERT INTO linkdb(link,title) VALUES('$link','$title')";
$result = #mysql_query($qry);
}
the code above takes very long time because it has to normally go through thousands of entries. If I don't check the table first using SELECT EXIST and use only INSERT INTO, 90,000 entries are added within 1 min. But that adds duplicate entries of the same row.
Please give me some advice on what I could do. These rows need to be updated almost everyday.
You're looking for ON DUPLICATE KEY UPDATE. Add an index on link and then:
INSERT INTO linkdb(link,title) VALUES('$link','$title') ON DUPLICATE KEY UPDATE link=link;
With that said, you should not be using ext/mysql since it is deprecated. Instead look into PDO or mysqli. It would be much better to use parametrized queries for this to prevent SQL injection.
Perhaps
INSERT INTO ... ON DUPLICATE KEY UPDATE
can solve your problem.
If you don't want to update the value when there is a duplicate, you can combine the two queries into one:
INSERT INTO linkdb(link,title)
select '$link','$title'
where not exists (SELECT * FROM linkdb WHERE link='$currentlink'))
In practice, you can speed up any of these queries by creating an index on linkdb(link).

Mysql UPDATE (from PHP) - only runs first few rows.. Alls CELL locked after one UPDATE

I run a MYSQL UPDATE from PHP. It produces output, which is shown in the browser - No probz.
The mysql db updates various number of rows.. If i try 10 rows, it produces first 5 rows. If i try 4 it produces first and last.
I started making INSERT for all rows, and it inserted 1000+ rows in few seconds for this excact same database. The UPDATE seems to be way off in some way...
Maybe people have some inputs to why this could happend ?
The main concern, is that after i have produced updates on the rows, the rows are "locked" for updates through PHP. This in my mind is really a weird point and i don't get what is going on. I can offcourse make updates through the phpMYadmin.
CODE as requested:
mysql_query(" UPDATE `search` SET `pr_image` = '$primage', `pr_link` = '$pr_link' WHERE `s_id` = '$id' ");
Thanks in advance.
UPDATE `search` SET `pr_image` = $primage, `pr_link` = $pr_link WHERE `s_id` = $id
Try with this query.

Merging and Adding data with SQLite

I am writing a PHP script that runs on a cron and pulls JSON data from an API [ title (text), path (text), visitors (integer) ] and stores it in a SQLite database. Every time it runs, if it sees an existing title, it should add the new visitors count to the existing visitors. If not, it should add the new data as a new row.
Here's a simplified look at my loop:
foreach($results as $printresults) {
//this iterates though $title, $path and $visitors
$existing_visitors = $db->query("SELECT SUM(visitors) FROM topten WHERE
title='$title'");
while ($existing_vis_row = $existing_visitors->fetch(SQLITE_NUM)) {
$dupe_enter = $db->query("UPDATE topten SET title='$title', path='$path',
visitors='$existing_vis_row[0]' WHERE title='$title' ");
}
$db->query("INSERT INTO topten (id,title,path,visitors,time) VALUES
(NULL, '$title', '$path', '$visitors', '$time');");
}
Then I'll do a SELECT to pull DISTINCT rows ordered by visitors and write this to a file. Since the UPDATE query adds all the visitors to these rows, it doesn't matter that there will be all the dupes. On a certain timeout, I'll drop the whole table and start collecting again, so the file doesn't get too unwieldy.
The problem is that it is adding the summed visitor counts on every pass of the loop making the visitor counts totally out of whack. But I couldn't find a better way to simply add the data together every time the script was run.
pseudo-code:
for($json_records as $rec){
$row = SELECT visitors FROM topten WHERE title = $rec['title']
if($row)
//record exists, add visitors and update
$sum_visitors = $row['visitors'] + $rec['visitors']
UPDATE topten SET visitors = $sum_visitors WHERE title = $rec['title']
else
//record doesn't exist, insert new
INSERT topten (title, visitors) VALUES ($rec['title'], $rec['visitors'])
}
Maybe?
avoid dupes. set a unique key and use INSERT OR REPLACE ... instead of doing it yourself.
something like CREATE UNIQUE INDEX 'title_path' ON topten (title, path). this will make impossible to have two records with the same title and path fields. so, if you just do a blind INSERT ...., you'd get a conflict error if it would be a dupe.
so, just use INSERT OR REPLACE ...., this would first check any unique index and if there's already a record, it would be erased, then it would do the insert. of course it's all atomic (so other process checking won't see the record disappear and reappear).

Best way to update user rankings without killing the server

I have a website that has user ranking as a central part, but the user count has grown to over 50,000 and it is putting a strain on the server to loop through all of those to update the rank every 5 minutes. Is there a better method that can be used to easily update the ranks at least every 5 minutes? It doesn't have to be with php, it could be something that is run like a perl script or something if something like that would be able to do the job better (though I'm not sure why that would be, just leaving my options open here).
This is what I currently do to update ranks:
$get_users = mysql_query("SELECT id FROM users WHERE status = '1' ORDER BY month_score DESC");
$i=0;
while ($a = mysql_fetch_array($get_users)) {
$i++;
mysql_query("UPDATE users SET month_rank = '$i' WHERE id = '$a[id]'");
}
UPDATE (solution):
Here is the solution code, which takes less than 1/2 of a second to execute and update all 50,000 rows (make rank the primary key as suggested by Tom Haigh).
mysql_query("TRUNCATE TABLE userRanks");
mysql_query("INSERT INTO userRanks (userid) SELECT id FROM users WHERE status = '1' ORDER BY month_score DESC");
mysql_query("UPDATE users, userRanks SET users.month_rank = userRanks.rank WHERE users.id = userRanks.id");
Make userRanks.rank an autoincrementing primary key. If you then insert userids into userRanks in descending rank order it will increment the rank column on every row. This should be extremely fast.
TRUNCATE TABLE userRanks;
INSERT INTO userRanks (userid) SELECT id FROM users WHERE status = '1' ORDER BY month_score DESC;
UPDATE users, userRanks SET users.month_rank = userRanks.rank WHERE users.id = userRanks.id;
My first question would be: why are you doing this polling-type operation every five minutes?
Surely rank changes will be in response to some event and you can localize the changes to a few rows in the database at the time when that event occurs. I'm pretty certain the entire user base of 50,000 doesn't change rankings every five minutes.
I'm assuming the "status = '1'" indicates that a user's rank has changed so, rather than setting this when the user triggers a rank change, why don't you calculate the rank at that time?
That would seem to be a better solution as the cost of re-ranking would be amortized over all the operations.
Now I may have misunderstood what you meant by ranking in which case feel free to set me straight.
A simple alternative for bulk update might be something like:
set #rnk = 0;
update users
set month_rank = (#rnk := #rnk + 1)
order by month_score DESC
This code uses a local variable (#rnk) that is incremented on each update. Because the update is done over the ordered list of rows, the month_rank column will be set to the incremented value for each row.
Updating the users table row by row will be a time consuming task. It would be better if you could re-organise your query so that row by row updates are not required.
I'm not 100% sure of the syntax (as I've never used MySQL before) but here's a sample of the syntax used in MS SQL Server 2000
DECLARE #tmp TABLE
(
[MonthRank] [INT] NOT NULL,
[UserId] [INT] NOT NULL,
)
INSERT INTO #tmp ([UserId])
SELECT [id]
FROM [users]
WHERE [status] = '1'
ORDER BY [month_score] DESC
UPDATE users
SET month_rank = [tmp].[MonthRank]
FROM #tmp AS [tmp], [users]
WHERE [users].[Id] = [tmp].[UserId]
In MS SQL Server 2005/2008 you would probably use a CTE.
Any time you have a loop of any significant size that executes queries inside, you've got a very likely antipattern. We could look at the schema and processing requirement with more info, and see if we can do the whole job without a loop.
How much time does it spend calculating the scores, compared with assigning the rankings?
Your problem can be handled in a number of ways. Honestly more details from your server may point you in a totally different direction. But doing it that way you are causing 50,000 little locks on a heavily read table. You might get better performance with a staging table and then some sort of transition. Inserts into a table no one is reading from are probably going to be better.
Consider
mysql_query("delete from month_rank_staging;");
while(bla){
mysql_query("insert into month_rank_staging values ('$id', '$i');");
}
mysql_query("update month_rank_staging src, users set users.month_rank=src.month_rank where src.id=users.id;");
That'll cause one (bigger) lock on the table, but might improve your situation. But again, that may be way off base depending on the true source of your performance problem. You should probably look deeper at your logs, mysql config, database connections, etc.
Possibly you could use shards by time or other category. But read this carefully before...
You can split up the rank processing and the updating execution. So, run through all the data and process the query. Add each update statement to a cache. When the processing is complete, run the updates. You should have the WHERE portion of the UPDATE reference a primary key set to auto_increment, as mentioned in other posts. This will prevent the updates from interfering with the performance of the processing. It will also prevent users later in the processing queue from wrongfully taking advantage of the values from the users who were processed before them (if one user's rank affects that of another). It also prevents the database from clearing out its table caches from the SELECTS your processing code does.

Categories