Updating thousands of mysql rows with php - php

I have a script with needs to update a value over night - each night.
My mysql db has 119k rows which is grouped into 35k rows.
For each of these rows i need to calculate the highest and the lowest value and the update the row with a new percent difference between these rows.
Right now i can't even execute the update with a limit of 50+
My code:
$query_updates = mysqli_query($con,"SELECT partner FROM trolls WHERE GROUP BY partner LIMIT 0, 50")
or die(mysqli_error($con));
while($item = mysqli_fetch_assoc($query_updates)) {
$query_updates_prices = mysqli_query($con,"SELECT
MIN(partner1) AS p1,
MAX(partner2) AS p2,
COUNT(partner3) AS p3
FROM trolls WHERE partner='". $item["partner"] ."'")
or die(mysqli_error($con));
$partner = mysqli_fetch_assoc($query_updates_prices);
$partner1 = $partner["p1"];
$partner2 = $partner["p2"];
$difference = $partner1 - $partner2;
$savings = round($difference / $partner1 * 100);
$partner3 = $prices["p3"];
$update_tyre = mysqli_query($con, "UPDATE trolls SET
partner1='". $partner1 ."',
partner2='". $partner2 ."',
partner3='". $partner3 ."',
partner4='". $savings ."'
WHERE partner='". $item["partner"] ."'")
or die(mysqli_error($con));
echo '<strong>Updated: '. $item["partner"] .'</strong><br>';
}
How can i make this more simple / able to execute?

+1 for the cron, also command line running would help you out as it won't timeout. However you might have problems with group by locking tables.
To be honest (you won't like this) but if you are doing a group by on a field with a large amount of fields, then I would say that you have done something wrong.
So I would look at redoing the tables, having a table for 'partner' and then referencing off trolls that would help.
But to give you a solution just to speed this up a touch, move you towards a better database/table setup and to remove the locking problem. I would do this.
Step 1.
Create a table called
Partners
Field1: partner_id
Field2: partner
Field3: p1
Field4: p2
Field5: p3
Step 2:
Run the query
SELECT partner FROM trolls (this could be changed in the future to SELECT * FROM partners)
Step 3:
Check are they in Partners - if not insert
Step 4:
Run your
SELECT
MIN(partner1) AS p1,
MAX(partner2) AS p2,
COUNT(partner3) AS p3
FROM trolls WHERE partner='". $item["partner"] ."'
Step 5:
Updates the values from this into the Partners table, and (for the time being) update the trolls table.
Done.
Oh and incase it's not already there add in an index to the partners field.

You can do those 2 SELECTs in one:
SELECT partner, MIN(partner1) AS p1, MAX(partner2) AS p2, COUNT(partner3) AS p3
FROM trolls GROUP BY partner LIMIT 0, 50
Create a BTREE index on trolls (partner) without locking the table:
CREATE INDEX CONCURRENTLY IX_TROLLS_PARTNER ON trolls USING btree(partner);
If you choose to still do those 2 SELECTs separated, use PDO->prepare instead of PDO->query, PDO->prepare doc on php.net:
Calling PDO::prepare() and PDOStatement::execute() for statements that will be issued multiple times with different parameter values optimizes the performance of your application by allowing the driver to negotiate client and/or server side caching of the query plan and meta information
Maybe change php.ini max_execution_time to a higher value if it's too low (I keep it 300 (5 minutes) but each case is a case :P ).

Related

How to check if previous query was executed correctly?

I have to decrease money from a user account and increase another user account, namely to transfer money from an account to another.
I have this code for example, in MySql:
START TRANSACTION;
UPDATE accounts
SET balance = (balance-100)
WHERE account_id = 2 AND balance>100;
--If the above query is succesfully then:
UPDATE accounts
SET balance = (balance+100)
WHERE account_id =1;
--How can I exec the commit only if everything is ok?
COMMIT;
The first query is executed only if the balance>100.
However the second query (namely the second update) should be executed only if the prevoious query has decreased the balance. How could I automatically check this?
Furthermore the COMMIT; has to be executed only if the previous 2 queries have done their job.
How could this be implemented?
(I'm using PHP too but I think this problem could easily tackled using sql. Am I wrong?)
Perform the operation as single query, not as a query pack:
UPDATE accounts t1
CROSS JOIN accounts t2
SET t1.balance = (t1.balance-100),
t2.balance = (t2.balance+100)
WHERE t1.account_id = 2 AND t1.balance>100
AND t2.balance_id = 1;
-- or
UPDATE accounts
SET balance = balance + CASE account_id WHEN 1 THEN 100
WHEN 2 THEN -100 END
WHERE account_id IN (1,2);
And you do not need in transaction at all.
Also you may check the amount of rows altered (by fact, on disk, not formally) by previous query, and take this info into account in 2nd query:
START TRANSACTION;
UPDATE accounts
SET balance = (balance-100)
WHERE account_id = 2 AND balance>100;
UPDATE accounts
SET balance = (balance+100)
WHERE account_id =1
AND ROW_COUNT(); -- check does a row was altered in previous statement
-- if not then this statement will not alter any row too
COMMIT;

Whats better; mysql scheduler or php script and crontab?

I have 1 mysql table where it has thousands of rows. I use this as a transaction history for my users. I also query this table on the same page for a sum of earnings per product. here are my two mysql calls;
$earnperproduct = mysqli_query($con,"SELECT product, SUM(amount) AS totalearn FROM wp_payout_history WHERE user=$userid GROUP BY product");
$result = mysqli_query($con,"SELECT * FROM wp_payout_history WHERE user=$userid ORDER BY date DESC");
my fear is that as the table grows, the $earnperproduct call will become too intensive and slow down page loading. Therefore instead of doing a sum command every time the page loads, i think it would be easier to update a summary (example wp_summary_table) whenever wp_payout_history is changed to replace past values with new SUM(AMOUNT)values per user and product; and thus query something like this;
$earnperproduct = mysqli_query($con,"SELECT * FROM wp_payout_summary_table WHERE user=$userid ORDER BY product DESC");
TL;DR
What is the best method to go about updating a table using the $earnedperproduct style call? would I be better using mysql event scheduler or a php script with a crontab? Is there any tutorials that can help me create either option for my needs?
Both the control mechanisms you mention use time as the trigger for an action. But your description says that you really want to trigger an action when the data changes. And in a relational database the best way to trigger an action when data is changed is with a ....trigger. Which makes your question a duplicate of this.
Arguably it may be more efficient to snapshot the transactions, then something like....
[INSERT INTO summary_table (user, total_amount, last_id) ]
SELECT
user, SUM(amount), MAX(id)
FROM (
SELECT a.user, a.total_amount AS amount, a.last_id
FROM summary_table a
WHERE a.user=$user_id
AND last_id=(SELECT MAX(b.last_id)
FROM summary_table b
WHERE b.user=$user_id)
UNION
SELECT h.user, h.amount, h.id
FROM wp_payout_history h
WHERE h.user=$user_id
AND h.id>(SELECT MAX(c.last_id)
FROM summary_table c
WHERE c.user=$user_id)
) ilv
GROUP BY user;
...then it doesn't really matter what you use to refresh the history - the query will always give you an up to date response. If you go down this route then add a dummy integer column in the summary table and add 0 as unaggregatedrows to the second SELECT and SUM(1) to the 4th SELECT to work out when it will be most efficient to update the summary table.

update remaining balance to 0 if payment is greater than remaining balance

using this query i can easily deduct the payment to the remaining balance
let say that my rem_balance is 3000 and my payment is 5000 so basically the change would be 2000.
UPDATE tbl_users SET rem_balance = (rem_balance - '5000') WHERE user_id = '2017001002'
but since i use this query what happen is the rem_balance updates to -2000 what i wanted to achieve is that if payment is > rem_balance, rem_balance becomes 0 and it will separate the change.
Something like this should work
UPDATE tbl_users
SET rem_balance = (CASE WHEN rem_balance < 5000 THEN 0 ELSE rem_balance - 5000 END)
WHERE user_id = '2017001002'
Please be aware of the fact that you use implicit conversion in your SQL ('5000' instead of 5000). It is a bad habit and it can harm performance of your queries in certain cases. For example, if tbl_users.user_id is integer as well then the update will be unable to use indexes to search the row with the specific user_id and it will do a sequential scan which is very slow.

How to improve Mysql database performance without changing the db structure

I have a database that is already in use and I have to improve the performance of the system that's using this database.
There are 2 major queries running about 1000 times in a loop and this queries have inner joins to 3 other tables each. This in turn is making the system very slow.
I tried actually to remove the query from the loop and fetch all the data only once and process it in PHP. But this is putting to much load on the memory (RAM) and the system is hanging if 2 or more clients try to use the system.
There is a lot of data in the tables even after removing the expired data .
I have attached the query below.
Can anyone help me with this issue ?
select * from inventory
where (region_id = 38 or region_id = -1)
and (tour_opp_id = 410 or tour_opp_id = -1)
and room_plan_id = 141 and meal_plan_id = 1 and bed_type_id = 1 and hotel_id = 1059
and FIND_IN_SET(supplier_code, 'QOA,QTE,QM,TEST,TEST1,MQE1,MQE3,PERR,QKT')
and ( ('2014-11-14' between from_date and to_date) )
order by hotel_id desc ,supplier_code desc, region_id desc,tour_opp_id desc,inventory.inventory_id desc
SELECT * ,pinfo.fri as pi_day_fri,pinfoadd.fri as pa_day_fri,pinfochld.fri as pc_day_fri
FROM `profit_markup`
inner join profit_markup_info as pinfo on pinfo.profit_id = profit_markup.profit_markup_id
inner join profit_markup_add_info as pinfoadd on pinfoadd.profit_id = profit_markup.profit_markup_id
inner join profit_markup_child_info as pinfochld on pinfochld.profit_id = profit_markup.profit_markup_id
where profit_markup.hotel_id = 1059 and (`booking_channel` = 1 or `booking_channel` = 2)
and (`rate_region` = -1 or `rate_region` = 128)
and ( ( period_from <= '2014-11-14' and period_to >= '2014-11-14' ) )
ORDER BY profit_markup.hotel_id DESC,supplier_code desc, rate_region desc,operators_list desc, profit_markup_id DESC
Since we have not seen your SHOW CREATE TABLES; and EXPLAIN EXTENDED plan it is hard to give you 1 answer
But generally speaking in regard to your query "BTW I re-wrote below"
SELECT
hotel_id, supplier_code, region_id, tour_opp_id, inventory_id
FROM
inventory
WHERE
region_id IN (38, -1)
AND tour_opp_id IN (410, -1)
AND room_plan_id IN (141, 1)
AND bed_type_id IN (1, 1059)
AND supplier_code IN ('QOA', 'QTE', 'QM', 'TEST', 'TEST1', 'MQE1', 'MQE3', 'PERR', 'QKT')
AND ('2014-11-14' BETWEEN from_date AND to_date )
ORDER BY
hotel_id DESC, supplier_code DESC, region_id DESC, tour_opp_id DESC, inventory_id DESC
Do not use * to get all the columns. You should list the column that you really need. Using * is just a lazy way of writing a query. limiting the columns will limit the data size that is being selected.
How often is the records in the inventory are being updates/inserted/delete? If not too often then you can use consider using SQL_CACHE. However, caching a query will cause you problems if you use it and the inventory table is updated very often. In addition, to use query cache you must check the value of query_cache_type on your server. SHOW GLOBAL VARIABLES LIKE 'query_cache_type';. If this is set to "0" then the cache feature is disabled and SQL_CACHE will be ignored. If it is set to 1 then the server will cache all queries unless you tell it not too using NO_SQL_CACHE. If the option is set to 2 then MySQL will cache the query only where SQL_CACHE clause is used. here is documentation about query_cache_type
If you have an index on those following column in this order it will help you (hotel_id, supplier_code, region_id, tour_opp_id, inventory_id)
ALTER TABLE inventory
ADD INDEX (hotel_id, supplier_code, region_id, tour_opp_id, inventory_id);
If possible increase sort_buffer_size on your server as most likely you issue here is that your are doing too much sorting.
As for the second query "BTW I re-wrote below"
SELECT
*, pinfo.fri as pi_day_fri,
pinfoadd.fri as pa_day_fri,
pinfochld.fri as pc_day_fri
FROM
profit_markup
INNER JOIN
profit_markup_info AS pinfo ON pinfo.profit_id = profit_markup.profit_markup_id
INNER JOIN
profit_markup_add_info AS pinfoadd ON pinfoadd.profit_id = profit_markup.profit_markup_id
INNER JOIN
profit_markup_child_info AS pinfochld ON pinfochld.profit_id = profit_markup.profit_markup_id
WHERE
profit_markup.hotel_id = 1059
AND booking_channel IN (1, 2)
AND rate_region IN (-1, 128)
AND period_from <= '2014-11-14'
AND period_to >= '2014-11-14'
ORDER BY
profit_markup.hotel_id DESC, supplier_code DESC, rate_region DESC,
operators_list DESC, profit_markup_id DESC
Again eliminate the use of * from your query
Make sure that the following columns have the same type/collation and same size. pinfo.profit_id, profit_markup.profit_markup_id, pinfoadd.profit_id, pinfochld.profit_id and each one have to have an index on every table. If the columns have different types then MySQL will have to convert the data every time to join the records. Even if you have index it will be slower. Also, if those column are characters type (ie. VARCHAR()) make sure they are of the CHAR() with a collation of latin1_general_ci as this will be faster for finding ID, but if you are using INT() even better.
Use the 3rd and 4th trick I listed for the previous query
Try using STRAIGHT_JOIN "you must know what your doing here or it will bite you!" Here is a good thread about this When to use STRAIGHT_JOIN with MySQL
I hope this helps.
For the first query, I am not sure if you can do much (assuming you have already indexed the fields you are ordering by) apart from replacing the * with column names (Don't expect this to increase the performance drastically).
For the second query, before you go through the loop and put in selection arguments, you could create a view with all the tables joined and ordered then make a prepared statement to select from the view and bind arguments in the loop.
Also, if your php server and the database server are in two different places, it is better if you did the selection through a stored procedure in the database.
(If nothing works out, then memcache is the way to go... Although I have personally never done this)
Here you have increase query performance not an database performance.
For both queries first check index is available on WHERE and ON(Join) clause columns, if index is missing then you have to add index to improve query performance.
Check explain plane before create index.
If possible show me the explain plane of both query that will help us.

managing concurrency using update queries where condition

I am using mysql, php.
table
user_meetup
id, user_id, meetup_id with unique on (user_id, meetup_id)
if my meetup is limited by places(10 places) 9 users already marked rsvp and if 2 users doing RSVP at the same time (i just have consider a case of concurrency)
A -> select count(id) from user_meetup -> result : 9 hence go ahead
B -> select count(id) from user_meetup -> result : 9 hence go ahead
A -> insert ......
B -> insert ......
now both of them will get place and my count became 11,
my solution was to add one more table
user_meetup_count
id, meetup_id, user_count ( count set to 0 as default value and record is created as soon as meetup is created)
now if use update query like
update user_meetup_count set user_count = user_count + 1 where user_count < 10
and if I write in user_meetup table based on number of rows updated by above query
if it returns 0 mean its full and if it returns 1 I got the place
will this work in case 2 user try at same time ?
is my approach to solve concurrency problem right ?
is there a better way ?
what are the tools to testing this type of situations ?
Use lock tables before counting. And unlock it after inserting.
Or you could use GET_LOCK then RELEASE_LOCK. With this you do not need to lock all entry table.
Or explore theme about gap locking for innodb tables. With this you need to use transactions.
And you could use jMeter for testing your queries.

Categories