I am having a PHP script, which starts another php script multiple times in an foreach loop. The other php scripts writes data to the same database table.
Will this cause any problems, because there will be around 30 processes writing to the same database table...
Or is this automatically handeled by MySQL ?
Thanks you!
Bye,
WorldSignia
It depends on what you are writing. INSERT can be used simultaneously. UPDATE ... WHERE ... might lead to conflicts.
Imagine you are executing UPDATE ... WHERE id=2 from two scripts at once. One might overwrite the other. You need to implement some locking facility.
You should be fine until two processes attempt to modify/retreive the same row(s). If you suspect you might run into such problems, you may take a look at mysql transactions(You need mysql server 5 or later for it) http://dev.mysql.com/doc/refman/5.0/en/commit.html
Related
I'd like to download some json-data (which gets updated every 15 seconds) and store it in my maria-db with a PHP-Script.
Unfortunately, the database-update queries take between 1 second and sometimes up to 60 seconds, depending on the json-data-size.
So sometimes I'm dead-locking myself with the write-queries who take longer than 15 seconds and as soon as I read/process the data I'm blocking all the write-queries as well.
Obviously, I do have the wrong approach and it's more complicated than I thought.
Does anyone have a good idea how such a job can be done professionally, with a continuous update possibility and not blocking the updates itself when I read the data?
Thanks for any hints!
PS: Currently I'm using an InnoDB-Table, and to speed up the inserts I've set the auto-commit to 0 and update everything in a transaction.
I had the fastest results with LOCK TABLES for WRITE, but of course this blocks the read access as well.
Simply updating some data into MariaDB shouldn't take that long unless the update you're doing is complex. What you could consider is inserting the raw JSON (maybe even in a document database instead) and have a background process triggered by a cronjob to read from the stored raw JSON to update MariaDB.
Additionally you could consider inserting data rather than updating. This will prevent deadlocks from happening. Doing so might require you to change your data model, so it might not be the solution you're looking for.
Other than the above I'd recommend you look into the process you've setup and split it into multiple steps which can be run individually. Doing so allows you fine-grained control over the timing and triggers for each step, which will prevent deadlocks if setup properly.
Maybe this is an obvious question, but it's just something I'm unsure of. If I have two standalone PHP applications running on one LAMP server, and the two PHP applications share the same MySQL database, do I need to worry about data integrity during concurrent database transactions, or is this something that MySQL just takes care of "natively"?
What happens if the two PHP applications both try to update the same record at the same time? What happens if they try to update the same table at the same time? What happens if they both try to read data from the database at the same time? Or if one application tries to read a record at the same time as the other application is updating that record?
What happens if the two PHP applications both try to update the same record at the same time?
What happens if they try to update the same table at the same time?
What happens if they both try to read data from the database at the same time?
Or if one application tries to read a record at the same time as the other application is updating that record?
This depend from several factor ..
the db engine you are using
the locking policy / transaction you have setted for you envirement .. or for you query
https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html
https://dev.mysql.com/doc/refman/8.0/en/innodb-locks-set.html
the code you are using .. you could use a select for update for lock only the rows you want modify
https://dev.mysql.com/doc/refman/8.0/en/update.html
and how you manage transaction
https://dev.mysql.com/doc/refman/8.0/en/commit.html
this is just a brief suggestion
This seems like a pretty basic question but one I don't know the answer to.
I wrote a script in PHP that loops through some data and then performs an UPDATE to records in our database. There are roughly some 150,000 records, so the script certainly takes a while to complete.
Could I potentially harm or interfere with the data insertion if I run a basic SELECT statement?
Say...I want to ensure that the script is working properly so if I run a basic SELECT COUNT() to see if it's increasing in real time as the script runs. Is this possible or would it screw something up?
Thank you!
Generally a SELECT call is incapable of "causing harm" provided you're not talking about SQL injection problems.
The InnoDB engine, which you should be using, has what's called Multi-Version Concurrency Control or MVCC for short. It means that until your UPDATE statement is finished, or the transaction that the statement is a part of, the SELECT will be done against the last consistent database state.
If you're using MyISAM, which is a very bad idea in most production environments due to the limitations of that engine and the way the data is stored without a rollback journal, the SELECT call will probably block until the UPDATE is applied since it does not support MVCC.
I was wondering if there is a (free) tool for mysql/php benchmark.
In particular, I would like to insert thousands of data into the MySQL database, and test the application with concurrent queries to see if it will last. This is, test the application in the worst cases.
I saw some pay tools, but none free or customizable one.
Any suggestion? or any script?
Thnx
Insert one record into the table.
Then do:
INSERT IGNORE INTO table SELECT FLOOR(RAND()*100000) FROM table;
Then run that line several times. Each time you will double the number of rows in the table (and doubling grows VERY fast). This is a LOT faster than generating the data in PHP or other code. You can modify which columns you select RAND() from, and what the range of the numbers is. It's possible to randomly generate text too, but more work.
You can run this code from several terminals at once to test concurrent inserts. The IGNORE will ignore any primary key collisions.
Make a loop (probably infinite) that would keep inserting data into the database and test going from there.
for($i=1;$i=1000;$i++){
mysql_query("INSERT INTO testing VALUES ('".$i."')");
//do some other testing
}
for($i=1;$i<5000;$i++){
$query = mysql_query("INSERT INTO something VALUES ($i)");
}
replace something with your table ;D
if you want to test concurrency you will have to thread your insert/update statements.
An easy and very simple way(without going into fork/threads and all that jazz) would be to do it in bash as follows
1. Create an executable PHP script
#!/usr/bin/php -q
<?php
/*your php code to insert/update/whatever you want to test for concurrency*/
?>
2. Call it within a for loop by appending & so it goes in the background.
#!/bin/bash
for((i=0; i<100; i++))
do
/path/to/my/php/script.sh &;
done
wait;
You can always extend this by creating multiple php scripts having various insert/update/select queries and run them through the for loop (remember to change i<100 to higher number if you want more load. Just don't forget to add the & after you call your script. (Of course, you will need to chmod +x myscript.sh )
Edit: Added the wait statement, below this you can write other commands/stuff you may want to do after flooding your mysql db.
I did a quick search and found the following page at MySQL documentation => http://dev.mysql.com/doc/refman/5.0/en/custom-benchmarks.html. This page contains the following interesting links:
the Open Source Database Benchmark, available at
http://osdb.sourceforge.net/.
For example, you can try benchmarking packages such as SysBench and
DBT2, available at http://sourceforge.net/projects/sysbench/, and
http://osdldbt.sourceforge.net/#dbt2. These packages can bring a
system to its knees, so be sure to use them only on your development
systems.
For MySQL to be fast you should look into Memcached or Redis to cache your queries. I like Redis a lot and you can get a free (small) instance thanks to http://redistogo.com. Most of the times the READS are killing your server and not the WRITES which are less frequently(most of the times). When WRITES are frequently most of the times it is not really a big case when you lose some data. Sites which have big WRITE rates are for example Twitter or Facebook. But then again I don't think it is the end of the world if a tweet or Facebook wall post gets lost. Like I point out previously you can fix this easily by using Memcached or Redis.
If the WRITES are killing you could look into bulk insert if possible, transactional insert, delayed inserts when not using InnoDB or partitioning. If data is not really critical you could put the queries in memory first and then do bulk insert periodically. This way when you do read from MySQL you would return stale data(could be problem). But then again when you use redis you could easily store all your data in memory, but when your server crashes you can lose data, which could be big problem.
I have a script that does an update function live. I would move it to a cron job, but due to some limitations I'd much rather have it live and called when the page loads.
The issue is that when there is a lot of traffic, it doesn't quite work, because it's using some random and weighted numbers, so if it's hit a bunch of times, the results aren't what we want.
So, question is. Is there a way to tell how many times a particular script is being accessed? And limit it to only once at a time?
Thank you!
The technique you are looking for is called locking.
The simplest way to do this is to create a temporary file, and remove it when the operation has completed. Other processes will look for that temporary file, see that it already exists and go away.
However, you also need to take care of the possibility of the lock's owner process crashing, and failing to remove the lock. This is where this simple task seems to become complicated.
File based locking solutions
PHP has a built-in flock() function that promises a OS-independent file-based locking feature. This question has some practical hints on how to use it. However, the manual page warns that under some circumstances, flock() has problems with multiple instances of PHP scripts trying to get a lock simultaneously. This question seems to have more advanced answers on the issue, but they are all not trivial to implement.
Database based locking
The author of this question - probably scared away by the complications surrounding flock() - asks for other, not file-based locking techniques and comes up with MySQL's GET_LOCK(). I have never worked with it, but it looks pretty straightforward - if you use mySQL anyway, it may be worth a shot.
Damn, this issue is complicated if you want to do it right! Interested to see whether anything more elegant comes up.
You could do something like this (requires PHP 5):
if(file_get_contents("lock.txt") == "unlocked"){
// no lock present, so place one
file_put_contents("lock.txt", "locked");
// do your processing
...
// remove the lock
file_put_contents("lock.txt", "unlocked", LOCK_EX);
}
file_put_contents() overwrites the file (as opposed to appending) by default, so the contents of the file should only ever be "locked" or nothing. You'll want to specify the LOCK_EX flag to ensure that the file isn't currently open by another instance of the script when you're trying to write to it.
Obviously, as #Pekka mentioned in his answer, this can cause problems if a fatal error occurs (or PHP crashes, or the server crashes, etc, etc) in between placing the lock and removing it, as the file will simply remain locked.
Start the script with a sql query that tests if a timestamp field from database is over 1 day ago.
If yes - write current timestamp and execute script.
pseudo-sql to show the idea:
UPDATE runs SET lastrun=NOW() WHERE lastrun<NOW()-1DAY
(different sql servers will require different changes in the above)
Check how many rows were updated to see if this script run got the lock.
Do not make it with two queries - SELECT and UPDATE because it won't be atomic anymore.