If two users execute the same php file, will it be executed parallel or sequential? Example:
If I have a database data which only has one column id would it be possible that the following code produces for two different users the same outcome?
1. $db=startConnection();
2. $query="SELECT id FROM data";
3. $result=$db->query($query)or die($db->error);
4. $zeile=mysqli_fetch_row($result);
5. $number=$zeile['id'];
6. $newnumber=$number+1;
7. echo $number;
8. $update = "UPDATE data Set id = '$newnumber' WHERE id = '$number'";
9. $db->query($query)or die($db->error);
10. mysqli_close($db);
If it is not executed parallel, does it mean when 100 people are loading a php file that has a loading time of 1 second, then one of them has to wait 99 seconds?
Edit: In the comments it is stated that I could messup my database, I guess this is how it could mess up:
User A executes the file from 1.-7. in this moment user B executes the file from 1.-7. then A loads 8.-10. and B loads 8.-10. In this scenario both users would have the same number on the screen.
Now lets take the following example:
1. $db=startConnection();
2. $query=" INSERT INTO data VALUES ()";
3. $result=$db->query($query)or die($db->error);
4. echo $db->insert_id;
5. mysqli_close($db);
Lets say A executes the file from 1.-3. in this moment user B executes the file from 1.-5., after that user A loads the file from 4.-5. I guess in this scenario also both would have the same number on the screen right? Does transaction prevent both scenarios?
You can say that php files executed parallel (for most cases it is so, but this depends on web server).
Yes, it is possible that the following code produces for two different users the same outcome.
How to avoid this possibility?
1) If you are using MySQL, you can use transactions and "SELECT ... UPDATE FOR" to avoid this possibility. Just using transaction wouldn't help!
2) Be sure that you are using InnoDB or any other database engine that support transactions. For example MyISAM doesn't support transactions. Also you can have problems if any form of snapshotting is enabled in the database to handle reading locked records.
3) Example of using "SELECT ... UPDATE FOR":
$db = startConnection();
// Start transaction
$db->query("START TRANSACTION") or die($db->error);
// Your SELECT request but with "FOR UPDATE" lock
$query = "SELECT id FROM data FOR UPDATE";
$result = $db->query($query);
// Rollback changes if there is error
if (!$result)
{
mysql_query("ROLLBACK");
die($db->error);
}
$zeile = mysqli_fetch_row($result);
$number = $zeile['id'];
$newnumber = $number + 1;
echo $number;
$update = "UPDATE data Set id = '$newnumber' WHERE id = '$number'";
$result = $db->query($query);
// Rollback changes if there is error
if (!$result)
{
mysql_query("ROLLBACK");
die($db->error);
}
// Commit changes in database after requests sucessfully executed
mysql_query("COMMIT");
mysqli_close($db);
Why just using transaction wouldn't help?
Just transaction will lock only for write. You can test examples bellow by running two mysql console clients in two separate terminal windows. I did so and that's how it works.
We have client#1 and client#2 that executed parallel.
Example #1. Without "SELECT ... FOR UPDATE":
client#1: BEGIN
client#2: BEGIN
client#1: SELECT id FROM data // fetched id = 3
client#2: SELECT id FROM data // fetched id = 3
client#1: UPDATE data Set id = 4 WHERE id = 3
client#2: UPDATE data Set id = 4 WHERE id = 3
client#1: COMMIT
client#2: COMMIT
Both clients fetched the same id (3).
Example #2. With "SELECT ... FOR UPDATE":
client#1: BEGIN
client#2: BEGIN
client#1: SELECT id FROM data FOR UPDATE // fetched id = 3
client#2: SELECT id FROM data FOR UPDATE // here! client#2 will wait for end of transaction started by client#1
client#1: UPDATE data Set id = 4 WHERE id = 3
client#1: COMMIT
client#2: client#1 ended transaction and client#2 fetched id = 4
client#1: UPDATE data Set id = 5 WHERE id = 4
client#2: COMMIT
Hey, I think such read-locks reduce performance!
"SELECT ... FOR UPDATE" do read-lock only for clients that use "SELECT ... FOR UPDATE". That's good, cause it means that such read-lock wouldn't affect on standart "SELECT" requests without "FOR UPDATE".
Links
MySQL documentation: "SELECT ... FOR UPDATE" and other read-locks
Parallel or Sequential?
Part of your question was about PHP running either parallel or sequential. As I have read everything and its opposite about that topic, I decided to test it myself.
Field testing:
On a LAMP stack running PHP 5.5 w/ Apache 2, I made a script with a very expensive loop:
function fibo($n)
{
return ($n > 1) ? fibo($n - 1) + fibo($n - 2) : 1;
}
$start = microtime(true);
print "result: ".fibo(38);
$end = microtime(true);
print " - took ".round(($end - $start), 3).' s';
Result with 1 script running:
result: 63245986 - took 19.871 s
Result with 2 scripts running at the same time in two different browser windows:
result: 63245986 - took 20.753 s
result: 63245986 - took 20.847 s
Result with 3 scripts running at the same time in three different browser windows:
result: 63245986 - took 26.172 s
result: 63245986 - took 28.302 s
result: 63245986 - took 28.422 s
CPU usage while running 2 instances of the script:
CPU usage while running 3 instances of the script:
So, it's parallel!
Althoug inside a PHP script, you can't easily use multithreading (while it's possible), Apache takes benefit from your servers having multiple cores to dispatch the load.
So if your 1-second script is run by 100 users at the same time, well if you have 100 CPU cores the 100th user will hardly notice anything. If you have 8 CPU cores (which is more common), then the 100th user will theoritically have to wait something like 100 / 8 = 12.5 seconds for his instance of the script to begin. In practice, as the "benchmark" puts in evidence, each thread's performance diminishes when other threads are running at the same time on other cores. So it could be a lot more. But not 100 seconds more.
Related
Can two Laravel workers can use the same Transaction DB?
I have Job Process A which will call/dispatch Job Process B if there is data in table A with flag is_processed = 0. What it does is:
-- first select data with lock
SELECT *
FROM tableA
WHERE is_proccesed = 0
LIMIT 1000
FOR UPDATE OF tableA SKIP LOCKED
-- insert data to tableB
INSERT tableB VALUES SELECT values from tableA
-- update data
UPDATE tableA SET is_proccesed = 1 where id = (from any id i have select)
Then trigger job process B:
ProcessB::dispatch(from any id i have select as string)->onQueue('queueA');
I have Job Process B which will be triggered by Job Process A or cron which works every minute.
--first select data with lock
SELECT *
FROM tableB
WHERE is_proccesed = 0 AND id in (parameter get from job A if any)
LIMIT 1000
FOR UPDATE OF tableB SKIP LOCKED
-- call API with parameter value is from tableB
-- update data
If (call API is success) then:
UPDATE tableB SET is_proccesed = 1 where id = (from any id i have select)
if (call API is fail) then:
UPDATE tableB SET is_proccesed = 0 where id = (from any id i have select)
I have a cron running every minute that will call/dispatch Job Process A if any is_processed flag is 0 in table A.
I have a cron running every minute which call/dispatch Job Process B if there is_processed flag which is 0 in table B.
I use supervisor to do this in real time and use max-retry for jobs that fail 3 times.
My problem is:
I have double process call API from job process B,
I have scrolled through my logs and the SELECT key got 2 data from 2 different processes at the same time. (in some cases with 2000 or more data to process),
It doesn't always happen to process a bit of data.
My question is:
Is select data with lock not working with queue jobs?
Is it correct to create cron to notify job manually to reprocess unsuccessful data, or should I apply a failed job only to rework jobs?
I have not seen many Web languages that use database locks correctly. Without looking at the Laravel code, I would guess that it does not use database locks correctly for jobs. I know that it does not use locks for migrate. Running migrate from >2 web nodes is not safe.
If you use Redis or some other technology for jobs instead of SQL DB, a lot of concurrent problems will probably go away.
Manage your own global lock
You can manage your own lock and add synchronization between your own processes.
$results = \DB::select('SELECT GET_LOCK("process-b", 120) as obtain_lock');
if (!$results[0]->obtain_lock) { return 0; }
//120 is seconds to wait for lock or fail
//load one record
//call API
//update one record
//free lock
$results = \DB::select('SELECT RELEASE_LOCK("process-b")');
if (!$results[0]->obtain_lock) { return -1; } //couldn't release lock, stop process, free mysql connection
In Postgresql they are called "advisory locks", but you cannot use characters, you have to use numbers
$results = \DB::select('SELECT pg_advisory_lock(1337)');
if (!$results) { return 0; } // ???
//load one record
//call API
//update one record
//free lock
$results = \DB::select('SELECT pg_advisory_unlock(1337)');
if (!$results) { return -1; } //??? how to check for success?
Use "SELECT ... FOR UPDATE"
I'm not sure if you are trying to use FOR UPDATE locks and it is not working, or you are skipping the lock with intention.
You need to turn off autocommit (set autocommit=0) to use lock FOR UPDATE or to start a transaction.
\DB::transaction( function () use ($id) {
$results = \DB::table('table_b')->select('SELECT * from table_b where ID=?', $id)->lockForUpdate()->get();
\DB::table('table_b')->update('UPDATE table_b set x=y where ID=?', $id);
});
Where ProcA sends jobs to ProcB, you can make 1 ProcB job for each ID that is processed=0 - OR - you can make 1 ProcB job whenever you find any processed=0 records.
So, if ProcB will only work with 1 record ID, then global lock solution is probably not good.
You can check that your lock for update is working by putting sleep() and creating 10-20 ProcB jobs with the same record ID. If you sleep for 3 seconds, and it takes 30-60 seconds to finish all ProcB jobs, then the lock for update is working properly. If they all finish in 3 seconds, then they are not respecting the lock on the record.
Bonus
Add this to your routes/console.php to get concurrent-safe artisan lockingmigrate command
$signature = 'lockingmigrate {--database= : The database connection to use}
{--force : Force the operation to run when in production}
{--path=* : The path(s) to the migrations files to be executed}
{--realpath : Indicate any provided migration file paths are pre-resolved absolute paths}
{--pretend : Dump the SQL queries that would be run}
{--seed : Indicates if the seed task should be re-run}
{--step : Force the migrations to be run so they can be rolled back individually}';
Artisan::command($signature, function ($database=false, $seed=false, $step=false, $pretend=false, $force=false) {
$results = \DB::select('SELECT GET_LOCK("artisan-migrate", 120) as migrate');
if (!$results[0]->migrate) { return -1; }
$params = [
'--pretend' => $pretend,
'--force' => $force,
'--step' => $step,
'--seed' => $seed,
];
$retval = Artisan::call('migrate', $params);
$outputLines = explode("\n", trim(\Artisan::output()));
dump($outputLines);
\DB::select('SELECT RELEASE_LOCK("artisan-migrate")');
return $retval;
})->describe('Concurrent-safe migrate');
The High Level Idea:
I have a micro controller that can connect to my site via a http request...I want to feed the device a response as soon as a change is noted on the database...
Due to the the end device being a client ie micro controller...Im unaware of a method to pass the data to the client without having to set up port forwarding...which is heavily undesired ...The problem arise when trying send data from an external network to an internal one...Either A. port forwarding or B have the client device initiate the request which leads me to the idea of having the device send an http request to file that polls for changes
Update:
Much Thanks to Ollie Jones. I have implimented some of his
suggestions here.
Jason McCreary suggested having a modified column which is a big
improvement as it should increase speed and reliability ...Great
suggestion! :)
if the database being overworked is in question in this example
maybe the following would work where...when the data is inserted into
the database the changes are wrote to a file...then have the loop
that continuously checks that file for an update....thoughts?
I have table1 and i want to see if a specific row(based on a UID/key) has been updated since the last time i checked as well as continuously check for 60 seconds if the record bets updated...
I'm thinking i can do this using the INFORMATION_SCHEMA database.
This database contains information about tables, views, columns, etc.
attempt at a solution:
<?php
$timer = time() + (10);//add 60 seconds
$KEY=$_POST['KEY'];
$done=0;
if(isset($KEY)){
//loign stuff
require_once('Connections/check.php');
$mysqli = mysqli_connect($hostname_check, $username_check, $password_check,$database_check);
if (mysqli_connect_errno($mysqli))
{ echo "Failed to connect to MySQL: " . mysqli_connect_error(); }
//end login
$query = "SELECT data1, data2
FROM station
WHERE client = $KEY
AND noted = 0;";
$update=" UPDATE station
SET noted=1
WHERE client = $KEY
AND noted = 0;";
while($done==0) {
$result = mysqli_query($mysqli, $query);
$update = mysqli_query($mysqli, $update);
$row_cnt = mysqli_num_rows($result);
if ($row_cnt > 0) {
$row = mysqli_fetch_array($result);
echo 'data1:'.$row['data1'].'/';
echo 'data2:'.$row['data2'].'/';
print $row[0];
$done=1;
}
else {
$current = time();
if($timer > $current){ $done=0; sleep(1); } //so if I haven't had a result update i want to loop back an check again for 60seconds
else { $done=1; echo 'done:nochange';}//60seconds pass end loop
}}
mysqli_close($mysqli);
echo 'time:'.time();
}
else {echo 'error:nokey';}
?>
Is this an adequate method and suggestions to improve the speed as well as improve the reliability
If I understand your application correctly, your client is a microcontroller. It issues an HTTP request to your php / mysql web app once in a while. The frequency of that request is up to the microcontroller, but but seems to be once a minute or so.
The request basically asks, "dude, got anything new for me?"
Your web app needs to send the answer, "not now" or "here's what I have."
Another part of your app is providing the information in question. And it's doing so asynchronously with your microcontroller (that is, whenever it wants to).
To make the microcontroller query efficient is your present objective.
(Note, if I have any of these assumptions wrong, please correct me.)
Your table will need a last_update column, a which_microcontroller column or the equivalent, and a notified column. Just for grins, let's also put in value1 and value2 columns. You haven't told us what kind of data you're keeping in the table.
Your software which updates the table needs to do this:
UPDATE theTable
SET notified=0, last_update = now(),
value1=?data,
value2?=data
WHERE which_microcontroller = ?microid
It can do this as often as it needs to. The new data values replace and overwrite the old ones.
Your software which handles the microcontroller request needs to do this sequence of queries:
START TRANSACTION;
SELECT value1, value2
FROM theTable
WHERE notified = 0
AND microcontroller_id = ?microid
FOR UPDATE;
UPDATE theTable
SET notified=1
WHERE microcontroller_id = ?microid;
COMMIT;
This will retrieve the latest value1 and value2 items (your application's data, whatever it is) from the database, if it has been updated since last queried. Your php program which handles that request from the microcontroller can respond with that data.
If the SELECT statement returns no rows, your php code responds to the microcontroller with "no changes."
This all assumes microcontroller_id is a unique key. If it isn't, you can still do this, but it's a little more complicated.
Notice we didn't use last_update in this example. We just used the notified flag.
If you want to wait until sixty seconds after the last update, it's possible to do that. That is, if you want to wait until value1 and value2 stop changing, you could do this instead.
START TRANSACTION;
SELECT value1, value2
FROM theTable
WHERE notified = 0
AND last_update <= NOW() - INTERVAL 60 SECOND
AND microcontroller_id = ?microid
FOR UPDATE;
UPDATE theTable
SET notified=1
WHERE microcontroller_id = ?microid;
COMMIT;
For these queries to be efficient, you'll need this index:
(microcontroller_id, notified, last_update)
In this design, you don't need to have your PHP code poll the database in a loop. Rather, you query the database when your microcontroller checks in for an update/
If all table1 changes are handled by PHP, then there's no reason to poll the database. Add the logic you need at the PHP level when you're updating table1.
For example (assuming OOP):
public function update() {
if ($row->modified > (time() - 60)) {
// perform code for modified in last 60 seconds
}
// run mysql queries
}
I have an application written in CakePHP version 1.2 and it was very slow because of the heavy and unoptimized queries to the database, this is an example of a deleas:
$pedidos_entregas = $this->Pedido->query('select pedidos.*, lojas.*, pessoas.*, formas_pagamentos.* from pedidos inner join veiculos_periodos
on pedidos.veiculos_periodo_id = veiculos_periodos.id inner join lojas
on veiculos_periodos.loja_id = lojas.id inner join pessoas
on pessoas.id = pedidos.pessoa_id inner join formas_pagamentos
on pedidos.formas_pagamento_id = formas_pagamentos.id
where
(finalizado = 1 or pedidos.id in
(
select pedido_id from status_pedidos where statu_id = 11
)
) order by entrega desc limit 200;');
Cache applied 30 minutes and much improved site performance. But when, after 30 minutes, one of the user will have to view the page slowly, to fill the cache again.
I captured the remaining time to finish each cache access controller that contains the use of the Cache.
$vencimento = file_get_contents(CACHE . 'cake_siv_financeiro_pedidos_entregas');
$vencimento = explode("\n", $vencimento);
$vencimento = $vencimento[0];
$agora = strtotime('now');
$faltam = ($vencimento - $agora)/60; //remaining time
echo $faltam;
For that, the win before the Cache 30 minutes, when missing 10 minutes or less, for example, if someone accesses the page, the cache already be updated again.
But still, a user will have to view the page slowly, because the query has to be done.
My question is: how to perform some function after the rendering of the view for the user? I want to do something like this, but this do not work
public function afterFilter()
{
parent::afterFilter();
//$this->atualizar_log();
$saida = $this->output;
$this->output = "";
ob_start();
print $saida;
flush();
ob_end_flush();
//I need that sleep after html returned to browser
sleep(500);
}
I have a second question, say I have a table sequinte:
table people
id (PK) name age
1 bob 20
2 ana 19
3 maria 50
and I run the following sql
UPDATE people SET age = 20 where id <3
This will affect the ID lines 1 and 2.
How, in CakePHP, after the update, grab the ids affected? (1 and 2)???
This is necessary when I delete existing caches;
It's not possible to execute code after a request. The best approach is to set up a cron job. You'll want to hook up a Cake Shell to cron - see http://book.cakephp.org/1.2/view/846/Running-Shells-as-cronjobs
If you can't use cron for whatever reason, consider having your clients fire an AJAX request to an action which updates the cache. This will happen after page load so there won't be a delay for the user.
edit: linked to 1.2 version of docs
to second question:
don't know if there is CakePHP way to do it but, you can still use mysql statement in query() method:
$sql = "SET #affids := '';
UPDATE people
SET age = 20
WHERE id < 3
AND ( SELECT #affids := CONCAT_WS(',', #affids, id) );
SELECT TRIM(LEADING ',' FROM #affids) AS affected; ";
$rs = $this->MODELNAME->query($sql);
debug($rs);
the query returns comma separated ids affected by the update.
I am running 10 PHP scripts at the same time and it processing at the background on Linux.
For Example:
while ($i <=10) {
exec("/usr/bin/php-cli run-process.php > /dev/null 2>&1 & echo $!");
sleep(10);
$i++;
}
In the run-process.php, I am having problem with database loop. One of the process might already updated the status field to 1, it seem other php script processes is not seeing it. For Example:
$SQL = "SELECT * FROM data WHERE status = 0";
$query = $db->prepare($SQL);
$query->execute();
while ($row = $query->fetch(PDO::FETCH_ASSOC)) {
$SQL2 = "SELECT status from data WHERE number = " . $row['number'];
$qCheckAgain = $db->prepare($SQL2);
$qCheckAgain->execute();
$tempRow = $qCheckAgain->fetch(PDO::FETCH_ASSOC);
//already updated from other processs?
if ($tempRow['status'] == 1) {
continue;
}
doCheck($row)
sleep(2)
}
How do I ensure processes is not re-doing same data again?
When you have multiple processes, you need to have each process take "ownership" of a certain set of records. Usually you do this by doing an update with a limit clause, then selecting the records that were just "owned" by the script.
For example, have a field that specifies if the record is available for processing (i.e. a value of 0 means it is available). Then your update would set the value of the field to the scripts process ID, or some other unique number to the process. Then you select on the process ID. When your done processing, you can set it to a "finished" number, like 1. Update, Select, Update, repeat.
The reason why your script executeds the same query multiple times is because of the parallelisation you are creating. Process 1 reads from the database, Process 2 reads from the database and both start to process their data.
Databases provide transactions in order to get rid of such race conditions. Have a look at what PDO provides for handling database transactions.
i am not entirely sure of how/what you are processing.
You can introduce limit clause and pass that as a parameter. So first process does first 10, the second does the next 10 and so on.
you need lock such as "SELECT ... FOR UPDATE".
innodb support row level lock.
see http://dev.mysql.com/doc/refman/5.0/en/innodb-locking-reads.html for details.
When there are multiple PHP scripts running in parallel, each making an UPDATE query to the same record in the same table repeatedly, is it possible for there to be a 'lag time' before the table is updated with each query?
I have basically 5-6 instances of a PHP script running in parallel, having been launched via cron. Each script gets all the records in the items table, and then loops through them and processes them.
However, to avoid processing the same item more than once, I store the id of the last item being processed in a separate table. So this is how my code works:
function getCurrentItem()
{
$sql = "SELECT currentItemId from settings";
$result = $this->db->query($sql);
return $result->get('currentItemId');
}
function setCurrentItem($id)
{
$sql = "UPDATE settings SET currentItemId='$id'";
$this->db->query($sql);
}
$currentItem = $this->getCurrentItem();
$sql = "SELECT * FROM items WHERE status='pending' AND id > $currentItem'";
$result = $this->db->query($sql);
$items = $result->getAll();
foreach ($items as $i)
{
//Check if $i has been processed by a different instance of the script, and if so,
//leave it untouched.
if ($this->getCurrentItem() > $i->id)
continue;
$this->setCurrentItem($i->id);
// Process the item here
}
But despite of all the precautions, most items are being processed more than once. Which makes me think that there is some lag time between the update queries being run by the PHP script, and when the database actually updates the record.
Is it true? And if so, what other mechanism should I use to ensure that the PHP scripts always get only the latest currentItemId even when there are multiple scripts running in parallel? Would using a text file instead of the db help?
If this is run in parallell there's little measure to avoid race conditions.
script1:
getCurrentItem() yields Id 1234
...context switch to script2, before script 1 gets to run its update statement.
script2:
getCurrentItem() yields Id 1234
And both scripts process Id 1234
You'd want to update and check status of the item an all-or-nothing operation, you don't need the settings table, but you'd do something like this (pseudo code):
SELECT * FROM items WHERE status='pending' AND id > $currentItem
foreach($items as $i) {
rows = update items set status='processing' where id = $i->id and status='pending';
if(rows == 0) //someone beat us to it and is already processing the item
continue;
process item..
update items set status='done' where id = $i->id;
}
What you need is for any thread to be able to:
find a pending item
record that that item is now being worked on (in the settings table)
And it needs to do both of those in one go, without any other thread interfering half-way through.
I recommend putting the whole SQL in a stored procedure; that will be able to run the entire thing as a single transaction, which makes it safe from competing threads.