I have a case in which i need to sync an external existing table with the website table every few minutes.
I previously had it with a simple foreach which would loop through every record, as the table grows it became slower and slower and now it is taking a long time for around 20.000 records.
I want to make sure it creates a new record or updates an existing one.
This is what I got but it doesn't seem to update the existing rows.
$no_of_data = RemoteUser::count(); // 20.000 (example)
$webUserData = array();
for ($i = 0; $i < $no_of_data; $i++) {
// I check the external user so i can match it.
$externalUser = RemoteUser::where('UserID', $i)
->first();
if($externalUser) {
$webUserData[$i]['username'] = $externalUser->username;
$webUserData[$i]['user_id'] = $externalUser->UserID;
}
}
$chunk_data = array_chunk($webUserData, 1000);
if (isset($chunk_data) && !empty($chunk_data)) {
foreach ($chunk_data as $chunk_data_val) {
\DB::table('WebUser')->updateOrInsert($chunk_data_val);
}
}
Is there something I am missing or is this the wrong approach?
Thanks in advance
I'll try to make a complete all-in-one answer on some possible event driven solutions. The ideal scenario would be to change the current situation of a static check of each and every row to an event-driven solution where each entry notifies a change.
I won't list solutions per database here and use MySQL by default.
I see three possible solutions:
using an internal solution if only one and the same database instance is at play using triggers
if the creation or modification of the eloquent models are based in one place, eloquent events could be optional
alternatively mysql replication could play a role to catch the events if modifications occur outside of the application (multiple applications modify the same database).
Using triggers
If the situation applies syncing data on the same database instance (different databases) or on the same database process (same database) and the data you copy doesn’t need intervention by an external interpreter, you can use SQL or any extension of SQL supported by your database to use triggers or prepared statements.
I assume you’re using MySQL, if not, SQL triggers are quite similar across all known databases supporting SQL.
A trigger structure has a simple layout like:
CREATE TRIGGER trigger_name
AFTER UPDATE ON table_name
FOR EACH ROW
body_to_execute
Where AFTER_UPDATE is the event to catch in this example.
So for an update event, we would like to know the data that has been changed AFTER it has been updated, so we’ll use the AFTER UPDATE trigger.
So an AFTER UPDATE for your table, calling both remote_user as original and web_user as the copy, using both user_id and username as fields, would look something like
CREATE TRIGGER user_updated
AFTER UPDATE ON remote_user
FOR EACH ROW
UPDATE web_user
SET username = NEW.username
WHERE user_id = NEW.user_id;
The variables NEW and OLD are available in triggers, where NEW owns the data after the update and OLD before the update.
For a new user that has been inserted, we have the same procedure, we just need to create the entry in web_user.
CREATE TRIGGER user_created
AFTER INSERT on remote_user
FOR EACH ROW
INSERT INTO web_user(user_id, username)
VALUES(NEW.user_id, NEW.username);
Hope this gives you a clear idea on how to use triggers with SQL. There is a lot of information to be found, guides, tutorials, you name it. SQL might be an old boring language created by old people with long beards, but to know its features gives you a great advantage to solve complicated problems with simple methods.
Using Eloquent events
Laravel has a bunch of Eloquent events that get triggered when models do stuff. If the creation or modification of a model (entry in the database) only occur in one place (e.g. on entry point or application), the use of Eloquent events could be an option.
This means that you have to guarantee that the modification and/or creation takes place using Eloquents model:
Model::create([...]);
Model::find(1)->update([...])
$model->save();
// etc
And not indirectly using DB or similar:
// won't trigger any event
DB::table('remote_users')->update()->where();
Also avoid using saveQuietly() or any method on the model that's been built deliberately to suppress events.
The simplest solution would be to directly register events in the model itself using the protected static boot method.
namespace App\Models;
use bunch\of\classes;
class SomeModel extends Model {
protected static function booted() {
static::updated(function($model) {
// access any database or service
});
static::created(function($model) {
// access any database or service
});
}
}
To put the callback on a queue, Laravel 8 and up offer the queueable function to utilize the queue.
static::updated(queueable(function($ThisModel) {
// access any database or service
}));
From Laravel 7 or lower, it would be wise to create an observer and push everything on queue using jobs.
example based on your comment
If a model is present for both databases, the Eloquent events could be used in such a way, where InternalModel presents the main model which will trigger the events (source) and ExternalModel the model to update its, to be synced, database (sync table or replication).
namespace App\Models;
use App\Models\ExternalModel;
class InternalModel extends Model {
protected static function booted() {
static::updated(function($InternalModel) {
ExternalModel::find($InternalModel->id)->update([
'whatever-needs' => 'to-be-updated'
]);
});
static::created(function($InternalModel) {
ExternalModel::create([
'whatever-is' => 'required-to-create',
'the-external' => 'model'
]);
});
static::deleted(function($InternalModel) {
// do know we only have the $InternalModel object left, the entry in the database doesn't exist anymore.
ExternalModal::destroy($InternalModel->id);
});
}
}
And remember to use the queueable() to utilize the queue if it might take longer than expected.
If indeed for some reason the InternalModel table get's updated by not using Eloquent, you can trigger each Eloquent event manually if the dispatch event() method is accessible, to keep the sync process functional. e.g.
$modal = InternalModel::find($updated_id);
// trigger the update manually
event('eloquent.updated: ' . $model::class, $model);
All Eloquent events related to the models can be triggered in such a way, so: retrieved, creating, created, updating, updated, saving, saved, deleting and so on.
I would also suggest to create an additional Console command to start the sync process once, before stepping over to the Eloquent model events. Such a command is like the foreach you already used where you check once if all data is synced, something like php artisan users:sync. This could help if sometimes events don't trigger caused by exceptions, this is rare, but it does happen once in a while.
MySQL Replication
If triggers isn't a solution and you can't guarantee the data is modified from one single source, replication would be my final solution.
Someone created a package for Laravel which uses the krowinski/php-mysql-replication or the more up to date fork moln/php-mysql-replication called huangdijia/laravel-trigger.
A few things need to be configured though:
Firstly MySQL should be configured to save all events in a log file to be read.
server-id = 1
log_bin = /var/log/mysql/mysql-bin.log
expire_logs_days = 1
max_binlog_size = 100M
binlog_row_image = full
binlog-format = row
Secondly, the database user connected with the database should be granted replication privileges:
GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'user'#'host';
GRANT SELECT ON `dbName`.* TO 'user'#'host';
The general idea here is to readout a log file MySQL generates about events that occur. Writing this answer took me a while longer because I couldn't get this package up and running within a few minutes. Though I have used it in the past and know it worked flawlessly, I wrote a smaller package which would minimize the traffic and filter out events I didn't use.
I've already opened an issue and I'm going to open several over time to get this thing up and running again.
But to grasp the idea of its use-fullness, I'm going to explain its workings anyway.
To configure an event, listeners are put in a routes file called routes/trigger.php, where you have access to the $trigger instance (manager) to bind your listeners.
If we would put this into context of your tables, a listener would look like
$trigger->on('database_name.remote_users', 'update', function($event) {
// event will contain a `EventInfo` object with changed entry data.
});
Same would go for create (write) events on the table
$trigger->on('database_name.remote_users', 'write', function($event) {
// event will contain a `EventInfo` object with changed entry data.
});
To start listening for database events use the
php artisan trigger:start
To get a list of all listeners recognized from the routes/trigger.php use
php artisan trigger:list
To get a status of which bin file has been recognized and its current position use
php artisan trigger:status
In a ideal situation you would use supervisor to start the listener (artisan trigger:start) to be run in the background. If the service needs to boot again due to updates made in your application, you can simply use php artisan trigger:terminate to reboot the service. Supervisor will notice and start again with a fresh booted application.
update on package status
They seem to respond very well and some things have already been fixed. I can definitely say for sure that this package will be up and running the in a few weeks.
Normally I won't put anything in my answers on stuff I didn't used or tested myself, though I know this worked before, I'm giving it a chance it's going to work again in the next several weeks. It's a least something to watch out for or even test it to grasp ideas on how to implement in a real case scenario.
Hope you enjoyed reading.
Related
I'm running into a problem that's out of my depth regarding MySQL database transactions, locking, and Laravel's Eloquent ORM. Thanks in advance for any assistance.
Laravel 5.5
PHP 7.2.25
MySQL 8.0.13 (InnoDB)
My (simplified) problem is this: I have a tasks table in my MySQL database, an endpoint to retrieve the next task, and an endpoint to mark a task as finished or inactive. The tasks table is meant to act as a sort of queue, allowing users to pull the next available task that no one else is currently working on (read: task is inactive). If a task was pulled but no update has been made in 3 hours, it should again be visible to the retrieval endpoint (in case a user quit the app unexpectedly, etc.).
My problem arises with the task-retrieval endpoint. I'm using DB::transaction and lockForUpdate() here to prevent concurrency issues (i.e. two users fetching the same task before it gets marked active). It works as expected when the request per second volume is relatively low, properly fetching a unique task for each request. At higher request frequency, though, the entire tasks table eventually locks, requests start to time out, and the database becomes unresponsive until restarted. Here's my code:
// Select the next task in the queue
$task = DB::transaction(
function () {
$t = Task::where('finished', '=', false)
->where(function ($q) {
// Task must not be active for another user
$q->where('active', '=', false)
->orWhere('updated_at', '<', DB::raw('DATE_SUB(NOW(), INTERVAL 3 HOUR)'));
})
// Prefer tasks that haven't been seen yet
->orderBy('attempts')
->lockForUpdate()
->first();
if (null !== $t) {
// Set the task as active and increment attempts
$t->active = true;
$t->attempts++;
$t->save();
}
return $t;
}
);
Some additional context:
Other requests at higher frequency with similar DB reads have no issues, so the problem here is definitely related to the transaction/locking behavior, not some other system configuration (MySQL connections, PHP threads, etc.)
The Task Eloquent model has standard Laravel timestamps, so updated_at is set automatically as a part of the save() method
Is there something obvious I'm missing here, or am I completely off base? Is what I'm attempting even possible, or do I need to move out of MySQL to a more traditional queuing system (Redis, SQS, etc.)? I'd have no problem switching to raw SQL queries rather than using the ORM if that would make this possible. Thanks for any and all guidance!
I don't know this for sure but the suggestion is too long for a comment.
I think the lockForUpdate method may cause the row to be locked even for consideration during the next search. That means one search cannot complete before the previous update has been committed. (i.e. in the first search you select row A which is inactive, then in the second search you look for all inactive rows, of which row A is still one, but you can't finish the query until row A has been unlocked).
To get around this you could try adding SKIP LOCKED (see docs). This would exclude the locked row A from the second search and allow it to return results before the previous transaction was committed.
I cannot see a method in Laravel to add this, so you would need to add the locking clauses manually using the raw query builder or perhaps ->whereRaw would work.
I observed a strange behavior of my doctrine object. In my symfony project I'm using ORM with doctrine to save my data in a mysql database. This is working normal in the most situations. I'm also using gearman in my project, this is a framework that allows applications to complete tasks in parallel. I have a gearman job-server running on the same machine where also my apache is running and I have registered a gearman worker on the same machine in a seperate 'screen' session using the screen window manager. By this method, I have always access to the standard console out of the function registered for the gearman-worker.
In the gearman-worker function I'm invoking, I have access to the doctrine object by $doctrine = $this->getContainer()->get('doctrine') and it works almost normal. But when I have changed some data in my database, doctrine is using still the old data, which were stored before in the database. I'm totally confused, because I expected that by callling:
$repo = $doctrine->getRepository("PackageManagerBundle:myRepo");
$dbElement = $repo->findOneById($Id);
I'm always getting the current data entrys from my database. This is looking like a strange caching behavior, but I have no clue what I've made wrong.
I can solve this problem, by registering the gearman worker and function new:
$worker = new \GearmanWorker();
$worker->addServer();
$worker->addFunction
After that I've back the current state of my database, until I've changing something else. I'm oberserving this behavior only in my gearman worker function. In the rest of the application everthing is synchronized with my database and normal.
This is what I think may be happening. Could be wrong though.
A gearman worker is going to be a long-running process that picks up jobs to do. The first job it gets will then cause doctrine to load the entity into its object map from the database. But, for the second job the worker receives, doctrine will not perform a database lookup, it will instead check it's identity map and find it already has the object loaded and so will simply return the one from memory. If something else, external to the worker process, has altered the database record then you'll end up with an object that is out of date.
You can tell doctrine to drop objects from its identity map, then it will perform a database lookup. To enforce loading objects from the database again instead of serving them from the identity map, you should use EntityManager#clear().
More info here:
https://www.doctrine-project.org/projects/doctrine-orm/en/2.6/reference/working-with-objects.html#entities-and-the-identity-map
I want to make a log details for my database while a record insert into any table and update any table and delete any record from an table in MYSQL - PHP. Please give me some ideas.
You've used laravel tag, so I assume you want to find a 'Laravel way' to do it.
The best practice is to use Eloquent Events.
Each time when Laravel will hit the DB, it will try to run some of these events: creating, created, updating, updated, saving, saved, deleting, deleted, restoring, restored
-ing events are used before hitting DB, -ed events are used right after hitting DB.
Example:
public function boot()
{
User::created(function () {
Log::info('User was just created successfully');
});
}
Read this Tutorial carefully i think its work for you
http://web.stanford.edu/dept/its/communications/webservices/wiki/index.php/How_to_create_logs_with_PHP
One solution is writing a helper function that logs successful insert/update sql queries if you have full control over your PHP application. A helper function can be called on every instance of insert/update queries which logs the features of the sql query you have. Some features for example can be the table name, the record id, and the type of operation i.e. insert or update, along with the system date.
In the module I have developed and have the link for below, these features are extracted from the original sql query automatically. This is one solution to the logging problem.
Here is my working solution. See the repository's Readme file to see how to add the module to your PHP application
Here is how it works:
We pass our original sql query to a function in the logging module;
The function will extract sql features from the query;
It will save all features to the log table we have setup in the database, along side system date and IP (Other features can be extracted also);
To see the logged data (as an interface to what we have logged):
We call the main function in the front-end side of the module;
This main function will get as parameter the number of most recent log records it should return;
It will present the log records neatly, for example it will collect repeat number of operations on same record in to one entry. The CSS styling can be adjusted as needed.
The whole interface can be minimized or restored using a show/hide button.
For our web application, written in Laravel, we use transactions to update the database. We have separated our data cross different database (for ease, let's say "app" and "user"). During an application update, an event is fired to update some user statistics in the user database. However, this application update may be called as part of a transaction on the application database, so the code structure looks something as follows.
DB::connection('app')->beginTransaction();
// ...
DB::connection('user')->doStuff();
// ...
DB::connection('app')->commit();
It appears that any attempt to start a transaction on the user connection (since a single query already creates an implicit transaction) while in a game transaction does not work, causing a deadlock (see InnoDB status output). I also used innotop to get some more information, but while it showed the lock wait, it did not show by which query it was blocked. There was a lock present on the user table, but I could not find it's origin. Relevant output is shown below:
The easy solution would be to pull out the user operation from the transaction, but since the actual code is slightly more complicated (the doStuff actually happens somewhere in a nested method called during the transaction and is called from different places), this is far from trivial. I would very much like the doStuff to be part of the transaction, but I do not see how the transaction can span multiple databases.
What is the reason that this situation causes a deadlock and is it possible to run the doStuff on the user database within this transaction, or do we have to find an entirely different solution like queueing the events for execution afterwards?
I have found a way to solve this issue using a workaround. My hypothesis was that it was trying to use the app connection to update the user database, but after forcing it to use the user connection, it still locked. Then I switched it around and used the app connection to update the user database. Since we have joins between the databases, we have database users with proper access, so that was no issue. This actually solved the problem.
That is, the final code turned out to be something like
DB::connection('app')->beginTransaction();
// ...
DB::connection('app')->table('user.users')->doStuff(); // Pay attention to this line!
// ...
DB::connection('app')->commit();
I have a demo app designed in php/laravel with a sqlite database & dummy data that I want to add to my portfolio, for users to test. I want users to be able to interact with the app and make changes, however i do not want them to be committed to the database file.
Anyone know what the best way to achieve this would be?
I have seen a few projects on themeforest/codecanyon implement something similar to this.
You can't make changes to the database and show them to the users, without those changes being persisted in the database. So if you want the users to be able test out functionality that involves storing/updating/deleting entries in the database, your best bet is to reset the database periodically (like once an hour).
You can use Task Scheduling to either run some seeding classes or whatever way you like to empty and repopulate the database with the original dummy data. This is a simple example of what you can add to the schedule method in your app/Console/Kernel.php file to run database seeding every hour:
protected function schedule(Schedule $schedule)
{
$schedule->call(function () {
Artisan::call('db:seed');
})->hourly();
}