I'll preface this by admitting slight sleep-deprivation.
The setup is as follows:
API Endpoint (Server A) receives an incoming call, and adds this to a specific queue on the RabbitMQ Server (Server B).
RabbitMQ (Server B) is simply a RabbitMQ Queue Server. Nothing more, nothing less.
Laravel Installation (Server C) is our actual Laravel install, which is meant to look for jobs on specific queues and do things with them.
We have a RabbitMQ package in the Laravel install, which allows the use of the regular Laravel Queue mechanics over a RabbitMQ connection.
The issue I've come across is that we can spawn a worker for a queue - but since we're not generating the jobs passing a $job class (the job content itself is most often a JSON array), the Laravel install has no idea what to do with the job.
So my question revolves mainly around how to approach a scenario like this. I'm thinking that using the Queue-functionality in Laravel won't do what I need it to do. Can you see an approach that I'm missing? Do I really need to spawn a daemon on a non-framework script to handle this?
Your input is much appreciated!
An alternative approach would be a listener on your Laravel application consuming the JSON messages an acting on those.
A queue listener can be created using a package such as https://github.com/bschmitt/laravel-amqp (a generic AMQP bridge for Laravel) or https://github.com/needle-project/laravel-rabbitmq (a bridge more specialised for RabbitMQ).
The queue consumer then reads the JSON payload, saves the paymload as appropriate data, then decides what jobs to dispatch as a result within the Laravel application, as handled by the https://github.com/vyuldashev/laravel-queue-rabbitmq package.
The the two applications still communicate with plain JSON, and not the Laravel-oriented JSON containing the serialised job class.
The solution is indeed to replicate the job code onto the one issuing the job. The code will not need every dependency that the job requires to actually function, as it only serializes the job from the one pushing it.
Related
I am trying to set up an API system that synchronously communicates with a number of workers in Laravel. I use Laravel 5.4 and, if possible, would like to use its functionality whenever possible without too many plugins.
What I had in mind are two servers. The first one with a Laravel instance – let’s call it APP – receiving and answering requests from and to a user. The second one runs different workers, each a Laravel instance. This is how I see the workflow:
APP receives a request from user
APP puts request on a queue
Workers look for jobs on the queue and eventually finds one.
Worker resolves job
Worker responses to APP OR APP finds out somehow that job is resolved
APP sends response to user
My first idea was to work with queues and beanstalkd. The problem is that this all seem to work asynchronously. Is there a way for the APP to wait for the result of one of the workers?
After some more research I stumbled upon Guzzle. Would this be a way to go?
EDIT: Some extra info on the project.
I am talking about a Restful API. E.g. a user sends a request in the form of "https://our.domain/article/1" and their API token in the header. What the user receives is a JSON formatted string like {"id":1,"name":"article_name",etc.}
The reason for using two sides is twofold. At one hand there is the use of different workers. On the other hand we want all the logic of the API as secure as possible. When a hack attack is made, only the APP side would be compromised.
Perhaps I am making things all to difficult with the queues and all that? If you have a better approach to meet the same ends, that would of course also help.
I know your question was how you could run this synchronously, I think that the problem that you are facing is that you are not able to update the first server after the worker is done. The way you could achieve this is with broadcasting.
I have done something similar with uploads in our application. We use a Redis queue but beanstalk will do the same job. On top of that we use pusher which the uses sockets that the user can subscribe to and it looks great.
User loads the web app, connecting to the pusher server
User uploads file (at this point you could show something to tell the user that the file is processing)
Worker sees that there is a file
Worker processes file
Worker triggers and event when done or on fail
This event is broadcasted to the pusher server
Since the user is listening to the pusher server the event is received via javascript
You can now show a popup or update the table with javascript (works even if the user has navigated away)
We used pusher for this but you could use redis, beanstalk and many other solutions to do this. Read about Event Broadcasting in the Laravel documentation.
I'm looking for a solution to add items into a queue and execute them one-by-one in a similar method to google appengine's tasks manager. Each task will be executed using a http request to a php script.
As i'm using amazon, i understood that the best practice is using the SNS service that will be responsible for receiving new tasks, adding them to a queue (Amazon's SQS service) and also inform my php worker that a new task has been pushed into the queue so he can look for it and execute it.
There are several issues with that method (like the need to limit the number of workers instances via the worker itself or just the possibility that the task won't be in the queue when we call the worker because we add the task to the queue in the same time).
I would like to hear if there are any better options or a nicer way of implementing a tasks manager. I preffer using the amazon's services but i'm open to any new suggestion, looking for the best method. Features that are missing in amazon like FIFO and priorities support would also be a nice addition.
Thanks!
Ben
I have found a good solution.
AWS Beanstalk service is apparently offering an option to define a new elastic-beanstalk instance as a "worker" or a "web server". in case you define it as a "Worker", you'll be able to attach it to a sqs queue and it will be responsible for polling the queue and performing the task (with the code you deploy to the instance).
I have implemented a command in my Symfony setup which grabs a job from the DB and then processes it.
How can I run multiple instances of command at once, to get through jobs quicker. I know that multithreading is not supported in PHP but seeing as the command is called from the shell, I was wondering if there was a workaround.
Call command using:
app/console job:process
The way I would solve this is to use a work queue with multiple workers. It's easier to manage and scale than manually running multiple processes and worrying about concurrency.
The simplest general-purpose queue I've found for working with php/symfony is beanstalkd which you can integrate into symfony2 with the LeezyPheanstalkBundle
In general, I'd suggest using enqueue library. You can choose from a variety of transports available, from the simplest like filesystem and Doctrine DBAL to real once like RabbitMQ and Amazon SQS.
Regarding the consumers, you need sort of process manager. There several options:
http://supervisord.org/ - You need extra service. It has to be configured properly.
A pure PHP process manager like this. Based on Symfony process component and pure PHP code. It can handle process reboot, correct exit on sigterm signal and a lot more.
A php\swoole process manager like this. It requires a swoole PHP extension but it is performance is amazing.
I have written a blog post on how to solve this exact problem. https://plume.baucum.me/~/Absolutely/running-multiple-processes-simultaneously-in-a-symfony-command
It is much too long to rehash everything here, but the basic concept is that your command optionally takes in the job's ID. The command will check if the ID was given. If not then it will grab all the jobs from the DB, loop over them, and recall itself with the job ID parameter. As each command is kicked off you store it in an array, and if the array is too big you sleep, for rate throttling. As commands finish you remove them from the array.
When the command is ran with the job ID it will create a lock using Symfony's lock component so that a job cannot accidentally be processed two times at once. It is important that you unlock the job when it either finishes or errors out. Once it has the ID and the lock it will then call whatever code you have written to actually process the job.
Using this technique I have taken commands that took hours to run, as it synchronously went through each task, into taking only minutes. Make sure to try different throttles to balance resource utilization and time it takes to execute your task.
First things first, I'm aware of this question:
Gearman: Sending data from a background worker to the client
What I want to know, is it still the case with Gearman? I'm planning on sending a batch of image URLs from a PHP web application to the gearman worker (also written in PHP; let's call it "The Main Worker") for processing asynchronously. This worker will then submit a separate task for each image to lower-tier workers (via addTask()), call runTasks() and wait for the tasks to finish, while listening to exceptions, accumulating error messages and updating the overall job status.
While I'm perfectly ok with retrieving the overall status from the Main Worker using jobStatus() calls, then just say that all of the images were processed when [false, false, 0, 0] is returned, I definitely need to be able to inform the users that some of the images couldn't be retrieved from their respective URLs or stored on the server.
I suppose I could always just store the custom data in memcache, then retrieve it from the web app, but it just seems "dirtier" to me...
I'm not trying to get any result, because from what I've seen in the manual on php.net, even the exception handling can only be done when the task is submitted synchronously, not mentioning the custom data retrieval. I just hoped that there could be something I'm missing.
I'm I remember correctly, we're using Ubuntu Server 12.04 with libgearman6 (v 0.27) and PHP 5.3.10. The version of the gearman extension is 1.0.2. I think the database is irrelevant here, as I will not be using it in either of the workers. And I think we're not using persistent queues right now.
Since gearman won't keep any task information in memory after a task has finished (just report it back for a synchronous task), you won't be able to retrieve it in your web application without storing it in a 3rd party location. We usually use a simple web service in the application for this, letting the worker call back to the application when a task has completed or an error has occured. This allows us to keep the business logic about what we'd like to do when such an error happens in the application where it belongs, and let our workers be more general (we might need image resizing in many apps, but some apps might want to start several sub tasks that depend on the image resizing being done first).
As you write, you may also let the worker write directly to the database with the state of the task or to memcached, but I've found that letting the application itself handle the logic instead of having to change and special case the workers work better. It's also well suited for a worker framework letting you keep the same standardized way of handling callback across actual worker code.
I'm looking for a job queue manager in node.js which can be invoked by php. This is for a web application which needs to send emails, create pdf files and so on which I'd like to perform asynchronous of the php process.
Example of the process:
User requests a php page
Php invokes the job queue manager and adds a task
Task is executed in node.js asynchronously of php, preferably when it's a bit more quiet
Task is to execute a php script
Why this "complex" system?
We write all our web-applications in php (Zend Framework)
We'd like to start learning node.js
We need a asynchronous process (fast response!)
The "real" task should be a php script as well, to utilize already written php classes, to have easy access to database connections and be as much DRY as possible
Use cases of this system:
User registers himself, system will send welcome email
User completes ecommerce order, system will send invoice
In the end, we'd like to use node-cron as well, to perform non-system wide cron tasks (very application specific). Node-cron will invoke the job queue manager, which will subsequently run a php script.
Is there such an application already in node?
In such a case I would prefer a message queue like RabbitMQ and client side libraries like node-amqp and php-amqp. Then simply send your job from your PHP script in the queue and let nodejs pick up the job from the queue. A big advantage is that it is extensible and it is widely used and tested in the enterprise market.
One possible options is node-jobs, which uses Redis.