In my Laravel 5.4 web app user can request report generation that takes a couple of minutes due to a big amount of data. Because of these he couldn't work with application no more, until report will be generated. To fix this problem I have read about queues in laravel and separated out my report generation code to the job class, but my app still holds until report will be generated. How can I fix that?
To be absolutely clear I will sum up my problem:
User make request for report generation (my app absolutely holds at this moment)
My app receives POST request in routes and calls a function from the controller class.
Controller's function dispatches a job, that should generate report and put it into the client web folder.
It sounds like you have already pretty much solved the problem by introducing a queue. Put the job in the queue, but don't keep track of its progress - allow your code to continue and return to the user. It should be possible to "fire-and-forget", and then either ask the user to check if the report is ready in a couple of minutes, or offer the ability to email it to them when it is completed.
By default, Laravel uses the sync queue driver. This driver executes the queued jobs in the same request as the one they are created in. So this won't make any difference.
You should take a look at other drivers and use the Laravel queue worker background process to execute jobs to make sure they don't hold the webrequest from completing.
Related
I am working on the system which developed in php without framework.
It has the function is automatically run some jobs via third party api every night. It loops all the jobs in table and call api using curl.
// run cron job to loop this table
ID JOB
1 updateUser
2 getLatestInfo/topicA
……
//code
// if UpdateUser
loop user table and call api to get latest info…
Also other curl task will do here like send email / notification …
It works perfectly before. But recently we have many new users. It will call 50-100 API at the same time.
Each api call will take 10-20 seconds to respond, and we will retry the api if it is timeout.
I checked the log it totally take 3-4 hours for only first job (with many errors)
Although I can make the cron job for queueing the curl, like get first 5 curl and run them each 1 minutes. But if we keep increasing the users or task, and the third party api keep slow. It may take more hours to finish the task.
Is there any solution can make it keep listening to the job table, and run the curl one by one?
I want it can be auto Triggered if new row is added to the table. (Like websocket?) and not single php to run and infinite loop ( to prevent some error occurred and need to rerun the php task manually )
(The API keys is in the php project, so I hope that I can do this in same project)
PHP scripts need to be triggered in order to do something, they can't really "run in background" (I mean, they can, technically, but PHP isn't supposed to be used that way).
Instead, one of three options is usually used to do job management:
call jobs on every call from web, along with the actual code to generate output
use external web cron service to query specific URLs tied to job execution
use local cron job on your system to call the php executable and have it execute jobs periodically
If you want an event based system, PHP is likely the wrong option. Depending on your DB system though you might be able to create a small wrapper code that subscribes to DB changes and is triggered on inserts, that then calls PHP again - but it's definitely a cleaner solution to use a more suitable programming language / environment.
I've been working on a project for a while that fetches data from an API and processes that data locally for various uses. Currently, a consumer picks up JSON objects from the message queue that it uses to trigger a matching Symfony command. The rate limiting is built in to this one consumer, is fairly simple and adjusts itself automatically to status responses from the API. The problem is, the way it is set up, it cannot run in parallel and if there is a major update to the versioned static data on the API, all processing halts while it caches the new static data.
I looked at using the rabbitmq-bundle Symfony bundle and converting the commands into separate consumers with their own channels so that they can be run in parallel and no longer block each other, however this comes with a couple of issues I'm stuck with how to handle.
The first is that I still need to manage limiting the API calls across all the consumers. I have a wrapper for Guzzle that could, in theory, use a simple file to manage to number of calls across all instances of it. I looked at an existing token bucket library but setting it up to work in Symfony looks problematic as each consumer could potentially reset the number of tokens if the consumer is restarted, so... Not sure where to go with that.
The second is that some consumers may hit data from the main API that we're still do not have the matching version of the static data for. If this happens, it needs to trigger the related consumers but only if there isn't already a trigger in each queue... Possible solution I can see for this is record the latest requested version in a file at the time a message is published to update it and have the consumer wait for the data to be available locally. Again, kind of lost about how best to handle this.
I am trying to set up an API system that synchronously communicates with a number of workers in Laravel. I use Laravel 5.4 and, if possible, would like to use its functionality whenever possible without too many plugins.
What I had in mind are two servers. The first one with a Laravel instance – let’s call it APP – receiving and answering requests from and to a user. The second one runs different workers, each a Laravel instance. This is how I see the workflow:
APP receives a request from user
APP puts request on a queue
Workers look for jobs on the queue and eventually finds one.
Worker resolves job
Worker responses to APP OR APP finds out somehow that job is resolved
APP sends response to user
My first idea was to work with queues and beanstalkd. The problem is that this all seem to work asynchronously. Is there a way for the APP to wait for the result of one of the workers?
After some more research I stumbled upon Guzzle. Would this be a way to go?
EDIT: Some extra info on the project.
I am talking about a Restful API. E.g. a user sends a request in the form of "https://our.domain/article/1" and their API token in the header. What the user receives is a JSON formatted string like {"id":1,"name":"article_name",etc.}
The reason for using two sides is twofold. At one hand there is the use of different workers. On the other hand we want all the logic of the API as secure as possible. When a hack attack is made, only the APP side would be compromised.
Perhaps I am making things all to difficult with the queues and all that? If you have a better approach to meet the same ends, that would of course also help.
I know your question was how you could run this synchronously, I think that the problem that you are facing is that you are not able to update the first server after the worker is done. The way you could achieve this is with broadcasting.
I have done something similar with uploads in our application. We use a Redis queue but beanstalk will do the same job. On top of that we use pusher which the uses sockets that the user can subscribe to and it looks great.
User loads the web app, connecting to the pusher server
User uploads file (at this point you could show something to tell the user that the file is processing)
Worker sees that there is a file
Worker processes file
Worker triggers and event when done or on fail
This event is broadcasted to the pusher server
Since the user is listening to the pusher server the event is received via javascript
You can now show a popup or update the table with javascript (works even if the user has navigated away)
We used pusher for this but you could use redis, beanstalk and many other solutions to do this. Read about Event Broadcasting in the Laravel documentation.
I have a need for part of my application to make calls to Reddit asynchronously from my core application's workflow. I have implemented a semi-workable solution by using a Reddit API library I have built here. For those that are unaware, Reddit manages authentication via OAuth and returns a bearer and a token for a particular user that expires in 60 minutes after generation.
I have opted to use cookies to store this authorization information for the mentioned time period, as seen in the requestRedditToken() method here. If a cookie is not found (i.e. it has expired) when another request to Reddit needs to be made, another reddit token is generated. This seems like it would work just fine.
What I am having trouble with is conceptualizing how cookies are handled when integrated with a daemonized queue worker, furthermore, I need to understand why these calls are failing periodically.
The application I'm working with, as mentioned, makes calls to Reddit. These calls are created by a job class being handled: UpdateRedditLiveThreadJob, which you can see here.
These jobs are processed by a daemonized Artisan queue worker using Laravel Forge, you can see the details of the worker here. The queue driver in this case is Redis, and the workers are monitored by Supervisor.
Here is the intended workflow of my app:
An UpdateRedditLiveThreadJob is created and thrown into the queue to be handled.
The handle() method of the job is called.
A Reddit client is instantiated, and a reddit token is requested if a cookie doesn't exist.
My Reddit client successfully communicates with Reddit.
The Job is considered complete.
What is actually happening:
The job is created.
Handle is called.
Reddit client is instantiated, something odd happens here generally.
Reddit client tries to communicate, but gets a 401 response which produces an Exception. This is indicative of a failed authorization.
The task is considered 'failed' and loops back to step 2.
Here are my questions:
Why does this flow work for the first hour, and then collapse as described above, after presumably, the cookie has expired?
I've tried my best to understand how Laravel Queues work, but I fundamentally am having a hard time of conceptualizing the different types of queue management options available: queue:listen, queue:work, a daemonized queue:work running on Supervisor, etc. Is my current queue infrastructure compatible with using cookies to manage tokens?
What adjustments do I need to make to my codebase to make the app function as intended?
How will my workflow handle multiple users, who each potentially have multiple cookies?
Why does the workflow magically start working again if I restart my queue worker?
Please let me know if I'm incorrectly describing anything here or need clarification, I've tried my best to explain the problem succinctly.
Your logic is incorrect. A queue job is in fact a cli running php script. It has no interaction with a browser. Cookies are set in a browser, see this related thread for reference.
Seeing you're interacting with an API it would make more sense to set the token as a simple variable in the Job (or better yet in that wrapper) and then re-use that within that job.
TL:DR: your wrapper is not an API client.
I know this is not a complete answer to all your questions, but it's a push in the right direction. Because would I have answered all your questions - in the end - might not have given any solution to your issues ;)
First things first, I'm aware of this question:
Gearman: Sending data from a background worker to the client
What I want to know, is it still the case with Gearman? I'm planning on sending a batch of image URLs from a PHP web application to the gearman worker (also written in PHP; let's call it "The Main Worker") for processing asynchronously. This worker will then submit a separate task for each image to lower-tier workers (via addTask()), call runTasks() and wait for the tasks to finish, while listening to exceptions, accumulating error messages and updating the overall job status.
While I'm perfectly ok with retrieving the overall status from the Main Worker using jobStatus() calls, then just say that all of the images were processed when [false, false, 0, 0] is returned, I definitely need to be able to inform the users that some of the images couldn't be retrieved from their respective URLs or stored on the server.
I suppose I could always just store the custom data in memcache, then retrieve it from the web app, but it just seems "dirtier" to me...
I'm not trying to get any result, because from what I've seen in the manual on php.net, even the exception handling can only be done when the task is submitted synchronously, not mentioning the custom data retrieval. I just hoped that there could be something I'm missing.
I'm I remember correctly, we're using Ubuntu Server 12.04 with libgearman6 (v 0.27) and PHP 5.3.10. The version of the gearman extension is 1.0.2. I think the database is irrelevant here, as I will not be using it in either of the workers. And I think we're not using persistent queues right now.
Since gearman won't keep any task information in memory after a task has finished (just report it back for a synchronous task), you won't be able to retrieve it in your web application without storing it in a 3rd party location. We usually use a simple web service in the application for this, letting the worker call back to the application when a task has completed or an error has occured. This allows us to keep the business logic about what we'd like to do when such an error happens in the application where it belongs, and let our workers be more general (we might need image resizing in many apps, but some apps might want to start several sub tasks that depend on the image resizing being done first).
As you write, you may also let the worker write directly to the database with the state of the task or to memcached, but I've found that letting the application itself handle the logic instead of having to change and special case the workers work better. It's also well suited for a worker framework letting you keep the same standardized way of handling callback across actual worker code.