I am trying to set up an API system that synchronously communicates with a number of workers in Laravel. I use Laravel 5.4 and, if possible, would like to use its functionality whenever possible without too many plugins.
What I had in mind are two servers. The first one with a Laravel instance – let’s call it APP – receiving and answering requests from and to a user. The second one runs different workers, each a Laravel instance. This is how I see the workflow:
APP receives a request from user
APP puts request on a queue
Workers look for jobs on the queue and eventually finds one.
Worker resolves job
Worker responses to APP OR APP finds out somehow that job is resolved
APP sends response to user
My first idea was to work with queues and beanstalkd. The problem is that this all seem to work asynchronously. Is there a way for the APP to wait for the result of one of the workers?
After some more research I stumbled upon Guzzle. Would this be a way to go?
EDIT: Some extra info on the project.
I am talking about a Restful API. E.g. a user sends a request in the form of "https://our.domain/article/1" and their API token in the header. What the user receives is a JSON formatted string like {"id":1,"name":"article_name",etc.}
The reason for using two sides is twofold. At one hand there is the use of different workers. On the other hand we want all the logic of the API as secure as possible. When a hack attack is made, only the APP side would be compromised.
Perhaps I am making things all to difficult with the queues and all that? If you have a better approach to meet the same ends, that would of course also help.
I know your question was how you could run this synchronously, I think that the problem that you are facing is that you are not able to update the first server after the worker is done. The way you could achieve this is with broadcasting.
I have done something similar with uploads in our application. We use a Redis queue but beanstalk will do the same job. On top of that we use pusher which the uses sockets that the user can subscribe to and it looks great.
User loads the web app, connecting to the pusher server
User uploads file (at this point you could show something to tell the user that the file is processing)
Worker sees that there is a file
Worker processes file
Worker triggers and event when done or on fail
This event is broadcasted to the pusher server
Since the user is listening to the pusher server the event is received via javascript
You can now show a popup or update the table with javascript (works even if the user has navigated away)
We used pusher for this but you could use redis, beanstalk and many other solutions to do this. Read about Event Broadcasting in the Laravel documentation.
Related
In my Laravel 5.4 web app user can request report generation that takes a couple of minutes due to a big amount of data. Because of these he couldn't work with application no more, until report will be generated. To fix this problem I have read about queues in laravel and separated out my report generation code to the job class, but my app still holds until report will be generated. How can I fix that?
To be absolutely clear I will sum up my problem:
User make request for report generation (my app absolutely holds at this moment)
My app receives POST request in routes and calls a function from the controller class.
Controller's function dispatches a job, that should generate report and put it into the client web folder.
It sounds like you have already pretty much solved the problem by introducing a queue. Put the job in the queue, but don't keep track of its progress - allow your code to continue and return to the user. It should be possible to "fire-and-forget", and then either ask the user to check if the report is ready in a couple of minutes, or offer the ability to email it to them when it is completed.
By default, Laravel uses the sync queue driver. This driver executes the queued jobs in the same request as the one they are created in. So this won't make any difference.
You should take a look at other drivers and use the Laravel queue worker background process to execute jobs to make sure they don't hold the webrequest from completing.
Hey guys I’m working on a website for my small startup that needs to check a database continuously for new data. I'm a mechanical engineer and don't have experience with web design and web communication. Currently I’m using an AJAX request every second to check a MYSQL database (using PHP). The code compares the received data (in JSON format) and if it’s different than the previous one it triggers a function to process the new data and update the UI.
Just last night I learned about web workers, web sockets and long polling and kinda overwhelmed with all the new options I have now. I’m really confused about whether I need to change my current solution and which solution would be the best. I thought maybe I should create a dedicated web worker that handles the AJAX calls in order to avoid sacrificing UI smoothness (the website should run smoothly on an average tablet).
Anyone with experience can give me some tips and directions? I learned about Pusher API but I would like to avoid API’s for now. I feel like all the code that I have written in the past few months are inefficient after reading about web workers and web sockets…
Thanks in advance...
You should really use google and search SO for previous posts on similar issues.
Here are a few good starters:
In what situations would AJAX long/short polling be preferred over HTML5 WebSockets?
Performance of AJAX vs Websocket REST over HTTP 2.0?
What are Long-Polling, Websockets, Server-Sent Events (SSE) and Comet?
Design/Architecture: web-socket one connection vs multiple connections
Or (outside SO):
http://dsheiko.com/weblog/websockets-vs-sse-vs-long-polling/
https://www.pubnub.com/blog/2015-01-05-websockets-vs-rest-api-understanding-the-difference/
As a quick summation:
I would probably opt for a web socket connection per client.
I would avoid polling the MySQL database (why do that?). There's really no need to waste resources. It's easier to add code to the update gateway, so that whenever the DB is updated, an event is scheduled for all listening sockets... I would consider Redis for Pub/Sub if I were using more than one process / machine for my server app.
An easier workflow would look like this:
Browser page load -> Websocket connection.
Websocket connection -> subscribe (listen to) Redis channel.
SQL update -> (triggers) Redis publish to a channel.
Redis channel publish -> notification to the (subscribed) websocket.
Notification on channel -> web socket message to client.
Good luck.
Here's a simple push idea that may work for you:
Create a trigger that writes to another table when inserts/updates are done and log any relevant data there (something useful to you)
On initial load of app, get the latest updates from the secondary "log" table, store the row/event ID for comparison later
Create a poller (server-sent events) that listens to a specific script that watches said "log" table
Create a CRON job to execute the script from step 3 every X amount of time
(caveat: #3 may not work in IE, so you'd need a fallback or different solution)
I am building a Laravel web app that performs some long running queries and utilizes a couple (both internal and external) APIs. I am having a hard time figuring out why I can't handle requests in parallel. To shed some light on my issue, here is the high level overview of my system/problem via an example:
Page loads
AJAX request called on page load which GET's a BigQuery result set (long running query), cleans the data and executes a python clustering algorithm which creates an image and returns the path to that image to the web app
Long running (~15 seconds)
Will max CPU while performing the Python clustering (at times)
AJAX request called which queries an external API for some information and simply displays it
Short running (~1-2 seconds)
The issue is that my AJAX requests are not being handled in parallel. The first one is received and the web app does not begin the other until the first is complete. I've checked the network tab in Chrome dev tools and both requests are being made in parallel but the web server is not handling them in parallel.
I cannot determine if this is an error in configuration with php, artisan, Laravel or if I have a whole other problem on my hands. I've done some testing with two simple route closures: one that simply returns a string and another which returns a string after sleep(10). When I call both with AJAX, the instantly returning route does not return until the long running request is served (after sleeping).
TL;DR: It's clear both AJAX calls are being fired and received in parallel, but how can I have my Laravel web app handle the requests in parallel (concurrently)?
For HTTP requests that might take a while, use Laravel's job structure to send the request as job and use either the built in queue or 3rd-party service provider to process the jobs. Laravel doesn't do parallel requests hence job was created.
You're problem is similar to the following thread: handle multiple post requests to same url Laravel 5
API Docs:
https://laravel.com/docs/5.1/queues#configuration
I have a need for part of my application to make calls to Reddit asynchronously from my core application's workflow. I have implemented a semi-workable solution by using a Reddit API library I have built here. For those that are unaware, Reddit manages authentication via OAuth and returns a bearer and a token for a particular user that expires in 60 minutes after generation.
I have opted to use cookies to store this authorization information for the mentioned time period, as seen in the requestRedditToken() method here. If a cookie is not found (i.e. it has expired) when another request to Reddit needs to be made, another reddit token is generated. This seems like it would work just fine.
What I am having trouble with is conceptualizing how cookies are handled when integrated with a daemonized queue worker, furthermore, I need to understand why these calls are failing periodically.
The application I'm working with, as mentioned, makes calls to Reddit. These calls are created by a job class being handled: UpdateRedditLiveThreadJob, which you can see here.
These jobs are processed by a daemonized Artisan queue worker using Laravel Forge, you can see the details of the worker here. The queue driver in this case is Redis, and the workers are monitored by Supervisor.
Here is the intended workflow of my app:
An UpdateRedditLiveThreadJob is created and thrown into the queue to be handled.
The handle() method of the job is called.
A Reddit client is instantiated, and a reddit token is requested if a cookie doesn't exist.
My Reddit client successfully communicates with Reddit.
The Job is considered complete.
What is actually happening:
The job is created.
Handle is called.
Reddit client is instantiated, something odd happens here generally.
Reddit client tries to communicate, but gets a 401 response which produces an Exception. This is indicative of a failed authorization.
The task is considered 'failed' and loops back to step 2.
Here are my questions:
Why does this flow work for the first hour, and then collapse as described above, after presumably, the cookie has expired?
I've tried my best to understand how Laravel Queues work, but I fundamentally am having a hard time of conceptualizing the different types of queue management options available: queue:listen, queue:work, a daemonized queue:work running on Supervisor, etc. Is my current queue infrastructure compatible with using cookies to manage tokens?
What adjustments do I need to make to my codebase to make the app function as intended?
How will my workflow handle multiple users, who each potentially have multiple cookies?
Why does the workflow magically start working again if I restart my queue worker?
Please let me know if I'm incorrectly describing anything here or need clarification, I've tried my best to explain the problem succinctly.
Your logic is incorrect. A queue job is in fact a cli running php script. It has no interaction with a browser. Cookies are set in a browser, see this related thread for reference.
Seeing you're interacting with an API it would make more sense to set the token as a simple variable in the Job (or better yet in that wrapper) and then re-use that within that job.
TL:DR: your wrapper is not an API client.
I know this is not a complete answer to all your questions, but it's a push in the right direction. Because would I have answered all your questions - in the end - might not have given any solution to your issues ;)
First things first, I'm aware of this question:
Gearman: Sending data from a background worker to the client
What I want to know, is it still the case with Gearman? I'm planning on sending a batch of image URLs from a PHP web application to the gearman worker (also written in PHP; let's call it "The Main Worker") for processing asynchronously. This worker will then submit a separate task for each image to lower-tier workers (via addTask()), call runTasks() and wait for the tasks to finish, while listening to exceptions, accumulating error messages and updating the overall job status.
While I'm perfectly ok with retrieving the overall status from the Main Worker using jobStatus() calls, then just say that all of the images were processed when [false, false, 0, 0] is returned, I definitely need to be able to inform the users that some of the images couldn't be retrieved from their respective URLs or stored on the server.
I suppose I could always just store the custom data in memcache, then retrieve it from the web app, but it just seems "dirtier" to me...
I'm not trying to get any result, because from what I've seen in the manual on php.net, even the exception handling can only be done when the task is submitted synchronously, not mentioning the custom data retrieval. I just hoped that there could be something I'm missing.
I'm I remember correctly, we're using Ubuntu Server 12.04 with libgearman6 (v 0.27) and PHP 5.3.10. The version of the gearman extension is 1.0.2. I think the database is irrelevant here, as I will not be using it in either of the workers. And I think we're not using persistent queues right now.
Since gearman won't keep any task information in memory after a task has finished (just report it back for a synchronous task), you won't be able to retrieve it in your web application without storing it in a 3rd party location. We usually use a simple web service in the application for this, letting the worker call back to the application when a task has completed or an error has occured. This allows us to keep the business logic about what we'd like to do when such an error happens in the application where it belongs, and let our workers be more general (we might need image resizing in many apps, but some apps might want to start several sub tasks that depend on the image resizing being done first).
As you write, you may also let the worker write directly to the database with the state of the task or to memcached, but I've found that letting the application itself handle the logic instead of having to change and special case the workers work better. It's also well suited for a worker framework letting you keep the same standardized way of handling callback across actual worker code.