I am building a Laravel web app that performs some long running queries and utilizes a couple (both internal and external) APIs. I am having a hard time figuring out why I can't handle requests in parallel. To shed some light on my issue, here is the high level overview of my system/problem via an example:
Page loads
AJAX request called on page load which GET's a BigQuery result set (long running query), cleans the data and executes a python clustering algorithm which creates an image and returns the path to that image to the web app
Long running (~15 seconds)
Will max CPU while performing the Python clustering (at times)
AJAX request called which queries an external API for some information and simply displays it
Short running (~1-2 seconds)
The issue is that my AJAX requests are not being handled in parallel. The first one is received and the web app does not begin the other until the first is complete. I've checked the network tab in Chrome dev tools and both requests are being made in parallel but the web server is not handling them in parallel.
I cannot determine if this is an error in configuration with php, artisan, Laravel or if I have a whole other problem on my hands. I've done some testing with two simple route closures: one that simply returns a string and another which returns a string after sleep(10). When I call both with AJAX, the instantly returning route does not return until the long running request is served (after sleeping).
TL;DR: It's clear both AJAX calls are being fired and received in parallel, but how can I have my Laravel web app handle the requests in parallel (concurrently)?
For HTTP requests that might take a while, use Laravel's job structure to send the request as job and use either the built in queue or 3rd-party service provider to process the jobs. Laravel doesn't do parallel requests hence job was created.
You're problem is similar to the following thread: handle multiple post requests to same url Laravel 5
API Docs:
https://laravel.com/docs/5.1/queues#configuration
Related
I am trying to create a project where I make an API request to another server and write down HTML interface with the data I get. For example, if a single request takes 2 seconds to complete and if there were 5 people that requested the same page, would the last person wait 2 seconds to finish or wait for other people to finish so 10 seconds? I couldn't find any info about this and not sure if Node is a better option for this project.
Any normal web server + PHP setup will handle several requests in parallel. Each incoming request spawns a new web server thread with independent PHP instance. There are several different models of how this can be handled by a web server (workers, threads, events, etc.), but in general that's how it works. There's some limit to how many requests can be handled in parallel, but generally speaking it's significantly more than one.
So, modulus some overhead of running several PHP threads in parallel, each request will be handled in 2 seconds.
A typical pitfall here is session handling: if each PHP instance tries to get a handle on the same session data, they'll block since only one instance at a time can use the normal file-based session store, and subsequent requests will have to wait. To be clear: that's if the same user tries multiple parallel requests; it does not affect different users with different sessions. That goes for any shared resource you may be trying to access.
I am trying to set up an API system that synchronously communicates with a number of workers in Laravel. I use Laravel 5.4 and, if possible, would like to use its functionality whenever possible without too many plugins.
What I had in mind are two servers. The first one with a Laravel instance – let’s call it APP – receiving and answering requests from and to a user. The second one runs different workers, each a Laravel instance. This is how I see the workflow:
APP receives a request from user
APP puts request on a queue
Workers look for jobs on the queue and eventually finds one.
Worker resolves job
Worker responses to APP OR APP finds out somehow that job is resolved
APP sends response to user
My first idea was to work with queues and beanstalkd. The problem is that this all seem to work asynchronously. Is there a way for the APP to wait for the result of one of the workers?
After some more research I stumbled upon Guzzle. Would this be a way to go?
EDIT: Some extra info on the project.
I am talking about a Restful API. E.g. a user sends a request in the form of "https://our.domain/article/1" and their API token in the header. What the user receives is a JSON formatted string like {"id":1,"name":"article_name",etc.}
The reason for using two sides is twofold. At one hand there is the use of different workers. On the other hand we want all the logic of the API as secure as possible. When a hack attack is made, only the APP side would be compromised.
Perhaps I am making things all to difficult with the queues and all that? If you have a better approach to meet the same ends, that would of course also help.
I know your question was how you could run this synchronously, I think that the problem that you are facing is that you are not able to update the first server after the worker is done. The way you could achieve this is with broadcasting.
I have done something similar with uploads in our application. We use a Redis queue but beanstalk will do the same job. On top of that we use pusher which the uses sockets that the user can subscribe to and it looks great.
User loads the web app, connecting to the pusher server
User uploads file (at this point you could show something to tell the user that the file is processing)
Worker sees that there is a file
Worker processes file
Worker triggers and event when done or on fail
This event is broadcasted to the pusher server
Since the user is listening to the pusher server the event is received via javascript
You can now show a popup or update the table with javascript (works even if the user has navigated away)
We used pusher for this but you could use redis, beanstalk and many other solutions to do this. Read about Event Broadcasting in the Laravel documentation.
I use php and laravel as my web service.
I want to know does laravel store and process requests in these situation?
requests to different controllers from many users;
requests to the same controller from the same user.
Is the laravel store these requests in a queue by the sequence the requests reached?
Is laravel parallel process requests for different users, and in sequence for the same user?
For example, there are two requests from the user. The two requests route to two methods in the same controller. While the first request will cost a long time for the server side processing, the second one will cost very little time. When a user set up the first request then the second one, though the second one cost very little time, the server side will not process the second request until it finish processing the first one.
So I want to know how does laravel store and process the requests?
Laravel does not process requests directly, this is something managed by your webserver and PHP. Laravel receives a request already processed by your webserver, because it is only a tool, written in PHP, which processes the data related to a request call. So, as long as your webserver knows how to execute PHP and calls the proper index.php file, Laravel will be booted and process the request data it receives from the webserver.
So, if your webserver is able to receive 2 different calls (usually they do that in the hundreds), it will try to instantiate 2 PHP (sub)processes, and you should have 2 Laravel instances in memory running in parallel.
So if you have code which depend on anther code, which may take too long to execute depending on many other factors, you'll have to deal with that yourself, in your Laravel app.
What we usually do is to just add data to a database and then get a result back from a calculation done with data already in the datastore. So it should not matter the order the data get to the datastore, which one got in first, the end result is always the same. If you cannot rely on this kind of methodology, you'll have to prepare your app to deal with it.
Everything in PHP starts as a separate process. These processes are independent form each other until some shared resource comes in Picture.
In your case one user is handled one session and sessions are file based by default. The session file is shared resource for processes which means you can only make one call to PHP at a time for one user.
Multiple user can invoke any number of processes at once depending on your systems capabilities.
According to official heroku website a dyno can handle thousands of requests per second depending on language used. I am using PHP. Can you explain how much requests can its one dyno handle?
Are all of my applications are using same dyno or each uses separate one?
Direct from the Heroku dev center:
A single-threaded, non-concurrent framework like Rails can process one
request at a time. For an app that takes 100ms on average to process
each request, this translates to about 10 requests per second per
dyno.
Load testing your app is the only realistic way to determine request
throughput.
If you are using something that is event drive, like Node.js, more requests would be handled.
I have a PHP app on Heroku, the app does lots of communication with external APIs, which in turn trigger jobs on the database, the results are then displayed in a Facebook app...
Currently I have 2 worker processes and a web process. The web process triggers the workers and monitors database flags to know when each worker job is complete...I know...this setup isn't great, ideally I'd like to get notified in my web process when each worker process is finished, but this doesn't seem to be possible...
Is there a better way to approach this in Heroku using PHP?
Maybe a PHP app on Heroku isn't the best solution, but I've written lots of PHP that I'd rather not re-write....
Thanks in advance...
I can think of two relatively straightforward things you can do without ditching PHP (though I have to mention that PHP doesn't have much to recommend it, and you would likely be better off with Python/Django, Python/Flask or Ruby/Rails):
One is that you can switch to Redis for managing your workers instead of using your database. The advantage to this is that Redis has a pub/sub system where you can subscribe to signals while you hold a connection open. This means that if a connection is open, for instance from a web process, you will be notified of a change immediately without having to poll.
Two is that you can switch to using ajax so that you don't block the loading of your page while you're waiting. Load your page immediately and then use javascript to hit a separate PHP page to periodically check for updates on the status of your job, and then use javascript to render the results on the page in place when the results are available.
Even better, use ajax long polling. Render your page immediately and then use javascript to send a request back. Then when your php page receives the second request, register a subscription with Redis and then also manually check for updates (if you're not using Redis, just check for updates). If there are no updates, then just wait until the subscription receives a message, or wait for 30 seconds, whichever. (To be honest, I've never done Redis subscriptions in PHP so I'm not sure how implement that -- if you can't do it easily then just poll every couple of seconds instead.) If the 30 second timer expires, return json that says there are no results and have the javascript retry immediately. If you do receive results within that time, return the results and have the javascript render them.