Why only ASP.NET have asynchronous programming model? - php

I works with ASP.NET. IMHO, the asynchronous programming support in ASP.NET is beautiful. That is, we can have BeginXXXX/EndXXXX pair method to improve scalability for resource intensive task.
For example, one operation needs to get huge data from database and render it on response web page. If we have this operation synchronous. The thread handing this request will be occupied for the whole page life cycle. Since threads are limited resource, it is always better to program operation with I/O in asynchronous way. That is, ASP.NET will allocate thread to invoke BeginXXXX method with a callback function. The thread invokes BeginXXXX returns immediately and can be arranged to handle other requests. When the job is done, the callback function is triggered and ASP.NET will invoke EndXXXX to get the actually response.
This asynchronous programming model can fully takes advantage of threading resources. Even though there is limit of ThreadPool, it can actually handle much more requests. However, if we program in synchronous way, and each request needs lengthy I/O, the concurrent requests would not exceed size of thread pool.
Recently, I have chance to explore other web development solution such as PHP and Ruby on Rails. To my surprise, these solutions doesn't have counterpart of asynchronous programming model. Each request is handled by one thread or process for the whole life cycle. That is, the thread or process is occupied before the last bit of response is sent.
There is something similar to asynchronously(http://netevil.org/blog/2005/may/guru-multiplexing), but the baseline is that there is always one thread or process occupied for the request. This is not like ASP.NET.
So, I am wondering: why doesn't these popular web solution have asynchronous programming model like ASP.NET? Why only ASP.NET evolves to use asynchronous approach?
Is it because PHP and Ruby-on-Rails mostly deployed in Linux? And Linux doesn't suffer process/thread performance penalty like Microsoft Windows?
Or, is there actually asynchronous solution for PHP and Ruby-on-Rails that I haven't find?
Thanks.

I don't have a definitive answer to your question, but I can make an educated guess.
Systems such as PHP and Ruby are designed to be very platform-independent, whereas ASP.NET is deeply integrated into the Windows platform. In addition, PHP is more like old-style ASP, with a linear, start-to-finish flow.
Full ASP.NET-style async pages require not only threads, but native async I/O to be used to their maximum impact. Async I/O is an OS-specific capability. Async pages also rely on the concept of a page life cycle, which is anathema to the linear flow style. Without a page life cycle, it becomes much more difficult to integrate the results of the async calls with the rest of your page.
Just my two cents, YMMV.

Related

Is SSE and PHP Better than AJAX for chat app?

I've been reading a lot on the subject of SSE and PHP, most of which seems to be advocating it as viable solutions for all sorts of things including chat apps. I have seen similar questions on this site but have not found a concise, definitive answer.
Is there something inherent in SSE which makes it way more server-friendly than AJAX short polling? Because the headers appear to be of very similar size. I am wondering if there is some kind of behind-the-scenes stuff beyond the headers that a noob like myself can't see e.g. some sort of connection recognition with each request/response? I know there are other factors involved where SSE prevails such as handling disconnections.
In terms of using it in a chat app scenario, ajax and sse appear to be doing the same thing. Neither of them seems to be able to perform long polling effectively with PHP. If I have User A and User B waiting on a PHP script that checks for new messages from the other user in the DB then sleeps for 3 seconds for say 10 loops, User A's new message cannot be inserted until User B has looped through the entire checking script, thereby rendering it absolutely useless (at least based on everything I've tried in the last 2 weeks!). I can get it working smoothly if I chat to myself and no one else is waiting on the checking script, but I've run out of things to talk about with myself and would really enjoy someone else being able to use it too.
So in a nutshell, given an Apache and PHP environment with WebSockets as not an option (due to shared hosting), is the only effective way to write a chat app, based on server burden alone, by short polling with one's choice of either AJAX or SSE, or is SSE definitely the superior option?
I would pursue WebSockets if the eventual traffic called for it and justified the web hosting upgrade.
(ALSO, as a side, is my premise off base regarding the long-polling scenario I described above where User A must wait for User B's loop to finish before he/she/it can perform the insert? Got me confused as to why that should be the case).
Kind of a long-winded, meandering question but hoping someone in the same situation can find this question and save themselves a lot of time.
Many Thanks!
Yes, SSE is a better option than AJAX, as AJAX polling is done on the main servers, like where most of the normal user traffic is to be hit. Whereas SSE polling is done on another instance which is made for it, so there will be no extra traffic on the main server. Please check Mercure (https://mercure.rocks/)
EDIT:
I mean to that, using SSE with platforms like Mercure would be a better option than AJAX. As AJAX will make a request to the main server. Which would increase the count of requests for the main server. Whereas we can distribute the network load using tools like the Mercure, in order to achieve the required functionality.
SSE can be thought of a thin API wrapper around the AJAX long-poll approach. It brings a standard API to something that was a hacky solution before.
something inherent in SSE which makes it way more server-friendly than AJAX short polling?
It holds the socket open. The pro of this is less latency (as soon as the server has the new information it sends it to the client, rather than waiting for the next client poll); the con is the extra resource usage (the socket, and the PHP process).
but I've run out of things to talk about with myself
Surely not. Have you tried starting a chat about if time is an illusion, and what came before?
with WebSockets as not an option (due to shared hosting)
SSE and WebSockets both hold a socket open. Shared hosting ISPs often go round closing sockets that have been open a long time (e.g. over 60s), unless they explicitly say they support SSE. The may also kill long-running PHP processes.
is my premise off base regarding the long-polling scenario I described above where User A must wait for User B's loop to finish before he/she/it can perform the insert?
I think it is off. The "A" in Ajax is asynchronous, meaning you can have multiple ajax/sse requests running at the same time. And on the server side you will have a distinct PHP process running for each request.

Parallel/Asynchronous PHP

I have been tasked with rebuilding an application (CakePHP 2.0, php 5.6) that receives a request, reformats/maps the body to API specific fields and makes requests to multiple APIs with the newly mapped body.
Once the responses are coming back they will be decoded and placed in the output array as a response from the application.
Currently the decoding (mapping from API specific fields) process happens in sequence once the Multicurl requests return.
My idea is to process the responses and soon as they arrive, and I am attempting to do so in parallel.
One complexity is that every target API needs 4 very specific mapping functions so every API object has a map and reverse map for 2 different operations.
A client requirement is to have the minimum number of dependencies, the solution should preferably be in raw php, no libraries wanted.
The KISS solution has been requested.
I have considered the following approaches but they all have drawbacks.
Multicurl waits for the slowest response to return all responses. This is the current approach, no parallel response processing.
pthreads not compatible with Apache, command line only.
Can't pass complex objects (API object) via Sockets easily.
Too many dependencies and/or too immature.
a) Appserver
b) Kraken
c) RabbitMQ
d) socket.io
I am looking for PHP 7 (nothing else) alternatives to this task.
Any suggestions?
It's worth noting that 'parallel' and 'asynchronous' are separate concepts.
eg: ReactPHP and it's ilk [node.js included] are asynchronous, but still single-threaded, relying on event loops, callbacks, and coroutines to allow out-of-order execution of code.
Responding to your assessment of approaches:
Accurate assessment of curl_multi().
However, your stated case is that all of this needs to take place within the context of a single request, so no matter what you do you're going to be stuck waiting on the slowest API response before you can serve your response.
Unless you're fundamentally changing your workflow you should probably just stick with this.
It's not that pthreads is incompatible with apache, it's that it's incompatible with mod_php.
Use use an FCGI worker model like FPM and you can use pthreads all you want.
Why not? That's what serialization is for.
So long as you never accept it from users or want to use it outside of PHP, serialize() is one option.
In pretty much all other cases json_encode() is going to be the way to go.
If you're just going to write off solutions wholesale like this you're going to have a bad time, particularly if you're trying to do something that's inherently at odds with PHP, like parallel/async processing.

PHP - Is there any way to control process executions?

I have php and nodejs installed in my server. I am calling a nodejs code which uses node canvas via php
Like:
<?php
exec(node path/to/the/js/file);
?>
Problem:
Each process of this execution consumes around 250 Mb of memory because of node canvas. So If my server has around 8Gb of memory, only 32 users can use the service at any given point of time and also there is a risk of crashing the server if the number of users exceeds.
Is there any scalable solution to this problem?
UPDATE I have to use Canvas server side because of my business requirements, so I am using node canvas, but it is heavily consuming the memory which is giving a huge problem.
Your problem is that you start a new node.js process for each request, that is why the memory footprint is so huge, and isn't what node.js is built for.
But node.js is built to handle a lot of different request in only one process, use that to your advantage.
What I advice you to do is to have only one node.js process started, and find another way to communicate between your PHP process and the node.js process.
There is a lot of different ways to do that, some more perfomant than others, some harder to build than other. All have pros and cons, but since both are web related language, you can be sure there is support in both for HTTP request.
So what you should do is a basic node.js/Express server, probably with only one API point, which execute the code you already did, and return the result. It is easy enought to do (especially if you use JSON to communicate between them), and while I don't know PHP, I m pretty sure it is easy to send a HTTP request and interpret the answer.
If you are ready to dig in node.js, you could try sockets or MQ, which should be more performant.
That way, you only have one node.js process, which shouldn't grow in memory and handle a lot more client, will not have to use exec, and have a first try with Express.

Using php + gearman + node.js

I am considering building a site using php, but there are several aspects of it that would perform far, far better if made in node.js. At the same time, large portions of of the site need to remain in PHP. This is because a lot of functionality is already developed in PHP, and redeveloping, testing, and so forth would be too large of an undertaking, and quite frankly, those parts of the site run perfectly fine in PHP.
I am considering rebuilding the sections in node.js that would benefit from running most in node.js, then having PHP pass the request to node.js using Gearman. This way, I scan scale out by launching more workers and have gearman handle the load distribution.
Our site gets a lot of traffic, and I am concerned if gearman can handle this load. I wan't to keep this question productive, so let's focus largely on the following addressable points:
Can gearman handle all of our expected load assuming we have the memory (potentially around 3000+ queued jobs at at time, with several thousand being processed per second)?
Would this run better if I just passed the requests to node.js using CURL, and if so, does node.js provide any way to distribute the load over multiple instances of a given script?
Can gearman be configured in a way that there is no single point of failure?
What are some issues that you guys can see arising both in terms of development and scaling?
I am addressing these wide range of points so anyone viewing this post can collect a wide range of information in one place regarding matters that strongly affect each other.
Of course I will test all of this, but I want to collect as much information as possible before potentially undertaking something like this.
Edit: A large reason I am using gearman is not because of it's non-blocking structure, but because of it's sheer speed.
I can only speak to your questions on Gearman:
Can gearman handle all of our expected load assuming we have the memory (potentially around 3000+ queued jobs at at time, with several thousand being processed per second)?
Short: Yes
Long: Everything has its limit. If your job payloads are inordinately large you may run into issues. Gearman stores its queue in memory.. so if your payloads exceed the amount of memory available to Gearman you'll run into problems.
Can gearman be configured in a way that there is no single point of failure?
Gearman has a plugin/extension/component available to use MySQL as a persistence store. That way, if Gearman or the machine itself goes down you can bring it right back up where it left off. Multiple worker-servers can help keep things going if other workers go down.
Node has a cluster module that can do basic load balancing against n processes. You might find it useful.
A common architecture here in nodejs-land is to have your nodes talk http and then use some way of load balancing such as an http proxy or a service registry. I'm sure it's more or less the same elsewhere. I don't know enough about gearman to say if it'll be "good enough," but if this is the general idea then I'd imagine it would be fine. At the least, other people would be interested in hearing how it went I'm sure!
Edit: Remember, number-crunching will block node's event loop! This is somewhat obvious if you think about it, but definitely something to keep in mind.

AJAX long-polling a REST API/Memcached in a PHP application

No, I'm not trying to see how many buzzwords I can throw into a single question title.
I'm making REST requests through cURL in my PHP app to some webservices. These requests need to be made fairly often since much of the application depends on this API. However, there is severe latency with the requests (2-5 seconds) which just makes my app look painfully slow.
While I'm halfway to a solution with a recommendation to cache these requests in Memcached, I'm still not satisfied with that kind of latency ever appearing within the application.
So here was my thought: I can implement AJAX long-polling in the background so that the user never experiences the latency outright. The REST requests/Memcache lookups will be done all through AJAX at a set interval.
But this is all really new to me and I'm not sure if this is the best approach. And if I'm on the right track, I do know that PHP + Apache is not going to handle something like this well. But PHP is the only language I know. I'd ideally like to set up something like Tornado in Python, but I'm just not sure if I'm over-engineering right now or not.
Any thoughts here would be helpful and much appreciated.
This was some pretty quick turnaround, but I went back through and profiled my app by echoing out microtime() throughout the relevant processes. Turns out that I'm not parallelizing my cURL requests and that's where I take the real hit. It takes approximately 2 seconds to do that, which means very long delays while each cURL request is done in succession.

Categories