Real-time Apps Symfony - What technology to use? - php

I would like to know if someone could explain to me how to build a real-time application with Symfony?
I have looked at a lot of documentation with my best friend Google, but I have not found quite detailed articles.
I would like some more PHP-oriented thing and saw that there were technologies like ReactPHP / Ratchet (but I can not find a tutorial clear enough to integrate it into an existing symfony project).
Do you have any advice on which technologies to use and why? (If you have tutorial links I take!)
Thank you in advance for your answers !

Every useful Symfony application does some form of I/O. In traditional applications this is most often blocking I/O. Even if it's non-blocking I/O, it doesn't integrate a global event loop that could schedule other things while waiting for I/O.
If you integrate Symfony into an existing event loop based WebSocket server, it will work with blocking I/O as a proof of concept, but you will quickly notice it isn't running fine in production, because any blocking I/O blocks your whole event loop and thus blocks all other connected clients.
One solution is rewriting everything to non-blocking I/O, but then you'd no longer be using Symfony. You might be able to reuse some components, but only those not doing any I/O.
Another solution is to use RPC and queue WebSocket requests into a queue. The intermediary can be written using non-blocking I/O only, it doesn't have much to do. It basically just forwards WebSocket messages as RPC requests to a queue. Then you have a set of workers pulling from that queue, doing a normal Symfony kernel dispatch and sending the response into a response queue. The worker can then continue to fetch the next job.
With the second solution you can totally use blocking I/O and all existing Symfony components. You can spawn as many workers as you need and you can even keep them alive between requests. The difference with a queue in between is that one blocking worker doesn't block the responsiveness of the WebSocket endpoint.
If you want multiple WebSocket processes, you'll need separate response queues for them, so the responses are sent back to the right process where the client is connected.
You can find a working implementation with BeanstalkD as queue in kelunik/rpc-demo. src/Server.php is just for the demo purpose and can be replaced with a HTTP server at any time. To keep the demo simple it uses a single WebSocket process but that can be changed as outlined above. You can start php bin/server and php bin/worker, then use telnet localhost 2000 to connect and send messages. It will respond with the same message but base64 encoded in the workers.
The mentioned demo is built on Amp, but the same concepts apply to ReactPHP as well.

In this issue of the official Symfony repository you may find comments and ideas about this: https://github.com/symfony/symfony/issues/17051

Related

Lumen can not read all packets from MQTT

I am working on an IoT project in which 10K devices are sending data every 5 seconds. In order to communicate with server, I use Mosquitto MQTT broker. From the broker, a Laravel (Lumen exactly) application reads data, process it and add to database. In order to read data from MQTT to PHP, I am using the following package.
https://packagist.org/packages/salmanzafar/laravel-mqtt
This is working perfectly at low load, but when the high load is coming to the server, some of the packets are missing. It reaches MQTT, but not reaches to database through PHP.
Does anyone faced this issue ? I have only one topic. Can anyone suggest a better alternative library in PHP ?
Our testing with mosquitto suggests severe degradation around 1000 connections, mainly because it is a single-threaded broker. Of course, there are many variables that affect performance, but if you care about performance, you'll end up with a commercial broker.
Eg. see this blog post https://gambitcomm.blogspot.com/2018/08/video-monitor-end-to-end-latency-of.html which details end-to-end latency testing for 10k publishers.

Implementing WebSockets , the theory and the reality in PHP

this question has to do with theory as with real life programming I first asked it in (cs.stackexchange.com) because is theory most and I had the instruction to ask here (https://cs.stackexchange.com/questions/81472/question-about-implementing-websockets-theory-and-the-reality-in-php) .
I am experimenting with web sockets and PHP many years now (some of this code is already in production) , first I created from scratch a WebSocket (WS) Server with non blocking IO and everything worked fine , except in real life other methods needed by the app couldn’t be non blocking (e.g. connection to a DB and a query). Then I introduced async programming , meaning that the WS Server initiated various PHP requests to the sever and check in every loop if those requests have finished the results in order to send them to client. That worked well for few client side users connected to this WS server , the number had to do with what the operation was but it wouldn’t be more than 30 or 50. That were because if you use only one thread and you have many simultaneous requests you must check each one of them sequential if there is a finished result.
The next step was to analyze the code of popular approaches claiming that can hold and process many (some say 10000) clients in same time. Maybe they knew something that I didn’t (My issue isn’t if they are lying , the issue is if there is something I am missing (or maybe I am wrong) here). The results were frustrating. Most of them don’t use async by default advising you not to use blocking methods (something that is really impossible in real life programming) , but even if you put modules to them to make them async the same problem that I had arose.
The question isn’t what is the solution , because I implemented PHP pthreads and I could make it work , but with no real benefit (e.g. sharing objects , it had to serialize unserialize everything), I write C++ PHP extensions some years now , so I am working in a PHP extension that will do that efficiently.
The question here is , am I missing something ? How can they claim that the can handle a large amount of request simultaneously while even with async programming they have to check for each request in the loop that has finished ?
Thank you in advance for any new knowledge or direction to search that your answer might lead me.
Yes, there are projects that make it possible with PHP. One such project is Amp with its Aerys HTTP and WebSocket server. Yes, you can't just call blocking functions in the same thread. Yes, pthreads won't help, it's mostly like just running another PHP process, because everything in PHP is shared nothing. But how does it work then?
Use non-blocking implementations where possible. There are libraries that work with non-blocking I/O for database access, such as amphp/mysql.
If there's no such library, ask whether something like that can be implemented if you don't want to / can't implement it yourself.
Another possibility is to use libraries such as amphp/parallel that use persistent workers for blocking tasks. Spawning another worker for each blocking task would be horribly inefficient, so that library makes it easy to use worker pools and keep these workers alive for several tasks each.
One such library that makes use of amphp/parallel is amphp/file, which uses these workers for non-blocking filesystem access when no extensions like uv or eio are available, maybe you want to have a look at its ParallelDriver.
How many connections you will be able to handle concurrently depends a lot on your hardware and what you're doing with these connections. If you constantly stream data to each client, you will be able to keep much fewer connections open than in a situation where most connections are idle and only send / receive something in a small portion of the connected time.
If you want to handle more than ~1000 clients, you probably need an extension or recompile PHP because of the FD_MAXSIZE for stream_select, which is compiled in and limits stream_select to file descriptors lower than 1024.

Should I use Laravel Queues to manage threads across my application

I am looking to hit multiple 3rd party APIs to gather information for a user's search query. I am planning to spin off a thread for each API I want to hit to minimize the response time on my end. I also want to limit the amount of threads my application can have running at any one time due to memory/cpu concerns.
Since I am using Laravel as my framework, I was trying to accomplish this using Laravel queues, but it seems that I might have trouble getting the response data from the Job.
Are laravel queues the correct way to tackle this? If so how do I
listen for the job's status and retrieve the data once the job is complete? I see some things that point towards passing a closure to the job, but something just isn't clicking for me.
It depends. A job queue and worker pool might be appropriate if there are a really huge number of API calls to make, especially if those API calls can be very slow. But, I'd try to avoid all that architecture unless you're really sure you need it.
To start, I'd look at doing async requests to the external APIs, and try to keep the whole thing in a single process. The Guzzle HTTP client library provides a very programmer-friendly API for doing this kind of asynchronous requests.
If the external requests are really numerous or slow, you might consider using a queue. But in that case, you're looking at implementing a bunch of logic to queue all the jobs, then poll until they're done (giving feedback to your user along the way), and finally return the merged result. That may end up being necessary, but I'd start with the simpler implementation I describe above.

SendGrid for PHP is slow. Are non-blocking requests possible?

We are currently developing a mobile app for iOS and Android. For this, we need a stable webservices.
Requirements: - Based on PHP and MySQL, must be blazing fast, must be scalable
I've created a custom-coded simple webservices with multiple endpoints to allow passing data from the app to our database, and vice versa.
My Question:
our average response time with my custom coded solution is below 100ms (measured using newrelic) for normal requests (say, updating a DB field, or performing INSERT INTO). This is without any load however (below 100 users daily). When we are creating outbound requests (specifically, sending E-Mail using SendGrid PHP-Framework) we are seeing a response time of > 1000ms. It appears that the request is "waiting" for a response from Sendgrid. Is it possible to tell the script not to "wait for a response"? This is not really ideal. My idea was to store all "pending" requests to a separate table, and then using a cron to run through all "pending" requests and mark them as "completed". Is this a viable solution? And will one cron each minute be enough for processing requests (possible delay of 1min for each E-Mail)?
As always, any replies or suggestions are very appreciated. Thanks in advance!
To answer the first part of your question: Yes you can make asynchronous requests with PHP, and even ignore the service's response. However, as you correctly say it's not a super great solution.
Asynchronous Requests
This excellent blog post on PHP Asynchronous Requests by Segment.io comes to several conclusions:
You can open a socket and write to it, as described by this Stack Overflow Topic - However, it seems that this is actually blocking and fairly slow (300ms in their tests).
You can write to a log file and then process it in another way (essentially a queue, like you describe) - However, this requires another process to read the log and process it. Using the file system can be slow, and shared files can cause all sorts of problems.
You can fork a cURL request - However, this means you aren't waiting for a response, so if SendGrid (or some other service) responds with an error, you can't catch it and react.
Opinion Land
We're now entering semi-opinion land, but queues as you describe (such as a mySQL one with a cron job, or a text file, or something else) tend to be very scalable as you can throw workers at the queue if you need it to process faster. These can be outside your user facing system (and therefor not share resources).
Queues
With a queue, you'd have a separate service that would be responsible for sending an email with SendGrid (e.g.). It would pull tasks off a queue (e.g. "send an email to Nick")and then execute on it.
There are several ways to implement queues that you can process.
You can write your own - As you seem to want to stay on PHP/mySQL, if you do this you'll need to take into account a bunch of queueing problems and weird edge cases. However, you'll have absolute control and for a simple application maybe this will work.
You can implement a self hosted task queue - Celery is meant to be a distributed task queue, øMQ (ZeroMQ) and RabbitMQ can also be used as Task Queues. These are meant to be fast and distributed and have had a lot of thought put into them. You'd need to benchmark them in your system to see if they speed it up. It'd also mean you have to host additional pieces yourself. This however, is likely to be the fastest solution from a communication standpoint.
You can pass things off to a hosted task queue - IronMQ and Amazon SQS are both cool hosted solutions which means you wouldn't need to dedicate resources to them, additionally with IronWorkers (e.g.) you could have the other service taken care of. However, since you're trying to optimize a request to an external service, this probably isn't the solution in this scenario.
Queueing Emails
On the topic of queuing emails (specifically), this is something common to email senders. Like with everything else it means you can have better reliability (because if a service down the line fails you can keep it in the queue and retry).
With email however, there's some specific services out there for queueing messages. These are SMTP Servers. Theoretically you can setup a server like sendmail and then set SendGrid as your "smarthost" or relay and have the server send to SendGrid. It then queues and deals with service interruptions and sends mail with little additional code. However, SMTP servers are pains to deal with, even if they're just forwarding messages. Additionally, SMTP is even slower than HTTP to establish a connection and therefor probably not what you want, but it's good to know.
Another possible solution if you control your own server environment that will speed up your email sending and your application is to install a mail server such as Postfix locally. You then configure Postfix to use your Sendgrid credentials, so any email sent will go from your server to sendgrid.
This is not a PHP solution, but removes the need for writing your own customer solution. If you set Postfix as the default mail server. You can then just use the php mail() function to send email.
https://sendgrid.com/docs/Integrate/Mail_Servers/postfix.html

Ajax Long Polling Restrictions

So a friend and I are building a web based, AJAX chat software with a jQuery and PHP core. Up to now, we've been using the standard procedure of calling the sever every two seconds or so looking for updates. However I've come to dislike this method as it's not fast, nor is it "cost effective" in that there are tons of requests going back and forth from the server, even if no data is returned.
One of our project supporters recommended we look into a technique known as COMET, or more specifically, Long Polling. However after reading about it in different articles and blog posts, I've found that it isn't all that practical when used with Apache servers. It seems that most people just say "It isn't a good idea", but don't give much in the way of specifics in the way of how many requests can Apache handle at one time.
The whole purpose of PureChat is to provide people with a chat that looks great, goes fast, and works on most servers. As such, I'm assuming that about 96% of our users will being using Apache, and not Lighttpd or Nginx, which are supposedly more suited for long polling.
Getting to the Point:
In your opinion, is it better to continue using setInterval and repeatedly request new data? Or is it better to go with Long Polling, despite the fact that most users will be using Apache? Also, it possible to get a more specific rundown on approximately how many people can be using the chat before an Apache server rolls over and dies?
As Andrew stated, a socket connection is the ultimate solution for asynchronous communication with a server, although only the most cutting edge browsers support WebSockets at this point. socket.io is an open source API you can use which will initiate a WebSocket connection if the browser supports it, but will fall back to a Flash alternative if the browser does not support it. This would be transparent to the coder using the API however.
Socket connections basically keep open communication between the browser and the server so that each can send messages to each other at any time. The socket server daemon would keep a list of connected subscribers, and when it receives a message from one of the subscribers, it can immediately send this message back out to all of the subscribers.
For socket connections however, you need a socket server daemon running full time on your server. While this can be done with command line PHP (no Apache needed), it is better suited for something like node.js, a non-blocking server-side JavaScript api.
node.js would also be better for what you are talking about, long polling. Basically node.js is event driven and single threaded. This means you can keep many connections open without having to open as many threads, which would eat up tons of memory (Apaches problem). This allows for high availability. What you have to keep in mind however is that even if you were using a non-blocking file server like Nginx, PHP has many blocking network calls. Since It is running on a single thread, each (for instance) MySQL call would basically halt the server until a response for that MySQL call is returned. Nothing else would get done while this is happening, making your non-blocking server useless. If however you used a non-blocking language like JavaScript (node.js) for your network calls, this would not be an issue. Instead of waiting for a response from MySQL, it would set a handler function to handle the response whenever it becomes available, allowing the server to handle other requests while it is waiting.
For long polling, you would basically send a request, the server would wait 50 seconds before responding. It will respond sooner than 50 seconds if it has anything to report, otherwise it waits. If there is nothing to report after 50 seconds, it sends a response anyways so that the browser does not time out. The response would trigger the browser to send another request, and the process starts over again. This allows for fewer requests and snappier responses, but again, not as good as a socket connection.

Categories