I'm building a chat application with JavaScript, jQuery, MySQL and PHP, and I'm just wondering what is the best way for the client to retrieve chat messages from the server? My current potential candidates are Polling, Long Polling, HTML5 Server-Sent Events (EventSource), and WebSockets. Which of these would be the fastest (instantaneous messages) and most efficient method (explain why too if possible)? Or if there a better way to do this, please detail it in the answer.
Additionally, I've also looked at Node.js + Socket.IO, but the documentation and sample code I found for those did not make a drop of sense to me.
Finally, I'm using XAMPP as my local server and MySQL as my database for this application.
Any help would be appreciated.
Coincidentally, your listed options are listed in order of efficiency, from least to most.
Polling is the least efficient. It will poll whether there are messages or not and it introduces a latency between a message being sent and received by other clients.
Long polling is better; then you can get the message when it's sent, but then there might be a slight delay in reconnecting. During that delay, messages will not be delivered, but otherwise, it's practically instantaneous.
COMET (not mentioned) is better than long polling, but worse than Server-Sent Events. It, too, must occasionally reconnect due to most web servers and browsers having timeouts on connections, but connections need not be re-established whenever a message is sent. Like long polling, there might be a delay while reconnecting, but otherwise, it's usually instantaneous.
Server-Sent Events are similar to COMET, but when not shimmed, it has native support from the browser, so it can bypass the timeout limitations and only needs to make one connection over its lifetime (as long as the connection isn't broken). Another advantage is that it automatically reconnects if the connection is broken without any client-side code needed on your part. This is instantaneous.
WebSockets are by far the best out of all of those options; it only needs one connection, and it's duplex: not only can you receive messages through it, but you can also send messages through it rather than needing to connect to the server separately every time you want to send a message. Unlike Server-Sent Events, it does require some more code: it does not automatically reconnect if a connection is broken and server-side implementations are typically more complex. I'm also not sure if you can use it with Apache/XAMPP. This is instantaneous.
Socket.io is a library that supports (almost?) all of these and some more (e.g., Flash sockets) and abstracts it behind a nice API, so you don't have to deal with the idiosyncrasies of browser support for each of them. It is as fast as the transport it chooses to use, which depends on the browser it's running on. It also can cut down on the amount of code you have to write. However, if it's too complex for you and you don't care about older browsers, it is certainly not necessary. Additionally, it really likes to be run on its own. You might be able to get XAMPP to proxy to it, but again, I don't know if Apache can be configured to forward WebSockets to it.
Related
I've been reading a lot on the subject of SSE and PHP, most of which seems to be advocating it as viable solutions for all sorts of things including chat apps. I have seen similar questions on this site but have not found a concise, definitive answer.
Is there something inherent in SSE which makes it way more server-friendly than AJAX short polling? Because the headers appear to be of very similar size. I am wondering if there is some kind of behind-the-scenes stuff beyond the headers that a noob like myself can't see e.g. some sort of connection recognition with each request/response? I know there are other factors involved where SSE prevails such as handling disconnections.
In terms of using it in a chat app scenario, ajax and sse appear to be doing the same thing. Neither of them seems to be able to perform long polling effectively with PHP. If I have User A and User B waiting on a PHP script that checks for new messages from the other user in the DB then sleeps for 3 seconds for say 10 loops, User A's new message cannot be inserted until User B has looped through the entire checking script, thereby rendering it absolutely useless (at least based on everything I've tried in the last 2 weeks!). I can get it working smoothly if I chat to myself and no one else is waiting on the checking script, but I've run out of things to talk about with myself and would really enjoy someone else being able to use it too.
So in a nutshell, given an Apache and PHP environment with WebSockets as not an option (due to shared hosting), is the only effective way to write a chat app, based on server burden alone, by short polling with one's choice of either AJAX or SSE, or is SSE definitely the superior option?
I would pursue WebSockets if the eventual traffic called for it and justified the web hosting upgrade.
(ALSO, as a side, is my premise off base regarding the long-polling scenario I described above where User A must wait for User B's loop to finish before he/she/it can perform the insert? Got me confused as to why that should be the case).
Kind of a long-winded, meandering question but hoping someone in the same situation can find this question and save themselves a lot of time.
Many Thanks!
Yes, SSE is a better option than AJAX, as AJAX polling is done on the main servers, like where most of the normal user traffic is to be hit. Whereas SSE polling is done on another instance which is made for it, so there will be no extra traffic on the main server. Please check Mercure (https://mercure.rocks/)
EDIT:
I mean to that, using SSE with platforms like Mercure would be a better option than AJAX. As AJAX will make a request to the main server. Which would increase the count of requests for the main server. Whereas we can distribute the network load using tools like the Mercure, in order to achieve the required functionality.
SSE can be thought of a thin API wrapper around the AJAX long-poll approach. It brings a standard API to something that was a hacky solution before.
something inherent in SSE which makes it way more server-friendly than AJAX short polling?
It holds the socket open. The pro of this is less latency (as soon as the server has the new information it sends it to the client, rather than waiting for the next client poll); the con is the extra resource usage (the socket, and the PHP process).
but I've run out of things to talk about with myself
Surely not. Have you tried starting a chat about if time is an illusion, and what came before?
with WebSockets as not an option (due to shared hosting)
SSE and WebSockets both hold a socket open. Shared hosting ISPs often go round closing sockets that have been open a long time (e.g. over 60s), unless they explicitly say they support SSE. The may also kill long-running PHP processes.
is my premise off base regarding the long-polling scenario I described above where User A must wait for User B's loop to finish before he/she/it can perform the insert?
I think it is off. The "A" in Ajax is asynchronous, meaning you can have multiple ajax/sse requests running at the same time. And on the server side you will have a distinct PHP process running for each request.
this question has to do with theory as with real life programming I first asked it in (cs.stackexchange.com) because is theory most and I had the instruction to ask here (https://cs.stackexchange.com/questions/81472/question-about-implementing-websockets-theory-and-the-reality-in-php) .
I am experimenting with web sockets and PHP many years now (some of this code is already in production) , first I created from scratch a WebSocket (WS) Server with non blocking IO and everything worked fine , except in real life other methods needed by the app couldn’t be non blocking (e.g. connection to a DB and a query). Then I introduced async programming , meaning that the WS Server initiated various PHP requests to the sever and check in every loop if those requests have finished the results in order to send them to client. That worked well for few client side users connected to this WS server , the number had to do with what the operation was but it wouldn’t be more than 30 or 50. That were because if you use only one thread and you have many simultaneous requests you must check each one of them sequential if there is a finished result.
The next step was to analyze the code of popular approaches claiming that can hold and process many (some say 10000) clients in same time. Maybe they knew something that I didn’t (My issue isn’t if they are lying , the issue is if there is something I am missing (or maybe I am wrong) here). The results were frustrating. Most of them don’t use async by default advising you not to use blocking methods (something that is really impossible in real life programming) , but even if you put modules to them to make them async the same problem that I had arose.
The question isn’t what is the solution , because I implemented PHP pthreads and I could make it work , but with no real benefit (e.g. sharing objects , it had to serialize unserialize everything), I write C++ PHP extensions some years now , so I am working in a PHP extension that will do that efficiently.
The question here is , am I missing something ? How can they claim that the can handle a large amount of request simultaneously while even with async programming they have to check for each request in the loop that has finished ?
Thank you in advance for any new knowledge or direction to search that your answer might lead me.
Yes, there are projects that make it possible with PHP. One such project is Amp with its Aerys HTTP and WebSocket server. Yes, you can't just call blocking functions in the same thread. Yes, pthreads won't help, it's mostly like just running another PHP process, because everything in PHP is shared nothing. But how does it work then?
Use non-blocking implementations where possible. There are libraries that work with non-blocking I/O for database access, such as amphp/mysql.
If there's no such library, ask whether something like that can be implemented if you don't want to / can't implement it yourself.
Another possibility is to use libraries such as amphp/parallel that use persistent workers for blocking tasks. Spawning another worker for each blocking task would be horribly inefficient, so that library makes it easy to use worker pools and keep these workers alive for several tasks each.
One such library that makes use of amphp/parallel is amphp/file, which uses these workers for non-blocking filesystem access when no extensions like uv or eio are available, maybe you want to have a look at its ParallelDriver.
How many connections you will be able to handle concurrently depends a lot on your hardware and what you're doing with these connections. If you constantly stream data to each client, you will be able to keep much fewer connections open than in a situation where most connections are idle and only send / receive something in a small portion of the connected time.
If you want to handle more than ~1000 clients, you probably need an extension or recompile PHP because of the FD_MAXSIZE for stream_select, which is compiled in and limits stream_select to file descriptors lower than 1024.
Ok, I am working up something like a chat environment, and I'd like to have near real time if not real time conversation. But I know browsers will only give up 2 threads at a time for transactions per domain. So I am trying to figure out a way to make a synchronous chat without really effecting the browser. I also know browsers tend to lock up with synchronous requests.
So whats the best approach at creating a chat like environment on a site from scratch, assume the DB and scripting concept is fine, its the managing of the connection, wondering how to keep a persistant connection that won't congest the browser and cause it to possibly freeze up.
Anyone have any ideas.. Im not looking for flash, or java based solutions. I'd prefer not to poll every second either. But what is stacks impression, what would you do.
First off, the spec only suggests that two connections are allowed. Most modern browsers actually support up to 6.
There're three main accepted methods for creating a chat system out of pure Javascript:
Polling
The first solution is simple, and just involves polling the server every few seconds (5 is a nice number) to see what it's missed. It works simply and efficiently, but can lead to large amounts of unnecessary requests if not careful, which can cause unnecessary server load.
A better implementation of this involves polling to simply check if anything's happened since the last chat update, and if so, only then go through the process of finding out what's happened. Saves on the server load and bandwidth fronts.
Waiting
This method's more commonly used, and involves the browser sending a request to the server which is never fulfilled, and instead keeps 'waiting for a response'. When something happens, the server outputs it and fulfills the request, and the client makes another request and the process repeats. This saves on the request front, but can end up with a backlog of ongoing processes on your server.
Websockets
https://developer.mozilla.org/en/WebSockets
This involves creating a direct socket connection to the server, allowing data to be pushed to the client when needed. It's relatively new though, and can have some compatability issues, especially with older browsers.
Out of these, none of them is specifically the 'best method'; it depends on what you're aiming for, and what matters. If you've got a site designed for up-to-date browsers, then websockets could be your answer, but if you've got a small-ish server, then polling could be better, for example.
My own chat engine checks for new messages every five seconds. That's close enough to instant that nobody knows the difference.
It's as simple as setInterval(updateChat,5000);.
So a friend and I are building a web based, AJAX chat software with a jQuery and PHP core. Up to now, we've been using the standard procedure of calling the sever every two seconds or so looking for updates. However I've come to dislike this method as it's not fast, nor is it "cost effective" in that there are tons of requests going back and forth from the server, even if no data is returned.
One of our project supporters recommended we look into a technique known as COMET, or more specifically, Long Polling. However after reading about it in different articles and blog posts, I've found that it isn't all that practical when used with Apache servers. It seems that most people just say "It isn't a good idea", but don't give much in the way of specifics in the way of how many requests can Apache handle at one time.
The whole purpose of PureChat is to provide people with a chat that looks great, goes fast, and works on most servers. As such, I'm assuming that about 96% of our users will being using Apache, and not Lighttpd or Nginx, which are supposedly more suited for long polling.
Getting to the Point:
In your opinion, is it better to continue using setInterval and repeatedly request new data? Or is it better to go with Long Polling, despite the fact that most users will be using Apache? Also, it possible to get a more specific rundown on approximately how many people can be using the chat before an Apache server rolls over and dies?
As Andrew stated, a socket connection is the ultimate solution for asynchronous communication with a server, although only the most cutting edge browsers support WebSockets at this point. socket.io is an open source API you can use which will initiate a WebSocket connection if the browser supports it, but will fall back to a Flash alternative if the browser does not support it. This would be transparent to the coder using the API however.
Socket connections basically keep open communication between the browser and the server so that each can send messages to each other at any time. The socket server daemon would keep a list of connected subscribers, and when it receives a message from one of the subscribers, it can immediately send this message back out to all of the subscribers.
For socket connections however, you need a socket server daemon running full time on your server. While this can be done with command line PHP (no Apache needed), it is better suited for something like node.js, a non-blocking server-side JavaScript api.
node.js would also be better for what you are talking about, long polling. Basically node.js is event driven and single threaded. This means you can keep many connections open without having to open as many threads, which would eat up tons of memory (Apaches problem). This allows for high availability. What you have to keep in mind however is that even if you were using a non-blocking file server like Nginx, PHP has many blocking network calls. Since It is running on a single thread, each (for instance) MySQL call would basically halt the server until a response for that MySQL call is returned. Nothing else would get done while this is happening, making your non-blocking server useless. If however you used a non-blocking language like JavaScript (node.js) for your network calls, this would not be an issue. Instead of waiting for a response from MySQL, it would set a handler function to handle the response whenever it becomes available, allowing the server to handle other requests while it is waiting.
For long polling, you would basically send a request, the server would wait 50 seconds before responding. It will respond sooner than 50 seconds if it has anything to report, otherwise it waits. If there is nothing to report after 50 seconds, it sends a response anyways so that the browser does not time out. The response would trigger the browser to send another request, and the process starts over again. This allows for fewer requests and snappier responses, but again, not as good as a socket connection.
I need to develop HTTP push using php, it sends back a message to the browser every 20 seconds.. that i need to display on the server
Is there any way of implementing http push in php with out using any sockets or library?
Sure: don't let your PHP script terminate for as long as you want the connection to remain open. Since you are generating output every 20 sec, the connection is much less likely to timeout.
There are many ways to achieve this including blocking (e.g. sleep() below) or busy waiting, but the simplest solution would be:
while( true ){
generate_output();
sleep(20);
}
You may need to handle unexpected connection termination on the client-side, but this is the jist of it. Check your system configuration to see how many open connections you can handle. (In Apache, see MaxClients)
As for the client-side, see this answer https://stackoverflow.com/a/7953033/329062
The only way to do this is to use WebSockets, "Long Polling", or an AJAX request every 20 seconds (even if there is nothing to update). Your solution will depend on which browsers you want to support. For example, newer browsers support Web Sockets, but older browsers don't. I would recommend researching Web Sockets. They are a great solution for this type of situation.