In PHP, my script is only trying to test if a server is online - nothing more. How would I go about creating multiple socket streams that all run at the same time? Doing it one-after-another would take forever if you're testing a bunch of servers.
Usually you would start a pool of threads and the threads would read all the sites that need to be tested from a queue. This would allow each thread to open a connection to a site (supporting concurrency)
or maybe pthreads? I dunno, i've never written threaded code in php
Use select. This takes a list of which sockets you want to read or write to and then tells you when they are ready. When you read/write to these, you know you'll be able to get/send some data. You then process what you need to do on those sockets, and go back and select again waiting for more data.
If you need to do other things as well, set the timeout on select and it will return in that amount of time, even if nothing is ready on any sockets.
edit: also, once you figure out how to use select (not that hard), it's a TON simpler to debug and deal with than dealing with synchronization between threads.
Related
this question has to do with theory as with real life programming I first asked it in (cs.stackexchange.com) because is theory most and I had the instruction to ask here (https://cs.stackexchange.com/questions/81472/question-about-implementing-websockets-theory-and-the-reality-in-php) .
I am experimenting with web sockets and PHP many years now (some of this code is already in production) , first I created from scratch a WebSocket (WS) Server with non blocking IO and everything worked fine , except in real life other methods needed by the app couldn’t be non blocking (e.g. connection to a DB and a query). Then I introduced async programming , meaning that the WS Server initiated various PHP requests to the sever and check in every loop if those requests have finished the results in order to send them to client. That worked well for few client side users connected to this WS server , the number had to do with what the operation was but it wouldn’t be more than 30 or 50. That were because if you use only one thread and you have many simultaneous requests you must check each one of them sequential if there is a finished result.
The next step was to analyze the code of popular approaches claiming that can hold and process many (some say 10000) clients in same time. Maybe they knew something that I didn’t (My issue isn’t if they are lying , the issue is if there is something I am missing (or maybe I am wrong) here). The results were frustrating. Most of them don’t use async by default advising you not to use blocking methods (something that is really impossible in real life programming) , but even if you put modules to them to make them async the same problem that I had arose.
The question isn’t what is the solution , because I implemented PHP pthreads and I could make it work , but with no real benefit (e.g. sharing objects , it had to serialize unserialize everything), I write C++ PHP extensions some years now , so I am working in a PHP extension that will do that efficiently.
The question here is , am I missing something ? How can they claim that the can handle a large amount of request simultaneously while even with async programming they have to check for each request in the loop that has finished ?
Thank you in advance for any new knowledge or direction to search that your answer might lead me.
Yes, there are projects that make it possible with PHP. One such project is Amp with its Aerys HTTP and WebSocket server. Yes, you can't just call blocking functions in the same thread. Yes, pthreads won't help, it's mostly like just running another PHP process, because everything in PHP is shared nothing. But how does it work then?
Use non-blocking implementations where possible. There are libraries that work with non-blocking I/O for database access, such as amphp/mysql.
If there's no such library, ask whether something like that can be implemented if you don't want to / can't implement it yourself.
Another possibility is to use libraries such as amphp/parallel that use persistent workers for blocking tasks. Spawning another worker for each blocking task would be horribly inefficient, so that library makes it easy to use worker pools and keep these workers alive for several tasks each.
One such library that makes use of amphp/parallel is amphp/file, which uses these workers for non-blocking filesystem access when no extensions like uv or eio are available, maybe you want to have a look at its ParallelDriver.
How many connections you will be able to handle concurrently depends a lot on your hardware and what you're doing with these connections. If you constantly stream data to each client, you will be able to keep much fewer connections open than in a situation where most connections are idle and only send / receive something in a small portion of the connected time.
If you want to handle more than ~1000 clients, you probably need an extension or recompile PHP because of the FD_MAXSIZE for stream_select, which is compiled in and limits stream_select to file descriptors lower than 1024.
So a friend and I are building a web based, AJAX chat software with a jQuery and PHP core. Up to now, we've been using the standard procedure of calling the sever every two seconds or so looking for updates. However I've come to dislike this method as it's not fast, nor is it "cost effective" in that there are tons of requests going back and forth from the server, even if no data is returned.
One of our project supporters recommended we look into a technique known as COMET, or more specifically, Long Polling. However after reading about it in different articles and blog posts, I've found that it isn't all that practical when used with Apache servers. It seems that most people just say "It isn't a good idea", but don't give much in the way of specifics in the way of how many requests can Apache handle at one time.
The whole purpose of PureChat is to provide people with a chat that looks great, goes fast, and works on most servers. As such, I'm assuming that about 96% of our users will being using Apache, and not Lighttpd or Nginx, which are supposedly more suited for long polling.
Getting to the Point:
In your opinion, is it better to continue using setInterval and repeatedly request new data? Or is it better to go with Long Polling, despite the fact that most users will be using Apache? Also, it possible to get a more specific rundown on approximately how many people can be using the chat before an Apache server rolls over and dies?
As Andrew stated, a socket connection is the ultimate solution for asynchronous communication with a server, although only the most cutting edge browsers support WebSockets at this point. socket.io is an open source API you can use which will initiate a WebSocket connection if the browser supports it, but will fall back to a Flash alternative if the browser does not support it. This would be transparent to the coder using the API however.
Socket connections basically keep open communication between the browser and the server so that each can send messages to each other at any time. The socket server daemon would keep a list of connected subscribers, and when it receives a message from one of the subscribers, it can immediately send this message back out to all of the subscribers.
For socket connections however, you need a socket server daemon running full time on your server. While this can be done with command line PHP (no Apache needed), it is better suited for something like node.js, a non-blocking server-side JavaScript api.
node.js would also be better for what you are talking about, long polling. Basically node.js is event driven and single threaded. This means you can keep many connections open without having to open as many threads, which would eat up tons of memory (Apaches problem). This allows for high availability. What you have to keep in mind however is that even if you were using a non-blocking file server like Nginx, PHP has many blocking network calls. Since It is running on a single thread, each (for instance) MySQL call would basically halt the server until a response for that MySQL call is returned. Nothing else would get done while this is happening, making your non-blocking server useless. If however you used a non-blocking language like JavaScript (node.js) for your network calls, this would not be an issue. Instead of waiting for a response from MySQL, it would set a handler function to handle the response whenever it becomes available, allowing the server to handle other requests while it is waiting.
For long polling, you would basically send a request, the server would wait 50 seconds before responding. It will respond sooner than 50 seconds if it has anything to report, otherwise it waits. If there is nothing to report after 50 seconds, it sends a response anyways so that the browser does not time out. The response would trigger the browser to send another request, and the process starts over again. This allows for fewer requests and snappier responses, but again, not as good as a socket connection.
I'm a moderate to good PHP programer and have experience with terminal/shell scripts but what I'm trying to wrap my head around is the logics behind background processes and is most certainly not Cron or Cron Jobs but a continual flow of data.
I recently talked to someone who made a little web app that worked with the twitter streaming API and Phirehose to gather tweets and save them to a DB. Now sounds simple but all this happens in the background as a process. What I'm ineptly used to is:
call process -> process finishes -> handle data from process.
What is so different about this is that it happens all the time non stop. I remember there was also so talk of socket connection as well.
So my questions are:
When executing a background process, is it a continual loop of specific function? That's all that I can logically conclude, or does it some how "stay open" and happen?
What does a socket connection do in this equation?
Is the any form of latency inherit from running this type of process?
I know this is not a "code specific" type of question but I can't find much information regarding this type of question.
With PHP, it's most likely that a cronjob is scheduled to execute the scripts once every hour or so. The script doesn't run continuously.
PHP has many ways of connecting to resources, most of these use sockets. If you do file_get_contents() to connect to a webserver, you're using sockets as well, you might just not notice it.
1. When executing a background process, is it a continual loop of specific function? That's all that I can logically conclude, or does it some how "stay open" and happen?
No, there is not requirement of such a continual loop. A background process can just be invoked, run and finish as well. It than does not run any longer like any other process as well. Maybe not useful for a background process, but possible.
2. What does a socket connection do in this equation?
Sockets are sometimes used to allow communication between different processes, also worded IPC - Inter Process Communication.
3. Is the any form of latency inherit from running this type of process?
Yes, every form of indirection comes with a price. Additionally, if you run multiple processes in parallel, there is also some overhead for the computer system to manage these multiple processes (which it does anyway nowadays, but just saying, if there were only one process, there would be nothing to manage).
If you want to take a tutorial on background processes:
http://thedjbway.b0llix.net/daemontools/blabbyd.html - really useful.
Daemontools makes it very easy to maintain backgound processes (daemons).
Short version: I want to connect a Client to a PHP server, but i have a limitation on the server of 10 PHP scripts running at the same time.
Question is: What is the best way to connect a client with PHP script, while staying under the limitation?
Long version:
My previous questions shows what i am really after, but here it is again:
I want to develop a a webchat using Java applet as the client side, and PHP as the back end server. Under normal circumstances I wouldn't ask a question like this just use the first thing that google pops up to my search. but right now i'm not under normal circumstances, but under restrictions: Server usage, as in my hosting isa shared account hosting, and 10 Entry processes(aka the number of PHP scripts running at the same time.) I need to make a server to my chat with these in minds, and lowering the performance as much as i can.
I did develop a Client/server connection using TCP in Delphi, but that was long ago, and i forgot much of it. And Now i try to resurface it, i realize i didn't know much about it.
So I got several questions, based on my researches:
What is a socket?
I did goggle this but i didn't find a really clear answer to this. This is the standard way of two programs communicating with each other right? and this where maybe one of my wrong knowledge is...
Is TCP/UDP protocol by Sockets?
I dont even know how to explain this question of mine...
What is stream exactly?
What i know from my C++ knowledge is its the ability to open files in binary form, and read from it from any point. I might be wrong because my C++ knowledge is old too.
Also i read about PHP sockets, and i found about that its capable to listen to a port with socket_create_listen but my concern is that does this scrip runs actively? like an infinitive loop? I'm asking this because the 10 process limitation.
And if i initiate a TCP Connection with a client does the script runs in an infinite loop again? does it counts on the active processes?
I know UDP doesn't need an active connection, because it just send it en masse and forgets about it terminating the script when it ends, but i don't know about TCP.
Sorry for the long post, and the many questions, and thank you for any help you can offer.
EDIT: I Forgot about GET/POST methods!
As i said that I'm planing a webchat and they need to communicate, but aside from direct connection there is the GET/POST method as well, which the script quickly does and terminates the script, but again the 10 process limit, what happens when 11 process tries to run at the same time?
Also is there a way to limit the simultaneously running processes? or put into a queue and wait till the others finish?
If your server is limited to only 10 concurrent threads, this is a hard limit and you can't do much. What you can do is to make the request as small as possible, and have as less things as possible resolved by php. So the possibility of concurrency would be very small.
Ideally, all your php's will start and exit very quickly, often redirecting the user to static content (html, js, img and css files).
Maybe you can make your whole webapp a lot of html files, and have some ajax.php file for the server communication...
I'm looking for some ideas to do the following. I need a PHP script to perform certain action for quite a long time. This is an extension for a CMS and this can't be anything else but PHP. It also can't be a command line script because it should be used by common people that will have only the standard means of the CMS. One of the options is having a cron job (most simple hostings have it) that will trigger the script often so that instead of working for a long time it could perform the action step by step preserving its state from one launch to the next one. This is not perfect but I can't see of any other solutions. If the script will be redirecting to itself server will interrupt it. What other options can suit?
Thanks everyone in advance!
What you're talking about is a daemon or long running program that waits for calls by client programs, performs and action, provides a response then keeps on waiting for more calls.
You might be familiar w/ these in the form of Apache & MySQL ;) Anyway PHP is generally OK in this regard, it does have the ability to function over raw sockets as well as fork sub-processes to handle multiple requests simultaneously.
Having said that PHP daemons are a tool where YMMV. Some folks will say they work great, other folks like me will say they have issues w/ interprocess communication and leaking memory even amidst plethora unset() calls.
Anyway you likely won't be able to deploy a daemon of any type on a shared hosting environment. You'll need to get a better server package or stick with a Cron based solution.
Here's a link about writing a PHP daemon.
Also, one more note. Daemons do crash from time to time and therefore you may still need to store state about whats going on, just in case someone trips over the power cord to your shared server :)
I would also suggest that you think about making it a daemon but if not then you can simply use
set_time_limit(0);
ignore_user_abort(true);
at the top to tell it not to time out and not to get interrupted by anything. Then call it from the cron to start it every day or whatever. I have this on many long processing daily tasks and it works great for me. However, it won't be able to easily talk to the outside world (other scripts can't query it or anything -- if that is what you want look into php services) so once you get it running make sure it will stop and have it print its progress to a logfile.