Can PHP asynchronously use sockets?

Can PHP asynchronously use sockets? - php

Typical PHP socket functionality is synchronous, and halts the thread when waiting for incoming connections and data. (eg. socket_read and socket_listen)
How do I do the same asynchronously? so I can respond to data in a data received event, instead of polling for data, etc.

Yup, that's what socket_set_nonblock() is for. Your socket interaction code will need to be written differently, taking into account the special meanings that error codes 11, EWOULDBLOCK, and 115, EINPROGRESS, assume.
Here's some somewhat-fictionalized sample code from a PHP sync socket polling loop, as requested:
$buf = '';
$done = false;
do {
$chunk = socket_read($sock, 4096);
if($chunk === false) {
$error = socket_last_error($sock);
if($error != 11 && $error != 115) {
my_error_handler(socket_strerror($error), $error);
$done = true;
}
break;
} elseif($chunk == '') {
$done = true;
break;
} else {
$buf .= $chunk;
}
} while(true);

How do I do the same asynchronously?
so I can respond to data in a data
received event, instead of polling for
data, etc.
You will need to execute your script and issue stream_select to check weither there is any data to receive. Process and send data back.

The term "asynchronous" is often misused in network programming. For I/O, asynchronous is often just used as another word for non-blocking. This means that the process is able to continue before a call on the network api has completed transmission.
For process execution in general, asynchronous means that multiple instructions are able to be computed at once (concurrently.)
In other words, asynchronous I/O is not truly asynchronous unless multiple threads are used to allow multiple reads/write/accepts to occur concurrently - all sockets will sill have to wait on a synchronous non-blocking call if it has data to be read/written or will otherwise not block, and reading/writing a large file can still take seconds or even minutes if not interrupted. Note that this would require a perfect flow between the client and server or TCP itself will interrupt the transmission. For example, a server sending faster than a client can download would cause a block on a write.
So from a strict point of view PHP is not able to perform asynchronous networking, only non-blocking. In short, the progression of the process will stop while the network call is able to usefully read/write etc. However, the process will then continue when the call is not able to usefully read/write or would otherwise block. In a truly asynchronous system the process will continue regardless, and the read/write will be done in a different thread. Note that blocking I/O can still be done asynchronously if done in a different thread.
Moreover, PHP is not able to do event driven I/O without installing an extension that supports it. You will otherwise need to do some form of polling in order to do non-blocking I/O in PHP. The code from Chaos would be a functional non-blocking read example if it used socket_select.
With that said, the select function will still allow true non-blocking behavior in PHP. In C, polling services have a performance loss over event driven, so I'm sure that it would be the same for PHP. But this loss is in the nanoseconds-microseconds depending on the amount of sockets, where the time saved from a non-blocking call is typically milliseconds, or even seconds if the call is made to wait.

AFAIK PHP is strictly singlethreaded, which means you can't do this asynchronously, because Script execution is always linear.
It's been a while since i have done this, but as far as i recall, you can only open the socket, and have the script continue execution upon receiving data.

Related

Passing a live socket between two PHP scripts

I'm writing a websocket server in PHP, that needs to be able to handle a large number of concurrent connections. I'm currently using the socket_select function to allow it to handle them, but this still blocks all other connections when sending a large block of data to a client. Is there a way for the master script to accept the incoming socket, and then start up a second PHP script (in a non-blocking fashion, obviously) and pass the client socket to that script for processing? I know this is possible in C, but the codebase is such that a migration is impossible, sadly.
*The server is running exclusively on a Unix stack, no need for a MS compatible solution.

I'm currently using the socket_select function to allow it to handle them, but this still blocks all other connections when sending a large block of data to a client.
Then don't send all the data at once. If you are going to do
socket_write ($mysocket, $mybuffer, 10000000);
then yeah, you'll have to wait until all 10 million bytes have been sent out. However, you can use the $write array of socket_select to check if you can write to the socket, in combination with non-blocking sockets. Each time socket_select says you have a 'go!' on the socket, write data until socket_write starts to complain (i.e. returns FALSE or less than the specified length). This will keep the socket's send buffer optimally filled.
The downside is that you must keep track of exactly where in your output buffer you are; also, turn off non-blocking on the socket after you've written al your data or socket_select will keep on firing (this assumes you want to send multiple large blobs of data).

The answer turns out to be the same answer you'd use in C - fork(). When you fork, the state of all open files and ports is preserved, so a child process can read a port that was opened by its parent (this is the same way that modern webservers spin off worker threads for each client connection that comes in) It does require using the pcntl (process control) module which is disabled by default and should be used sparingly, but it works:
if($verbose)
echo "Connected Client from $remoteaddy to $localaddy\n";
echo "Forking...";
$pid = pcntl_fork(); // you're bringing children into this world, just to kill them in a few seconds. You monster.
if($pid==0){
$p = posix.getpid();
echo "PID OF CHILD: $p\n";
//in child process. Send a handshake and wait for the callback from the WebSockets library
$this->send($client, "Thank you for contacting myAwesomeServer.com! I'm slave #{$p}, and I'll be your host today");
}else if($pid>0){
$childWorkers[]=$pid;
echo "[ OK ]\n";
$this->disconnect($client->socket, false); //disconnect the clients socket from the master thread, so only the child thread is talking to the client
}else if($pid==-1){
echo "[FAIL] unable to create child worker\n";
}
NOTE!! This approach is PURELY ACADEMIC, and should only be used on small, 'pet' projects when you don't have enough time to learn a more appropriate language (personally, I know C well enough to fork(), but my lack of knowledge of its string manipulation functions would no doubt leave a gaping security hole in the server). I'm not sure how the Zend engine is doing this pcntl_fork(), but I'd imagine that the memory image of this monstrosity is going to be many times the size of equivalent C code..

How does Long Polling or Comet Work with PHP?

I am making a notification system for my website. I want the logged in users to immediately noticed when a notification has made. As many people say, there're only a few ways of doing so.
One is writing some javascript code to ask the server "Are there any new notifications ?" at a given time interval. It's called "Polling" (I should be right).
Another is "Long Polling" or "Comet". As wikipedia says, long polling is similar to polling. Without asking everytime for new notifications, when new notifications are available, server sends them directly to the client.
So how can i use Long Polling with PHP ? (Don't need full source code, but a way of doing so)
What's its architecture/design really ?

The basic idea of long-polling is that you send a request which is then NOT responded or terminated by the server until some desired condition. I.e. server-side doesn't "finish" serving the request by sending the response. You can achieve this by keeping the execution in a loop on server-side.
Imagine that in each loop you do a database query or whatever is necessary for you to find out if the condition you need is now true. Only when it IS you break the loop and send the response to the client. When the client receives the response, it immediately re-sends the "long-polling" request so it wouldn't miss a next "notification".
A simplified example of the server-side PHP code for this could be:
// Set the loop to run 28 times, sleeping 2 seconds between each loop.
for($i = 1; $i < 29; $i++) {
// find out if the condition is satisfied.
// If YES, break the loop and send response
sleep(2);
}
// If nothing happened (the condition didn't satisfy) during the 28 loops,
// respond with a special response indicating no results. This helps avoiding
// problems of 'max_execution_time' reached. Still, the client should re-send the
// long-polling request even in this case.

You can use (or study) some existing implementations, like Ratchet. There are a few others.
Essentially, you need to avoid having apache or the web server handle the request. Just like you would with a node.js server, you can start PHP from the command line and use the server socket functions to create a server and use socket_select to handle communications.
It could technically work throught the web server by keeping a loop active. However, the memory overhead of keeping a php process active per HTTP connection is typically too high. Creating your own server allows you to share the memory between connections.

I used long polling for a chat application recently. After doing some research and playing it with a while here are some things I would recommend.
1) Don't long poll for more than about 20 seconds. Some browsers will timeout. I normally set my long poll to run about 20 seconds and send back an empty response at that point. Then you can use javascript to restart the long poll.
2) Every once in a while a browser will hang up. To help add a second level of error checking, I have a javascript timer run for 30 seconds and if no response has come in 30 seconds I abandon the ajax call and start it up again.
3) If you are using php make sure you use session_write_close()
4) If you are using ajax with Jquery you may need to use abort()

You can find your answer here. More detail here . And you should remember to use $.ajaxSetup({ cache:false }); when working with jquery.

Server-sent events and php - what triggers events on the server?

All,
HTML5 Rocks has a nice beginner tutorial on Server-sent Events (SSE):
http://www.html5rocks.com/en/tutorials/eventsource/basics/
But, I don't understand an important concept - what triggers the event on the server that causes a message to be sent?
In other words - in the HTML5 example - the server simply sends a timestamp once:
<?php
header('Content-Type: text/event-stream');
header('Cache-Control: no-cache'); // recommended to prevent caching of event data.
function sendMsg($id, $msg) {
echo "id: $id" . PHP_EOL;
echo "data: $msg" . PHP_EOL;
echo PHP_EOL;
ob_flush();
flush();
}
$serverTime = time();
sendMsg($serverTime, 'server time: ' . date("h:i:s", time()));
If I were building a practical example - e.g., a Facebook-style "wall" or a stock-ticker, in which the server would "push" a new message to the client every time some piece of data changes, how does that work?
In other words... Does the PHP script have a loop that runs continuously, checking for a change in the data, then sending a message every time it finds one? If so - how do you know when to end that process?
Or - does the PHP script simply send the message, then end (as appears to be the case in the HTML5Rocks example)? If so - how do you get continuous updates? Is the browser simply polling the PHP page at regular intervals? If so - how is that a "server-sent event"? How is this different from writing a setInterval function in JavaScript that uses AJAX to call a PHP page at a regular interval?
Sorry - this is probably an incredibly naive question. But none of the examples I've been able to find make this clear.
[UPDATE]
I think my question was poorly worded, so here's some clarification.
Let's say I have a web page that should display the most recent price of Apple's stock.
When the user first opens the page, the page creates an EventSource with the URL of my "stream."
var source = new EventSource('stream.php');
My question is this - how should "stream.php" work?
Like this? (pseudo-code):
<?php
header('Content-Type: text/event-stream');
header('Cache-Control: no-cache'); // recommended to prevent caching of event data.
function sendMsg($msg) {
echo "data: $msg" . PHP_EOL;
echo PHP_EOL;
flush();
}
while (some condition) {
// check whether Apple's stock price has changed
// e.g., by querying a database, or calling a web service
// if it HAS changed, sendMsg with new price to client
// otherwise, do nothing (until next loop)
sleep (n) // wait n seconds until checking again
}
?>
In other words - does "stream.php" stay open as long as the client is "connected" to it?
If so - does that mean that you have as many threads running stream.php as you have concurrent users? If so - is that remotely feasible, or an appropriate way to build an application? And how do you know when you can END an instance of stream.php?
My naive impression is that, if this is the case, PHP isn't a suitable technology for this kind of server. But all of the demos I've seen so far imply that PHP is just fine for this, which is why I'm so confused...

"...does "stream.php" stay open as long as the client is "connected"
to it?"
Yes, and your pseudo-code is a reasonable approach.
"And how do you know when you can END an instance of stream.php?"
In the most typical case, this happens when the user leaves your site. (Apache recognizes the closed socket, and kills the PHP instance.) The main time you might close the socket from the server-side is if you know there is going to be no data for a while; the last message you send the client is to tell them to come back at a certain time. E.g. in your stock-streaming case, you could close the connection at 8pm, and tell clients to come back in 8 hours (assuming NASDAQ is open for quotes from 4am to 8pm). Friday evening you tell them to come back Monday morning. (I have an upcoming book on SSE, and dedicate a couple of sections on this subject.)
"...if this is the case, PHP isn't a suitable technology for this kind
of server. But all of the demos I've seen so far imply that PHP is
just fine for this, which is why I'm so confused..."
Well, people argue that PHP isn't a suitable technology for normal web sites, and they are right: you could do it with far less memory and CPU cycles if you replaced your whole LAMP stack with C++. However, despite this, PHP powers most of the sites out there just fine. It is a very productive language for web work, due to a combination of a familiar C-like syntax and so many libraries, and a comforting one for managers as plenty of PHP programmers to hire, plenty of books and other resources, and some large use-cases (e.g. Facebook and Wikipedia). Those are basically the same reasons you might choose PHP as your streaming technology.
The typical setup is not going to be one connection to NASDAQ per PHP-instance. Instead you are going to have another process with a single connection to the NASDAQ, or perhaps a single connection from each machine in your cluster to the NASDAQ. That then pushes the prices into either a SQL/NoSQL server, or into shared memory. Then PHP just polls that shared memory (or database), and pushes the data out. Or, have a data-gathering server, and each PHP instance opens a socket connection to that server. The data-gathering server pushes out updates to each of its PHP clients, as it receives them, and they in turn push out that data to their client.
The main scalability issue with using Apache+PHP for streaming is the memory for each Apache process. When you reach the memory limit of the hardware, make the business decision to add another machine to the cluster, or cut Apache out of the loop, and write a dedicated HTTP server. The latter can be done in PHP so all your existing knowledge and code can be re-used, or you can rewrite the whole application in another language. The pure developer in me would write a dedicated, streamlined HTTP server in C++. The manager in me would add another box.

Server-sent events are for realtime update from the server-side to the client-side. In the first example, the connection from the server isn't kept and the client tries to connect again every 3 seconds and makes server-sent events no difference to ajax polling.
So, to make the connection persist, you need to wrap your code in a loop and check for updates constantly.
PHP is thread-based and more connected users will make the server run out of resources. This can be solved by controlling the script execution time and end the script when it exceed an amount of time (i.e. 10mins). The EventSource API will automatically connect again so the delay is in a acceptable range.
Also, check out my PHP library for Server-sent events, you can understand more about how to do server-sent events in PHP and make it easier to code.

I have notice that the sse techink sends every couple of delay data to the client (somtething like reversing the pooling data techink from client page e.x. Ajax pooling data.) so to overcome this problem i made this at a sseServer.php page :
<?php
session_start();
header('Content-Type: text/event-stream');
header('Cache-Control: no-cache'); // recommended to prevent caching of event data
require 'sse.php';
if ($_POST['message'] != ""){
$_SESSION['message'] = $_POST['message'];
$_SESSION['serverTime'] = time();
}
sendMsg($_SESSION['serverTime'], $_SESSION['message'] );
?>
and the sse.php is :
<?php
function sendMsg($id, $msg) {
echo "id: $id" . PHP_EOL;
echo "data: $msg" . PHP_EOL;
echo PHP_EOL;
ob_flush();
flush();
}
?>
Notice that at the sseSerer.php i start a session and using a session variable! to overcome the problem.
Also i call the sseServer.php via Ajax (posting and set value to variable message) every time that i want to "update" message.
Now at the jQuery (javascript) i do something like that :
1st) i declare a global variable var timeStamp=0;
2nd) i use the next algorithm :
if(typeof(EventSource)!=="undefined"){
var source=new EventSource("sseServer.php");
source.onmessage=function(event)
if ((timeStamp!=event.lastEventId) && (timeStamp!=0)){
/* this is initialization */
timeStamp=event.lastEventId;
$.notify("Please refresh "+event.data, "info");
} else {
if (timeStamp==0){
timeStamp=event.lastEventId;
}
} /* fi */
} else {
document.getElementById("result").innerHTML="Sorry, your browser does not support server-sent events...";
} /* fi */
At the line of : $.notify("Please refresh "+event.data, "info");
is there that you can handle the message.
For my case i used to send an jQuery notify.
You may use POSIX PIPES or a DB Table instead to pass the "message" via POST since the sseServer.php does something like an "infinite loop".
My problem at the time is that the above code DOES NOT SENDS THE "message" to all clients but only to the pair (client that called the sseServer.php works as individual to every pair) so i'll change the technik and to a DB update from the page that i want to trigger the "message" and then the sseServer.php instead to get the message via POST it will get it from DB table.
I hope that i have help!

This is really a structural question about your application. Real-time events are something that you want to think about from the beginning, so you can design your application around it. If you have written an application that just runs a bunch of random mysql(i)_query methods using string queries and doesn't pass them through any sort of intermediary, then many times you won't have a choice but to either rewrite much of your application, or do constant server-side polling.
If, however, you manage your entities as objects and pass them through some sort of intermediary class, you can hook into that process. Look at this example:
<?php
class MyQueryManager {
public function find($myObject, $objectId) {
// Issue a select query against the database to get this object
}
public function save($myObject) {
// Issue a query that saves the object to the database
// Fire a new "save" event for the type of object passed to this method
}
public function delete($myObject) {
// Fire a "delete" event for the type of object
}
}
In your application, when you're ready to save:
<?php
$someObject = $queryManager->find("MyObjectName", 1);
$someObject->setDateTimeUpdated(time());
$queryManager->save($someObject);
This is not the most graceful example but it should serve as a decent building block. You can hook into your actual persistence layer to handle triggering these events. Then you get them immediately (as real-time as it can get) without hammering your server (since you have no need to constantly query your database and see if things changed).
You obviously won't catch manual changes to the database this way - but if you're doing anything manually to your database with any frequency, you should either:
Fix the problem that requires you to have to make a manual change
Build a tool to expedite the process, and fire these events

Basically, PHP is not suitable techonology for this sort of things.
Yes you can make it work, but it will be a disaster on highload. We run stockservers that send stock-change signals via websockets to dozens thousends users - and If we'd use php for that... Well, we could, but those homemade cycles - is just a nightmare. Every single connection will make a separate process on server or you have to handle connections from some sort of database.
Simply use nodejs and socket.io. It will let you easily start and have a running server in couple days. Nodejs has own limitations also, but for websockets (and SSE) connections now its the most powerfull technology.
And also - SSE is not that good as it seems. The only advantage to websockets - is that packets are being gzipped natively (ws is not gzipped), but on the downside is that SSE is one-side connection. You user, if he wants to add another stock symbol to subscripton, will have to make ajax request (including all troubles with origin control and the request will be slow). In websockets client and sever communicate both ways in one single opened connection, so if user sends a trading signal or subscribes to quote, he just send a string in already opened connection. And it's fast.

How to keep checking for a file until it exists, then provide a link to it

I'm calling a Java program with a PHP system call. The Java program takes a while to run but will eventually produce a PDF file with a known filename.
I need to keep checking for this file until it exists and then serve up a link to it. I assume a while loop will be involved but I don't want it to be too resource intensive. What's a good way of doing this?

Basically you got it right
while (!file_exists($filename)) sleep(1);
print 'download PDF';
the sleep gives 1 second between checks so it won't stress your CPU for nothing

this will do the work but you may specify an additional timeout.
while( !file_exists($pathToFile) )
{
sleep(1);
}

If you need to send it back to the browser, you should probably investigate using an AJAX call on a setInterval timer and a PHP script that checks for the files existence. You can do this in two ways:
flush() html back to the browser that includes Javascipt that starts a polling process using AJAX for the browser poll-side and your PHP script with an AJAX function to process the poll.
If flush() doesn't work, then you should return the HTML of your PHP script BEFORE setting off your Java process. In that code put two AJAX calls. One that starts the actual Java process and one that starts a polling service looking for the file.
Long running scripts may timeout the browser before you can get a response from your Java application, which is why you'll likely need the browser to work asynchronously from your Java process.
On the other hand, if this is a pure PHP script running or the Java process is less than a typical browser timeout, you can just use something like:
$nofileexists = true;
while($nofilexists) { // loop until your file is there
$nofileexists = checkFileExists(); //check to see if your file is there
sleep(5); //sleeps for X seconds, in this case 5 before running the loop again
}
You didn't mention if this would be a high traffic call (for lots of public users) or a reporting type application. If high traffic, I would recommend the AJAX route, but if low traffic, then the code above.

What is a practical use for PHP's sleep()?

I just had a look at the docs on sleep().
Where would you use this function?
Is it there to give the CPU a break in an expensive function?
Any common pitfalls?

One place where it finds use is to create a delay.
Lets say you've built a crawler that uses curl/file_get_contents to get remote pages. Now you don't want to bombard the remote server with too many requests in short time. So you introduce a delay between consecutive requests.
sleep takes the argument in seconds, its friend usleep takes arguments in microseconds and is more suitable in some cases.

Another example: You're running some sort of batch process that makes heavy use of a resource. Maybe you're walking the database of 9,000,000 book titles and updating about 10% of them. That process has to run in the middle of the day, but there are so many updates to be done that running your batch program drags the database server down to a crawl for other users.
So you modify the batch process to submit, say, 1000 updates, then sleep for 5 seconds to give the database server a chance to finish processing any requests from other users that have backed up.

Here's a snippet of how I use sleep in one of my projects:
foreach($addresses as $address)
{
$url = "http://maps.google.com/maps/geo?q={$address}&output=json...etc...";
$result = file_get_contents($url);
$geo = json_decode($result, TRUE);
// Do stuff with $geo
sleep(1);
}
In this case sleep helps me prevent being blocked by Google maps, because I am sending too many requests to the server.

Old question I know, but another reason for using u/sleep can be when you are writing security/cryptography code, such as an authentication script. A couple of examples:
You may wish to reduce the effectiveness of a potential brute force attack by making your login script purposefully slow, especially after a few failed attempts.
Also you might wish to add an artificial delay during encryption to mitigate against timing attacks. I know that the chances are slim that you're going to be writing such in-depth encryption code in a language like PHP, but still valid I reckon.
EDIT
Using u/sleep against timing attacks is not a good solution. You can still get the important data in a timing attack, you just need more samples to filter out the noise that u/sleep adds.
You can find more information about this topic in: Could a random sleep prevent timing attacks?

Another way to use it: if you want to execute a cronjob more often there every minute. I use the following code for this:
sleep(30);
include 'cronjob.php';
I call this file, and cronjob.php every minute.

This is a bit of an odd case...file transfer throttling.
In a file transfer service we ran a long time ago, the files were served from 10Mbps uplink servers. To prevent the network from bogging down, the download script tracked how many users were downloading at once, and then calculated how many bytes it could send per second per user. It would send part of this amount, then sleep a moment (1/4 second, I think) then send more...etc.
In this way, the servers ran continuously at about 9.5Mbps, without having uplink saturation issues...and always dynamically adjusting speeds of the downloads.
I wouldn't do it this way, or in PHP, now...but it worked great at the time.

You can use sleep to pause the script execution... for example to delay an AJAX call by server side or implement an observer. You can also use it to simulate delays.
I use that also to delay sendmail() & co. .
Somebody uses use sleep() to prevent DoS and login brutefoces, I do not agree 'cause in this you need to add some checks to prevent the user from running multiple times.
Check also usleep.

I had to use it recently when I was utilising Google's Geolocation API. Every address in a loop needed to call Google's server so it needed a bit of time to receive a response. I used usleep(500000) to give everything involved enough time.

I wouldn't typically use it for serving web pages, but it's useful for command line scripts.
$ready = false;
do {
$ready = some_monitor_function();
sleep(2);
} while (!$ready);

Super old posts, but I thought I would comment as well.
I recently had to check for a VERY long running process that created some files. So I made a function that iterates over a cURL function. If the file I'm looking for doesn't exist, I sleep the php file, and check again in a bit:
function remoteFileExists() {
$curl = curl_init('domain.com/file.ext');
//don't fetch the actual page, you only want to check the connection is ok
curl_setopt($curl, CURLOPT_NOBODY, true);
//do request
$result = curl_exec($curl);
//if request did not fail
if ($result !== false) {
//if request was ok, check response code
$statusCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($statusCode == 404) {
sleep(7);
remoteFileExists();
}
else{
echo 'exists';
}
}
curl_close($curl);
}
echo remoteFileExists();

One of its application is, if I am sending mails by a script to 100+ customers then this operation will take maximum 1-2 seconds thus most of the website like hotmail and yahoo consider it as spam, so to avoid this we need to use some delay in execution after every mail.

Among the others: you are testing a web application that makes ayncronous requests (AJAX calls, lazy image loading,...)
You are testing it locally so responses are immediate since there is only one user (you) and no network latency.
Using sleep lets you see/test how the web app behaves when load and network cause delay on requests.

A quick pseudo code example of where you may not want to get millions of alert emails for a single event but you want your script to keep running.
if CheckSystemCPU() > 95
SendMeAnEmail()
sleep(1800)
fi

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.