How to deal with streaming data in PHP?

How to deal with streaming data in PHP? - php

There is a family of methods (birddog, shadow, and follow)in the Twitter API that opens a (mostly) permanent connection and allows you to follow many users. I've run the sample connection code with cURL in bash, and it works nicely: when a user I specify writes a tweet, I get a stream of XML in my console.
My question is: how can I access data with PHP that isn't returned as a direct function call, but is streamed? This data arrives sporadically and unpredictably, and it's not something I've ever dealt with nor do I know where to begin looking for answers. Any advice and descriptions of libraries or pitfalls would be appreciated.

fopen and fgets
<?php
$sock = fopen('http://domain.tld/path/to/file', 'r');
$data = null;
while(($data = fgets($sock)) == TRUE)
{
echo $data;
}
fclose($sock);
This is by no means great (or even good) code but it should provide the functionality you need. You will need to add error handling and data parsing among other things.

I'm pretty sure that your script will time out after ~30 seconds of listening for data on the stream. Even if it doesn't, once you get a significant server load, the sheer number of open and listening connections will bring the server to it's knees.
I would suggest you take a look at an AJAX solution that makes a call to a script that just stores a Queue of messages. I'm not sure how the Twitter API works exactly though, so I'm not sure if you can have a script run when requested to get all the tweets, or if you have to have some sort of daemon append the tweets to a Queue that PHP can read and pass back via your AJAX call.

There are libraries for this these days that make things much easier (and handle the tricky bits like reconnections, socket handling, TCP backoff, etc), ie:
http://code.google.com/p/phirehose/

I would suggest looking into using AJAX. Im not a PHP developer, but I would think that you could wire up an AJAX call to the API and update your web page.

Phirehose is definitely the way to go:
http://code.google.com/p/phirehose/

Related

Using Meetup Stream in PHP to import new events

The meetup API allows you to get pushed data updates. A CURL request that looks as followed:
http://stream.meetup.com/2/open_events
Will return a constant stream of updated events. This imported stream just keeps updating as new data is pushed to it. I am trying to think up the best way to work with this stream, and I am not sure what is standard. I was thinking to just run a CURL request within a while loop but that would just run forever and running that from a controller function in my codeigniter project seems impractical. What is the proper way to interface with a seemingly infinite stream of events like this? Is there a better way than running a process from a controller function? Would it be more sensible to run a chron-job that woke up every so often to continue processing from where it left off before?
UPDATE:
Further investigation has shown me that importing data from this stream uses chunked-transfer encoding. What is confusing to me is that this stream seems to be infinite. It just keeps outputting data non-stop. I still do not understand how to import data from a stream of data like this. A cron job also does not seem to be the answer for importing a stream like this into my mysql database.
This is a link to the resource that documents the Meetup Stream:
http://www.meetup.com/meetup_api/docs/stream/2/open_events/
Thank you for the help.

Without looking closely at the meetup.com docs, it sounds like you would probably want to use javascript & ajax to call your controller (which calls the meetup web service) at fixed intervals. Your Ajax response would then contain your updated data, and you could update your page accordingly in the ajax callback function.
Check out the javascript setInterval() method

I'm curious what your solution ended up as. I am just starting, so if there is a solution in place it would help me a great deal.
Also I think persistent applications are the way to go over cron.
$url= 'http://stream.meetup.com/2/open_events?since_mtime=1326077000';
$fp = fopen($url, 'r');
$data="";
$string="";
while ($data !="")
{
$data = fread($fp, 45);
$string .= $data;
}

PHP Background Process on BSD uses 100% CPU

I have a PHP script that runs as a background process. This script simply uses fopen to read from the Twitter Streaming API. Essentially an http connection that never ends. I can't post the script unfortunately because it is proprietary. The script on Ubuntu runs normally and uses very little CPU. However on BSD the script always uses nearly a 100% CPU. The script is working just fine on both machines and is the exact same script. Can anyone think of something that might point me in the right direction to fix this? This is the first PHP script I have written to consistently run in the background.
The script is an infinite loop, it reads the data out and writes to a json file every minute. The script will write to a MySQL database whenever a reconnect happens, which is usually after days of running. The script does nothing else and is not very long. I have little experience with BSD or writing PHP scripts that run infinite loops. Thanks in advance for any suggestions, let me know if this belongs in another StackExchange. I will try to answer any questions as quickly as possible, because I realize the question is very vague.

Without seeing the script, this is very difficult to give you a definitive answer, however what you need to do is ensure that your script is waiting for data appropriately. What you should absolutely definitely not do is call stream_set_timeout($fp, 0); or stream_set_blocking($fp, 0); on your file pointer.
The basic structure of a script to do something like this that should avoid racing would be something like this:
// Open the file pointer and set blocking mode
$fp = fopen('http://www.domain.tld/somepage.file','r');
stream_set_timeout($fp, 1);
stream_set_blocking($fp, 1);
while (!feof($fp)) { // This should loop until the server closes the connection
// This line should be pretty much the first line in the loop
// It will try and fetch a line from $fp, and block for 1 second
// or until one is available. This should help avoid racing
// You can also use fread() in the same way if necessary
if (($str = fgets($fp)) === FALSE) continue;
// rest of app logic goes here
}
You can use sleep()/usleep() to avoid racing as well, but the better approach is to rely on a blocking function call to do your blocking. If it works on one OS but not on another, try setting the blocking modes/behaviour explicitly, as above.
If you can't get this to work with a call to fopen() passing a HTTP URL, it may be a problem with the HTTP wrapper implementation in PHP. To work around this, you could use fsockopen() and handle the request yourself. This is not too difficult, especially if you only need to send a single request and read a constant stream response.

It sounds to me like one of your functions is blocking briefly on Linux, but not BSD. Without seeing your script it is hard to get specific, but one thing I would suggest is to add a usleep() before the next loop iteration:
usleep(100000); //Sleep for 100ms
You don't need a long sleep... just enough so that you're not using 100% CPU.
Edit: Since you mentioned you don't have a good way to run this in the background right now, I suggest checking out this tutorial for "daemonizing" your script. Included is some handy code for doing this. It can even make a file in init.d for you.

How does the code look like that does the actual reading? Do you just hammer the socket until you get something?
One really effective way to deal with this is to use the libevent extension, but that's not for the feeble minded.

PHP: game loop (threads or the sort)

I am writing PHP code to be a game client. It uses socket; socket_create followed by socket_connect and then socket_read. It works fine, but the issue is that the server can send a packet at any time which means socket_read needs to be happening constantly in a "game loop". So something like this:
<?php
$reply = "";
do {
$recv = "";
$recv = socket_read($socket, '1400');
if($recv != "") {
$reply .= $recv;
}
} while($recv != "");
echo($reply);
?>
Doesn't work because it's stuck in the loop (server doesn't terminate connection until game is quit by client) and the PHP code needs to handle the packet stuff as it comes in.
So PHP doesn't really have threading. What's the best way of handling this?

Basically any software platform is going to butt up against this problem. Most, as you've figured out, solve it with threading. While threading IS possible in PHP. It requires MAJORHAXXX. Such as launching a commandline php thread from within php.
It really doesn't end up being ideal.
However, there are other ways to get around this.
But you need to check ALL the marks on this list first:
[] - My game doesn't need to constantly keep checking the server, such as for player locations or complex movements. Anything beyond a chat-room level of data transfer and update rates should leave this box un-checked.
[] - My game doesn't need to be told BY THE SERVER anything. It is perfectly acceptable for the client to ask for anything it needs, perhaps once a second or better off once a minute.
[] - My game doesn't need to keep a constant simulation of a complex world running on the server for longer than it takes to complete a request. Tracking chat is one thing, doing physics and graphics modifications is another.
If you checked all of these boxes, then PHP is STILL IN THE GAME! Otherwise. Don't bother.
Basically, what I am saying here is that PHP is great for games that aren't really multiplayer, and that are turn-based or at least not very interactive. But once you have to keep things going without the player, PHP falls on its face.
VOODOO LEVEL
But if you simply MUST do this. There ARE ways to get around it.
A - Create a PHP Daemon that runs your world, pipe all other traffic to either a getter or setter request file that interacts with the database. So, you might request a getting of the game world state, or set a value that the player performed. All other game-world related things can be handled by the daemon and the game itself takes place in the database.
B - Use cron, not a Daemon. (dangerous, but we already established you as a risk taker, right?)
C - TRY only a Daemon and listening to sockets, then sending out threads (via exec()) to respond. Kind of like AndreKR's idea above, only you don't need to sleep. Problem here is you will almost always end up missing stuff or otherwise getting cut off. And the whole thing might explode if the Daemon get's run twice somehow..

If you really want to do this, you have to sleep for some time, check the socket, sleep again, check the socket...
To check the socket without blocking you need to use non-blocking I/O which you can achieve with the socket_set_nonblock() or socket_recv() which has a DONTWAIT flag.

Can be done, but I agree with #Andrey and #DampeS8N, not the best choice. If you are dead set on doing this, check out this book: You want to do WHAT with PHP?

TCP implementations tend to fragment and join messages; there's no telling how much data or how many message fragments a socket receive will return. You need to know where a message ends and a new one begins (which may happen multiple times in data returned by a single read). Some simple solutions:
Use some kind of delimiter. End each message by '\0'.
Send the message size along with the message. Start each message with "Content-length: 42\n" or two size bytes (0x00 0x42).
Use XML. <message> starts and </message> ends a message.
PHP's XML parser doesn't like incomplete XMLs, though so the third option is out unless you want to match the start and end tags manually. Use the first option if the protocol is based on ASCII, second if it's binary, third if it's already XML.
Now, remember you can get any number of messages per packet. In the most complex case, you might have the end of an earlier message followed by a number of full messages and the beginning of yet another message in a single packet.
A full solution would be along these lines:
while (connected) {
while (messages in buffer < 1) {
read from socket;
add to buffer;
}
while (messages in buffer > 0) {
extract message from buffer;
process message;
}
}
...though this is an asynchronous message loop. I'll leave the "if there's a message available, return it; else, wait for one" synchronous implementation as an exercise. (Hint: You'll need a class to build and buffer messages.)

PHP has no multithreading, so you should really consider to use a more suitable language (like Andrey mentioned in its comment).

All you have to do is to use socket_select() function:
http://php.net/manual/en/function.socket-select.php
It will put your script to sleep and wake it up when there is data on the socket to be read. It's waaay more efficient that periodical sleep/read, cron scripts and all other proposed solutions.
#aib made a valid point. The server might sent a complete "game message" divided into several packets. Dont expect to get all your data in a singe exceution of code block after socket_select() returns.

Instead of writing this smelly blocking polling loop, check out some event system based around the reactor pattern like Python Twisted or Ruby EventMachine.
I believe the PHP flavor is call PHP-MIO: http://thethoughtlab.blogspot.com/2007/04/non-blocking-io-with-php-mio.html

PHP and AJAX, sending new PHP info to page for AJAX to receive, is this possible?

I'm searching on how to do this but my searches aren't turning up things that are talking about what I'm trying to do so maybe I'm not searching with the right terms or this isn't possible, but figured I would ask here for help.. this is what I am trying to do..
I have PHP scripts that are called asyncrhonously, so it is called and it just runs, the calling PHP doesn't wait for a response, so it can go on to do other stuff / free things up so another asynch php process can be run.
I would still like to get back a result from these "zombie" scripts or whatever you want to call them, however the only way I can think of doing it that I know for sure will work is something like make this "zombie" script save its final output to a database and then have my AJAX UI make periodic requests to this database to check if the needed value exists in the place it is supposed to.. which would allow it to get the output from the zombie PHP script..
I am thinking it would be better if somehow this zombie script could do a sort of page refresh to the AJAX ui but the ajax ui would intercept this and just take the received data from PHP and use it as needed (such as display in a DIV for user to see).. basically I'm wondering if you can make PHP force this kind of thing rather than needing to involve a database in this and making AJAX do repeated requests to check for a specific value that way..
Thanks for any advice

No, a background script has no way to influence the client's front-end because it has no connection to it.
Starting a background script, having the script write status data into a shared space - be it a database or a memcache or a similar solution - and polling the status through Ajax is usually indeed the best way to go.
One alternative may be Comet. It's a technique where a connection is kept open over a long time, and updated actively from the server side (instead of frequent client-side Ajax polling). I have no practical experience with this but I imagine it most probably needs server side tweaking to be doable in PHP - it's not the best platform for long-running stuff. See this question for some approaches.

Alternative to header(location: ) php

I have a while loop that constructs a url for an SMS api.
This loop will eventually be sending hundreds of messages, thus being hundreds of urls.
How would i go about doing this?
I know you can use header(location: ) to chnage the location of the browser, but this sint going to work, as the php page needs to remain running
Hope this is clear
thankyouphp h

You have a few options:
file_get_contents as Trevor noted
curl_ - Use the curl library of commands to make the request
fsock* - Handle the connection a bit lower level, but making and managing the socket connection.
All will probably work just fine and you should pick one depending on your overall needs.

After you construct each $url, use file_get_contents($url)

If it just a case that during the construction of all these URLs you get the error "Maximum Execution Time Exceeded", then just add set_time_limit(10); after the URL generation to give your script an extra 10 seconds to generate the next URL.
I'm not quite sure what you are actually asking in this question - do you want the user to visit the urls (if so, can you does the end users web browser support javascript?), just be shown the urls, for the urls to be generated and stored or for the PHP script to fetch each url (and do you care about the user seeing the result) - but if you clarify the question, the community may be able to provide you with a perfect answer!

Applying a huge amount guesswork, I infer from your post that you need to dynamically create a URL, and the invoking of that URL causes an SMS message to be sent.
If this is the case, then you should not be trying to invoke the URL from the client but from server side using the url_wrappers or cURL.
You should also consider running the loop in a seperate process and reporting back to the browser using (e.g.) AJAX.
Have a google for spawning long running processes in PHP - but be warned there is a lot of bad advice on the topic published out there.
C.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.