Using Meetup Stream in PHP to import new events - php

The meetup API allows you to get pushed data updates. A CURL request that looks as followed:
http://stream.meetup.com/2/open_events
Will return a constant stream of updated events. This imported stream just keeps updating as new data is pushed to it. I am trying to think up the best way to work with this stream, and I am not sure what is standard. I was thinking to just run a CURL request within a while loop but that would just run forever and running that from a controller function in my codeigniter project seems impractical. What is the proper way to interface with a seemingly infinite stream of events like this? Is there a better way than running a process from a controller function? Would it be more sensible to run a chron-job that woke up every so often to continue processing from where it left off before?
UPDATE:
Further investigation has shown me that importing data from this stream uses chunked-transfer encoding. What is confusing to me is that this stream seems to be infinite. It just keeps outputting data non-stop. I still do not understand how to import data from a stream of data like this. A cron job also does not seem to be the answer for importing a stream like this into my mysql database.
This is a link to the resource that documents the Meetup Stream:
http://www.meetup.com/meetup_api/docs/stream/2/open_events/
Thank you for the help.

Without looking closely at the meetup.com docs, it sounds like you would probably want to use javascript & ajax to call your controller (which calls the meetup web service) at fixed intervals. Your Ajax response would then contain your updated data, and you could update your page accordingly in the ajax callback function.
Check out the javascript setInterval() method

I'm curious what your solution ended up as. I am just starting, so if there is a solution in place it would help me a great deal.
Also I think persistent applications are the way to go over cron.
$url= 'http://stream.meetup.com/2/open_events?since_mtime=1326077000';
$fp = fopen($url, 'r');
$data="";
$string="";
while ($data !="")
{
$data = fread($fp, 45);
$string .= $data;
}

Related

PHP multiple requests at once are creating incorrect database entries

What I'm running here is a graphical file manager, akin to OneDrive or OpenCloud or something like that. Files, Folders, Accounts, and the main server settings are all stored in the database as JSON-encoded objects (yes, I did get rid of columns in favor of json). The problem is that is multiple requests use the same object at once, it'll often save back incorrect data because the requests obviously can't communicate changes to each other.
For example, when someone starts a download, it loads the account object of the owner of that file, increments its bandwidth counter, and then encodes/saves it back to the DB at the end of the download. But say if I have 3 downloads of the same file at once, they'll all load the same account object, change the data as they see fit, and save back their data without regards to the others that overlap. In this case, the 3 downloads would show as 1.
Besides that downloads and bandwidth are being uncounted, I'm also having a problem where I'm trying to create a maintenance function that loads the server object and doesn't save it back for potentially several minutes - this obviously won't work while downloads are happening and manipulating the server object all the meanwhile, because it'll all just be overwritten with old data when the maintenance function finishes.
Basically it's a threading issue. I've looked into PHP APC in the hope I could make objects persist globally between threads but that doesn't work since it just serializes/deserialized data for each request rather than actually having each request point to an object in memory.
I have absolutely no idea how to fix this without completely designing a new system that's totally different.... which sucks.
Any ideas on how I should go about this would be awesome.
Thanks!
It's not a threading issue. Your database doesn't conform to neither of the standards of building databases, including even the first normal form: every cell must contain only one value. When you're storing JSON data in DB, you cannot write an SQL request to make that transaction atomic. So, yes, you need to put that code in a trash bin.
In case you really need to get that code working, you can use some mutexes to synchronize running PHP scripts. The most common implementation in PHP is file mutex.
You can try to use flock , I guess you already have a user id before getting JSON from DB.
$lockfile = "/tmp/userlocks/$userid.txt";
$fp = fopen($lockfile, "w+");
if (flock($fp, LOCK_EX)) {
//Do your JSON update
flock($fp, LOCK_UN); //unlock
}else{
// lock exist
}
What you need to figure out is what to do when there is a lock, maybe wait for 0.5 secs and try to obtain lock again , or send a message "Only one simultaneous download allowed " or ....

Getting big amount of data from a very slow external data-source

I need to recieve a big amount of data from external source. The problem is that external source sends data very slow. The workflow is like this:
The user initiates some process from app interface (common it is fetching data from local xml file). This is quite fast process.
After that we need to load information connected with fetched data from external source(basically it is external statistics for data from xml). And it is very slow. But user needs this additional inforamtion to continue work. For example he may perform filtering according to external data or something else.
So, we need to do it asynchronously. The main idea is to shows external data as it becomes available. The question is how could we organise this async process? Maybe some quess or something else? We`re using php+mysql as backend and jquery at front-end.
Thanks a lot!
Your two possible strategies are:
Do the streaming on the backend, using a PHP script that curls the large external resource into a database or memcache, and responds to period requests for new data by flushing that db row or cache into the response.
Do the streaming on the frontend, using a cross-browser JavaScript technique explained in this answer. In Gecko and WebKit, the XmlHttpRequest.onreadystatechange event fires every time new data is received, making it possible to stream data slowly into the JavaScript runtime. In IE, you need to use an iframe workaround, also explained at Ajax Patterns article linked in the above SO post.
One possible solution would be to make the cURL call using system() with the output being redirected in a file. Thus PHP would not hang until the call is finished. From the PHP manual for system():
If a program is started with this function, in order for it to continue running in the background, the output of the program must be redirected to a file or another output stream. Failing to do so will cause PHP to hang until the execution of the program ends.
This would split the data gathering from the user interface. You could then work with the gathered local data by several means, for example:
employ an iFrame in the GUI that would refresh itself in some intervals and fetch data from the local stored file (and possibly store it in the database or whatever),
use jQuery to make AJAX calls to get the data and manipulate it,
use some CGI script that would run in the background and handle the database writes too and display the data using one of the above from the DB directly,
dozens more I can't think of now...

PHP Background Process on BSD uses 100% CPU

I have a PHP script that runs as a background process. This script simply uses fopen to read from the Twitter Streaming API. Essentially an http connection that never ends. I can't post the script unfortunately because it is proprietary. The script on Ubuntu runs normally and uses very little CPU. However on BSD the script always uses nearly a 100% CPU. The script is working just fine on both machines and is the exact same script. Can anyone think of something that might point me in the right direction to fix this? This is the first PHP script I have written to consistently run in the background.
The script is an infinite loop, it reads the data out and writes to a json file every minute. The script will write to a MySQL database whenever a reconnect happens, which is usually after days of running. The script does nothing else and is not very long. I have little experience with BSD or writing PHP scripts that run infinite loops. Thanks in advance for any suggestions, let me know if this belongs in another StackExchange. I will try to answer any questions as quickly as possible, because I realize the question is very vague.
Without seeing the script, this is very difficult to give you a definitive answer, however what you need to do is ensure that your script is waiting for data appropriately. What you should absolutely definitely not do is call stream_set_timeout($fp, 0); or stream_set_blocking($fp, 0); on your file pointer.
The basic structure of a script to do something like this that should avoid racing would be something like this:
// Open the file pointer and set blocking mode
$fp = fopen('http://www.domain.tld/somepage.file','r');
stream_set_timeout($fp, 1);
stream_set_blocking($fp, 1);
while (!feof($fp)) { // This should loop until the server closes the connection
// This line should be pretty much the first line in the loop
// It will try and fetch a line from $fp, and block for 1 second
// or until one is available. This should help avoid racing
// You can also use fread() in the same way if necessary
if (($str = fgets($fp)) === FALSE) continue;
// rest of app logic goes here
}
You can use sleep()/usleep() to avoid racing as well, but the better approach is to rely on a blocking function call to do your blocking. If it works on one OS but not on another, try setting the blocking modes/behaviour explicitly, as above.
If you can't get this to work with a call to fopen() passing a HTTP URL, it may be a problem with the HTTP wrapper implementation in PHP. To work around this, you could use fsockopen() and handle the request yourself. This is not too difficult, especially if you only need to send a single request and read a constant stream response.
It sounds to me like one of your functions is blocking briefly on Linux, but not BSD. Without seeing your script it is hard to get specific, but one thing I would suggest is to add a usleep() before the next loop iteration:
usleep(100000); //Sleep for 100ms
You don't need a long sleep... just enough so that you're not using 100% CPU.
Edit: Since you mentioned you don't have a good way to run this in the background right now, I suggest checking out this tutorial for "daemonizing" your script. Included is some handy code for doing this. It can even make a file in init.d for you.
How does the code look like that does the actual reading? Do you just hammer the socket until you get something?
One really effective way to deal with this is to use the libevent extension, but that's not for the feeble minded.

PHP zip archive progress bar

I've googled for this but didn't find any solution - is there a way to create a progessbar for adding/extracting files to/from zip archive in PHP?
Can I get some kind of status message which I can than get with an AJAX request and update the progress bar?
Thanks.
I am trying to do the same thing at the moment; it is mostly* complete (*see issues section below).
The basic concept I use is to have 2 files/processes:
Scheduler (starts task and can be called to get updates)
Task (actually completes the task of zipping)
The Scheduler will:
Create a unique update token and save to cache (APC)
Call the Task page using curl_multi_exec which is asynchronous, passing update_token
Return the token in JSON format
OR
Return the contents of the APC under the update_token (in my case this is a simple status array) as JSON
The Task will:
Update the APC with status, using the update token
Do the actual work :)
Client-side
You'll need some JavaScript to call the Scheduler, get the token in return, then call the Scheduler, passing update_token, to get updates, and then use these returned values to update the HTML.
** Potential pitfalls**
Sessions can be a problem. If you have the same session you will notice that your browser (or is this Apache?) waits for the first request in the session to complete before returning others. This is why I store in the APC.
Current Issues
The problem with the ZipArchive class is that it appears to all the grunt work in the ->close() method, whilst the addFile method appears to take little to no time to complete.
As a workaround you can close and then reopen the archive at specific byte or file intervals. This actually slows the process of zipping down a little, but in my case this is acceptable, as the visual progress bar is better than just waiting with no indication of what is happening.

How to deal with streaming data in PHP?

There is a family of methods (birddog, shadow, and follow)in the Twitter API that opens a (mostly) permanent connection and allows you to follow many users. I've run the sample connection code with cURL in bash, and it works nicely: when a user I specify writes a tweet, I get a stream of XML in my console.
My question is: how can I access data with PHP that isn't returned as a direct function call, but is streamed? This data arrives sporadically and unpredictably, and it's not something I've ever dealt with nor do I know where to begin looking for answers. Any advice and descriptions of libraries or pitfalls would be appreciated.
fopen and fgets
<?php
$sock = fopen('http://domain.tld/path/to/file', 'r');
$data = null;
while(($data = fgets($sock)) == TRUE)
{
echo $data;
}
fclose($sock);
This is by no means great (or even good) code but it should provide the functionality you need. You will need to add error handling and data parsing among other things.
I'm pretty sure that your script will time out after ~30 seconds of listening for data on the stream. Even if it doesn't, once you get a significant server load, the sheer number of open and listening connections will bring the server to it's knees.
I would suggest you take a look at an AJAX solution that makes a call to a script that just stores a Queue of messages. I'm not sure how the Twitter API works exactly though, so I'm not sure if you can have a script run when requested to get all the tweets, or if you have to have some sort of daemon append the tweets to a Queue that PHP can read and pass back via your AJAX call.
There are libraries for this these days that make things much easier (and handle the tricky bits like reconnections, socket handling, TCP backoff, etc), ie:
http://code.google.com/p/phirehose/
I would suggest looking into using AJAX. Im not a PHP developer, but I would think that you could wire up an AJAX call to the API and update your web page.
Phirehose is definitely the way to go:
http://code.google.com/p/phirehose/

Categories