System with two asynchronous processes

System with two asynchronous processes - php

I'm planning to write a system which should accept input from users (from browser), make some calculations and show updated data to all users, currently visiting certain website.
Input can come one time in a hour, but can also come 100 times each second. It is VERY important not to loose any of user inputs, but really register and process ALL of them.
So, the idea was to create two programs. One will receive data (input) from browser and store it somehow in a queue (maybe an array, to be really fast?). Second program should wait until there are new items in the queue (saving resources) and then became active and begin to process the queue items. Both programs should run asynchronously.
I can php, so I would write first program using php. But I'm not sure about second part.. I'm not sure about how to send an event from first to second program. I need some advice at this point. Threads are not possible with php? I need some ideas how to create the system like i discribed.
I would use comet server to communicate feedback to the website the input came from (this part already tested)

As per the comments above, trivially you appear to be describing a message queueing / processing system, however looking at your question in more depth this is probably not the case:
Both programs should run asynchronously.
Having a program which process a request from a browser but does it asynchronously is an oxymoron. While you could handle the enqueueing of a message after dealing with the HTTP request, its still a synchronous process.
It is VERY important not to loose any of user inputs
PHP is not a good language for writing control systems for nuclear reactors (nor, according to Microsoft, is Java). HTTP and TCP/IP are not ideal for real time systems either.
100 times each second
Sorry - I thought you meant there could be a lot of concurrent requests. This is not a huge amount.
You seem to be confusing the objective of using COMET / Ajax with asynchronous processing of the application. Even with very large amounts of data, it should be possible to handle the interaction using a single php script working synchronously.

Related

PHP - How would you make buildings or such things finish etc?

So as you know in some browser-games such as Travian, Tribalwars and etcetera, you can build up a building, it takes X amount of time and it finishes.
So I'm curious how that is done?
Is it a cron-job running every second or what? How are they then doing with troops, they can't have a cron-job running ever millisecond, that wouldn't be resource usage friendly, right?
So I'm really curious about this and I have no idea so I can't really say I have tried. I have however searched around but never found anything helpful.
Thanks.

The easiest way to implement something like this is simple timestamps. The request on the front end generates a timestamp based on the constraints given by the details of the request (what type of building you are building, what level you are, if you have bought the upgrade). Then a timestamp is inserted into the database for when the completion occurs. Then, if you want the browser to refresh when the job is up, you make a script on the js that makes a request for all timestamps in queue and reloads when they come up.

One way: You let the client handle the timer. So the timer will sit on the browser side counting down using javascript. When the time is up it will contact server to see if it's valid (never trust client side code). Server looks up the building and see if it would have been finished by then. Server side doesn't need to keep any timers it just answers requests. Timers are UI side.

Well, PHP is one of the worst things you could use to build a game... In games, everything is controlled by the main loop that controls the game. So basically, in a game, everything is running inside an infinite loop, albeit one that allows for user input without freezeing the computer, obviously. So that loop takes care of the timing as well, and the way to compute timing will depends on the language on which the game is developed. For web-based games, Java, JavaScript and Flash are usual choices.

How to process massive data-sets and provide a live user experience

I am a programmer at an internet marketing company that primaraly makes tools. These tools have certian requirements:
They run in a browser and must work in all of them.
The user either uploads something (.csv) to process or they provide a URL and API calls are made to retrieve information about it.
They are moving around THOUSANDS of lines of data (think large databases). These tools literally run for hours, usually over night.
The user must be able to watch live as their information is processed and is presented to them.
Currently we are writing in PHP, MySQL and Ajax.
My question is how do I process LARGE quantities of data and provide a user experience as the tool is running. Currently I use a custom queue system that sends ajax calls and inserts rows into tables or data into divs.
This method is a huge pain in the ass and couldnt possibly be the correct method. Should I be using a templating system or is there a better way to refresh chunks of the page with A LOT of data. And I really mean a lot of data because we come close to maxing out PHP memory and is something we are always on the look for.
Also I would love to make it so these tools could run on the server by themselves. I mean upload a .csv and close the browser window and then have an email sent to the user when the tool is done.
Does anyone have any methods (programming standards) for me that are better than using .ajax calls? Thank you.
I wanted to update with some notes incase anyone has the same question. I am looking into the following to see which is the best solution:
SlickGrid / DataTables
GearMan
Web Socket
Ratchet
Node.js
These are in no particular order and the one I choose will be based on what works for my issue and what can be used by the rest of my department. I will update when I pick the golden framework.

First of all, you cannot handle big data via Ajax. To make users able to watch the processes live you can do this using web sockets. As you are experienced in PHP, I can suggest you Ratchet which is quite new.
On the other hand, to make calculations and store big data I would use NoSQL instead of MySQL

Since you're kind of pinched for time already, migrating to Node.js may not be time sensitive. It'll also help with the question of notifying users of when the results are ready as it can do browser notification push without polling. As it makes use of Javascript you might find some of your client-side code is reusable.

I think you can run what you need in the background with some kind of Queue manager. I use something similar with CakePHP and it lets me run time intensive processes in the background asynchronously, so the browser does not need to be open.
Another plus side for this is that it's scalable, as it's easy to increase the number of queue workers running.
Basically with PHP, you just need a cron job that runs every once in a while that starts a worker that checks a Queue database for pending tasks. If none are found it keeps running in a loop until one shows up.

How do I make a php script wait for user input

Here is my problem, for those of you who looked at the title and thought "PHP does not wait on user input because it is a server side language and therefore your problem is a client-side one," just hear me out.
I'm making a game. It is mulitplayer game, therefore multiple users. At the start of every game round the users involved in the game are prompted with a chose of things they want to do that round. Course, the round isn't to commence until everyone has made a selection on what they want to do that round.
See, that is my problem. How do I make a script wait for 'all' users to send a request (send input) before continuing execution? The server-side language is PHP. The answer wouldn't be with the client as the client is only responsible for one user and wouldn't know what the other users are doing.
Thanks.

Fundamentally, you have two choices:
Choice one, you have each script check if it has all the data it needs, and then do all the work to calculate the next move (or whatever). That is actually much harder to get right than it sounds, because you run into problems of concurrency.
Basically, that approach leads to more than one "process" - page load - trying to do the same work on the same data, and that opens the door to races where you either don't do the work at all, or where you do it twice.
Choice two, which sounds harder, is where you write another PHP script that checks to see if it has all the moves, calculates the outcome, and updates the database (or whatever) in the background.
Then, run that off a cron job, or something like that, so you only have one instance running at a time. That makes life easier: your "is everything done" script is only running once, and so you don't have to worry about races - but there might be a lag between the last move being submitted and calculating the outcome.
That approach is actually easier in the long run, because while it involves more code and more moving parts, it actually avoids the hard problems (concurrency) in return for a few more easy problems (a bit more code, using cron).
You can improve on both of those, of course, but those are the fundamental models. Locking and other coordination techniques can make "calculate in the last page" work better, but they involve you addressing the races.
Using various "background job" tools can improve the latency of the second approach, by letting you trigger the check instantly rather than just on a timer. You still have some latency, but the user doesn't see as much of it.
Really, though, you get to pick one of those two strategies and go with it.
(Also, I strongly advise that if you can, grab a framework or something where someone else already solved these problems, then use that.)

Since you can't push data out, I'd tackle this as follows:
collect the submitted information on the server in eg DB
run a client-side script to check periodically if all required information is available
on the client evaluate the response: if true, start game - if not "wait" = keep checking
so you need:
script on the server that collects the info
server script that checks if all info is available
client script that checks periodically and evaluates

Inter process pushing of captured events

I have a php based web application that captures certain events in a database table. It also features a visualization of those captured events: a html table listing the events which is controlled by ajax.
I would like to add an optional 'live' feature: after pressing a button ('switch on') all events captured from that moment on will be inserted into the already visible table. Three things have to happen: noticing the event, fetching the events data and inserting it into the table. To keep the server load inside sane limits I do not want to poll for new events with ajax request, instead I would prefer the long polling strategy.
The problem with this is obviously that when doing a long polling ajax call the servers counterpart has to monitor for an event. Since the events are registered by php scripts there is no easy way to notice that event without polling the database for changes again. This is because the capturing action runs in another process than the observing long polling request. I looked around to find a usable mechanism for such inter process communication as I know it from rich clients under linux. Indeed there are php extensions for semaphores, shared memory or even posix. However they all only exist under linux (or unix like) systems. Though not typically the application might be used under MS-Windows systems in rare cases.
So my simple question is: is there any means that is typically available on all (most) systems that can push such events to a php script servicing the long polling ajax request ? Something without polling a file or a database constantly, since I already have an event elsewhere ?

So, the initial caveats: without doing something "special", trying to do long polling with vanilla PHP will eat up resources until you kill your server.
Here is a good basic guide to basic PHP based long polling and some of the challenges associated with going the "simple" road:
How do I implement basic "Long Polling"?
As far as doing this really cross-platform (and simple enough to start), you may need to fall back to some sort of simple internal polling - but the goal should be to ensure that this action is much lower-cost than having the client poll.
One route would be to essentially treat it like you're caching database calls (which you are at this point), and go with some standard caching approaches. Everything from APC, to memcached, to polling a file, will all likely put less load on the server than having the client set up and tear down a connection every second. Have one process place data in the correct keys, and then poll them in your script on a regular basis.
Here is a pretty good overview of a variety of caching options that might be crossplatform enough for you:
http://simas.posterous.com/php-data-caching-techniques
Once you reach the limits of this approach, you'll probably have to move onto a different server architecture anyhow.

How to trigger server-sent event in HTML5 ... OR : Can PHP script be aware if another script has been called?

Now that all of the browsers I like have almost full support for Server Sent Events, I wanted to try implementing it on a site I've been putting off because I hate polling. But I have initial hesitation that I was hoping I could get some help on.
Here is my use case:
User goes to a form, something time-based and competitive, in this case class registration. All things being equal, they have a list of about 30 - 40 classes they are eligible for, and in order to minimize instances of "she logged in first but he hit save first but he didn't mean to hit save but she already chose another class" etc, I want to make the form real-time, so that when someone selects an option, it goes straight into the db and anyone else viewing the form sees that it is filling up. (I'll deal with the stress of people changing their minds later).
So, in a polling scenario, I had to deal with the AJAX calls having to check on the status of 40 spots and update them and setting an interval that could potentially still create collisions.
But with Server Sent Events, I can have the listener get just the spots that need updating, which seems better, but here's where I get stuck:
Is there any risk of the listener getting overloaded? Let's say the script sends 15 messages, back-to-back, about a status change. I see vague mentions of how user agents should handle queued tasks, but it's not clear if that's for establishing a connection or handling server-sent messages
Is this basically just passing the burden of polling from the browser to the server? Does the script have to check the DB every second for changes? Is there any way for the script to be aware or notified when change has occurred? Let's assume that seat requests are sent to requests.php via ajax and that updates.php pushes events back to the browser. Is there a standard and/or clever way for updates to idle until requests has made a commit?
The only solution I can think of is for requests.php to write the committed changes to a flat file (commits.xml perhaps) and updates.php just polls the file size every half-second, thereby keeping the workload to a minimum.
Any better/smarter/more obvious solutions out there?

Polling your database for changes is not a good idea. Instead, you should do inter-process PUB/SUB on the server. To do that, you can use a message queue like RabbitMQ, ZeroMQ or Redis PUB/SUB.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.