This is THE MOST noobish question ever, I see it done so often.
I want a PHP page which is constantly runnng in the background (the backend) to occasionally query the frontend with updated data. YES, this is the way I want to do it.
But the only way I know of querying a page is to re-create that php, with XHR - so, I would XHR "index.php?data=newdata", but this would create a new process server-side. What do I do?
(Please ask for more info or correct me if there is a better way of doing it)
This is a great SO question/answer to look at:
Using comet with PHP?
The upshot is, you can do it with PHP...
Another way to do this is to create a bridge from your Apache set up and Node, if you read through the guides about Node you will see that it is:
Designed for high loads of networking
Only spawns new threads when it need's to do blocking tasks such as I/O
Extremely simple to use, based on Google V8 (Javascript Engine)
Can handle thousands of concurrent connections
With the above in mind my road plan would be to create a database for your PHP Application, and create 2 connections to that,
Connection used in the PHP Application
Connection used within Node.
The Node side of things would be simple:
Create a simple socket server (20~ lines)
Create an array
Listen for new connections, place the resource into the array.
Attach an for the database
When the event get's fired, pipe the new data to all the clients in the array.
All the clients will receive the data at pretty much the same time, this should be stable, it's extremely light weight solution as 1K Connections would use 1 process with a few I/O Threads, the ram used would be about 8~MB
Your first step's would be to set up node.js on your server, if you google around you will be able to find how to do that, a simple way under ubuntu is to do:
apt-get install nodejs
you should read the following resources:
http://nodejs.org/
http://www.youtube.com/watch?v=jo_B4LTHi3I
http://jeffkreeftmeijer.com/2010/experimenting-with-node-js/
http://www.slideshare.net/kompozer/nodejs-and-websockets-intro
http://remysharp.com/2010/02/14/slicehost-nodejs-websockets/
for more technical assistance you should connect to the #node.js irc server on freenode.net, those guys will really help you out over there!
Hope this helps.
COMET may be a way to go; however, having a finalized HTML page, and doing AJAX requests to get updates is the usual, and more robust way to do this. I would investigate ways to implement an Ajax based approached that is optimized for speed.
You say you are doing complex calculations, which you would have to repeat for each request when going the Ajax way. You may be able to help that e.g. by employing smart caching. If you write the results of whatever you do as JSON encoded data into a text file, you can fetch that with almost no overhead at all.
Related
I am a programmer at an internet marketing company that primaraly makes tools. These tools have certian requirements:
They run in a browser and must work in all of them.
The user either uploads something (.csv) to process or they provide a URL and API calls are made to retrieve information about it.
They are moving around THOUSANDS of lines of data (think large databases). These tools literally run for hours, usually over night.
The user must be able to watch live as their information is processed and is presented to them.
Currently we are writing in PHP, MySQL and Ajax.
My question is how do I process LARGE quantities of data and provide a user experience as the tool is running. Currently I use a custom queue system that sends ajax calls and inserts rows into tables or data into divs.
This method is a huge pain in the ass and couldnt possibly be the correct method. Should I be using a templating system or is there a better way to refresh chunks of the page with A LOT of data. And I really mean a lot of data because we come close to maxing out PHP memory and is something we are always on the look for.
Also I would love to make it so these tools could run on the server by themselves. I mean upload a .csv and close the browser window and then have an email sent to the user when the tool is done.
Does anyone have any methods (programming standards) for me that are better than using .ajax calls? Thank you.
I wanted to update with some notes incase anyone has the same question. I am looking into the following to see which is the best solution:
SlickGrid / DataTables
GearMan
Web Socket
Ratchet
Node.js
These are in no particular order and the one I choose will be based on what works for my issue and what can be used by the rest of my department. I will update when I pick the golden framework.
First of all, you cannot handle big data via Ajax. To make users able to watch the processes live you can do this using web sockets. As you are experienced in PHP, I can suggest you Ratchet which is quite new.
On the other hand, to make calculations and store big data I would use NoSQL instead of MySQL
Since you're kind of pinched for time already, migrating to Node.js may not be time sensitive. It'll also help with the question of notifying users of when the results are ready as it can do browser notification push without polling. As it makes use of Javascript you might find some of your client-side code is reusable.
I think you can run what you need in the background with some kind of Queue manager. I use something similar with CakePHP and it lets me run time intensive processes in the background asynchronously, so the browser does not need to be open.
Another plus side for this is that it's scalable, as it's easy to increase the number of queue workers running.
Basically with PHP, you just need a cron job that runs every once in a while that starts a worker that checks a Queue database for pending tasks. If none are found it keeps running in a loop until one shows up.
Is there any way you can push data to a page rather than checking for it periodically?
Obviously you can check for it periodically with ajax, but is there any way you can force the page to reload when a php script is executed?
Theoretically you can improve an ajax request's speed by having a table just for when the ajax function is supposed to execute (update a value in the table when the ajax function should retrieve new data from the database) but this still requires a sizable amount of memory and a mysql connection as well as still some waiting time while the query executes even when there isn't an update/you don't want to execute the ajax function that retrieves database data.
Is there any way to either make this even more efficient than querying a database and checking the table that stores the 'if updated' data OR tell the ajax function to execute from another page?
I guess node.js or HTML5 webSocket could be a viable solution as well?
Or you could store 'if updated' data in a text file? Any suggestions are welcome.
You're basically talking about notifying the client (i.e. browser) of server-side events. It really comes down to two things:
What web server are you using? (are you limited to a particular language?)
What browsers do you need to support?
Your best option is using WebSockets to do the job, anything beyond using web-sockets is a hack. Still, many "hacks" work just fine, I suggest you try Comet or AJAX long-polling.
There's a project called Atmosphere (and many more) that provide you with a solution suited towards the web server you are using and then will automatically pick the best option depending on the user's browser.
If you aren't limited by browsers and can pick your web stack then I suggest using SocketIO + nodejs. It's just my preference right now, WebSockets is still in it's infancy and things are going to get interesting once it starts to develop more. Sometimes my entire application isn't suited for nodejs, so I'll just offload the data operation to it alone.
Good luck.
Another possibility, if you can store the data in a simple format in a file, you update a file with the data and use the web server to check its timestamp.
Then the browser can poll, making HEAD requests, which will check the update times on the file to see if it needs an updated copy.
This avoids making a DB call for anything that doesn't change the data, but at the expense of keeping file system copies of important resources. It might be a good trade-off, though, if you can do this for active data, and roll them off after some time. You will need to ensure that you manage to change this on any call that updates the data.
It shares the synchronization risks of any systems with multiple copies of the same data, but it might be worth investigating if the enhanced responsiveness is worth the risks.
There was once a technology called "server push" that kept a Web server process sitting there waiting for more output from your script and forwarding it on to the client when it appeared. This was the hot new technology of 1995 and, while you can probably still do it, nobody does because it's a freakishly terrible idea.
So yeah, you can, but when you get there you'll most likely wish you hadn't.
Well you can (or will) with HTML5 Sockets.
This page has some great info about this technology:
http://www.html5rocks.com/en/tutorials/websockets/basics/
after reading through all of the twitter streaming API and Phirehose PHP documentation i've come across something I have yet to do, collect and process data separately.
The logic behind it, If I understand correctly, is to prevent a log jam at the processing phase that will back up the collecting process. I've seen examples before but they basically write right to a MySQL database right after collection which seems to go against what twitter recommends you do.
What I'd like some advice/help on is, what is the best way to handle this and how. It seems that people recommend writing all the data directly to a text file then parsing/processing it with a separate function. But with this method, I'd assume it could be a memory hog.
Here's the catch, it's all going to be running as a daemon/background process. So does anyone have any experience with solving a problem like this, or more specifically, the twitter phirehose library? Thanks!
Some notes:
*The connection will be through a socket so my guess is that the file will constantly be appended? not sure if anyone has any feedback on that
The phirehose library comes with an example of how to do this. See:
Collect: https://github.com/fennb/phirehose/blob/master/example/ghetto-queue-collect.php
Consume: https://github.com/fennb/phirehose/blob/master/example/ghetto-queue-consume.php
This uses a flat file, which is very scalable and fast, ie: Your average hard disk can write sequentially at 40MB/s+ and scales linearly (ie: unlike a database, it doesn't slow down as it gets bigger).
You don't need any database functionality to consume a stream (ie: you just want the next tweet, there's no "querying" involved).
If you rotate the file fairly often, you will get near-realtime performance (if desired).
I am building a "multiplayer world" with jQuery and PHP. Here is a bit how it works:
User's character's positions are taken from a database, user is plotted accordingly (the position values are CSS values - left and top)
User is able to move about using the arrow keys on the keyboard, making their character move using jQuery animations. While this is happening (on each arrow press) the user's position values are inserted into a database and updated.
In order to make this "global" (so users see each other) as you could say, the values need to be updated all at once for each user using AJAX
The problem I am having is I need to continuously call a JavaScript function I wrote which connects to the MySQL server and grabs values from a database table. And this function needs to be called constantly via setInterval(thisFunction, 1000); however my host just suspended me for overloading the server's resources and I think this was because of all my MySQL queries. And even after grabbing values from my database repeatedly, I had to insert values every few seconds as well so I can imagine that would cause a crash over time if enough clients were to login. How can I reduce the amount of queries I am using? Is there another way to do what I need to do? Thank you.
This is essentially the same thing as a chat system with regards to resource usage. Try a search and you'll find many different solution, including concepts like long polling and memcached. For example, check this thread: Scaling a chat app - short polling vs. long polling (AJAX, PHP)
You should look into long polling - http://en.wikipedia.org/wiki/Push_technology. This method allows you to establish a connection with your server, and then update it only when you need to. However by the sounds of it, you have a pretty intensive thing going on if you want to update every time, you may want to look into another way of storing this data, but if your wondering how big companies do it, they tend to have mass amounts of servers to handle it for them, but they will also use a technique similar to long polling.
You could store all the positions in memory using memcached See http://php.net/manual/fr/book.memcached.php and save them all at once every few seconds into the database (if needed).
You could use web sockets to overcome this problem. Check out this nettuts tutorial.
There is another way, it's to emulate or use actual sockets. Instead of constantly pulling the data (refreshing to check if there are new records), you can push the data over WebSockets which works in Chrome at the moment (at least to my knowledge, didn't try it in FF4) or you can use Node.js for leaner long pooling. That way, the communication between players will be bi-directional without the need of MySQL for storing positions.
Checkout Tornado
From their site:
Tornado is an open source version of the scalable, non-blocking web server and tools that power FriendFeed. The FriendFeed application is written using a web framework that looks a bit like web.py or Google's webapp, but with additional tools and optimizations to take advantage of the underlying non-blocking infrastructure.
The framework is distinct from most mainstream web server frameworks (and certainly most Python frameworks) because it is non-blocking and reasonably fast. Because it is non-blocking and uses epoll, it can handle thousands of simultaneous standing connections, which means it is ideal for real-time web services. We built the web server specifically to handle FriendFeed's real-time features — every active user of FriendFeed maintains an open connection to the FriendFeed servers. (For more information on scaling servers to support thousands of clients, see The C10K problem.)
Any idea how to implement this (http://fluin.com/63) using MySQL+PHP+Javascript(mootools)?
In a nutshell, it's a realtime threaded conversational web app.
Update:
This uses http://www.ape-project.org/home.html
Any idea how to implement realtime stuff without AJAX push (ape)?
Install Firefox.
Install Web Development toolbar
Install Firebug
Install HttpFox
Read docs of above tools re how to use, what they can do.
Go to http://fluin.com/63. Use above tools to inspect.
Read up on Databases and data models, and MySQL.
Build your own.
Well, this depends on your definition of realtime, which, in its technical meaning, is simply impossible with public ip networks and traditional tcp stack, for you have no control over timing.
Closer to the topic though, to get any web page updated without direct user intervention, you'd have to use javascript to poll server for changes since the last successful poll, and do this over certain intervals of time. In calculating these intervals you'll have to consider both network/server load, and the delay that is comfortable for the user.
The server, of course, will have to store the new data and its timely status (creation timestamps are one way of doing it), to be able to distinguish between content already delivered to various clients.
As soon as the server reports new content, it is inserted into a dom page via javascript and the user sees the response.
This is a bit general, of course, but you should get the idea.
Isn't it like a shoutbox ? here an example of one
Doing this properly using PHP only is very hard. When you have 5 users you could use long-polling, but it will definitely not scale when you have let's say 1000 users.
Using comet with PHP?
The screencast(link) in my post shows how you could implement it, but it has a couple of flaws:
It touches the disc(disc is very slow compared to memory).
To make matters worse it also polls the disc frequently(filemtime()).
Maybe phet(PHP) is able to scale. You should try that out.
To make it scale I think you need at least:
a good implementation of long-polling(at least long-polling. You have better transports) that can handle load.
keep data in memory(much faster than dics) using something like redis or memcached.
I would use:
node.js with socket.io(video) module.
to keep data in memory I would use node_redis(video).