Any idea how to implement this?

Any idea how to implement this? - php

Any idea how to implement this (http://fluin.com/63) using MySQL+PHP+Javascript(mootools)?
In a nutshell, it's a realtime threaded conversational web app.
Update:
This uses http://www.ape-project.org/home.html
Any idea how to implement realtime stuff without AJAX push (ape)?

Install Firefox.
Install Web Development toolbar
Install Firebug
Install HttpFox
Read docs of above tools re how to use, what they can do.
Go to http://fluin.com/63. Use above tools to inspect.
Read up on Databases and data models, and MySQL.
Build your own.

Well, this depends on your definition of realtime, which, in its technical meaning, is simply impossible with public ip networks and traditional tcp stack, for you have no control over timing.
Closer to the topic though, to get any web page updated without direct user intervention, you'd have to use javascript to poll server for changes since the last successful poll, and do this over certain intervals of time. In calculating these intervals you'll have to consider both network/server load, and the delay that is comfortable for the user.
The server, of course, will have to store the new data and its timely status (creation timestamps are one way of doing it), to be able to distinguish between content already delivered to various clients.
As soon as the server reports new content, it is inserted into a dom page via javascript and the user sees the response.
This is a bit general, of course, but you should get the idea.

Isn't it like a shoutbox ? here an example of one

Doing this properly using PHP only is very hard. When you have 5 users you could use long-polling, but it will definitely not scale when you have let's say 1000 users.
Using comet with PHP?
The screencast(link) in my post shows how you could implement it, but it has a couple of flaws:
It touches the disc(disc is very slow compared to memory).
To make matters worse it also polls the disc frequently(filemtime()).
Maybe phet(PHP) is able to scale. You should try that out.
To make it scale I think you need at least:
a good implementation of long-polling(at least long-polling. You have better transports) that can handle load.
keep data in memory(much faster than dics) using something like redis or memcached.
I would use:
node.js with socket.io(video) module.
to keep data in memory I would use node_redis(video).

Related

How to process massive data-sets and provide a live user experience

I am a programmer at an internet marketing company that primaraly makes tools. These tools have certian requirements:
They run in a browser and must work in all of them.
The user either uploads something (.csv) to process or they provide a URL and API calls are made to retrieve information about it.
They are moving around THOUSANDS of lines of data (think large databases). These tools literally run for hours, usually over night.
The user must be able to watch live as their information is processed and is presented to them.
Currently we are writing in PHP, MySQL and Ajax.
My question is how do I process LARGE quantities of data and provide a user experience as the tool is running. Currently I use a custom queue system that sends ajax calls and inserts rows into tables or data into divs.
This method is a huge pain in the ass and couldnt possibly be the correct method. Should I be using a templating system or is there a better way to refresh chunks of the page with A LOT of data. And I really mean a lot of data because we come close to maxing out PHP memory and is something we are always on the look for.
Also I would love to make it so these tools could run on the server by themselves. I mean upload a .csv and close the browser window and then have an email sent to the user when the tool is done.
Does anyone have any methods (programming standards) for me that are better than using .ajax calls? Thank you.
I wanted to update with some notes incase anyone has the same question. I am looking into the following to see which is the best solution:
SlickGrid / DataTables
GearMan
Web Socket
Ratchet
Node.js
These are in no particular order and the one I choose will be based on what works for my issue and what can be used by the rest of my department. I will update when I pick the golden framework.

First of all, you cannot handle big data via Ajax. To make users able to watch the processes live you can do this using web sockets. As you are experienced in PHP, I can suggest you Ratchet which is quite new.
On the other hand, to make calculations and store big data I would use NoSQL instead of MySQL

Since you're kind of pinched for time already, migrating to Node.js may not be time sensitive. It'll also help with the question of notifying users of when the results are ready as it can do browser notification push without polling. As it makes use of Javascript you might find some of your client-side code is reusable.

I think you can run what you need in the background with some kind of Queue manager. I use something similar with CakePHP and it lets me run time intensive processes in the background asynchronously, so the browser does not need to be open.
Another plus side for this is that it's scalable, as it's easy to increase the number of queue workers running.
Basically with PHP, you just need a cron job that runs every once in a while that starts a worker that checks a Queue database for pending tasks. If none are found it keeps running in a loop until one shows up.

Push data to page without checking periodically for it?

Is there any way you can push data to a page rather than checking for it periodically?
Obviously you can check for it periodically with ajax, but is there any way you can force the page to reload when a php script is executed?
Theoretically you can improve an ajax request's speed by having a table just for when the ajax function is supposed to execute (update a value in the table when the ajax function should retrieve new data from the database) but this still requires a sizable amount of memory and a mysql connection as well as still some waiting time while the query executes even when there isn't an update/you don't want to execute the ajax function that retrieves database data.
Is there any way to either make this even more efficient than querying a database and checking the table that stores the 'if updated' data OR tell the ajax function to execute from another page?
I guess node.js or HTML5 webSocket could be a viable solution as well?
Or you could store 'if updated' data in a text file? Any suggestions are welcome.

You're basically talking about notifying the client (i.e. browser) of server-side events. It really comes down to two things:
What web server are you using? (are you limited to a particular language?)
What browsers do you need to support?
Your best option is using WebSockets to do the job, anything beyond using web-sockets is a hack. Still, many "hacks" work just fine, I suggest you try Comet or AJAX long-polling.
There's a project called Atmosphere (and many more) that provide you with a solution suited towards the web server you are using and then will automatically pick the best option depending on the user's browser.
If you aren't limited by browsers and can pick your web stack then I suggest using SocketIO + nodejs. It's just my preference right now, WebSockets is still in it's infancy and things are going to get interesting once it starts to develop more. Sometimes my entire application isn't suited for nodejs, so I'll just offload the data operation to it alone.
Good luck.

Another possibility, if you can store the data in a simple format in a file, you update a file with the data and use the web server to check its timestamp.
Then the browser can poll, making HEAD requests, which will check the update times on the file to see if it needs an updated copy.
This avoids making a DB call for anything that doesn't change the data, but at the expense of keeping file system copies of important resources. It might be a good trade-off, though, if you can do this for active data, and roll them off after some time. You will need to ensure that you manage to change this on any call that updates the data.
It shares the synchronization risks of any systems with multiple copies of the same data, but it might be worth investigating if the enhanced responsiveness is worth the risks.

There was once a technology called "server push" that kept a Web server process sitting there waiting for more output from your script and forwarding it on to the client when it appeared. This was the hot new technology of 1995 and, while you can probably still do it, nobody does because it's a freakishly terrible idea.
So yeah, you can, but when you get there you'll most likely wish you hadn't.

Well you can (or will) with HTML5 Sockets.
This page has some great info about this technology:
http://www.html5rocks.com/en/tutorials/websockets/basics/

AJAX/PHP Why is HTTP-Polling so laggy?

Why is HTTP-Polling so laggy?
What I have is a button, and whenever a user clicks it a MySQL database field gets updated and the value is displayed to the user. I'm polling every 800 milliseconds and it's very laggy/glitchy. Sometimes when clicking the button it doesn't register it. And I actually need to be polling quite a bit more frequent than every 800 milliseconds.
This is also with just 1 user on the website at a time... When in the end there is going to be many at once.

HTTP-streaming/Long-polling/Websockets instead of polling
When you need real-time information you should avoid polling(frequently). Below I would try to explain why this is wrong. You could compare it to a child in the back of your car screaming every second "are we there yet" while you are replying "we are not there yet" all the time.
Instead you would like to have something like long-polling/HTTP-streaming or websockets. You could compare this to a child in the back of your car telling you to let him know when "we are there" instead of asking us every second. You could imagine this is way more efficient then the previous example.
To be honest I don't think PHP is the right tool for this kind of applications(yet). Some options you have available are:
hosted solutions:
http://pusherapp.com:
Pusher is a hosted API for quickly,
easily and securely adding scalable
realtime functionality via WebSockets
to web and mobile apps.
Our free Sandbox plan includes up to
20 connections and 100,000 messages
per day. Simply upgrade to a paid plan
when you're ready.
http://beaconpush.com/
Beaconpush is a push service for
creating real-time web apps using
HTML5 WebSockets and Comet.
host yourself:
http://socket.io:
Socket.IO aims to make realtime apps
possible in every browser and mobile
device, blurring the differences
between the different transport
mechanisms
When becoming very big the "host yourself" solution is going to less expensive, but on the other hand using something like pusherapp will get you started easier(friendly API) and also is not that expensive. For example pusherapp's "Bootstrap" can have 100 concurrent connections and 200,000 messages per day for $19 per month(but when small beaconpush is cheaper => do the math :)). As a side-note this plan does not include SSL so can not be used for sensitive data. I guess having a dedicated machine(VPS) will cost you about the same amount of money(for a simple website) and you will also have to manage the streaming solution yourself, but when getting bigger this is probably way more attractive.
Memory instead of Disc
whenever a user clicks it a MySQL
database field gets updated and the
value is displayed to the user
When comparing disc I/O(MySQL in standard mode) to memory it is extremely slow. You should be using an in-memory database like for example redis(also has persistent snapshots) or memcached(completely in memory) to speed up the process. I myself really like redis for it's insane speed, simplicity and persistent snapshots. http://redistogo.com/ offers a free plan with 5MB of memory which will probably cover your needs. If not the mini plan of $5 a month will probably cover you, but when getting even bigger a VPS will be cheaper and in my opinion the prefered solution.
Best solution
The best solution(especially if you are getting big) is to host socket.io/redis yourself using a VPS(cost money). If really small I would use redistogo, if not I would host it myself. I would also start using something like beaconpush/pusherapp because of it's simplicity(getting started immediately). Hosting socket.io(advice to play with it on your own machine for when getting big) is pretty simple, but in my opinion more difficult than beaconpush/pusherapp.

Laggy/glitchy? Sounds like a client-side problem. As does the button thing. I'd get your JavaScript in order first.
As for polling, 0.8 sounds a bit time-critical. I don't know about most countries, but here in the third world simple network packets may get delayed for as long a few seconds. (Not to mention connection drops, packet losses and the speed of light.) Is your application ready to cope with all that?
As for an alternate approach, I agree with #Vern in that an interrupt-driven one would be much better. In HTTP terms, it translates to a long-standing HTTP request that does not receive a response until the server has some actual data to send, minimizing delay and bandwidth. (AFAIK) it's an older technique than AJAX, though has been named more recently. Search for "COMET" and you'll end up with both client- and server-side libraries.

there are many things that might cause the lag that you are experiencing. Your server might be able to process the requests fast enough, but if the connection between your client and the server is slow, then you'll see the obvious lag.
The first thing you should try is to ping the server and see what response time you're getting.
Secondly, rather than poll, you might want to consider an interrupt driven approach. This means that only when your server replies, will you send out your next request. This makes sense, so that many clients won't be flooding the server with requests till the point the server cannot cope. This is especially true, then the RTT (Round-Trip-Time) of your request is pretty long.
Hope it helps. Cheers!

A good place to start would be to use a tool like Firebug in Mozilla Firefox that will allow you to watch the requests being sent to the server and look for bottlenecks.
Firebug will break down each part of the request, so you can see if you are having trouble talking to the server or if it is simply taking a long time to come up with a response.

Along with #Vern's answer I would also say that if at all possible I would have the server cache the data ahead of time and then all of the clients will pull from that same cache and not need separate MySQL calls to reach the same data for every update. Then you just have your PHP update the cache whenever the actual DB data changes.
By cache I mean having php write to a file on the sever side, and then clients will simply look at the contents of that one file to see the most updated info. There might be better ways of caching, but being that I have never done this personally before, this is the first solution that popped into my mind.

Reduce MySQL query amount with jQuery and PHP

I am building a "multiplayer world" with jQuery and PHP. Here is a bit how it works:
User's character's positions are taken from a database, user is plotted accordingly (the position values are CSS values - left and top)
User is able to move about using the arrow keys on the keyboard, making their character move using jQuery animations. While this is happening (on each arrow press) the user's position values are inserted into a database and updated.
In order to make this "global" (so users see each other) as you could say, the values need to be updated all at once for each user using AJAX
The problem I am having is I need to continuously call a JavaScript function I wrote which connects to the MySQL server and grabs values from a database table. And this function needs to be called constantly via setInterval(thisFunction, 1000); however my host just suspended me for overloading the server's resources and I think this was because of all my MySQL queries. And even after grabbing values from my database repeatedly, I had to insert values every few seconds as well so I can imagine that would cause a crash over time if enough clients were to login. How can I reduce the amount of queries I am using? Is there another way to do what I need to do? Thank you.

This is essentially the same thing as a chat system with regards to resource usage. Try a search and you'll find many different solution, including concepts like long polling and memcached. For example, check this thread: Scaling a chat app - short polling vs. long polling (AJAX, PHP)

You should look into long polling - http://en.wikipedia.org/wiki/Push_technology. This method allows you to establish a connection with your server, and then update it only when you need to. However by the sounds of it, you have a pretty intensive thing going on if you want to update every time, you may want to look into another way of storing this data, but if your wondering how big companies do it, they tend to have mass amounts of servers to handle it for them, but they will also use a technique similar to long polling.

You could store all the positions in memory using memcached See http://php.net/manual/fr/book.memcached.php and save them all at once every few seconds into the database (if needed).

You could use web sockets to overcome this problem. Check out this nettuts tutorial.

There is another way, it's to emulate or use actual sockets. Instead of constantly pulling the data (refreshing to check if there are new records), you can push the data over WebSockets which works in Chrome at the moment (at least to my knowledge, didn't try it in FF4) or you can use Node.js for leaner long pooling. That way, the communication between players will be bi-directional without the need of MySQL for storing positions.

Checkout Tornado
From their site:
Tornado is an open source version of the scalable, non-blocking web server and tools that power FriendFeed. The FriendFeed application is written using a web framework that looks a bit like web.py or Google's webapp, but with additional tools and optimizations to take advantage of the underlying non-blocking infrastructure.
The framework is distinct from most mainstream web server frameworks (and certainly most Python frameworks) because it is non-blocking and reasonably fast. Because it is non-blocking and uses epoll, it can handle thousands of simultaneous standing connections, which means it is ideal for real-time web services. We built the web server specifically to handle FriendFeed's real-time features — every active user of FriendFeed maintains an open connection to the FriendFeed servers. (For more information on scaling servers to support thousands of clients, see The C10K problem.)

Load testing the UI

I have been working on a site that makes some pretty big use of AJAX and dynamic JavaScript on the front end and it's time to start stress testing. But how do you properly stress test something that requires clicking several links on the front-end? One way I was able to easily hit every page of the site quickly and repeatedly was to point a Google Mini at it. But that's not going to click links and then navigate Modal windows and things like that.
Edit - I should point out that the site is done in PHP5 and the JavaScript library used is jQuery. Not sure if this would make any difference but felt it might be useful to know.

JMeter is great at this. You may record your sessions and tweak them to your liking.
So-called 'ajax load testing' is a recurring subject on this site, and is often confused. So let's get it straight: There is really no difference between load testing a normal web page and load testing with ajax. It all boils down to discrete requests; they just happen to not be full page refreshes.
One thing to keep in mind is there is a distinct difference between load testing the server processing the requests (a load test) and the performance on screen of the UI components being updated (how well your javascript performs.)
Simple load test example:
initial page load
login
navigate?
5-10 'ajax' requests (or whatever may fit your application usage pattern)
logout

There are load testing tools that can support AJAX. For example, WebLoad
http://www.radview.com/solutions/ajax-load-testing.aspx

What you really want is to stress test is the server's ability to handle the ajax requests. Use a load tool that looks at the requests while "recording" the test, and then tune as appropriate. I have only used the vs test edition one, so I can't point you to a low cost one.

I disagree with Nathan and Freddy to some degree. They are correct that "AJAX testing" is really no different in that HTTP requests are made. But it's not that simple. See my article on Ajaxian.com on Why Load Testing Ajax is Hard.
JMeter, Pylot, and The Grinder are all great tools for generating HTTP requests (I personally recommend Pylot). But at their core, they don't act as a browser and process JavaScript, meaning all they do is replay the traffic they saw at record time. If those AJAX requests were unique to that session, they may not be suitable/correct to replay in large volumes.
The fact is that as more logic is pushed down in to the browser, it becomes much more difficult (if not impossible) to properly simulate the traffic using traditional load testing tools.
In my article, I give a simple example of how difficult it becomes to test something like Google's home page when you want to query 1000's of different search terms (an important goal during load testing). To do it with JMeter/Pylot/Grinder you effectively end up re-writing parts of the AJAX code (in your case w/ jQuery) over again in the native language of the tool.
It gets even more complex if your goal is to measure the response time as perceived by the user (which is arguably the most important thing at the end of the day). For really complex applications that use Comet/"Reverse Ajax" (a technique that keeps open sockets for long periods of time), traditional load tools don't work at all.
My company, BrowserMob, provides a load testing service that uses Firefox browsers, powered by Selenium, to drive hundreds or thousands of real browsers, allowing you to measure and time the performance of visual elements as seen in the browser. We also support traditional virtual users (blind HTTP traffic) and a simulated browser (via HtmlUnit).
All that said, usually a mix of a service like BrowserMob plus traditional load testing is the right approach. That is, real browsers are great for a full-fidelity load test, but they will never be as economical as "virtual users", since they require 10-100X more RAM and CPU. See my recent blog post on whether to simulate or not to simulate virtual users.
Hope that helps!

You could use something like openSTA.
This allows a session with a web site to be recorded and then played back via a relatively simple script language.
You can also easily test web services and write your own scripts.
It allows you to put scripts together in a test in any way you want and configure the number of iterations, the number of users in each iteration, the ramp up time to introduce each new user and the delay between each iteration. Tests can also be scheduled in the future.
It's open source and free.
It produces a number of reports which can be saved to a spreadsheet. We then use a pivot table to easily analyse and graph the results.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.