Looping sql Query with ajax to update data - php

I'm working and creating a messenger like app now, I just want to ask
It it advisable to loop an ajax that queries data from MYSQL, to have an updated data for my messages?
I want page data to be updated every time there is a new message from the sender, the receiver will receive the data without reloading the page, and my approach for this is to loop my ajax query that gets all the messages for the receiver or sender from my sql table.
Q2. Will this affects the performance of my database?
Q3. Is there any other way to do this? I'm currently working with php now.
Thank you!

For a few users, no problem. If you have more than, say, 10K users sitting in that loop, you might start smelling smoke come out of the database server.
For large scale deployment, I think you need something "push" technology outside the realm of AJAX, PHP, and MySQL.
One AJAX call will lead to
Web server sees the request, and hands it to
PHP which starts up as a "child" to the web server; then PHP
Connects to MySQL, and
Performs a query, and
PHP replies to
AJAX, which either puts up the message or sleeps for another second;
7,8 Meanwhile, MySQL and PHP shutdown.
That's a lot of steps; I am not sure which part will be the worst. But, assuming everything is running on a single machine, there are limits to how many users it will support.

Related

Looking for MySQL performance advice for database-heavy app

I'm creating a messaging app using jQuery, PHP and MySQL. Every time a user enters a message, I store it in a MySQL table. On the receiving users end, I basically just added a Javascript timer to check the database every X number of seconds for new messages.
The system works well but is this going to be a performance problem? For example, let's say I have 1000 users and I'm hitting a MySQL table every 5 seconds for each user.
Can anyone suggest a better method?
With your actual architecture, your SGBD will get a heart attack :)
The solution reside on implementation of Web Socket
in back-end only 1 instance of PHP check if there is a new update on database, if there is, PHP can invoke a web service to your Web-Socket Server (like NodeJS), and Server send the message to the client

How to process massive data-sets and provide a live user experience

I am a programmer at an internet marketing company that primaraly makes tools. These tools have certian requirements:
They run in a browser and must work in all of them.
The user either uploads something (.csv) to process or they provide a URL and API calls are made to retrieve information about it.
They are moving around THOUSANDS of lines of data (think large databases). These tools literally run for hours, usually over night.
The user must be able to watch live as their information is processed and is presented to them.
Currently we are writing in PHP, MySQL and Ajax.
My question is how do I process LARGE quantities of data and provide a user experience as the tool is running. Currently I use a custom queue system that sends ajax calls and inserts rows into tables or data into divs.
This method is a huge pain in the ass and couldnt possibly be the correct method. Should I be using a templating system or is there a better way to refresh chunks of the page with A LOT of data. And I really mean a lot of data because we come close to maxing out PHP memory and is something we are always on the look for.
Also I would love to make it so these tools could run on the server by themselves. I mean upload a .csv and close the browser window and then have an email sent to the user when the tool is done.
Does anyone have any methods (programming standards) for me that are better than using .ajax calls? Thank you.
I wanted to update with some notes incase anyone has the same question. I am looking into the following to see which is the best solution:
SlickGrid / DataTables
GearMan
Web Socket
Ratchet
Node.js
These are in no particular order and the one I choose will be based on what works for my issue and what can be used by the rest of my department. I will update when I pick the golden framework.
First of all, you cannot handle big data via Ajax. To make users able to watch the processes live you can do this using web sockets. As you are experienced in PHP, I can suggest you Ratchet which is quite new.
On the other hand, to make calculations and store big data I would use NoSQL instead of MySQL
Since you're kind of pinched for time already, migrating to Node.js may not be time sensitive. It'll also help with the question of notifying users of when the results are ready as it can do browser notification push without polling. As it makes use of Javascript you might find some of your client-side code is reusable.
I think you can run what you need in the background with some kind of Queue manager. I use something similar with CakePHP and it lets me run time intensive processes in the background asynchronously, so the browser does not need to be open.
Another plus side for this is that it's scalable, as it's easy to increase the number of queue workers running.
Basically with PHP, you just need a cron job that runs every once in a while that starts a worker that checks a Queue database for pending tasks. If none are found it keeps running in a loop until one shows up.

PHP: how to achieve asynchronous effect for server script

I have searched the internet and see people are working their way to make concurrent calls with PHP even though PHP doesn't have rich concurrency features. I recently want to make improvement on one of my scripts on the server side, which takes a request from a client, gets some data from the database, returns the data and does some other data update.
The problem now is that the client have to wait for the server to get the data, finish the update and everything else, then it can finally get the result that it asked for. The client however doesn't care about the data update that the server does and therefore should not waste time waiting for it.
Through my study all other people are talking about the client making asynchronous call to the server without waiting for result, but I want the server to return data to calling client in the middle of its process.
If I do not want to change anything on the client side, is there any workaround that can achieve this effect??
How about some pseudo multi-threading? http://phplens.com/phpeverywhere/?q=node/view/254

PHP infinitive loop or jQuery setInterval?

Js:
<script>
function cometConnect(){
$.ajax({
cache:false,
type:"post",
data:'ts='+1,
url: 'Controller/chatting',
async: true,
success: function (arr1) {
$(".page-header").append(arr1);
},
complete:function(){
cometConnect(true);
nerr=false;
},
dataType: "text"
});
}
cometConnect();
</script>
Php:
public function chatting()
{
while(true)
{
if(memcache_get(new_message))
return new_message;
sleep(0.5);
}
}
Is this a better solution than setting setInterval which connects to the PHP method which returns message if there is any every 1 second (1 sec increases +0.25 every 5 seconds let's say)?
If I used first solution, I could probably use sleep(0.5) it would give me messages instantly, because php loop is cheap, isn't?
So, what solution is better (more importantly, which takes less resources?). Because there are going to be hundreds of chats like this.
Plus, can first solution cause problems? Let's say I would reload a page or I would stop execution every 30 secs so I wouldn't get 502 Bad Gateway.
EDIT: I believe the second solution is better, so I am going to reimplement my site, but I am just curious if this can cause problems to the user or not? Can something not expected happen?
First problem I noticed is that you can't go to other page until there is at least one new message.
A chat is a one to many communication, while each one of the many can send messages and will receive messages from everybody else.
These two actions (sending, receiving) happen continuously. So this looks like an endless loop whereas the user can enter into (join the chat) and exit (leave the chat).
enter
send message
receive message
exit
So the loop looks like this (pseudo-code) on the client side:
while (userInChat)
{
if (userEnteredMessages)
{
userSendMessages(userEnteredMessages)
}
if (chatNewMessages)
{
displayMessages(chatNewMessages)
}
}
As you already note in your question, the problem is in implementing such a kind of chat for a website.
To implement such a "loop" for a website, you are first of all facing the situation that you don't want to have an actual loop here. As long as the user is in chat, it would run and run and run. So you want to distribute the execution of the loop over time.
To do this, you can convert it into a collection of event functions:
ChatClient
{
function onEnter()
{
}
function onUserInput(messages)
{
sendMessages = send(messages)
display(sendMessages)
}
function onReceive(messages)
{
display(messages)
}
function onExit()
{
}
}
It's now possible to trigger events instead of having a loop. Only left is the implementation to trigger these events over time, but for the moment this is not really interesting because it would be dependent to how the chat data exchange is actually implemented.
There always is a remote point where a chat client is (somehow) connected to to send it's own messages and to receive new messages from.
This is some sort of a stream of chat messages. Again this looks like a loop, but infact it's a stream. Like in the chat clients loop, at some point in time it hooks onto the stream and will send input (write) and receive output (read) from that stream.
This is already visible in the ChatClient pseudo code above, there is an event when the user inputs one or multiple messages which then will be send (written). And read messages will be available in the onReceive event function.
As the stream is data in order, there needs to be order. As this is all event based and multiple clients are available, this needs some dedicated handling. As order is relative, it will only work in it's context. The context could be the time (one message came before another message), but if the chat client has another clock as the server or another client, we can't use the existing clock as time-source for the order of messages, as it normally differs between computers in a WAN.
Instead you create your own time to line-up all messages. With a shared time across all clients and servers an ordered stream can be implemented. This can be easily done by just numbering the messages in a central place. Luckily your chat has a central place, the server.
The message stream starts with the first message and ends with the last one. So what you simply do is to give the first message the number 1 and then each new message will get the next higher number. Let's call it the message ID.
So still regardless which server technology you'll be using, the chat knows to type of messages: Messages with an ID and messages without an ID. This also represents the status of a message: either not part or part of the stream.
Not stream associated messages are those that the user has already entered but which have not been send to the server already. While the server receives the "free" messages, it can put them into the stream by assigning the ID:
function onUserInput(messages)
{
sendMessages = send(messages)
display(sendMessages)
}
As this pseudo code example shows, this is what is happening here. The onUserInput event get's messages that are not part of the stream yet. The sendMessages routine will return their streamed representation which are then displayed.
The display routine then is able to display messages in their stream order.
So still regardless how the client/server communication is implemented, with such a structure you can actually roughly handle a message based chat system and de-couple it from underlying technologies.
The only thing the server needs to do is to take the messages, gives each message an ID and return these IDs. The assignment of the ID is normally done when the server stores the messages into it's database. A good database takes care to number messages properly, so there is not much to do.
The other interaction is to read new messages from the server. To do this over network effectively, the client tells the server from which message on it likes to read from. The server will then pass the messages since that time (ID) to the client.
As this shows, from the "endless" loop in the beginning it's now turned into an event based system with remote calls. As remote calls are expensive, it is better to make them able to transfer much data with one connection. Part of that is already in the pseudo code as it's possible to send one or multiple messages to the server and to receive zero or more messages from the server at once.
The ideal implementation would be to have one connection to the server that allows to read and write messages to it in full-duplex. However no such technology exists yet in javascript. These things are under development with Websockets and Webstream APIs and the like but for the moment let's take things simple and look what we have: stateless HTTP requests, some PHP on the server and a MySQL database.
The message stream can be represented in a database table that has an auto-incrementing unique key for the ID and other fields to store the message.
The write transaction script will just connect to the database, insert the message(s) and return the IDs. That's a very common operation and it should be fast (mysql has a sort of memcache bridge which should make the store operation even more fast and convenient).
The read transaction script is equally simple, it will just read all messages with an ID higher than passed to it and return it to the client.
Keep these scripts as simple as possible and optimize the read/write time to the store, so they can execute fast and you're done even with chatting over plain HTTP.
Still your webserver and the overall internet connection might not be fast enough (although there is keep-alive).
However, HTTP should be good enough for the moment to test if you chat system is actually working without any loops, not client, nor server side.
It's also good to keep servers dead simple, because each client relies on them, so they should just do their work and that's it.
You can at any time change the server (or offer different type of servers) that can interact with your chat client by giving the chat client different implementations of the send and receive functions. E.g. I see in your question that you're using comet, this should work as well, it's probably easy to directly implement the server for comet.
If in the future websockets are more accessible (which might never be the case because of security considerations), you can offer another type of server for websockets as well. As long as the data-structure of the stream is intact, this will work with different type of servers next to each other. The database will take care of the congruency.
Hope this is helpful.
Just as an additional note: HTML5 offers something called Stream Updates with Server-Sent Events with an online demo and PHP/JS sources. The HTML 5 feature offers already an event object in javascript which could be used to create an exemplary chat client transport implementation.
I wrote a blog post about how I had to handle a similar problem (using node.js, but the principles apply).
http://j-query.blogspot.com/2011/11/strategies-for-scaling-real-time-web.html
My suggestion is, if it's going to be big either a) you need to cache like crazy on your web server layer, which probably means your AJAX call needs to have a timestamp on it or b) use something like socket.io, which is built for scaling real-time web apps and has built-in support for channels.
Infinite loops in php can and will use 100% of your CPU. Sleep functions will fix that problem. However, you probably don't want to have a separate HTTP process running all the time for every client that is connected to your server because you'll run out of connections. You could just have one php process that looks at all inbound messages and routes them to the right person as they come in. This process could be launched from a cron job once a minute. I've written this type of thing many times and it works like a charm. Note: Make sure you don't run the process if it's already running or you will run into multiprocessing problems (like getting double messages). In other words, you need to make the process thread safe.
If you want to get real time chatting, then you might want to take a look at StreamHub which opens a full duplex connection to the client's browser.
It's not a PHP or jQuery task now. Node.js!
There is socket.io, which means WebSockets.
I'll explain why node.js is better. I have a task to refresh on-page markers every, for example, 10 seconds. I've done it with the first method. When the persistent users count come to 200. Http server and php were in trouble. There were a lot of requests which was unnesessary.
Whats give you Node.js:
Creating separate rooms for chats (here)
Sends data, only for those who has updates (for example, if I do not have any new message my refresh will be blocked when there will be selection from database)
You run 1 query to the DB per 0.5 second, no matter how much users there are
Just look into Node.js and Socket.io. This solution help me with a great boost.
First off, ask yourself if it's necessary to update the chat frequently. What type of chats will be happening? Is it real-time? Simple Q&A? Tech support? Etc. In all but the real-time chat cases, you will be better off using a long polling JS-based design, because instantaneous responses are not that important. If this is for real-time chats, then you should consider a Gmail-like design whereby you keep an XHR open and push messages back to the client as they are received. If connection resources are a concern, you can get by using long polling with a very brief interval (ex. 5-10 seconds).

How would you protect a database of links from being scraped?

I have a large database of links, which are all sorted in specific ways and are attached to other information, which is valuable (to some people).
Currently my setup (which seems to work) simply calls a php file like link.php?id=123, it logs the request with a timestamp into the DB. Before it spits out the link, it checks how many requests were made from that IP in the last 5 minutes. If its greater than x, it redirects you to a captcha page.
That all works fine and dandy, but the site has been getting really popular (as well as been getting DDOsed for about 6 weeks), so php has been getting floored, so Im trying to minimize the times I have to hit up php to do something. I wanted to show links in plain text instead of thru link.php?id= and have an onclick function to simply add 1 to the view count. Im still hitting up php, but at least if it lags, it does so in the background, and the user can see the link they requested right away.
Problem is, that makes the site REALLY scrapable. Is there anything I can do to prevent this, but still not rely on php to do the check before spitting out the link?
It seems that the bottleneck is at the database. Each request performs an insert (logs the request), then a select (determine the number of requests from the IP in the last 5 minutes), and then whatever database operations are necessary to perform the core function of the application.
Consider maintaining the request throttling data (IP, request time) in server memory rather than burdening the database. Two solutions are memcache (http://www.php.net/manual/en/book.memcache.php) and memcached (http://php.net/manual/en/book.memcached.php).
As others have noted, ensure that indexes exist for whatever keys are queried (fields such as the link id). If indexes are in place and the database still suffers from the load, try an HTTP accelerator such as Varnish (http://varnish-cache.org/).
You could do the ip throttling at the web server level. Maybe a module exists for your webserver, or as an example, using apache you can write your own rewritemap and have it consult a daemon program so you can do more complex things. Have the daemon program query a memory database. It will be fast.
Check your database. Are you indexing everything properly? A table with this many entries will get big very fast and slow things down. You might also want to run a nightly process that deletes entries older than 1 hour etc.
If none of this works, you are looking at upgrading/load balancing your server. Linking directly to the pages will only buy you so much time before you have to upgrade anyway.
Every thing you do on the client side can't be protected, Why not just use AJAX ?
Have a onClick event that call's an ajax function, that returns just the link and fill it in a DIV on your page, beacause the size of the request an answer is small, it will work fast enougth for what you need. Just make sure in the function you call to check the timestamp, It is easy to make a script that call that function many times to steel you links.
You can check out jQuery, or other AJAX libraries (i use jQuery and sAjax). And I have lots of page that dinamicly change content very fast, The client doesn't even know is not pure JS.
Most scrapers just analyze static HTML so encode your links and then decode them dynamically in the client's web browser with JavaScript.
Determined scrapers can still get around this, but they can get around any technique if the data is valuable enough.

Categories