I am looking for methods that let you deliver data to a user. This may be from XML to a flat-file DB to a relational DB.
This DB will be saved in a server and will serve handheld devices, smartphones and tablets on request. My main concern is the speed of displaying the requested info over millions of rows and how the data can be easily updated.
Before posting this, I made some Google searches pointing me to SQLite. What is your opinion about this ?
You would probably be better off setting up a SOAP/XML-RPC service and working the data on the server and sending it to the client pieces at a time, caching as you go. This is going to be potentially the "fastest" way, since you don't have to transmit a large file to a handheld. It may not work for you, but it's worth mentioning.
This is essentially just a file download bracketed by an output to a transport format and an input to the mobile device's datastore. I imagine you're describing something like a dictionary, something that is fairly static.
The approach you're going to want to achieve will need to:
Take into account overall file size, ie, utilize as little meta description as you can get away with to communicate the data's meaningfulness.
Utilize some type of compression (gzip, for instance) of the data between the server and device. Consider that aggressive compression may take longer on the device to uncompress and be counter-productive to a less aggressive approach that's significantly faster.
Take into account how the data will get out of your server's storage (Is it in a database? Can you cache the data into the transport form?) into the device's data storage system (ie, SQLite). Going out of your way to compress the transfer speed significantly may in the end be counter-productive if the device has to work three times longer to insert the data than the time that was saved in downloading to the device.
Test. Test. Test.
You could serve it as a CSV or JSON-formatted file, which are very minimalistic; they don't use a lot of data to describe the data. XML is not the fastest because it is generally very descriptive, but has other advantages. Another option may be send a text file with the SQL insert statements.
Another "option", of course, may be to use the update mechanism and push it out as an "update". My thought on this is that many people with mobile devices know updates are time consuming and are often very large downloads, so they connect to their computer when doing so. This is just a thought; I'm not sure if this would actually be any better, but thought I'd throw it out there.
SQLite is an RDBMS that runs on many mobile platforms. It has it's limitations, but it may be all you have to work with. It's a good option.
if you want the mobile devices to preform actions on the data(like offline processing) I think sql lite is the way to go.
if the processing gonna happen online anyway and the mobile device would be most for displaying and will be in an environment where you are "always" online then Web services for requesting and processing the data and then a backend sql server would be the way to go
In tearms of preformance it is like this:
alot of data = alot of online traffic so if really alot of data like you said milions of rows then I think offline is the way to go otherwise the traffic would just kill the preformance
If you are dealing with millions of rows and you want to display small samples at a time (anything else is hardly possible) on mobile devices, you should definitely do the processing at the server side, best using an RDBMS. Those are made for handling millions of rows. Sqlite is fast and simple, but in the millions-of-rows-range you'd better reach for PostgreSQL or one of the commercial brands (Oracle, MS SQL server, ..).
You can use a web service or direct access to the database, that depends. You can do some processing locally, but the selection of rows has to happen at the server.
Related
I would like your opinion on a project. The way I started myself and slowly presenting many gaps and problems either now or in the future they will create big issue.
The system will have a notification system, friends system, message system (private), and in general such systems. All these I have set up with:
jQuery, PHP, mysqli round-trips to avoid wordiness. I am getting at what the title says.
If all these do a simple PHP code and post and get methods for 3-4 online users will be amazing! The thing is when I have several users what can I do to make better use of the resources of the server? So I started looking more and and found like this socket.io
I just want someone to tell me who knows more what would be best to look for. Think how the update notification system work now. jQuery with post and repeated every 3-5 seconds, but it is by no means right.
If your goal is to set up a highly scalable notification service, then probably not.
That's not a strict no, because there are other factors than speed to consider, but when it comes to speed, read on.
WebSockets does give the user a consistently open, bi-directional connection that is, by its very nature, very fast. Also, the client doesn't need to request new information; it is sent when either party deems it appropriate to send.
However, the time savings that the connection itself gives is negligible in terms of the costs to generate the content. How many database calls do you make to check for new notifications? How much structured data do you generate to let the client know to change the notification icon? How many times do you read data from disk, or from across the network?
These same costs do not go away when using any WebSocket server; it just makes one mitigation technique more obvious: Keep the user's notification state in memory and update it as notifications change to prevent costly trips to disk, to the database, and across the server's local network.
Known proven techniques to mitigate the time costs of serving quickly changing web content:
Reverse proxy (Varnish-Cache)
Sits on port 80 and acts as a very thin web server. If a request is for something that isn't in the proxy's in-RAM cache, it sends the request on down to a "real" web server. This is especially useful for serving content that very rarely changes, such as your images and scripts, and has edge-side includes for content that mostly remains the same but has some small element that can't be cached... For instance, on an e-commerce site, a product's description, image, etc., may all be cached, but the HTML that shows the contents of a user's cart can't, so is an ideal candidate for an edge-side include.
This will help by greatly reducing the load on your system, since there will be far fewer requests that use disk IO, which is a resource far more limited than memory IO. (A hard drive can't seek for a database resource at the same time it's seeking for a cat jpeg.)
In Memory Key-Value Storage (Memcached)
This will probably give the most bang for your buck, in terms of creating a scalable notification system.
There are other in-memory key-value storage systems out there, but this one has support built right into PHP, not just once, but twice! (In the grand tradition of PHP core development, rather than fixing a broken implementation, they decided to consider the broken version deprecated without actually marking that system as deprecated and throwing the appropriate warnings, etc., that would get people to stop using the broken system. mysql_ v. mysqli_, I'm looking at you...) (Use the memcached version, not memcache.)
Anyways, it's simple: When you make a frequent database, filesystem, or network call, store the results in Memcached. When you update a record, file, or push data across the network, and that data is used in results stored in Memcached, update Memcached.
Then, when you need data, check Memcached first. If it's not there, then make the long, costly trip to disk, to the database, or across the network.
Keep in mind that Memcached is not a persistent datastore... That is, if you reboot the server, Memcached comes back up completely empty. You still need a persistent datastore, so still use your database, files, and network. Also, Memcached is specifically designed to be a volatile storage, serving only the most accessed and most updated data quickly. If the data gets old, it could be erased to make room for newer data. After all, RAM is fast, but it's not nearly as cheap as disk space, so this is a good tradeoff.
Also, no key-value storage systems are relational databases. There are reasons for relational databases. You do not want to write your own ACID guarantee wrapper around a key-value store. You do not want to enforce referential integrity on a key-value store. A fancy name for a key-value store is a No-SQL database. Keep that in mind: You might get horizontal scalability from the likes of Cassandra, and you might get blazing speed from the likes of Memcached, but you don't get SQL and all the many, many, many decades of evolution that RDBMSs have had.
And, finally:
Don't mix languages
If, after implementing a reverse proxy and an in-memory cache you still want to implement a WebSocket server, then more power to you. Just keep in mind the implications of which server you choose.
If you want to use Socket.io with Node.js, write your entire application in Javascript. Otherwise, choose a WebSocket server that is written in the same language as the rest of your system.
Example of a 1 language solution:
<?php // ~/projects/MySocialNetwork/core/users/myuser.php
class MyUser {
public function getNotificationCount() {
// Note: Don't got to the DB first, if we can help it.
if ($notifications = $memcachedWrapper->getNotificationCount($this->userId) !== null) // 0 is false-ish. Explicitly check for no result.
return $notifications;
$userModel = new MyUserModel($this->userId);
return $userModel->getNotificationCount();
}
}
...
<?php // ~/projects/WebSocketServerForMySocialNetwork/eventhandlers.php
function websocketTickCallback() {
foreach ($connectedUsers as $user) {
if ($user->getHasChangedNotifications()) {
$notificationCount = $user->getNotificationCount();
$contents = json_encode(array('Notification Count' => $notificationCount));
$message = new WebsocketResponse($user, $contents);
$message->send();
$user->resetHasChangedNotifications();
}
}
}
If we were using socket.io, we would have to write our MyUser class twice, once in PHP and once in Javascript. Who wants to bet that the classes will implement the same logic in the same ways in both languages? What if two developers are working on the different implementations of the classes? What if a bugfix gets applied to the PHP code, but nobody remembers the Javascript?
I'm developing an online sudoku game, with ActionScript 3.
I made the game and asked people to test it, it works, but the website goes down constantly. I'm using 000webhost, and I'm suspecting it is a bandwidth usage precaution.
My application updates the current puzzle, by parsing a JSON string every 2 seconds. And of course when players enter a number, it sends a $_GET request to update the mysql database. Do you think this causes a lot of data traffic?
How can I see the bandwidth usage value?
And how should I decrease the data traffic between Flash and mysql (or php, really).
Thanks !
There isn't enough information for a straight answer, and if there were, it'd probably take more time to figure out. But here's some directions you could look into.
Bandwidth may or may not be an issue. There are many things that could happen, you may very well put too much strain on the HTTP server, run out of worker threads, have your MySQL tables lock up most of the time, etc.
What you're doing indeed sounds like it's putting a lot of strain on the server. Monitoring this client side could be inefficient, you need to look at some server-side values, but you generally don't have access to those unless you have at least a VPS.
Transmitting data as JSONs is easier to implement and debug, but a more efficient way to send data (binary, instead of strings) is AMF: http://en.wikipedia.org/wiki/Action_Message_Format
One PHP implementation for the server side part is AMFPHP: http://www.silexlabs.org/amfphp/
Alternatively, your example is a good use case for the remote shared objects in Flash Media Server (or similar products). A remote shared object is exactly what you're doing with MySQL: it creates a common memory space where you can store whatever data and it keeps that data synchronised with all the clients. Automagically. :)
You can start from here: http://livedocs.adobe.com/flashmediaserver/3.0/hpdocs/help.html?content=00000100.html
I'm working on a chat application which I would love to use a SQL db for.
My problem is, after a few google searches, i have people telling me from one site, that using a DB would be much slower then using a normal file (e.g Text or JSON file), but then on some other sites, people are saying the complete opposite. And I don't know about you guys, but when it comes to creating web apps for users, the users always come first.
So as much as I'd love to use a SQL DB as 1.) I have good experience with it and 2.) it allows me to make the application much more cooler (more features). but if it would slow things down on the users end (a noticeable lag), then its a no-no.
Either way, I will be "polling" the server continuously with AJAX and PHP to check the file/DB (for new messages, contact requests, ect ect).
Also, incase your wondering, the application wont be like a 1-to-1 chat, it will have "rooms" where multiple users can join and talk with all users joining in. The users will also be able to request a "private chat" with another user, where a 1-to-1 connection opens up.
So, MySQL Database OR a boring TEXT/JSON/OTHER file, in regards to performance?
Oh, one more thing, I don't want to use any third party libraries or APIs. Hate relying on other peoples work (been let down to many times).
If you're looking to implement an IRC clone, I think you've chosen all the wrong tools.
The best way to do this would be to write a custom HTTP server that handles everything in memory. No databases, no constant polling of files. When a message arrives, you simply loop through the correct in-memory list and dispatch the message to other users. For the browser to server connection, I suggest "Comet" (with web sockets for browsers that support them, if you're feeling up to it).
PHP likely isn't the language of choice for this, because pretty much all work done with PHP is based on traditional short, isolated requests. For a long-running process which serves multiple clients in real time, I'd suggest something like Python or Node.js.
You don't really want to be storing chats in files, that can create a management nightmare, I would recommend you go with MySQL and to make sure it works probably go with Sockets instead of AJAX polling, Sockets will scale really well.
However there isn't much around about how you can integrate socket based chats with MySQL.
I have done a few tests and have a basic example working here: https://github.com/andrefigueira/PHP-MySQL-Sockets-Chat
It makes use of Ratchet (http://socketo.me/) for the creation of the chat server in PHP.
And you can send chat messages to the DB by sending the server JSON with the information of who is chatting, (if of course you have user sessions)
Why is HTTP-Polling so laggy?
What I have is a button, and whenever a user clicks it a MySQL database field gets updated and the value is displayed to the user. I'm polling every 800 milliseconds and it's very laggy/glitchy. Sometimes when clicking the button it doesn't register it. And I actually need to be polling quite a bit more frequent than every 800 milliseconds.
This is also with just 1 user on the website at a time... When in the end there is going to be many at once.
HTTP-streaming/Long-polling/Websockets instead of polling
When you need real-time information you should avoid polling(frequently). Below I would try to explain why this is wrong. You could compare it to a child in the back of your car screaming every second "are we there yet" while you are replying "we are not there yet" all the time.
Instead you would like to have something like long-polling/HTTP-streaming or websockets. You could compare this to a child in the back of your car telling you to let him know when "we are there" instead of asking us every second. You could imagine this is way more efficient then the previous example.
To be honest I don't think PHP is the right tool for this kind of applications(yet). Some options you have available are:
hosted solutions:
http://pusherapp.com:
Pusher is a hosted API for quickly,
easily and securely adding scalable
realtime functionality via WebSockets
to web and mobile apps.
Our free Sandbox plan includes up to
20 connections and 100,000 messages
per day. Simply upgrade to a paid plan
when you're ready.
http://beaconpush.com/
Beaconpush is a push service for
creating real-time web apps using
HTML5 WebSockets and Comet.
host yourself:
http://socket.io:
Socket.IO aims to make realtime apps
possible in every browser and mobile
device, blurring the differences
between the different transport
mechanisms
When becoming very big the "host yourself" solution is going to less expensive, but on the other hand using something like pusherapp will get you started easier(friendly API) and also is not that expensive. For example pusherapp's "Bootstrap" can have 100 concurrent connections and 200,000 messages per day for $19 per month(but when small beaconpush is cheaper => do the math :)). As a side-note this plan does not include SSL so can not be used for sensitive data. I guess having a dedicated machine(VPS) will cost you about the same amount of money(for a simple website) and you will also have to manage the streaming solution yourself, but when getting bigger this is probably way more attractive.
Memory instead of Disc
whenever a user clicks it a MySQL
database field gets updated and the
value is displayed to the user
When comparing disc I/O(MySQL in standard mode) to memory it is extremely slow. You should be using an in-memory database like for example redis(also has persistent snapshots) or memcached(completely in memory) to speed up the process. I myself really like redis for it's insane speed, simplicity and persistent snapshots. http://redistogo.com/ offers a free plan with 5MB of memory which will probably cover your needs. If not the mini plan of $5 a month will probably cover you, but when getting even bigger a VPS will be cheaper and in my opinion the prefered solution.
Best solution
The best solution(especially if you are getting big) is to host socket.io/redis yourself using a VPS(cost money). If really small I would use redistogo, if not I would host it myself. I would also start using something like beaconpush/pusherapp because of it's simplicity(getting started immediately). Hosting socket.io(advice to play with it on your own machine for when getting big) is pretty simple, but in my opinion more difficult than beaconpush/pusherapp.
Laggy/glitchy? Sounds like a client-side problem. As does the button thing. I'd get your JavaScript in order first.
As for polling, 0.8 sounds a bit time-critical. I don't know about most countries, but here in the third world simple network packets may get delayed for as long a few seconds. (Not to mention connection drops, packet losses and the speed of light.) Is your application ready to cope with all that?
As for an alternate approach, I agree with #Vern in that an interrupt-driven one would be much better. In HTTP terms, it translates to a long-standing HTTP request that does not receive a response until the server has some actual data to send, minimizing delay and bandwidth. (AFAIK) it's an older technique than AJAX, though has been named more recently. Search for "COMET" and you'll end up with both client- and server-side libraries.
there are many things that might cause the lag that you are experiencing. Your server might be able to process the requests fast enough, but if the connection between your client and the server is slow, then you'll see the obvious lag.
The first thing you should try is to ping the server and see what response time you're getting.
Secondly, rather than poll, you might want to consider an interrupt driven approach. This means that only when your server replies, will you send out your next request. This makes sense, so that many clients won't be flooding the server with requests till the point the server cannot cope. This is especially true, then the RTT (Round-Trip-Time) of your request is pretty long.
Hope it helps. Cheers!
A good place to start would be to use a tool like Firebug in Mozilla Firefox that will allow you to watch the requests being sent to the server and look for bottlenecks.
Firebug will break down each part of the request, so you can see if you are having trouble talking to the server or if it is simply taking a long time to come up with a response.
Along with #Vern's answer I would also say that if at all possible I would have the server cache the data ahead of time and then all of the clients will pull from that same cache and not need separate MySQL calls to reach the same data for every update. Then you just have your PHP update the cache whenever the actual DB data changes.
By cache I mean having php write to a file on the sever side, and then clients will simply look at the contents of that one file to see the most updated info. There might be better ways of caching, but being that I have never done this personally before, this is the first solution that popped into my mind.
I am building a "multiplayer world" with jQuery and PHP. Here is a bit how it works:
User's character's positions are taken from a database, user is plotted accordingly (the position values are CSS values - left and top)
User is able to move about using the arrow keys on the keyboard, making their character move using jQuery animations. While this is happening (on each arrow press) the user's position values are inserted into a database and updated.
In order to make this "global" (so users see each other) as you could say, the values need to be updated all at once for each user using AJAX
The problem I am having is I need to continuously call a JavaScript function I wrote which connects to the MySQL server and grabs values from a database table. And this function needs to be called constantly via setInterval(thisFunction, 1000); however my host just suspended me for overloading the server's resources and I think this was because of all my MySQL queries. And even after grabbing values from my database repeatedly, I had to insert values every few seconds as well so I can imagine that would cause a crash over time if enough clients were to login. How can I reduce the amount of queries I am using? Is there another way to do what I need to do? Thank you.
This is essentially the same thing as a chat system with regards to resource usage. Try a search and you'll find many different solution, including concepts like long polling and memcached. For example, check this thread: Scaling a chat app - short polling vs. long polling (AJAX, PHP)
You should look into long polling - http://en.wikipedia.org/wiki/Push_technology. This method allows you to establish a connection with your server, and then update it only when you need to. However by the sounds of it, you have a pretty intensive thing going on if you want to update every time, you may want to look into another way of storing this data, but if your wondering how big companies do it, they tend to have mass amounts of servers to handle it for them, but they will also use a technique similar to long polling.
You could store all the positions in memory using memcached See http://php.net/manual/fr/book.memcached.php and save them all at once every few seconds into the database (if needed).
You could use web sockets to overcome this problem. Check out this nettuts tutorial.
There is another way, it's to emulate or use actual sockets. Instead of constantly pulling the data (refreshing to check if there are new records), you can push the data over WebSockets which works in Chrome at the moment (at least to my knowledge, didn't try it in FF4) or you can use Node.js for leaner long pooling. That way, the communication between players will be bi-directional without the need of MySQL for storing positions.
Checkout Tornado
From their site:
Tornado is an open source version of the scalable, non-blocking web server and tools that power FriendFeed. The FriendFeed application is written using a web framework that looks a bit like web.py or Google's webapp, but with additional tools and optimizations to take advantage of the underlying non-blocking infrastructure.
The framework is distinct from most mainstream web server frameworks (and certainly most Python frameworks) because it is non-blocking and reasonably fast. Because it is non-blocking and uses epoll, it can handle thousands of simultaneous standing connections, which means it is ideal for real-time web services. We built the web server specifically to handle FriendFeed's real-time features — every active user of FriendFeed maintains an open connection to the FriendFeed servers. (For more information on scaling servers to support thousands of clients, see The C10K problem.)