Multiple chat rooms, technologies and server load, - php

I am working on a website where I need multiple chat rooms. Each room will be generated and tied to the content/event just generated by the user and will aproximately last 15 minutes and then autodestroyed. One room will contain aprox. 15 users. Many events will co-exist at the same time.
I have been googling around a lot recently looking for the most appropriate but still easily implementable solution. What I came across are three main paths:
using a mysql-php ready made script (ie. phpfreechat, x7 chat)
using irc channels created on the fly through an ajax or flash web frontend embeded as an iframe (lightirc, qwebirc, mibbit, (advice more if u want))
using some lively generated hosted chatrooms (ie. everywherechat.com, gabbly (i really need to know more of those, again, advice if you can).
Still I have many doubts on all this: irc frontends cant be customized as much as I'd need (ie. disable kick or ban functionalities). Php-mysql solutions, on the other hand, make me think there will be a lot of server load concerns with such a big user-base.
I am on a paid but non dedicated hosting plan. Php memory limit set to 64mb kind of stuff. How many users at the same time would that hold? Hosted solutions are the easiest ones, but can you recommend one that allows rooms to be created on the fly by just passing in some parameters? (no registration, just a quick-easy-flying chatroom).
I know many of these questions are not strictly programming related but more technology related. But still a programming related answer could help me sort out the right direction. So your contribution is really appreciated. Thanks ! Sorry for being long

Related

facebook/gmail alike web chatbox - what is a good way for nowadays chatapp to store text message?

I'm currently building a facebook alike chatbox, and I have encounter several considerations and problems along the way.
I had been googling useful resources all the time,like simple chatbox example or tutorial online.
My goal is to build one just like facebook/gmail chatbox and CometChat, I know it's hard and too much thing to scale behind the scene, but all I want to do is building it as simple as possible, and figuring out how facebook/gmail chatbox implement their chat functionality.
Progress:
I have finished facebook-like chatbox structure where I have sidebar at the right displaying online friends i can chat with, and popup chatbox at the bottom, and it is able to expand and minimize it.
I also have finished simple chatting based on MySQL database.
There's a table with 4 columns 'sender', 'receiver', 'message', 'time' for storing conversation.
My chatbox works this way:
1.The user send a message, and my front-end javascript will fetch the message the user type in and send the message to php file on the server via Ajax.
2. backend php file will store this message to MySQL.
3. The front-end will call the update function every 3 seconds to update the chatbox content if receiver send message to the sender, and show it out in frontend's chat.
I'm not sure this is a good way and long way to do, and I'm really concerned about it.
If users grow and grow, I have to think of ways to scale it well or my database and server will explode and frontend users might feel high latency in updating conversation.
Is BigTable a right way to do this if you have millions of users online?
How does facebook store their customer's text message or chat history in the backend well??
How does chat app like Whatapp store their text message?
Is it able to let the users chat directly to another user without storing state in server?
If I want to implement the chat history functionality in my chatbox, what is a good way to do ??
I am thinking server can create .txt file for each conversation in their file system, and it has a database table column to store the file path. Is this a good way and right way to do with chat history, I know its possible to do it this way, but im not sure if its a right way or good way.
I know this could be a huge, detailed application.
I'm asking not a detailed implementation but a big picture, concept of building it!
thank you!.
That's a good question and here's an attempt at answering it.
I believe you are thinking about scalability a bit too early. Your IM app might not reach the projected number of users for it to stop performing well. Consider enhancing your small product and scale as you go as much as is needed.
Disk I/O is one of the issues that you will face scaling your web application. Storing communication directly onto the disk with txt file might not be a reliable solution.
Push your technology stack to its limits before considering changing it or switching to something else. I assume you are using a relational database for your storage (since you mentioned columns and rows, which is not an ultimate indicator but still), there are other options out there that have good benchmarking results at the expense of multiple other compromises. (NoSQL: which you referred to as BigTable) is one option. Relational databases are great, they have been for quite a long time the industry standard but currently there are alternative solutions that are quite promising.
Look into NoSQL document based datastorage solutions such as MongoDB, CoucheDB or even Casandra and there are many others. There is a considerable amount of information about the performance of each, under specific circumstances and situations. Choose what is best for the problem at hand and not what is most fashionable or hipped.
Another option would be to outsource your scalability problems to a 3rd Party provider such as Firebase. In this situation all you have to worry about is your product and not what's happening under the hood.
Store only the data that you need and archive or dismiss what you don't.
With scalability there are generally 2 broad categories: Horizontal and Vertical scaling.
Horizontal: means adding more nodes to your system i.e. adding more server instances to handle the extra load. There are many cloud solution providers out there that make this genre of scaling very cheap and instantaneous.
Vertical: means adding more resources to the node you are currently running your app from in addition to use specific technologies that allow you to take full advantages of your resources. This optimization happens on the level of the instance resources (i.e. CPU, RAM, Disk Space etc...) and your data storage, programming language of choice, algorithms you are using etc... You might realize that php and mysql aren't the tools for this job, but that's arguable.
Read More about it here
Distributed Systems architects / programmers also take advantage of other (faster) programming languages at runtime (such as C, C++ or even Java) to speed up certain tasks. Look into how you can dissect your application into smaller decoupled modules / components that can run independently. (But i'm not sure if you will ever reach this stage with an IM client unless it becomes as popular as Whatsapp or Facebook chat).
I advise you to grab and read a couple of books about scaling web applications and leveraging cloud computing. Study scalable architectures and design your application depending on your business logic based on them.
This is a very broad and complex topic, I'm sure others might have additional interesting insight on the matter.

To have multiple sub-domains or multiple separate domains?

My client has a host of Facebook pages that have become very successful. In order to move away from big brother Facebook my client wishes to create a large dynamic site that incorporates the more successful parts of the Facebook empire.
One of my client's spin off sites has been created and is getting a lot of traffic. I'm not sure exactly how much but it hit 90 Gigs in a month as the space allocated need to be increased.
In any case my client has dreamed up a massive website with its own community looking to put the community under the one banner. However I am concerned that it will get thrashed, bottlenecks, long load time, etc.
My questions:
Will a managed dedicated server be able to handle a potentially large amount of traffic?
Is it going to be better to create various parts of the empire in their own separate hosting and domain (normal hosting or VPS), or is it better to have them all under the one hood (i.e. using sub-domains).
If they were all together would it be better for SEO and easier to manage? Or if they are separate, they may be quicker but would it need some sort of Passport user system so people can log into any of the website with the same user details?
Whats the best way to implement a Passport style user system? Do you remotely connect to databases? Or run a regular a Cron job that updates each individual user details on each domain? Maybe run CURL request to the other site given then any new data?
Any other Pros/Cons to keeping all the section together or separating them?
Large site like Facebook manages to have everything under the one root. Then sites like eBay have separate domain names but you can use the same user login across all of them.
I'm not sure what the best option is and would appreciate any guidance.
It is a very general question but to give some hints:
Measure, measure and measure again. Know what kind of parts are used heavily and which are not.
Fix things and go back to 1.
Really: Without knowing what takes lots of time, what is used most heavily etc. you cannot say anything usefull.
VPS or dedicated servers are not the right question. You start with: What do I have to do for the users. Then: How am I going to do that? (for example: in database, in scripts, in message queue) and then finally you see how much hardware you need.
One or multiple domains doesn't really matter. Though one exception: For static content it might be interesting if you have lots of it to use a CDN like Amazon. Read for example: http://highscalability.com/blog/2011/12/27/plentyoffish-update-6-billion-pageviews-and-32-billion-image.html where you can read some things about the possibilities with a CDN.
In general serving static content from a static domain is useful many other things don't really need that. So there you could just consider all in one domain.

Legal script that scrapes and indexes?

I want to create a website that scrapes certain websites (specified by me) to collect data and pricing and then offer that data as search results on my own site. So basically like a search engine, but for specific sites, indexed in a specific way. I can write this myself, but would like to know:
Is it legal? Can I grab for example, all the items off ebay, put it in a search engine and allow users to search ebay using my site?
What if I make money off this?
Are there any popular PHP scripts that already do this?
The legal aspect has been covered. I found a way around this (well, I got permission from the persons creating the content)... so the only real question is: what can I use to crawl the content, especially keeping in mind, each site will have diffrent rules that I will have to set up? It must also be clever enough to not spider the same content twice?
Is it legal?
Yes. And no. Probably.
There isn't one set of laws covering the entire planet, and SO isn't really for legal advice, you need to find a lawyer in your jurisdiction.
My own thoughts are that you would probably be okay in most jurisdictions as long as you use only the information. So, no eBay logos, no representations that you may be associated with them and so on.
But I am not a lawyer (though I deal a lot with the US sub-species as part of my work), certainly not your lawyer, and this advice (which isn't legal advice) is worth every cent you paid for it, which is ZERO!
What if I make money of this?
Good for you :-) Make mega-bucks. But see above point.
Are there any popular PHP scripts that already do this?
That's the bit I can't answer. My experience with PHP ranges somewhere between zero and nothing.
The legality is a bit shady in this area. You should look for the presence of a robots.txt ( http://www.robotstxt.org/robotstxt.html ) file to first determine if the website welcomes web spiders.
Also, there is a very good PHP search script called sphider ( http://www.sphider.eu/ ), you should have a look at.
EDIT:
I can't see many websites having an issue with you taking snippets of their website and then linking users onto the webpage which the content came from. However, if you plan on just taking all their content and displaying it on your own website in order to make profit, I can only assume many web sites would have an issue as they are the ones who should be profiting off the content.
1) Is it legal? Can I grab for example, all the items off ebay, put it in a search engine and allow users to search ebay using my site?
This is technically feasible. You can build a PHP script that does this quite easily. I would say that it is borderline illegal however, because by scraping content from somebody elses site you will be using their intellectual property, their data without permission.
2) What if I make money off this?
Then the original owners of the data are very likely to come after you, issue a cease and desist notice then sue you. An organization as large as ebay could do this without blinking.
3) Are there any popular PHP scripts that already do this?
Because of the questionable legal nature of your question, I highly doubt there are any scripts that already do this.
The correct technique of getting data from ebay and other large data providers is by using APIs, or application programming interfaces. These are special protocols, languages, designed for programs to communicate with each other. This has the benifit of being significantly more efficient than page-scraping, while also being a known legal way to get data from a provider.
More information about the ebay specific API can be found here; http://developer.ebay.com/common/api/

API implementation for my social network

I know about code development using PHP but not much about modern day web APIs. I want to implement a framework of APIs like Facebbok connect. Myspace connect, Google connect etc for 2 purposes:
1) Users can upload photos to their photo album
2) Other websites can login users using authentication from my site (similar to facebook/Google connect).
So firstly, what is the underline technology / server requirements etc to implement this? Can i use PHP? Then what other schema changes are required? I see facebook has public API keys that other developers use for this. But I am not sure on the implementation.
Probably the best place to start is to read up on http://oauth.net/. I know there are several OAuth implementations in several languages like Java and .NET. I'm confident there is something for PHP (since Facebook is primarily PHP). Just have to hit the google.
I disagree. People use facebook because there is no option out there. I use it too (purely to show off to my friends what a loser I am on weekends when I sit at home and keep posting updates of how awesome my day was). But if i find a better network (not in terms of features but pure trust, a social network that actually respects its users and the information people share, I will switch. Creating an account is no big deal, takes 1 minute. But trust is a life time thing. Once broken it rarely ever comes back. I see it like a relationship. When i created my facebook account 4 yrs ago i was in a relationship with facebook. They betrayed me time after time, year after year and I have no respect for it now. It is like that partner who has cheated on you so many times that today you wish it would just die and fade away. If I find something better and I am out, and all my friends whom I know well enough share my views too. So you will get users, no doubt.
I like your idea of trying to create something, this is how we grow. If everyone thinks like these other people here then there will be no progress in the world. Everyone will be a follower only and not a leader. Google would have said there is a yahoo and microsoft which is huge let's just follow them. But they took their time, fine tuned their model and today they are bigger than these these brands. Of course it is a different story they are a bigger offender of being a big brother than facebook but with power, 99% of the time comes these unethical minds who want to take over the world. If you can fall in the 1% who can have power and remain true to your users, people will follow you in a true sense.

Architectural advice on connecting multiple diverse sites into a single community

I've been given a task to connect multiple sites of the same client into a single network. So i would like to hear an architectural advice on connecting these sites into a single community.
These sites include:
1. Invision Power Board Forum (the most important site)
2. 3 custom made cms-s (changes to code allowable)
3. 1 drupal site
4. 3-4 wordpress blogs
Requirements are as follows:
1. Connecting all users of all sites into a single administrable entity. With permissions changing ability, users banning etc.
2. Later on, based on this implementation I have to implement "facebook like" chat, which will be available to all users regardless of place of login.
I have few ideas on my mind on how to go with this, but would like to hear some people with more experience and expertize than my self.
Cheers!
You're going to have one hell of a time. Each of those site platforms has a very disparate user architecture: there is no way to "connect" them all together fluidly without numerous codebase changes. You're looking at making deep changes to each of those platforms to communicate with a central database, likely modifying thousands (if not tens of thousands) of lines of code.
On top of the obvious (massive) changes to all of the platforms, you're going to have to worry about updates: what happens when a new version of Wordpress is released? You'd likely have to update all of your code manually (since you can't just drop in the changes). You'd also have to make sure that all of the code changes are compatible with your current database. God forbid one of the platforms starts storing user information differently---you'd have to make more massive code changes. This just isn't maintainable.
Your alternative (and best bet) is to have some sort of synchronization job that runs every hour or so: iterate through each user in each database and compare it to see if it both exists and is up-to-date in the other databases. If not, push the changes out. The problem with this is that it will get significantly slower as you get more and more users.
Perhaps another alternative is to simply offer a custom OpenID implementation. I believe that Drupal and Wordpress both have OpenID plugins that you can take advantage of. This way, you could allow your users to sign in with a pseudo-single sign-on service across your sites. The downside is that users could opt not to use it.
Good luck

Categories