I did some Google searches and can't seem to find what i want. I'm designing my web site to use MYSQL, PHP Web Servers. multiple web servers with load balancers and a MySql Custer for scaling is planed so far. But then i get to images/videos/mp3s. I need a file system multiple servers can read files from and write files to. So one web server can run the MySQL, Networked File System and Web Server, but as the site scales the site can be switched to multiple servers. Does anyone have any examples, tutorials or resources to help me on this? The site runs on Ubuntu Servers. My original idea was to just store the images in MySQL(I know how to do that and have working examples) so all servers could read/write but other people told me thats a bad idea and i should use a file system(but don't want to use the local one, as i don't think it san scale for large sites).
There are Three systems that come to mind - Mogilefs, Mongodb GridFS and a cloud based storage solution.
MogileFS (OMG Files!) was developed for Livejournal and stores metadata in Mysql. It uses that to find the actual disk with the appropriate file and streams it out.
MongoDB GridFS is a lot newer, and probably easier to get going, certainly for a smaller system. It uses a new 'NoSql' database to store parts of files across its database, assembling as required. Searching around for information will find plenty of information.
Finally, you could simply avoid the whole issue and just upload images into Amazon's S3, or Rackspace Cloudfiles. I've done the latter before (though the site was already running inside Rackspace's system) and it's not very difficult, again with plenty of examples around.
For S3 there is also a command-line tool, s3cmd that can be set to sync (or, better) upload and then delete a directory full of files into an S3 'bucket'.
First storing images/large files is not really possible with MySQL because of the maximum size limitation
To quote this answer Choosing data type for MySQL?
MySQL is incapable of working with any data that is larger than max_allowed_packet (default: 1M) in size, unless you construct complicated and memory intense workarounds at the server side. This further restricts what can be done with TEXT/BLOB-like types, and generally makes the LARGETEXT/LARGEBLOB type useless in a default configuration.
Now for storage and upgrade compatibility why not just store them on an NAS or Raid system that you can continue to tack drives onto. Then in your DB just store a path to the file. Much lest db intensive and allows for decent scalability.
Related
I am in the process of developing an application (in Go or possibly in PHP) where users needs to upload photos and images.
I have setup up a couple of ZFS (mirror) storage servers on different locations, but I am in doubt about how to let users best upload files. ZFS handles quotas and reservation.
I am running with a replicated Galera database on all servers, both for safety, but also for easy access to user accounts from each server. In other words, each server has a local copy of the database all the time. All users are virtual users only.
So far I have testes the following setup options:
Solution 1
Running SFTP (ProFTPD with module) or FTPS (Pure-FTPs with TLS) on the storage servers with virtual users.
This gives people direct access to the storage servers using a client like Filezilla. At the same time users can also upload using our web GUI from our main web server.
One advantage with this setup is that the FTP server can handle virtual users. Our web application will also send files via SFTP or FTPS.
One disadvantage is that FTP is meeh, annoying to firewall. Also I much prefer FTP over SSH (SFTP), rather than FTP over TLS (FTPS). However, only ProFTPD has a module for SSH, but it has been a real pain (many problems with non-working configuration options and file permission errors) to work with compared to PureFTPd, but PureFTPd only supports TLS.
Running with real SSH/SCP accounts and using PAM is not an option.
Solution 2
Mount the storage servers locally on the web server using NFS or CIFS (Samba is great at automatic resume in case a box goes down).
In this setup users can only upload via our main web server. The web server application, and the application running on the storage servers, then needs to support resumable uploads. I have been looking into using the tus protocol.
A disadvantage with both the above setups is that storage capacity needs to be managed somehow. When storage server 1 reaches its maximum number of users, the application needs to know this and then only create virtual users for storage server 2, 3, etc.
I have calculated how many users each storage server can hold and then have the web application check the database with virtual users to see when it needs to move newly created users to the next storage server.
This is rather old school, but it works.
Solution 3
Same as solution 2 (no FTP), but clone our web application upload thingy to each storage server and then redirect users (or provide them with a physical link to the storage server, s1.example.com, s2.example.com, etc.)
The possibly advantage with this setup is that users upload directly to the storage server they have been assigned to rather than go trough our main web server (preventing it from becoming a possible bottleneck).
Solution 4
Use GlusterFS on the storage servers and build a cluster that can easily be expanded. I have tested out GlusterFS and it works very well for this purpose.
The advantage with this setup is that I don't really need to care about where files physically go on which storage servers, and I can easily expand storage by adding more servers to the cluster.
However, the disadvantage here is again that our main web server might become a bottleneck.
I have also considered adding a load balancer and then use multiple web server in case our main web server becomes a bottleneck for uploading files.
In any case, I much prefer to keep it simple! I don't like adding stuff. I want it to be easy to maintain in the long run.
Any ideas, suggestions, and advice will be greatly appreciated.
How do you do it?
A web application should be agnostic of the underlying storage in case we are talking of file storage; Separation of concerns.
(S)FTP(S) on the other hand is not a storage method. It is a communication protocol. It does not preclude you from having a shared storage. See above.
ZFS does not come with the ability of shared storage included, so you are basically down to the following choices:
Which underlying filesystem?
Do I want to offer an additional access mode via (S)FTP(S)?
How do I make my filesystem available across multiple servers? GlusterFS, CIFS or NFS?
So, let us walk this through.
Filesystem
I know ZFS is intriguing, but here is the thing: xfs for example already has a maximum filesystem size of 8 exbibytes minus one byte. The specialist term for this is "a s...load". To give you a relation: The library of congress holds about 20TB of digital media - and would fit into that roughly 400k times. Even good ol' ext4 can hold 50k LoCs. And if you hold that much data, your FS is your smallest concern. Building the next couple of power plants to keep your stuff going presumably is.
Gist Nice to think about, but use whatever you feel comfortable with. I personally use xfs (on LVM) for pretty much everything.
Additional access methods
Sure, why not? Aside from the security nightmare (privilege escalation, anyone?). And ProFTPd, with it's in build coffee machine and kitchen sink is the last FTP server I would use for anything. It has a ginormous code base, which lends itself to accidentally introducing vulnerabilities.
Basically it boils down to the skills present in the project. Can you guys properly harden a system and an FTP server and monitor it for security incidents? Unless your answer is a confident "Yes, ofc, plenty with experience with it!" you should minimize the attack surface you present.
Gist Don't, unless you really know what you are doing. And if you have to ask, you probably do not. No offense intended, just stating facts.
Shared filesystem
Personally, I have made... less than perfect experiences with GlusterFS. The replication has quite some requirements when it comes to network latency and stuff. In a nutshell: if we are talking of multiple availability zones, say EMEA, APAC and NCSA, it is close to impossible. You'd be stuck to georeplication, which is less than ideal for the use case you describe.
NFS and CIFS on the other hand have the problem that there is no replication at all, and all clients need to access the same server instance in order to access the data - hardly a good idea if you think you need an underlying ZFS to get along.
Gist Shared filesystems at a global scale with halfway decent replication lags and access times are very hard to do and can get very expensive.
Haha, Smartypants, so what would you suggest?
Scale. Slowly. In the beginning, you should be able to get along with a simple FS based repository for your files. And then check various other means for large scale shared storage and migrate to it.
Taking the turn towards implementation, I would even go a step further, you should make your storage an interface:
// Storer takes the source and stores its contents under path for further reading via
// Retriever.
type Storer interface {
StreamTo(path string, source io.Reader) (err error)
}
// Retriever takes a path and streams the file it has stored under path to w.
type Retriever interface {
StreamFrom(path string, w io.Writer) (err error)
}
// Repository is a composite interface. It requires a
// repository to accept andf provide streams of files
type Repository interface {
Storer
Retriever
Close() error
}
Now, you can implement various storage methods quite easily:
// FileStore represents a filesystem based file Repository.
type FileStore struct {
basepath string
}
// StreamFrom statisfies the Retriever interface.
func (s *FileStore) StreamFrom(path string, w io.Writer) (err error) {
f, err := os.OpenFile(filepath.Join(s.basepath, path), os.O_RDONLY|os.O_EXCL, 0640)
if err != nil {
return handleErr(path, err)
}
defer f.Close()
_, err = io.Copy(w, f)
return err
}
Personally, I think this would be a great use case for GridFS, which, despite its name is not a filesystem, but a feature of MongoDB. As for the reasons:
MongoDB comes with a concept called replica sets to ensure availability with transparent automatic failover between servers
It comes with a rather simple mechanism of automatic data partitioning, called a sharded cluster
It comes with an indefinite number of access gateways called mongos query routers to access your sharded data.
For the client, aside from the connection URL, all this is transparent. So it does not make a difference (almost, aside from read preference and write concern) whether it's storage backend consists of a single server or a globally replicated sharded cluster with 600 nodes.
If done properly, there is not a single point of failure, you can replicate across availability zones while keeping the "hot" data close to the respective users.
I have created a repository on GitHub which contains an example of the interface suggestion and implements a filesystem based repository as well as a MongoDB repository. You might want to have a look at it. It lacks caching at the moment. In case you would like to see that implemented, please open an issue there.
I'm planning a system for serving image files from a server cluster with load-balancing. I'm battling with the architechture and whether to save the actual image files as blobs in the database or in filesystem.
My problem is that, the database connection is required anyways as the users need to be authenticated. Different users have access only to contents of their friends and items uploaded by themselves. Since the connection is required anyways, maybe the images could be retrieved from there aswell?
Images should be stored with no single point of failure. And obviously, the system should be fast.
For database approach:
The database is separate from rest of my application, so my applications main database won't get bloated by all the images. Database would be easy to scale as I just need to add more servers to the cluster. Problem is, that I've heard this might be a slow system from a website with millions, even billions of photos.
For filesystem:
I would be really interested in knowing how could one design a system, where the webservers are load balanced, and none of them is too important for the overall system. All the servers should use a common storage, so they can access the same files in the cluster.
What do you think? Which is the best solution in this case?
What kind of overall architechture and servers would you recommend for a image serving cluster? Note: This cluster only serves images. Applications servers are a whole different story.
I definitely wouldn't store them in the database. If you need to use PHP for authentication, then do that as quickly as possible and use X-SendFile to hand over the actual image serving to your web server.
For the filesystem it sounds like MogileFS would be a good fit.
For the web server I'd suggest nginx. If you can adapt your authentication mechanism to use one of the existing modules, or write your own module for it, you could omit PHP completely (there's already a MogileFS client module).
I am currently working on configuring my CakePHP (1.3) based web app to run in a HA Setup. I have 4 web boxes running the app itself a MySQL cluster for database backend. I have users uploading 12,000 - 24,000 images a week (35-70 GB). The app then generates 2 additional files from the original, a thumbnail and a medium size image for preview. This means a total of 36,000 - 72,000 possible files added to the repositories each week.
What I am trying to wrap my head around is how to handle large numbers of static file request coming from users trying to view these images. I mean I can have have multiple web boxes serving only static files with a load-balancer dispatching the requests.
But does anyone on here have any ideas on how to keep all static file servers in sync?
If any of you have any experiences you would like to share, or any useful links for me, it would be very appreciated.
Thanks,
serialk
It's quite a thorny problem.
Technically you can get a high-availability shared directory through something like NFS (or SMB if you like), using DRBD and Linux-HA for an active/passive setup. Such a setup will have good availability against single server loss, however, such a setup is quite wasteful and not easy to scale - you'd have to have the app itself decide which server(s) to go to, configure NFS mounts etc, and it all gets rather complicated.
So I'd probably prompt for avoiding keeping the images in a filesystem at all - or at least, not the conventional kind. I am assuming that you need this to be flexible to add more storage in the future - if you can keep the storage and IO requirement constant, DRBD, HA NFS is probably a good system.
For storing files in a flexible "cloud", either
Tahoe LAFS
Or perhaps, at a push, Cassandra, which would require a bit more integration but maybe better in some ways.
MySQL-cluster is not great for big blobs as it (mostly) keeps the data in ram; also the high consistency it provides requires a lot of locking which makes updates scale (relatively) badly at high workloads.
But you could still consider putting the images in mysql-cluster anyway, particularly as you have already set it up - it would require no more operational overhead.
I have a file host website thats burning through 2gbit of bandwidth, so I need to start adding secondary media servers to store the files. What would be the best way to manage a multiple server setup, with a large amount of files? Preferably through php only.
Currently, I only have around 100Gb of files... so I could get a 2nd server, mirror all content between them, and then round robin the traffic 50/50, 33/33/33, etc. But once the total amount of files grows beyond the capacity of a single server, this wont work.
The idea that I had was to have a list of media servers stored in the DB with the amounts of free space left on each server. Once a file is uploaded, php will choose to which server the file is actually uploaded to, and spread out all the files evenly among the servers.
Was hoping to get some more input/inspiration.
Cant use any 3rd party services like Amazon. The files range from several bytes to a gigabyte.
Thanks
You could try MogileFS. It is a distributed file system. Has a good API for PHP. You can create categories and upload a file to that category. For each category you can define on how many servers it should be distributed. You can use the API to get a URL to that file on a random node.
If you are doing as much data transfer as you say, it would seem whatever it is you are doing is growing quite rapidly.
It might be worth your while to contact your hosting provider and see if they offer any sort of shared storage solutions via iscsi, nas, or other means. Ideally the storage would not only start out large enough to store everything you have on it, but it would also be able to dynamically grow beyond your needs. I know my hosting provider offers a solution like this.
If they do not, you might consider colocating your servers somewhere that either does offer a service like that, or would allow you install your own storage server (which could be built cheaply from off the shelf components and software like Freenas or Openfiler).
Once you have a centralized storage platform, you could then add web-servers to your hearts content and load balance them based on load, all while accessing the same central storage repository.
Not only is this the correct way to do it, it would offer you much more redundancy and expandability in the future if you endeavor continues to grow at the pace it is currently growing.
The other solutions offered using a database repository of what is stored where, would work, but it not only adds an extra layer of complexity into the fold, but an extra layer of processing between your visitors and the data they wish to access.
What if you lost a hard disk, do you lose 1/3 or 1/2 of all your data?
Should the heavy IO's of static content be on the same spindles as the rest of your operating system and application data?
Your best bet is really to get your files into some sort of storage that scales. Storing files locally should only be done with good reason (they are sensitive, private, etc.)
Your best bet is to move your content into the cloud. Mosso's CloudFiles or Amazon's S3 will both allow you to store an almost infinite amount of files. All your content is then accessible through an API. If you want, you can then use MySQL to track meta-data for easy searching, and let the service handle the actual storage of the files.
i think your own idea is not the worst one. get a bunch of servers, and for every file store which server(s) it's on. if new files are uploaded, use most-free-space first*. every server handles it's own delivery (instead of piping through the main server).
pros:
use multiple servers for a single file. e.g. for cutekitten.jpg: filepath="server1\cutekitten.jpg;server2\cutekitten.jpg", and then choose the server depending on the server load (or randomly, or alternating, ...)
if you're careful you may be able to move around files automatically depending on the current load. so if your cute-kitten image gets reddited/slashdotted hard, move it to the server with the lowest load and update the entry.
you could do this with a cron-job. just log the downloads for the last xx minutes. try some formular like (downloads-per-minutefilesize(product of serverloads)) for weighting. pick tresholds for increasing/decreasing the number of servers those files are distributed to.
if you add a new server, it's relativley painless (just add the address to the server pool)
cons:
homebrew solutions are always risky
your load distribution algorithm must be well tested, otherwise bad things could happen (everything mirrored everywhere)
constantly moving files around for balancing adds additional server load
* or use a mixed weighting algorithm: free-space, server-load, file-popularity
disclaimer: never been in the situation myself, just guessing.
Consider HDFS, which is part of Apache's Hadoop. This will integrate with PHP, but you'll be setting up a second application. This will also solve all your points of balancing among servers and handling things when your file space usage exceeds one server's ability. It's not purely in PHP, though, but I don't think that's what you meant when you said "pure" anyway.
See http://hadoop.apache.org/core/docs/current/hdfs_design.html for the idea of it. They cover the whole idea of how it handles large files, many files, replication, etc.
I have a simple question and wish to hear others' experiences regarding which is the best way to replicate images across multiple hosts.
I have determined that storing images in the database and then using database replication over multiple hosts would result in maximum availability.
The worry I have with the filesystem is the difficulty synchronising the images (e.g I don't want 5 servers all hitting the same server for images!).
Now, the only concerns I have with storing images in the database is the extra queries hitting the database and the extra handling i'd have to put in place in apache if I wanted 'virtual' image links to point to database entries. (e.g AddHandler)
As far as my understanding goes:
If you have a script serving up the
images: Each image would require a
database call.
If you display the images inline as
binary data: Which could be done in
a single database call.
To provide external / linkable
images you would have to add a
addHandler for the extension you
wish to 'fake' and point it to your
scripting language (e.g php, asp).
I might have missed something, but I'm curious if anyone has any better ideas?
Edit:
Tom has suggested using mod_rewrite to save using an AddHandler, I have accepted as a proposed solution to the AddHandler issue; however I don't yet feel like I have a complete solution yet so please, please, keep answering ;)
A few have suggested using lighttpd over Apache. How different are the ISAPI modules for lighttpd?
If you store images in the database, you take an extra database hit plus you lose the innate caching/file serving optimizations in your web server. Apache will serve a static image much faster than PHP can manage it.
In our large app environments, we use up to 4 clusters:
App server cluster
Web service/data service cluster
Static resource (image, documents, multi-media) cluster
Database cluster
You'd be surprised how much traffic a static resource server can handle. Since it's not really computing (no app logic), a response can be optimized like crazy. If you go with a separate static resource cluster, you also leave yourself open to change just that portion of your architecture. For instance, in some benchmarks lighttpd is even faster at serving static resources than apache. If you have a separate cluster, you can change your http server there without changing anything else in your app environment.
I'd start with a 2-machine static resource cluster and see how that performs. That's another benefit of separating functions - you can scale out only where you need it. As far as synchronizing files, take a look at existing file synchronization tools versus rolling your own. You may find something that does what you need without having to write a line of code.
Serving the images from wherever you decide to store them is a trivial problem; I won't discuss how to solve it.
Deciding where to store them is the real decision you need to make. You need to think about what your goals are:
Redundancy of hardware
Lots of cheap storage
Read-scaling
Write-scaling
The last two are not the same and will definitely cause problems.
If you are confident that the size of this image library will not exceed the disc you're happy to put on your web servers (say, 200G at the time of writing, as being the largest high speed server-grade discs that can be obtained; I assume you want to use 1U web servers so you won't be able to store more than that in raid1, depending on your vendor), then you can get very good read-scaling by placing a copy of all the images on every web server.
Of course you might want to keep a master copy somewhere too, and have a daemon or process which syncs them from time to time, and have monitoring to check that they remain in sync and this daemon works, but these are details. Keeping a copy on every web server will make read-scaling pretty much perfect.
But keeping a copy everywhere will ruin write-scalability, as every single web server will have to write every changed / new file. Therefore your total write throughput will be limited to the slowest single web server in the cluster.
"Sharding" your image data between many servers will give good read/write scalability, but is a nontrivial exercise. It may also allow you to use cheap(ish) storage.
Having a single central server (or active/passive pair or something) with expensive IO hardware will give better write-throughput than using "cheap" IO hardware everywhere, but you'll then be limited by read-scalability.
Having your images in a database doesn't necessarily mean a database call for each one; you could cache these separately on each host (e.g. in temporary files) when they are retrieved. The source images would still be in the database and easy to synchronise across servers.
You also don't really need to add Apache handlers to serve an image through a PHP script whilst maintaining nice urls- you can make urls like http://server/image.php/param1/param2/param3.JPG and read the parameters through $_SERVER['PATH_INFO'] . You could also remove the 'image.php' portion of the URL (if you needed to) using mod_rewrite.
What you are looking for already exists and is called MogileFS
Target setup involves mogilefsd, replicated mysql databases and lighttd/perlbal for serving files; It will bring you failover, fine grained file replication (for exemple, you can decide to duplicate end-user images on several physical devices, and to keep only one physical instance of thumbnails). Load balancing can also be achieved quite easily.