Multiple file servers for one website's content

Multiple file servers for one website's content - php

I'm planning to use multiple file servers to host my website uploaded files. what's the best way to do it ? should I install a web server on other machines as well? or is there any special software for routing files on the network? what would you pros do?
Thanks,
Taher.

Here's one way you could do it...
Create a central routing handler specifically for grabbing files off the network and have your file servers named as sub domains pointing to your various file servers.
When a user clicks on the download link, e.g.
www.example.com/GetDownload.php?id=10
...the GetDownload.php page would look in the database to see where the file has been stored (assuming you're keeping track of the files locations in the database) or through whatever your convention is for keeping track of uploads, then determine the location of the file on your network. Then it could simply redirect the URL to the appropriate server/download folder. So GetDownload.php?id=10, upon finding the location of the file would redirect to the appropriate Server/URL:
AFile.doc is on FileServerB, redirect...
FileServerA.Example.com
Here! --> FileServerB.Example.com/A/AFile.doc
FileServerC.Example.com

You can also configure gluster and mount your glusterfs on the webserver ... You will aslo have a fault tolerant system.

Related

how to send header:location to out of web root file

I'm making a web application which will only allow registered members to download zip folders from a folders directory.
I really need to know which would be the proper way to secure the folder as only members stored in my database will be able to access them so the problem is if somebody finds the directory and a file name there's nothing to stop them accessing it.
I've been doing some research and found some approaches but they all have major drawbacks.
1.) put the files outside of the webroot then use readfile to send them the data.
This is how I have it currently set up. the major draw back is that I'm on a shared server and max execution time for the script is 30 seconds (can't be changed) and if the file is big or user connection slow the timeout will be called before the download is complete.
2.) htaccess and htpasswd inside a webroot directory.
The problem with this is I don't want to have to ask the user to put a password again. unless there's a way to allow php to send the password then send a header to the actual zip file that needs to be downloaded.
3.) Keeping the files in webroot but obfuscating the file names so they are hard to guess.
this is just totally lame!
What I really would like to do is keep the files outside of web root then just send a header:location to that document to force a download, obviously as it's not in web root so the browser won't see it. is there a way around this. Is there a way to redirect to an out of web root file with header:location('/file') to force a download. thus allowing apache to serve the file and not php with readfile.
Is there some easier way to secure the folders and serve with apache that I am just not coming across? Has anybody experienced this problem before and is there an industry standard way to do this better?
I know this may resemble a repeat question but none of the answers in the other similar question gave any useful information for my needs.

What I really would like to do is keep the files outside of web root then just send a header:location to that document to force a download, obviously as it's not in web root so the browser won't see it.
More to the point, it is outside the web root so it doesn't have a URL that the server can send in the Location header.
is there a way around this. Is there a way to redirect to an out of web root file with header:location('/file') to force a download.
No. Preventing the server from simply handing over the file is the point of putting it outside the web root. If you could redirect to it, then you would just be back in "hard to guess file name" territory with the added security flaw of every file on the server being public over HTTP.
Is there some easier way to secure the folders and serve with apache that I am just not coming across.
Your options (some of which you've expressed already in the form of specific implementations) are:
Use hard to guess URLs
Put the file somewhere that Apache won't serve it by default and write code that will serve it for you
Use Apache's own password protection options
There aren't any other approaches.
Is there some easier way to secure the folders and serve with apache that I am just not coming across.
No, there isn't an easier way (but that said, all three implementations you've described are "very easy").
Another approach, which I consider really dirty but might get around your resource constraints:
Keep the files outside the web root
Configure Apache to follow symlinks
On demand: Create a symlink from under the web root to the file you want to serve
Redirect to the URI of that symlink
Have a cron job running every 5 minutes to delete old symlinks (put a timestamp in the symlink filename to help with this)
It's effectively a combination of the first two options in my previously bulleted list.

Can I share the /uploads/ folder between multiple WordPress servers?

I have a wordpress blog which is replicated, i.e. 2 servers behind load balancer serve the same wordpress blog. I pointed the database on both servers to the same database so I have no problem there. However, when a user is forwarded (by the load balancer) to server-1 and uploads files, they are kept on server-1. The same goes for server-2. Those files are not shared between the 2 servers and therefore user who is forwarded to server-2 will not see the files (e.g. images) which were uploaded to server-1.
I read that the upload folder can be changed but "This path can not be absolute. It is always relative to ABSPATH".
What are the best practices to share the upload folder between servers?

Options:
Set something up to replicate files between servers. ie: rsync in a cron job
Mount a network share to the uploads folder on both servers.

You are already load balancing, why not get rid of some of the http load.
Move the uploads to something like s3
Here is one plugin for it http://wordpress.org/plugins/wp2cloud-wordpress-to-cloud/
Moving the rest of your static files, eg. theme & plugin files would also be good for the server load.

Inter-network File Transfers using PHP with polling

I am designing a web-based file-managment system that can be conceptualised as 3 different servers:
The server that hosts the system interface (built in PHP) where users 'upload' and manage files (no actual files are stored here, it's all meta).
A separate staging server where files are placed to be worked on.
A file-store where the files are stored when they are not being worked on.
All 3 servers will be *nix-based on the same internal network. Users, based in Windows, will use a web interface to create an initial entry for a file on Server 1. This file will be 'uploaded' to Server 3 either from the user's local drive (if the file doesn't currently exist anywhere on the network) or another network drive on the internal network.
My question relates to the best programmatic approach to achieve what I want to do, namely:
When a user uploads a file (selecting the source via a web form) from the network, the file is transferred to Server 3 as an inter-network transfer, rather than passing through the user (which I believe is what would happen if it was sent as a standard HTTP form upload). I know I could set up FTP servers on each machine and attempt to FXP files between locations, but is this preferable to PHP executing a command on Server 1 (which will have global network access), to perform a cross-network transfer that way?
The second problem is that these are very large files we're talking about, at least a gigabyte or two each, and so transfers will not be instant. I need some method of polling the status of the transfer, and returning this to the web interface so that the user knows what is going on.
Alternatively this upload could be left to run asyncrhonously to the user's current view, but I would still need a method to check the status of the transfer to ensure it completes.
So, if using an FXP solution, how could polling be achieved? If using a file move/copy command from the shell, is any form of polling possible? PHP/JQuery solutions would be very acceptable.
My final part to this question relates to windows network drive mapping. A user may map a drive (and select a file from), an arbitrarily specified mapped drive. Their G:\ may relate to \server4\some\location\therein, but presumably any drive path given to the server via a web form will only send the G:\ file path. Is there a way to determine the 'real path' of mapped network drives?
Any solution would be used to stage files from Server 3 to Server 2 when the files are being worked on - the emphasis being on these giant files not having to pass through the user's local machine first.
Please let me know if you have comments and I will try to make this question more coherant if it is unclear.

As far as I’m aware (and I could be wrong) there is no standard way to determine the UNC path of a mapped drive from a browser.
The only way to do this would be to have some kind of control within the web page. Could be ActiveX or maybe flash. I’ve seen ActiveX doing this, but not flash.
In the past when designing web based systems that need to know the UNC path of a user’s mapped drive I’ve had to have a translation of drive to UNC path stored server side. I did have a luxury though of knowing which drive would map to what UNC path. If the user can set arbitrary paths then this obviously won’t work.
Ok, as I’m procrastinating and avoiding real work I’ve given this some thought.
I’ll preface this by saying that I’m in no way a Linux expert and the system I’m about to describe has just been thought up off the top of my head and is not something you’d want to put into any kind of production. However, it might help you down the right path.
So, you have 3 servers, the Interface Server (LAMP stack I’m assuming?) your Staging Server and your File Store Server. You will also have Client Machines and Network Shares. For the purpose of this design your Network Shares are hosted on nix boxes that your File Store can scp from.
You’d create your frontend website that tracks and stores information about files etc. This will also hold the details about which files are being copied, which are in Staging and so on.
You’ll also need some kind of Service running on the File Store Server. I’ll call this the File Copy Service. This will be responsible for coping the files from your servers hosting the network shares.
Now, you’ve still got an issue with how you figure out what path the users file is actually on. If you can stop users from mapping their own drives and force them to use consistent drive letters then you could keep a translation of drive letter to UNC path on the server. If you can’t, well I’ll let you figure that out. If you’re in a windows domain you can force the drive mappings using Group Policies.
Anyway, the process for the system would work something like this.
User goes to system and selects a file
The Interface server take the file path and calls the File Copy Service on the File Store Server
The File Copy Service connects to the server that hosts the file and initiates the copy. If they’re all nix boxes you could easily use something like SCP. Now, I haven’t actually looked up how to do it but I’d be very surprised if you can’t get a running total of percentage complete from SCP as it’s copying. With this running total the File Copy Service will be updating the database on the Interface Server with how the copy is doing so the user can see this from the Interface Server.
The File Copy Service can also be used to move files from the File Store to the staging server.
As i said very roughly thought out. The above would work, but it all depends a lot on how your systems are set up etc.
Having said all that though, there must be software that would do this out there. Have you looked?

If iam right is this archtecture:
Entlarge image
1.)
First lets sove the issue of "inter server transfer"
I would solve this issue by mount the FileSystem from Server 2 and 3 to Server 1 by NFS.
https://help.ubuntu.com/8.04/serverguide/network-file-system.html
So PHP can direct store files on file system and dont need to know on which server the files realy is.
/etc/exports
of Server 2 + 3
/directory/with/files 192.168.IPofServer.1 (rw,sync)
exportfs -ra
/etc/fstab
of Server 1
192.168.IPofServer.2:/var/lib/data/server2/ /directory/with/files nfs rsize=8192,wsize=8192,timeo=14,intr
192.168.IPofServer.3:/var/lib/data/server3/ /directory/with/files nfs rsize=8192,wsize=8192,timeo=14,intr
mount -a
2.)
Get upload progress for realy large files,
here are some possibilitys to have a progress bar for http uploads.
But for a resume function you would have to use a flash plugin.
http://fineuploader.com/#demo
https://github.com/valums/file-uploader
or you can build it by your selfe using the apc extension
http://www.amwsites.com/blog/2011/01/use-a-combination-of-jquery-php-apc-uploadprogress-to-show-progress-bar-during-an-upload/
3.)
Lets Server load files from Network drive.
This i would try with a java applet to figurre out the real network path and send this to server, so the server can fetch the file in background.
But i never didt thinks like this before and have no further informations.

Allow logged in users to view and download files (some 250+ MB) that would normally be 403 access denied

I'm building a web server out of a spare computer in my house (with Ubuntu Server 11.04), with the goal of using it as a file sharing drive that can also be accessed over the internet. Obviously, I don't want just anyone being able to download some of these files, especially since some would be in the 250-750MB range (video files, archives, etc.). So I'd be implementing a user login system with PHP and MySQL.
I've done some research on here and other sites and I understand that a good method would be to store these files outside the public directory (e.g. /var/private vs. /var/www). Then, when the file is requested by a logged in user, the appropriate headers are given (likely application/octet-stream for automatic downloading), the buffer flushed, and the file is loaded via readfile.
However, while I imagine this would be a piece of cake for smaller files like documents, images, and music files, would this be feasible for the larger files I mentioned?
If there's an alternate method I missed, I'm all ears. I tried setting a folders permissions to 750 and similar, but I could still view the file through normal HTTP in my browser, as if I was considered part of the group (and when I set the permissions so I can't access the file, neither can PHP).
Crap, while I'm at it, any tips for allowing people to upload large files via PHP? Or would that have to be don via FTP?

You want the X-Sendfile header. It will instruct your web server to serve up a specific file from your file system.
Read about it here: Using X-Sendfile with Apache/PHP

That could indeed become an issue with large files.
Isn't it possible to just use FTP for this?
HTTP isn't really meant for large files but FTP is.

The soluton you mentioned is the best possible when the account system is handled via PHP and MySQL. If you want to keep it away from PHP and let the server do the job, you can protect the directory by password via .htaccess file. This way the files won't go through the PHP, but honestly there's nothing you should be worried about. I recommend you to go with your method.

php apache and temporary files

I have a web based application which server's content to authenticated users by interacting with a soap server. The soap server has file's which the user's need to be able to download.
What is the best way to serve these files to users? When a user requests a file, my server will make a soap call to the soap server to pull the file and then it will serve it to the user via referencing the link to it.
The question is that these temporary files need to be cleaned up at some point and my first thought was this being a linux based system, store them in /tmp/ and let the system take care of cleanup.
Is it possible to store these files in /tmp and have apache serve them
to the user?
If apache cannot access /tmp since it is outside of the web root, potentially I could create a symbolic link to /tmp/filename within the web root? (This would require cleanup of the symbolic links though at some point.)
Suggestions/comments appreciated on best way to manage these temporary files?
I am aware that I could write a script and have it executed as a cron job on
regular intervals but was wondering if there was a way similar to presented
above to do this and not have to handle deleting the files?

There's a good chance that Apache can read the tmp directory, but that approach smells bad. My approach would be to have PHP read the file and send it to the user. Basically, you send out the appropriate HTTP headers to indicate what type of content you're sending and what name to use for the file, and then you just spit out the file with echo (for example).
It looks like there's a good discussion of this in another question:
HTTP Headers for File Downloads
An additional benefit of this approach is that it leaves you in full control because there's PHP between a user and the file. This means you can add additional security measures (e.g., time-of-day controls), pull the file from various places to distribute bandwidth usage, and so on.
[additional material]
Sorry for not directly addressing your question. If you're using PHP to serve the files, they need not reside in the Apache web root, just where Apache/PHP has file-system read access to them. Thus, you can indeed simply store them in /tmp and let the OS clean them up for you. You might want to adjust the frequency of those clean-ups, however, to keep volume at the level you want.
If you want to ensure that access is reliably denied after a period of time or a certain number of downloads, you can store tracking information in your database (e.g., a flag on the user to indicate that they've downloaded the file), and then check it with your download script and possibly deny the download. This effectively separates security of access from frequency of cleanup, two things you may want to adjust independently.
Hope that's more helpful....

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.