I am programming a service that has to force the download of a file.
I know the possibility of setting the HTTP-headers using PHP and then sending the file using the readfile-function. But I think this is not a good way for sending larger files because it would need a lot of server performance and the maximum execution time of the PHP-scripts would be exceeded.
So is it possible to send the HTTP-headers using PHP (I have to modify them depending on entries in a mysql-database.) and then let Apache send the file body?
I have to add that I could also use perl scripts but I also do not see a possibility for doing this in a cgi-script.
Thanks.
You can do this strictly with apache if the location and/or filetype of the download is known ahead of time:
<Location /downloads>
SetEnvIf Request_URI ".attachment-extension$" FILENAME=$0
Header set "Content-disposition" "attachment; filename=%{FILENAME}"
</Location>
because it would need a lot of server performance
I don't believe this is the case. As long as sending the file to the client is the last thing the script does before it terminates, the difference in CPU/RAM performance between sending the file from PHP and letting Apache handle it directly should be minimal, if there is one at all.
Unless you have a very large number of Gbps bandwidth on your server with an incredibly fast HDD setup, you would run into a bandwidth problem long before you ran into a server system resources problem.
Admittedly this discussion is based largely on conjecture (since I know nothing about your hosting setup), so YMMV.
the maximum execution time of the PHP-scripts would be exceeded
So just call set_time_limit(0);. That what it's for.
I found this link: http://jasny.net/articles/how-i-php-x-sendfile/. X-Sendfile is a module which seems to allow exactly what I want to do. It redirects the body of the HTTP-response and will Apache let handle everything. I just installed and tried it and it works fine.
Related
I'm confused. I have a PHP 5.6 script that is producing a .js JavaScript file. It echoes arrays using PHP loops, and also includes some plain-text non-PHP sections, and also reads ten smaller javascript files and echoes them to the client. The total final file size is 560KB, The server automatically compresses the output and it arrives at the client as 163KB compressed. It takes between 700 and 1400 milliseconds to arrive at the client.
I guess that I shouldn't complain, but it seemed to me that it didn't make sense to keep reconstructing this file via PHP, so I prepared a copy of the final file and gzencoded it with level 9 to a size of 160KB and tried skipping PHP and loading the file either directly or via a RewriteRule in .htaccess. It's now always taking betwen 1200 and 1600 milliseconds to arrive, according to Chrome's Network pane.
Is it possible that PHP is so fast that it's bad to cache the file? Or is there something that might need adjustment? This is all via shared hosting, so I don't have full control.
I think that I was missing the point that it's all static data as pointed out by #Capsule. So by using my new scheme and putting it in a file I automatically activate support for sending back 304 responses later via the ETag (after I set the appropriate Cache-Control header in .htaccess).
The PHP system may be highly optimized as pointed out by #arkascha, but that's only helpful for the first time the user accesses the file; when max-age expires and they ask for it again, my PHP script doesn't contain a whole scheme to try to send back 304, so it has to send the whole file out again.
I want to allow uploads of very large files into our PHP application (hundred of megs - 8 gigs). There are a couple of problems with this however.
Browser:
HTML uploads have crappy feedback, we need to either poll for progress (which is a bit silly) or show no feedback at all
Flash uploader puts entire file into memory before starting the upload
Server:
PHP forces us to set post_max_size, which could result in an easily exploitable DOS attack. I'd like to not set this setting globally.
The server also requires some other variables to be there in the POST vars, such as an secret key. We'd like to be able to refuse the request right away, instead of after the entire file is uploaded.
Requirements:
HTTP is a must.
I'm flexible with client-side technology, as long as it works in a browser.
PHP is not a requirement, if there's some other technology that will work well on a linux environment, that's perfectly cool.
upload_max_filesize can be set on a per-directory basis; the same goes for post_max_size
e.g.:
<Directory /uploadpath/>
php_value upload_max_filesize 10G
php_value post_max_size 10G
</IfModule>
Python Handler?
Using a Python POST handler instead of PHP. Generate a unique identifier from your PHP app that the client can put in the HTTP headers. With mod_python to reject or accept the large upload before the entire POST body is transmitted.
I think
http://www.modpython.org/live/current/doc-html/dir-handlers-hph.html
Allows you to check headers and decline the rest of the POST input. I haven't tried it but might be the right path?
Looking at the source of mod_python, the buffering of the input via read() seems to allow bit-at-a-time evaluation of the HTTP input. Headers are first.
https://svn.apache.org/repos/asf/quetzalcoatl/mod_python/trunk/src/filterobject.c
It's old I know, but maybe someone have this problem nowdays ,too.
Now you can do this with only Javascript and, say, PHP. No Flash or Java required on client side.
demo: http://dnduploader.filkor.org/
The idea is to slice the files with Javascript's Blob slice() method...
How about a Java applet? That's how we had to do it at a company I previously worked for. I know applets suck, especially in this day and age with all our options available, but they really are the most versatile solution to desktop-like problems encountered in web development. Just something to consider.
You can set the post_max_size for just scripts in 1 directory. Place your upload script there, and allow only that script to handle large sizes. It's still possible for that script to be attacked with large/useless files, but it avoids setting it globally.
Use that with APC and you might be able to work out something good:
IBM Developer works article on APC
Tried all of this... this is by far the best I have used yet...
http://www.uploadify.com/
Take a look at jumploader.com
A good java-applet for uploading.
I've used it for uploading images and it works fine. Haven't tried with bigger files than 10MB, but i should work for really big files too.
Have you looked into using APC to check the progress and total file size. Here is a good blog post about it. It might help.
Maybe you could use Webdav and Javascript in the browser
AJAX Big file upload, with progress, to WebDAV
http://www.webdavsystem.com/ajax/programming/upload_progress
A simple library
http://debris.demon.nl/projects/davclient.js/doc/README.html
You can then get the JS to redirect the user to a success page. Secret keys and what-not can be handled in a PHP prelude before handing off the JS Client->WebDAV
I would look into FTP, SSH or SCP this allows you to upload a large file and still have access control over the file as well. This might take a little longer to implement but its probably the most secure way I could think of.
I know it sucks to add another dependency but in my experience, most websites that are doing something like this are using flash on the client side, and uploading the large file as chunks
adobe as a howto on flash file uploads
I also found this tutorial on codeproject:
Multiple File Upload With Progress Bar Using Flash and ASP.NET
PS - I know you're using PHP and not .net, I figured the important part was the flash ;)
I've had success with uploadify, and I would recommend it. It's a jQuery/Flash script that handles large uploads, and you can pass extra parameters to it (like the secret key). To solve the server-side issues, simply use the following code. The changes take affect just for the script they're called in:
//Check to see if the key is there
if(!isset($_POST['secret_key']) || !isValid($_POST['secret_key']))
{
exit("Invalid request");
}
function isValid($key)
{
//Put your validation code here.
}
//This line changes the timeout.
//Give it a value in seconds (3600 = 1 hour)
set_time_limit(3600);
//Set these amounts to whatever you need.
ini_set("post_max_size","8192M");
ini_set("upload_max_filesize","8192M");
//Generally speaking, the memory_limit should be higher
//than your post size. So make sure that's right too.
ini_set("memory_limit","8200M");
EDIT In response to your comment:
Given what you've said, I'm afraid you may not be able to meet your requirements over http. All of the solutions out there are code that add features to http that it was never designed for.
Like you said yourself, it's a simple protocol. Apart from writing your own client software that runs outside of the browser, a java applet, or using a different protocol (like FTP, which was designed for this), you might not get what you want.
I've done the best I could within the given constraints. Sorry I couldn't do better.
Try this: http://www.simple2ftp.com uses a Java based FTP applet from within a clever PHP application wrapper.
I have a problem here, maybe someone has already gone through this before.
A system controller is serving php downloads, it reads information from files and sends the client as a download. The system works perfectly. The problem is that speed is always low, always less than the 300kb/se times less than 100kb / s for the user.
The server has a 100mbps link 6mbps free and the customer has, then it should be downloaded at 600kb / s. Something is holding the output of php. I've tried searching on the buffers of apache but found nothing on this issue.
Does anyone have any idea what might be happening?
PHP really isn't built for processing large files. It has to read that entire file into memory and then output it. It sounds like you're sending a reasonable amount of traffic through PHP, if 100kb/s - 300kb/s per user is too slow, via something like readfile() which is a bad idea. Instead, I suggest taking a look at mod_xsendfile (if you're using Apache) or it's equivalent for your web server of choice (e.g. I prefer nginx, and would use XSendFile for this).
In PHP then, you can just do this: header('X-Sendfile: ' . $file);. The server intercepts the header, and sends that file. It allows you the benefits of what you're doing with PHP, and the speed of the web server directly reading the file.
I want to setup a automated backup via PHP, that using http/s I can "POST" a zip file request to another server and send over a large .zip file , basically, I want to backup an entire site (and its database) and have a cron peridocally transmit the file over via a http/s. somethiling like
wget http://www.thissite.com/cron_backup.php?dest=www.othersite.com&file=backup.zip
The appropriate authentication security can be added afterwords....
I prefer http/s because this other site has limited use of ftp and is on a windows box. So the I figure the sure way to communicate with it is via http/s .. the other end would have a correspondign php script that would store the file.
this process needs to be completely programmatic (ie. Flash uploaders will not work, as this needs a browser to work, this script will run from a shell session)//
Are there any generalized PHP libraries or functions that help with this sort of thing? I'm aware of the PHP script timeout issues, but I can typically alter php.ini to minimize this.
I'd personally not use wget, and just run from the shell directly for this.
Your php script would be called from cron like this:
/usr/local/bin/php /your/script/location.php args here if you want
This way you don't have to worry about yet another program to handle things (wget) if your settings are the same at each run then just put them in a config file or directly into the PHP script.
Timeout can be handled by this, makes PHP script run an unlimited amount of time.
set_time_limit(0);
Not sure what libraries you're using, but look into CRUL to do the POST, should work fine.
I think the biggest issues that would come up would be more sever related and less PHP/script related, i.e make sure you got the bandwidth for it, and that your PHP script CAN connect to an outside server.
If its at all possible I'd stay away from doing large transfers over HTTP. FTP is far from ideal too - but for very different reasons.
Yes it is possible to do this via ftp, http and https using curl - but this does not really solve any of the problems. HTTP is optimized around sending relatively small files in erlatively short periods of time - when you stray away from that you will end up undermining a lot of the optimization that's applied to webservers (e.g. if you've got a setting for maxrequestsperchild you could be artificially extending the life of processes which should have stopped, and there's the interaction between LimitRequest* settings and max_file_size, not to mention various timeouts and the other limit settings in Apache).
A far more sensible solution is to use rsync over ssh for content/code backups and the appropriate database replication method for the DBMS you are using - e.g. mysql replication.
I have a PHP client that requests an XML file over HTTP (i.e. loads an XML file via URL). As of now, the XML file is only several KB in size. A problem I can foresee is that the XML becomes several MBs or Gbs in size. I know that this is a huge question and that there are probably a myriad of solutions, but What ideas do you have to transport this data to the client?
Thanks!
based on your use case i'd definitely suggest zipping up the data first. in addition, you may want to md5 hash the file and compare it before initiating the download (no need to update if the file has no changes), this will help with point #2.
also, would it be possible to just send a segment of XML that has been instead of the whole file?
Ignoring how well a browser may or may-not handle a GB-sized XML file, the only real concern I can think of off the top of my head is if the execution time to generate all the XML is greater than any execution time thresholds that are set in your environment.
PHP's max_execution_time setting
PHP's set_time_limit() function
Apache's TimeOut Directive
Given that the XML is created dynamically with your PHP, the simplest thing I can think of is to ensure that the file is gzipped automatically by the webserver, like described here, it offers a general PHP approach and an Apache httpd-specific solution.
Besides that, having a browser (what else can be a PHP-client?) do such a job every night for some data synchonizing sounds like there must be a far simpler solution somewhere else.
And, of course, at some point, transferring "a lot" of data is going to take "a lot" of time...
The problem is that he's syncing up two datasets. The problem is completely misstated.
You need to either a) keep a differential log of changes to dataset A to that you can send that log to dataset B, or b) keep two copies of the dataset (last nights and the current dataset), and then compare them so you can then send the differential log from A to B.
Welcome to the world of replication.
The problem with (a) is that it's potentially invasive to all of your code, though if you're using an RDBMS you could do some logging perchance via database triggers to keep track of inserts/updates/deletes, and write the information in to a table, then export the relevant rows as your differential log. But, that can be nasty too.
The problem with (b) is the whole "comparing the database" all at once. Fine for 100 rows. Bad for 10^9 rows. Nasty nasty.
In fact, it can all be nasty. Replication is nasty.
A better plan is to look into a "real" replication system designed for the particular databases that you're running (assuming you're running a database). Something that perhaps sends database log records over for synchronization rather than trying to roll your own.
Most of the modern DBMS systems have replication systems.
Gallery2, which allows you to upload photos over http, makes you set up a couple of php parameters, post_max_size and upload_max_filesize, to allow larger uploads. You might want to look into that.
It seems to me that posting large files has problems with browser time-outs and the like, but on the plus side it works with proxy servers and firewalls better than trying a different file upload protocol.
Thanks for the responses. I failed to mention that transferring the file should be relatively fast (few mintues max, is this even possible?). The XML that is requested will be parsed and inserted into a database every night. The XML may be the same from the night before, or it may be different. One solution that was proposed is to zip the xml file and then transfer it. So there are basically two requirements: 1. it has to relatively fast 2. it should minimize the number of writes to the database.
One solution that was proposed is to zip the xml file and then transfer it. but that only satisfies (1)
Any other ideas?
Are there any algorithms that I could apply to compress the XML? How are large files such as MP3s being downloaded in a matter of seconds?
PHP receiving GB's of data will take long and is overhead.
Even more perceptible to flaws.
I would - dispatch the assignment to a shellscript (wget with simple error catching) that is not bothered by execution time and on failure could perhaps even retry on its own merit.
Am not experienced with this, but though one could use exec() or alike, these sadly run modal.
Calling a script with **./test.sh &** makes it run in background and solves that problem / i guess. The script could easily let your PHP pick it back up via a wget `http://yoursite.com/continue-xml-stuff.php?id=1049381023&status=0ยด. The id could be a filename, if you don't need to backtrack lost requests. The status would indicate how the script ended up handling the request.
Have you thought about using some sort of version control system to handle this? You could leverage its ability to calculate and send just the differences in the files, plus you get the added benefits of maintaining a version history of your file.
Since I don't know the details of your situation I'll throw question out there. Just for sake of argument does it have to be HTTP? FTP is much better suited for large data transfer and can be automated easily via PHP or Perl.
If you are using Apache, you might also consider Apache mod_gzip. This should allow you to compress the file automatically and the decompression should also happen automatically, as long as both sides accept gzip compression.