I need to get user download some file (for example, PDF). What will be longer:
send this file by PHP (with specific headers),
or put it in http public folder, and get user the public link to download it (without PHP help)?
In 1st case the original file could be in private zone.
But I'm thinking it will take some time to send this file by PHP.
So how I can measure PHP spent time to sending file and how much memory it can consumed?
P.S. in the 1st case, when PHP sends headers and browser (if pdf plugin is installed) will try to opening it inside browser, is PHP still working, or it push out whole file after headers sent immediately? Or if plugin not installed and browser will show "save as" dialog PHP still working ?
There will be very little in it if you are worried about download speeds.
I guess it comes down to how big your files are, how many downloads you expect to have, and if your documents should be publicly accessible, the download speed of the client.
Your main issue with PHP is the memory it consumes - each link will create a new process, which would be maybe 8M - 20M depending on what your script does, whether you use a framework etc.
Out of interest, I wrote a symfony application to offer downloads, and to do things like concurrency limiting, bandwidth limiting etc. It's here if you're interested in taking a look at the code. (I've not licensed it per se, but I'm happy to make it GPL3 if you like).
Related
I'm using the following code to manage downloads from my site (the files are behind a captcha): http://www.richnetapps.com/php-download-script-with-resume-option/
Trouble is, when a file is being downloaded, it locks the rest of the site, and it's not possible to download another file simultaneously. ('Locks' as in trying to go to, say, the homepage when a download is in progress results in a long wait. The homepage appears only when the download is finished or cancelled. This is a problem because some of the files are several hundred MB).
I'd like two things to happen: 1- To be able to browse the site while a file is being downloaded, and 2- to be able to download another file (or two, or three, or ten...) simultaneously.
My gut feeling is I need to fork the process, create a new one, or open another socket. But I'm way out of my depth, and even if this was the right approach, I don't know how to do it. Any ideas guys?
Many thanks in advance....
EDIT----
I found it! I added session_write_close() right before setting the headers in the download script. Apparently this behaviour is due to PHP session handling - further info here: php simultaneous file downloads from the same browser and same php script (I searched and searched before asking, but obviously missed this post).
Many thanks....
A Content Delivery Network (CDN) will both offload from your server allowing your server to process homepage (or other) page requests, and allow many, many simultaneous downloads. It should be cheaper for bandwidth and perhaps faster for most users as well.
The key will be to configure to protect the files only after your Captcha, instead of being freely available like most CDN setups.
I'm trying to stream MP4 files through Apache / Nginx using a PHP proxy for authentication. I've implemented byte-ranges to stream for iOS as outlined here: http://mobiforge.com/developing/story/content-delivery-mobile-devices. This works perfectly fine in Chrome and Safari but.... the really odd thing is that if I monitor the server requests to the php page, three of them occur per page load in a browser. Here's a screen shot of Chrome's inspector (going directly to the PHP proxy page):
As you can see, the first one gets canceled, the second remains pending, and the third works. Again, the file plays in the browser. I've tried alternate methods of reading the file (readfile, fgets, fread, etc) with the same results. What is causing these three requests and how can I get a single working request?
The first request is for the first range of bytes, preloading the file. The browser cancels the request once it has downloaded the specified amount.
The second one I'm not sure about...
The third one is when you actually start playing the media file, it requests and downloads the full thing.
Not sure whether this answers your question, but serving large binary files with PHP isn't the right thing to do.
It's better to let PHP handle authentication only and pass the file reference to the web server to serve, freeing up resources.
See also: Caching HTTP responses when they are dynamically created by PHP
It describes in more detail what I would recommend to do.
I have a problem here, maybe someone has already gone through this before.
A system controller is serving php downloads, it reads information from files and sends the client as a download. The system works perfectly. The problem is that speed is always low, always less than the 300kb/se times less than 100kb / s for the user.
The server has a 100mbps link 6mbps free and the customer has, then it should be downloaded at 600kb / s. Something is holding the output of php. I've tried searching on the buffers of apache but found nothing on this issue.
Does anyone have any idea what might be happening?
PHP really isn't built for processing large files. It has to read that entire file into memory and then output it. It sounds like you're sending a reasonable amount of traffic through PHP, if 100kb/s - 300kb/s per user is too slow, via something like readfile() which is a bad idea. Instead, I suggest taking a look at mod_xsendfile (if you're using Apache) or it's equivalent for your web server of choice (e.g. I prefer nginx, and would use XSendFile for this).
In PHP then, you can just do this: header('X-Sendfile: ' . $file);. The server intercepts the header, and sends that file. It allows you the benefits of what you're doing with PHP, and the speed of the web server directly reading the file.
I would like to create a caching system that will bypass some mechanisms in order to improve the performance.
I have some examples:
1-) I have a dynamic PHP page that is updated every hour. The page content is same for every user. So in this case I can either:
a) create an HTML page, and that page can be generated every hour. In this case I would like to bypass PHP, so there should be a static page and if the database is updated, a new HTML file will be generated. How can I do this? I can create a crontab script that generates the HTML file, but it does not seem as an elegant way.
b) cache the output in the memory, so the web server will update the content every hour. I guess I need a memory cache module for the web server. There is a unofficial memcache module for lighttpd, but it does not seem stable, I have also heard a memcache module for nginx but don't know whether is this possible or not. This way seems more elegant and possible, but how? Any ideas? (Again, I would like to bypass PHP in this case)
Another example is that I have a dynamic PHP page that is updated every hour, in that page only user details part is fully dynamic (so a user logs in or out and see his/her status in that section)
Again, how can I create a caching system for this page? I think, if I can find a solution for the first example, then I can use AJAX in that part with the same solution. Am I correct?
edit: I guess, I could not make clear. I would like to bypass PHP completely. PHP script will be run once an hour, after that no PHP call will be made. I would like to remove its overhead.
Thanks in advance,
Go with static HTML. Every hour simply update a static HTML file with your output. You'll want to use an hourly cron to run a PHP script to fopen() and fwrite() to the file. There's no need to hit PHP to retrieve the page whatsoever. Simply make a .htaccess mod_rewrite redirection rule for that particular page to maintain your current URL naming.
Although not very elegant, static HTML with gzip compression to me is more efficient and would use less bandwidth.
An example of using cron to run a PHP script hourly:
// run this command in your console to open the editor
crontab -e
Enter these values:
01 * * * * php -f /path/to/staticHtmlCreater.php > /dev/null
The last portion ensures you will not have any output. This cron would run on the first minute of every hour.
UPDATE
Either I missed the section regarding your dynamic user profile information or it was added after my initial comment. If you are only using a single server, I would suggest you make a switch to APC which provides both opcode caching and a caching mechanism faster than memcached (for a single server application). If the user's profile data is below the fold (below the user's window view), you could potentially wait to make the AJAX request until the user scrolls down to a specified point. You can see this functionality used on the facebook status page.
If this is just a single web server, you could just use PHP's APC module to cache the contents of the page. It's not really designed to cache entire pages, but it should do in a pinch.
Edit: I forgot to mention that APC isn't (yet) shipped with PHP, but can be installed from PECL. It will be shipped as part of PHP 6.
A nice way to do it is to have the static content stored in a file. Things should work like this :
your PHP script is called
if your content file has been modified more than 1 hour ago (width filemtime($yourFile))
re-generate content + store it in the file + send it back to the client
else
send the file content as is (with file($yourFile), or echo file_get_contents($yourFile)
Works great in every cases, even under heavy load.
I need to upload potentially big (as in, 10's to 100's of megabytes) files from a desktop application to a server. The server code is written in PHP, the desktop application in C++/MFC. I want to be able to resume file uploads when the upload fails halfway through because this software will be used over unreliable connections. What are my options? I've found a number of HTTP upload components for C++, such as http://www.chilkatsoft.com/refdoc/vcCkUploadRef.html which looks excellent, but it doesn't seem to handle 'resume' of half done uploads (I assume this is because HTTP 1.1 doesn't support it). I've also looked at the BITS service but for uploads it requires an IIS server. So far my only option seems to be to cut up the file I want to upload into smaller pieces (say 1 meg each), upload them all to the server, reassemble them with PHP and run a checksum to see if everything went ok. To resume, I'd need to have some form of 'handshake' at the beginning of the upload to find out which pieces are already on the server. Will I have to code this by hand or does anyone know of a library that does all this for me, or maybe even a completely different solution? I'd rather not switch to another protocol that supports resume natively for maintenance reasons (potential problems with firewalls etc.)
I'm eight months late, but I just stumbled upon this question and was surprised that webDAV wasn't mentioned. You could use the HTTP PUT method to upload, and include a Content-Range header to handle resuming and such. A HEAD request would tell you if the file already exists and how big it is. So perhaps something like this:
1) HEAD the remote file
2) If it exists and size == local size, upload is already done
3) If size < local size, add a Content-Range header to request and seek to the appropriate location in local file.
4) Make PUT request to upload the file (or portion of the file, if resuming)
5) If connection fails during PUT request, start over with step 1
You can also list (PROPFIND) and rename (MOVE) files, and create directories (MKCOL) with dav.
I believe both Apache and Lighttpd have dav extensions.
You need a standard size (say 256k). If your file "abc.txt", uploaded by user x is 78.3MB it would be 313 full chunks and one smaller chunk.
You send a request to upload stating filename and size, as well as number of initial threads.
your php code will create a temp folder named after the IP address and filename,
Your app can then use MULTIPLE connections to send the data in different threads, so you could be sending chunks 1,111,212,313 at the same time (with separate checksums).
your php code saves them to different files and confirms reception after validating the checksum, giving the number of a new chunk to send, or to stop with this thread.
After all thread are finished, you would ask the php to join all the files, if something is missing, it would goto 3
You could increase or decrease the number of threads at will, since the app is controlling the sending.
You can easily show a progress indicator, either a simple progress bar, or something close to downthemall's detailed view of chunks.
libcurl (C api) could be a viable option
-C/--continue-at
Continue/Resume a previous file transfer at the given offset. The given offset is the exact number of bytes that will be skipped, counting from the beginning of the source file before it is transferred to the destination. If used with uploads, the FTP server command SIZE will not be used by curl.
Use "-C -" to tell curl to automatically find out where/how to resume the transfer. It then uses the given output/input files to figure that out.
If this option is used several times, the last one will be used
Google have created a Resumable HTTP Upload protocol. See https://developers.google.com/gdata/docs/resumable_upload
Is reversing the whole proccess an option? I mean, instead of pushing file over to the server make the server pull the file using standard HTTP GET with all bells and whistles (like accept-ranges, etc.).
Maybe the easiest method would be to create an upload page that would accept the filename and range in parameter, such as http://yourpage/.../upload.php?file=myfile&from=123456 and handle resumes in the client (maybe you could add a function to inspect which ranges the server has received)
# Anton Gogolev
Lol, I was just thinking about the same thing - reversing whole thing, making server a client, and client a server. Thx to Roel, why it wouldn't work, is clearer to me now.
# Roel
I would suggest implementing Java uploader [JumpLoader is good, with its JScript interface and even sample PHP server side code]. Flash uploaders suffer badly when it comes to BIIIGGG files :) , in a gigabyte scale that is.
F*EX can upload files up to TB range via HTTP and is able to resume after link failures.
It does not exactly meets your needs, because it is written in Perl and needs an UNIX based server, but the clients can be on any operating system. Maybe it is helpful for you nevertheless:
http://fex.rus.uni-stuttgart.de/
Exists the protocol called TUS for resumable uploads with some implementations in PHP and C++