Currently the MP3 file exceeds the limit set in wordpress. And Although I am going to raise that limit with some information I found on the topic, the Mp3s are still rather large. If the site were for me, I'd simply compress them. So I need to find a method to compress them. I assume it needs to save that large file, to transcode it to a smaller file, and then delete the old one. Any ideas?
I assume it needs to save that large file, to transcode it to a smaller file, and then delete the old one
This is correct. You always have to store the content first. Encoding should not be done via PHP though, there are more effective libraries for that. For example, you can run lame mp3 encoder on the command line via a system() call in php. Be aware though that encoding might fail and also that it takes quite long. So you should run a cleaner script for the big files via cron instead of trying to delete them via the upload script.
Related
I'm trying to convert a website to use S3 storage instead of local (expensive) disk storage. I solved the download problem using a stream wrapper interface on the S3Client. The upload problem is harder.
It seems to me that when I post to a PHP endpoint, the $_FILES object is already populated and copied to /tmp/ before I can even intercept it!
On top of that, the S3Client->upload() expects a file on the disk already!
Seems like a double-whammy against what I'm trying to do, and most advice I've found uses NodeJS or Java streaming so I don't know how to translate.
It would be better if I could intercept the code that populates $_FILES and then send up 5MB chunks from memory with the S3\ObjectUploader, but how do you crack open the PHP multipart handler?
Thoughts?
EDIT: It is a very low quantity of files, 0-20 per day, mostly 1-5MB sometimes hitting 40~70MB. Periodically (once every few weeks) a 1-2GB file will be uploaded. Hence the desire to move off an EC2 instance and into heroku/beanstalk type PaaS where I won't have much /tmp/ space.
It's hard to comment on your specific situation without knowing the performance requirements of the application and the volume of users needed to access it so I'll try to answer assuming a basic web app uploading profile avatars.
There are some good reasons for this, the file is streamed to the disk for multiple purposes one of which is to conserve memory use. If your file is not on the disk than it is in memory(think disk usage is expensive? bump up your memory usage and see how expensive that gets), which is fine for a single user uploading a small file, but not so great for a bunch of users uploading small files or worse: large files. You'll likely see the best performance if you use the defaults on these libraries and let them stream to and from the disk.
But again I don't know your use case and you may actually need to avoid the disk at all costs for some unknown reason.
What is a good way to upload multiple large files in PHP?
Note: in my case I don't really need very large files, I need something like 40-50 or maybe 100 MB upload. My purpose and goal is mainly about documents upload in websites, these documents (pdf, doc, etc.) can sometimes be 20-30 MB and rarely 100+MB, where normally the php MAX_UPLOAD_FILESIZE is 10-20 MB. I think that having a large MAX_UPLOAD_FILESIZE is very bad and anyway allowing large file uploads in PHP (like 1GB+) is really a bad idea.
I have been reading, even here on SO, different solutions like plupload and some others (HTTP Upload, bigUpload, etc.) and I am not sure which way is better to consider.
As a principle, I'd like to find something with mantained code (not an abandoned library) and possibly following coding standards (PSR).
I think that writing all from scratch would be a huge work, but maybe I am wrong, if someone did that I would like to hear your experience. Of course if I can't find something that gives me what I need, I'll have to edit existing libraries or write one on my own.
I will give a try to plupload but my biggest concern is that it might not be well mantained. There is a v3 release but the stable is still the v2. If I understand correctly, the last updates of this library on GitHub were made in 2017 and this doesn't really reassure me. PHP but everything in general changes really quickly.
Problem 1: Server knows the size of the file only when the upload is finished
I think this is the first problem. I can't be sure about the file size until the upload is completed. If I have to give an error in PHP because the file is too big, I'd need to upload the file before. I could guess the size in javascript but I think that would be too hackable. Same thing for HTML5. All client-side checks could be "hacked" or manipulated even if it requires work, it is still doable
Problem 2: File chunking, is it the best solution?
I have read about file chunking, which is very interesting, but that is only client-side, right? Because to chunk a file on server-side, you go back to problem 1, you need to upload the whole file before chunking it. Is client-side chunking safe? Are there known vulnerabilities about it? (an example, could I exploit something out while the file is being uploaded?). I will read more about this later.
Also what about the time required to upload? Let's say you are using your phone in 4G to upload two PDFs or JPGs on a form, you have to upload 20-30 MB with a bad signal in that moment, let's say you take 30 seconds or 1 minute to upload them. Will the server stop listening after a while?
In any case, with file chunking, do I need to have a large MAX_FILESIZE_UPLOAD? Is usually the request sent (as a POST) with the entire file, which could return an error because the size exceeds the PHP limit? I would like to keep a normal PHP file size limit.
Problem 3: stopping file upload and deleting interrupted upload
What happens if I stop or I want to stop my upload? I guess that the server will find itself with a temporary file which is only partially uploaded.
Then, in general, is the best way to have a cron checking the /tmp (or whatever) folder and deleting uncompleted files?
Problem 4: overwriting interrupted uploads
What if the user uploads a new file that should overwrite the old one, but the old one was not completed? Let's say, user uploads a 10MB document but realizes it's the wrong one. He will probably reload the page or maybe click "Browse" again and upload the new one. So, all the temporary files should have a unique name. If I remember correctly, PHP already gives them a random name in /tmp/. Is this enough? Would it be better to manually give them a random name, maybe based on a timestamp?
In code words, something like:
$fileName = time() . '-' . uniqid() . '.tmp';
Problem 5: what about multiple files upload?
Let's say the user uploads 3 documents, 10 MB each one. Should the server receive them all at the same time or one by one? Or maybe is it the same? At a first look, I might think that multiple uploads at the same time could give more problems with server load. Maybe this is not that important though.
Problem 6: accessibility
Would all this be easily accessible? Do you think that a user using a screen-reader would be able to upload multiple (and possibly large) files without problems?
Conclusion
In conclusion I will take a look at existing libraries and test a bit if I can find a good solution for my needs. In general I would like to read comments or experiences that could help me understand possible difficulties and problems that I could encounter.
I will try to help you in your questions:
Problem 1: Server knows the size of the file only when the upload is finished
You need to configure upload_max_filesize and post_max_size to allow the maximum file upload you want to be able to receive in a single request.
And yes, you will know the size once the file has been uploaded, as your script will be executed once the file is completely uploaded.
You can have some kind of checking in your javascript to improve your UI to the customer.
Also if the file exceeds the maximum size, the file will not be available in your server script.
Problem 2: File chunking, is it the best solution?
I don't think this would be any better than uploading the whole file.
The speed will not be increased, so although you can paralelize the upload, the upload speed will be determined by the client speed, you will end up uploading 1 file at 10Mb/s or 10 files at 1Mb/s, at the end will be the same.
Also you will have a very complex code at the client and at the server to handle it, and you will have to handle new error scenarios.
It is not worth it.
Problem 3: stopping file upload and deleting interrupted upload
If the client stops the upload, your server code will not be executed as the request has not been completed.
Also you won't have any troubles with duplicated tmp files, the file is removed from tmp folder when you move it using move_uploaded_file and if you don't move the file will be deleted when your script has finished.
https://www.php.net/manual/en/features.file-upload.post-method.php
Problem 5: what about multiple files upload?
With multiple file uploads, i prefer to use diferent requests via javascript for each file, then you can paralelize or start uploading files one by one and show the user the upload process.
With javascript you can see the upload percent of each request, and it is very useful when uploading big files to give the user the feedback that the application is working properly.
Problem 6: accessibility
You will handle the same accessibility issues than uploading a small file.
Conclusion
The server side coding is independent of the file size you are trying to upload, as it just does some checks and moves the file to the proper location.
You will have to code some javascript to improve the ui to show the progress to the customer, and if you want to add the ability to cancel the upload, it is very easy as it is just an http request with a listener that updates the progress (https://stackoverflow.com/a/47638378/1445024)
We have files that are hosted on RapidShare which we would like to serve through our own website. Basically, when a user requests http://site.com/download.php?file=whatever.txt, the script should stream the file from RapidShare to the user.
The only thing I'm having trouble getting my head around is how to properly stream it. I'd like to use cURL, but I'm not sure if I can read the download from RapidShare in chunks and then echo them to the user. The best way I've thought of so far is to use a combination of fopen, fread, echo'ing the chunk of the file to the user, flushing, and repeating that process until the entire file is transferred.
I'm aware of the PHP readfile() function aswell, but would that be the best option? Bear in mind that these files can be several GB's in size, and although we have servers with 16GB RAM I want to keep the memory usage as low as possible.
Thank you for any advice.
HTTP has a Header called "Range" which basically allows you to fetch any chunk of a file (knowing that you already know the file size), but since PHP isn't multi-threaded aware, I don't see any benefit of using it.
Afaik, if you don't want to consume all your RAM, the only way to go is a two steps way.
First, stream the remote file using fopen()/fread() (or any php functions which allow you to use stream), split the read in small chunks (2048 bits may be enough), write/append the result to a tempfile(), then "echoing" back to your user by reading the temporary file.
That way, even a file 2To would, basically, consumes 2048 bits since only the chunk and the handle of the file is in memory.
You may also write some kind of proxy manager to cache and keep already downloaded files to avoid the remote reading process if a file is heavily downloaded (and keep it locally for a given time).
Maybe it would be best to start by describing the scenario.
We have a Debian server with ffmpeg that we use to covert various video files into FLV.
The files are supplied by a number of different people via FTP and are kept in the "uploads" folder.
I need to write a PHP script that would go through all the files in the uploads folder, select the ones which are complete (i.e. not currently being uploaded or without any uploading errors) and then convert them to FLV using ffmpeg.
I can do the conversion and everything else, but how do I determine whether a file is complete and fully uploaded?
Many thanks!
Afaik you can't just figure out if a file is still being uploaded. You could run a cronjob everyminute getting the filesizes and store these in a database or file. Then if you run the cronjob the second time and the filesize is the same: convert them, if not.. wait another minute and then try again.
I don't believe there's a filesize stored with a file that contains the size it should be after the upload is done.
There is another way to go about this, which we have done for years.
Most ftp servers (proftpd does) will output a log which will tell you when a file upload has completed successfully. You can set this logging to go to a unix named pipe / fifo, and then have a daemonized script read this to determine which files to process. This works great, and only processes files after they are uploaded completely and successfully.
I would like to create an upload script that doesn't fall under the php upload limit.
There might be an occasion where I need to upload a 2GB, or larger file and I don't want to have to change the whole server execution to above 32MB.
Is there a way to write direct to disk from php?
What method might you propose someone would use to accomplish this? I have read around stack overflow but haven't quite found what I am looking to do.
The simple answer is you can't due to the way that apache handles post data.
If you're adamant to have larger file uploads and still use php for the backend you could write a simple file upload receiver using the php sockets api and run it as a standalone service. Some good details to be found at http://devzone.zend.com/article/1086#Heading8
Though this is an old post, you find it easily via google when looking for a solution to handle big file uploads with PHP.
I'm still not sure if file uploads that increase the memory limit are possible but I think there is a good chance that they are. While looking for a solution to this problem, I found contradicting sources. The PHP manual states
post_max_size: Sets max size of post data allowed.
This setting also affects file upload.
To upload large files, this value must
be larger than upload_max_filesize. If
memory limit is enabled by your
configure script, memory_limit also
affects file uploading. Generally
speaking, memory_limit should be
larger than post_max_size. (http://php.net/manual/en/ini.core.php)
...which implies that your memory limit should be larger than the file you want to upload. However, another user (ragtime at alice-dsl dot com) at php.net states:
I don't believe the myth that 'memory_size' should be the size of
the uploaded file. The files are
definitely not kept in memory...
instead uploaded chunks of 1MB each
are stored under /var/tmp and later on
rebuild under /tmp before moving to
the web/user space.
I'm running a linux-box with only 64MB
RAM, setting the memory_limit to 16MB
and uploading files of sizes about
100MB is no problem at all! (http://php.net/manual/en/features.file-upload.php)
He reports some other related problems with the garbage collector but also states how they can be solved. If that is true, the uploaded file size may well increase the memory limit. (Note, however, that another thing might be to process the uploaded file - then you might have to load it into memory)
I'm writing this before I tried handling large file uploads with PHP myself since I'm evaluating using php or python for this task.
You can do some interesting things based around PHP's sockets. Have you considered writing an applet in Java to upload the file to a listening PHP daemon? This probably won't work on most professional hosting providers, but if you're running your own server, you could make it work. Consider the following sequence:
Applet starts up, sends a request to PHP to open a listening socket
(You'll probably have to write a basic web browser in Java to make this work)
Java Applet reads the file from the file system and uploads it to PHP through the socket that was created in step 1.
Not the cleanest way to do it, but if you disable the PHP script timeout in your php.ini file, then you could make something work.
It isn't possible to upload a file larger than PHP allowed limits with PHP, it's that simple.
Possible workarounds include using a client-side technology - like Java, not sure if Flash and Javascript can do this - to "split" the original file in smaller chunks.