I have a Codeigniter web app that is uploading many tiny files every hour to Amazon S3, which is causing my S3 request charges to shoot up real fast. One way to overcome this will be to zip up the file, upload the zip file to S3, then unzip it when it is on S3.
Can this be done using EC2? Or is there a better method to achieve this? Thank you!!
EDIT: If I were to use EC2, do I use PHP to trigger the creation of a EC2 instance, upload the PHP file required to unzip the zipped files, copy the uncompressed files to S3, then destroy the EC2 instance?
If you have an EC2 machine in the same region I would suggest you upload it there zipped and then it drop it to s3 from there unzipped. S3 cannot unzip it on its own as its all static.
Theres no charges between ec2 and s3 so ec2 can handle the unzipping and then write it out into your s3 bucket without additional transfer charges.
You can write code in a lambda to unzip a file of S3 bucket, you just have to use it, AWS Lambda will do this for you.
Referece:
https://github.com/carloscarcamo/aws-lambda-unzip-py/blob/master/unzip.py
https://github.com/mehmetboraezer/aws-lambda-unzip
S3 is just storage. Whatever file you upload is the file that is stored. You cannot upload a zip file then extract it once its in S3. If you wrote the application the best thing I could say is to try to re-design how you store the files. S3 requests are pretty cheap... you must be making a lot of requests.
I have been using this service to unzip files full of thousands of tiny image files, each zip I upload is about 4GB, and costs around $1 to unzip using http://www.cloudzipinc.com/service/s3_unzip, maybe that might help someone.
Having said that, you might find it easier to use Python with the Boto library. That will work far more efficiently than PHP.
Related
We have a zip file containing a large JSON file in it that we need to unzip. We currently fetch the file in a lambda using Laravel, copy and unzip it locally (in the lambda), and then upload the JSON file back to S3 for processing.
It however seems that the lambda doesn't process large files (moving it back to s3 after unzipping in the lambda). (180MB). We could continue to up the lambda resources (deployed via Laravel Vapor), however, we're looking for an in-memory option that could perhaps provide a streaming way of unzipping.
Is there such a solution for PHP?
I want users to upload large files (like HD videos and big PDFs) and am using this line of code to upload the files. Storage::disk('s3')->putFile('uploads', new File($request->file('file_upload')));
the problem is even though I have a good internet speed it takes a very long time to upload the file. what can I do to get a faster file upload?
There are actually two network calls involved in the process.
From client-side the file gets uploaded to your server.
Via server-to-server call the file gets uploaded from your server to s3.
The only way to reduce the delay is to directly upload the files from client to s3 using client-side SDKs securely. With this the files will be directly stored in the S3 bucket.
Once the files are uploaded to s3 via AWS S3 client-side SDKs, you can post the attributes of the file along with the download URL to Laravel and update it to DB.
The plus point of this approach is it allows you to show actual file upload progress at the client-side.
This can be done via the AWS amplify library which provides great integration with S3: https://docs.amplify.aws/start
The other options:
JS: https://softwareontheroad.com/aws-s3-secure-direct-upload/
Android:
https://grokonez.com/android/uploaddownload-files-images-amazon-s3-android
iOS:
https://aws.amazon.com/blogs/mobile/amazon-s3-transfer-utility-for-ios/
Please use this
$file_name = $request->file('name');
$disk = Storage::disk('s3');
$disk->put($filePath, fopen($file_name, 'r+'));
Instead of
Storage::disk('s3')->put($filePath, file_get_contents($file_name));
And also increase
post_max_size,upload_max_filesize,max_execution_time,memory_limit
I want to transcode video in 360p, 480p, 720p and then upload to amazon s3.
Currently we are using php library FFMPEG
I have successfully transcode video on my server. But I did not get that how to achieve same on amazon s3.
Do I need to first upload original video on s3 and then get that video and transcode in different format and send to amazon s3? is it possible?
Or if any other way than please suggest me.
S3 is not a block file system, it is an object file system. The difference here is that, normally, you cant mount a S3 bucket like a standard unix FS and work on file with fopen(), fwrite() ect... Some trick exists to work on S3 like any other FS but I would suggest an other option :
You have to transcode the video on a localy mounted FS (like an AWS EFS, or a local file system), then "push" (or upload) the whole transcoded video onto the S3 bucket. Of course, you can improve this process in may ways (remove temp file, do parallel works, use Lambda service, or task in containers...). You should avoid to do many upload/download from or to S3 (because it is time and cost consuming). Use a local storage as much as possible, then push the resulting data when they are ready on S3.
Also AWS have a service to do video transcodification : https://aws.amazon.com/en/elastictranscoder/
1) I have upload form
2) It uploads file to my local storage move_uploaded_file.
3) It uses zend putObject function to move file to s3 object.
Everything works ok till I have file size of around 30Mb to 40 Mb. The problem is when I try uploading larger files like 80 Mb, 100 Mb or so, the file moving to s3 takes ages to complete the upload. My code is something like this:
$orginalPath = APPLICATION_PATH."/../storage/".$fileName;
move_uploaded_file($data['files']['tmp_name'], "$orginalPath");
$s3 = new Zend_Service_Amazon_S3($accessKey, $secretKey);
$s3->putObject($path, file_get_contents($orginalPath),
array(Zend_Service_Amazon_S3::S3_ACL_HEADER =>Zend_Service_Amazon_S3::S3_ACL_PUBLIC_READ));
Can you help how to handle large files move quickly I tried using streamWrapper like this
$s3->registerStreamWrapper("s3");
file_put_contents("s3://my-bucket-name/orginal/$fileName", file_get_contents($orginalPath));
But no luck, it take same long time to move file.
Hence, is there an efficient way to move file quickly to s3 bucket?
The answer is a worker process. You can start a PHP worker script via PHP CLI on server boot, perhaps with a GearmanClient php extension and gearman server running on your box. Then you queue a background job to upload the file to S3 while your main site PHP code returns success after issuing the job and the file happily uploads in the background while your foreground site continues on it's merry way. Another way of doing this is making another server do all of this task while your main site remains utilization free of this process. I am doing this now. It works well.
You could consider using the more direct POST to S3 feature. The AWS SDK for PHP has a class to help generate the data for the form.
When uploading an image PHP stores the temp image in a local on the server.
Is it possible to change this temp location so its off the local server.
Reason: using loading balancing without sticky sessions and I don't want files to be uploaded to one server and then not avaliable on another server. Note: I don't necessaryly complete the file upload and work on the file in the one go.
Preferred temp location would be AWS S3 - also just interested to know if this is possible.
If its not possible I could make the file upload a complete process that also puts the finished file in the final location.
just interested to know if the PHP temp image/file location can be off the the local server?
thankyou
You can mount S3 bucket with s3fs on your Instances which are under ELB, so that all your uploads are shared between application Servers. About /tmp, don't touch it as destination is S3 and it is shared - you don't have to worry.
If you have a lot of uploads, S3 might be bottleneck. In this case, I suggest to setup NAS. Personally, I use GlusterFS because it scales well and very easy to set up. It has replication issues, but you might not use replicated volumes at all and you are fine.
Another alternatives are Ceph, Sector/Sphere, XtreemFS, Tahoe-LAFS, POHMELFS and many others...
You can directly upload a file from a client to S3 with some newer technologies as detailed in this post:
http://www.ioncannon.net/programming/1539/direct-browser-uploading-amazon-s3-cors-fileapi-xhr2-and-signed-puts/
Otherwise, I personally would suggest using each server's tmp folder for exactly that-- temporary storage. After the file is on your server, you can always upload to S3, which would then be accessible across all of your load balanced servers.