It seems like the images read from amazon s3 load really slow. I had the images on the same server as the website and it loaded super fast. Is it loading slow cause it has to access it from s3 now?
Is there nothing i can really do about it ??
Using this to read the image files:
$secure_link = gs_prepareS3URL("myAmazon."/thumb/thumb_".$id, $bucket);
readfile($secure_link);
Function is from : http://www.richardpeacock.com/blog/2010/07/amazon-aws-s3-query-string-authentication-php
If you're embedding the images, you should serve them through Amazon CloudFront (Amazon's CDN Service). CloudFront loads the image/file from S3 (or a custom origin) then caches it on their edge servers.
CloudFront Tutorial - http://www.hongkiat.com/blog/amazon-cloudfront-how-to-setup-cloudfront-to-work-with-s3/
S3 is just for Static Storage, by default it does not have the optimized performance, though there are ways to make it better.
Before you configure CloudFront, you should try enabling `Transfer Acceleration' on S3 bucket.
Source: https://docs.aws.amazon.com/AmazonS3/latest/userguide/transfer-acceleration.html
Benefits are,
Your customers upload to a centralized bucket from all over the
world.
You transfer gigabytes to terabytes of data on a regular basis
across continents.
You can't use all of your available bandwidth over the internet when uploading to Amazon S3.
This comes at a price.. (https://aws.amazon.com/about-aws/whats-new/2016/04/transfer-files-into-amazon-s3-up-to-300-percent-faster/)
Pricing for Amazon S3 Transfer Acceleration is simple, with no upfront
costs or long-term commitments. You simply pay a low, per-GB rate for
data transferred through the service. The pricing is designed to be
risk free: if Amazon S3 Transfer Acceleration isn’t likely to make a
difference in the speed of an upload (like when you upload data over
the short distance from a client in Tokyo to an S3 bucket in Japan),
you won’t be charged anything extra for that upload. For more
information on pricing, see Amazon S3 pricing.
Related
I want some little guidance from you all. I have a multimedia based site which is hosted on a traditional Linux based, LAMP hosting. As the site has maximum of Images /Video content,there are around 30000+ posts and database size is around 20-25MB but the file system usage is of 10GB and Bandwidth of around 800-900 GB ( of allowed 1 TB ) is getting utilized every month.
Now,after a little brainstorming and seeing my alternatives here and there, I have come up with two options
Increase / Get a bigger hosting plan.
Get my static content stored on Amazon S3.
While the first plan will be a simple option, I am actually looking forward for the second one, i.e. storing my static content on Amazon S3. The website i have is totally custom-coded and based on PHP+MySQL. I went through this http://undesigned.org.za/2007/10/22/amazon-s3-php-class/ and it gave me a fair idea.
I would love to know pros/cons when I consider hosting static content on s3.
Please give your inputs.
Increase / Get a bigger hosting plan.
I would not do that. The reason is, storage is cheap, while the other components of a "bigger hosting plan" will cost you dearly without providing an immediate benefit (more memory is expensive if you don't need it)
Get my static content stored on Amazon S3.
This is the way to go. S3 is very inexpensive, it is a no-brainer. Having said that, since we are talking video here, I would recommend a third option:
[3.] Store video on AWS S3 and serve through CloudFront. It is still rather inexpensive by comparison, given the spectacular bandwidth and global distribution. CloudFront is Amazon's CDN for blazing fast speeds to any location.
If you want to save on bandwidth, you may also consider using Amazon Elastic Transcoder for high-quality compression (to minimize your bandwidth usage).
Traditional hosting is way too expensive for this.
Bigger Hosting Plan
going for a bigger hosting plan is not a permanent solution because
As the static content images/videos always grow in size. this time your need is 1 TB the next time it will increase more. So, you will be again in the same situation.
With the growth of users and static content your bandwidth will also increase and will cost you more.
Your database size is not so big and We can assume you are not using a lot of CPU power and memory. So you will only be using more disk space and paying for larger CPU and memory which you are not using.
Technically it is not good to server all your requests from a single server. Browser has a limited simultaneous requests per domain.
S3/ Cloud storage for static content
s3 or other cloud storage is good option for static contents. following are the benefits.
You don't need to worry about the storage space it auto scales and available in abundance.
If your site is accessible in different location worldwide you can manage a cdn to improve the speed of content delivered from the nearest location.
The bandwidth is very cheap as compared to the traditional hosting.
It will also decrease burden from your server by uploading your files and serving from s3.
These are some of the benefits for using s3 over traditional hosting. As s3 is specifically built to server the static contents. Decision is yours :)
If you're looking at the long term, at some point you might not be able to afford a server that will hold all of your data. I think S3 is a good option for a case like yours for the following reasons:
You don't need to worry about large file uploads tying down your server. With Cross Origin Resource Sharing, you can upload files directly from the client to your S3 bucket.
Modern browsers will often load parallel requests when a webpage requests content from different domains. If you have your pictures coming from yourbucket.s3.amazonaws.com and the rest of your website loading from yourdomain.com, your users might experience a shorter load time since these requests will be run in parallel.
At some point, you might want to use a Content Distribution Network (CDN) to serve your media. When this happens, you could use Amazon's cloudfront with out of the box support for S3, or you can use another CDN - most popular CDNs these days do support serving content from S3 buckets.
It's a problem you'll never have to worry about. Amazon takes care of redundancy, availability, backups, failovers, etc. That's a big load off your shoulders leaving you with other things to take care of knowing your media is being stored in a way that's scalable and future-proof (at least the foreseeable future).
I'm working on a project that is being hosted on Amazon Web Services. The server setup consists of two EC2 instances, one Elastic Load Balancer and an extra Elastic Block Store on which the web application resides. The project is supposed to use S3 for storage of files that users upload. For the sake of this question, I'll call the S3 bucket static.example.com
I have tried using s3fs (https://code.google.com/p/s3fs/wiki/FuseOverAmazon), RioFS (https://github.com/skoobe/riofs) and s3ql (https://code.google.com/p/s3ql/). s3fs will mount the filesystem but won't let me write to the bucket (I asked this question on SO: How can I mount an S3 volume with proper permissions using FUSE). RioFS will mount the filesystem and will let me write to the bucket from the shell, but files that are saved using PHP don't appear in the bucket (I opened an issue with the project on GitHub). s3ql will mount the bucket, but none of the files that are already in the bucket appear in the filesystem.
These are the mount commands I used:
s3fs static.example.com -ouse_cache=/tmp,allow_other /mnt/static.example.com
riofs -o allow_other http://s3.amazonaws.com static.example.com /mnt/static.example.com
s3ql mount.s3ql s3://static.example.com /mnt/static.example.com
I've also tried using this S3 class: https://github.com/tpyo/amazon-s3-php-class/ and this FuelPHP specific S3 package: https://github.com/tomschlick/fuel-s3. I was able to get the FuelPHP package to list the available buckets and files, but saving files to the bucket failed (but did not error).
Have you ever mounted an S3 bucket on a local linux filesystem and used PHP to write a file to the bucket successfully? What tool(s) did you use? If you used one of the above mentioned tools, what version did you use?
EDIT
I have been informed that the issue I opened with RioFS on GitHub has been resolved. Although I decided to use the S3 REST API rather than attempting to mount a bucket as a volume, it seems that RioFS may be a viable option these days.
Have you ever mounted an S3 bucket on a local linux filesystem?
No. It's fun for testing, but I wouldn't let it near a production system. It's much better to use a library to communicate with S3. Here's why:
It won't hide errors. A filesystem only has a few errors codes it can send you to indicate a problem. An S3 library will give you the exact error message from Amazon so you understand what's going on, log it, handle corner cases, etc.
A library will use less memory. Filesystems layers will cache lots of random stuff that you many never use again. A library puts you in control to decide what to cache and not to cache.
Expansion. If you ever need to do anything fancy (set an ACL on a file, generate a signed link, versioning, lifecycle, change durability, etc), then you'll have to dump your filesystem abstraction and use a library anyway.
Timing and retries. Some fraction of requests randomly error out and can be retried. Sometimes you may want to retry a lot, sometimes you would rather error out quickly. A filesystem doesn't give you granular control, but a library will.
The bottom line is that S3 under FUSE is a leaky abstraction. S3 doesn't have (or need) directories. Filesystems weren't built for billions of files. Their permissions models are incompatible. You are wasting a lot of the power of S3 by trying to shoehorn it into a filesystem.
Two random PHP libraries for talking to S3:
https://github.com/KnpLabs/Gaufrette
https://aws.amazon.com/sdkforphp/ - this one is useful if you expand beyond just using S3, or if you need to do any of the fancy requests mentioned above.
Quite often, it is advantageous to write files to the EBS volume, then force subsequent public requests for the file(s) to route through CloudFront CDN.
In that way, if the app must do any transformations to the file, it's much easier to do on the local drive & system, then force requests for the transformed files to pull from the origin via CloudFront.
e.g. if your user is uploading an image for an avatar, and the avatar image needs several iterations for size & crop, your app can create these on the local volume, but all public requests for the file will take place through a cloudfront origin-pull request. In that way, you have maximum flexibility to keep the original file (or an optimized version of the file), and any subsequent user requests can either pull an existing version from cloud front edge, or cloud front will route the request back to the app and create any necessary iterations.
An elementary example of the above would be WordPress, which creates multiple sized/cropped versions of any graphic image uploaded, in addition to keeping the original (subject to file size restrictions, and/or plugin transformations). CDN-capable WordPress plugins such as W3 Total Cache rewrite requests to pull through CDN, so the app only needs to create unique first-request iterations. Adding browser caching URL versioning (http://domain.tld/file.php?x123) further refines and leverages CDN functionality.
If you are concerned about rapid expansion of EBS volume file size or inodes, you can automate a pruning process for seldom-requested files, or aged files.
I have several gigabytes of documents which I wish to store online somewhere in a system I can access from multiple servers using HTTP requests. These are mostly 5-200kB text documents (very few in binary formats) that are not read very often, and need to be stored in a way that all servers can access them. Cost is a big factor.
These documents do not have additional attributes, so if the files were larger I would use S3 for sure, but since they are so small I'm not sure which service would be easier to work with.
Has anyone used either of these services for this type of thing?
I'm pretty sure the maximum row size in SimpleDB is lower than 200kb, so you'd have to use S3.
S3 has the massive advantage that you can access files very simply over HTTP, and it supports REST operations for creating/updating/deleting files. This makes it incredibly easy to talk to.
Combine that with the fact that with S3 you only pay for storage (with SimpleDB, you also pay for machine hours), and I'd say S3 was the best solution in this case.
I currently have a document management system running across several servers which stores all documents in S3. The documents/files range from 1KB to 2GB. So far I've found s3 to be brilliant, very easy to communicate with in almost any language and as a bonus offers AES encryption.
If Storage Size & Cost are your major deciding factors - SimpleDB's sizing model includes indexes and therefore costs more than S3 byte-for-byte. Starting at $0.25 to $0.34 per GB-month, it is a waste to load up SimpleDB with data and not use complex queries.
The attribute value size limit of 1k may impact your designs and require you to chunk values. Great for javascript/terrible for html.
SimpleDB is fantastic as an index to your content hosted on S3 and pre/post processed on EC2.
You get in-house bandwidth for free.
S3 is cheapest GB-m.
Actively Reduce/Compress your output via EC2 proxy and you are effectively paying only thumbnail bandwidth costs.
Adding a caching proxy keeps the SimpleDB cpu costs to a minimum.
I will be launching a web application soon which will require users to upload pictures for others to view. I would like to use Amazon S3 to store the images as it scales and is cheap. The user will use a form to upload their file, it will be processed with php and saved to the S3 mount thats attached to the web server.
I am anticipating and hoping tens or hundreds of thousands of images will eventually be uploaded.
My first question is whether an S3 bucket mount is robust and fast enough for such an application, or would I be better off using Amazon EBS. Although I'd like to have my own dedicated box rather than use an EC2 instance.
Also, I am at this point unfamiliar with S3, but when I do upload files, is it appropriate to put them in a single bucket rather than a cascade of directories? It seems it might be ok since each 'bucket' is virtual anyway.
One of the things you can do is to have your users upload to S3 bucket directly, unless you want to do some processing. You can use POST to upload the files to S3 or one of the 3rd party components such as http://flajaxian.com/ This way you can significantly offload your server.
As for your second question it is actually up to you how your design your app. there is no pros and cons.
I will be launching an application in the very near future which will, in part, require users to upload files (images) to be viewed by other members. I like the idea of S3 as it is relatively cheap and scales automatically.
My problem is how I will have users upload their images to S3. It seems there are a few options.
1- Use the php REST API. The only problem is that I can't get it to work for uploading variously scaled versions (ie thumbnails) of the same image simultaneously and uploading them directly to s3 (it works for just one image at a time this way). Overall, it just seems less flexible.
http://net.tutsplus.com/tutorials/php/how-to-use-amazon-s3-php-to-dynamically-store-and-manage-files-with-ease/
2- The other option would be to mount an S3 bucket with s3fs. Then just programmatically move my images into the bucket like I would with NFS. From what I've read, it seems some people are dubious of the reliability of mounting S3. Is this true?
http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=fuse+over+amazon
Which method would be better for maximum reliability and speed?
Would EBS be something to consider? I would really like to have a dedicated box rather than use an EC2 instance, though...
For your use case I recommend to use the S3 API directly rather than using s3fs because of performance. Remember that s3fs is just another layer on top of S3's API and it's usage of that API is not always the best one for your application.
To handle the creation of thumbnails, I recommend to decouple that from the main upload process by using Amazon Simple Queue Service. That way your users will receive a response as soon as a file is uploaded without having to wait for it to be processed resulting in shorter response times.
As for using EBS, that is a different scenario. EBS is just a persistent storage for and Amazon EC2 instance and it's reliability doesn't compare with S3.
It's also important to remember that S3 only offers "eventual consistency" as opposed to a physical HDD on your machine or an EBS instance on EC2 so you need to code your app to handle that correctly.