Using putObject command to aliased bucket - php

I've got a script that uploads files perfectly fine into buckets. However, one particular bucket has been given a cname so that it can be accessed directly, apparently it has been assigned this using CloudFront.
I'm no expert in this field, but basically, instead of accessing the bucket via:
http://mybucket.mysite.com.s3.amazonaws.com/thing.txt, it allows you to access it via:
http://mybucket.mysite.com/thing.txt
It performs the put fine by the looks of it, when I do a response on the callback, it says it's all done but the last element in the array swaps the bucket and the endpoint around, so it looks like this: https://s3.amazonaws.com/mybucket.mysite.com/thing.txt
However, when I use any other bucket it uploads correctly and returns the correct ObjectURL.
Having had a search around google and this site, I can't seem to find a solution so any help would be magic.
I'm using an older version of the AWS PHP 2 sdk, currently using 2.2.1.
Edit: even stranger still, when I pass the bucket through the isValidBucketName method, it returns true.

Just in case anyone else ever encounters this issue, the problem was that when executing the put, the SDK automatically assumes that you are trying to connect to a bucket in the US region. So I needed to specify what region the bucket is in, in my case EU_WEST_1, so when you set up your config array, be sure to provide this value, eg.
$config = array(
'key' => 'your-key'
, 'secret' => 'your-secret'
, 'region' => Region::EU_WEST_1
);
Being sure to include the Aws\Common\Enum\Region in the class.

Related

S3 DeleteObject - DeleteMarker always returns empty

I am using the AWS SDK for PHP, version 2.4.7 installed via composer. After deleting a file from an S3 bucket the DeleteMarker key in the response object is always empty even through the file has actually been deleted from S3. The documentation states that DeleteMarker should be true if the operation was successful otherwise it's false.
My delete call is:
// delete S3 object
$result = $s3->deleteObject(array(
'Bucket' => $this->_bucket,
'Key' => $object_key,
));
and the response is:
Guzzle\Service\Resource\Model Object
(
[structure:protected] =>
[data:protected] => Array
(
[DeleteMarker] =>
[VersionId] =>
[RequestId] => 2CC3EC60C4294CB5
)
)
If I then do:
// check if was deleted
$is_deleted = (bool) $result->get('DeleteMarker');
$is_deleted is always false. How can it be that there is no value returned against the DeleteMarker key even though the delete operation was actually successful and the file was removed from S3?
UPDATE:
If I add a slash to the start of my key I get a response false back even though the file is still removed from S3.
Key "path/to/my/image.jpg" results in DeleteMarker having empty value
Key "/path/to/my/image.jpg" results in DeleteMarker having empty false
But in both cases the images is removed from the S3 bucket.
In converting from SDK v. 1.? to 2.?, I too ran into the problem of not knowing if the file deleted (there used to be the ->isOK() method on just about everything that would let me know if the file had been deleted or not).
I finally stumbled upon this response from the Guzzle creator: https://forums.aws.amazon.com/thread.jspa?messageID=455154
Basically, there is no longer any 'did delete' flag of any kind. What Michael (Guzzle) suggests is this: if you want to know if a file deleted, use ->deleteObject() and then run ->doesObjectExist() to see if the delete was successful.
The rationale for the change is this: the new approach lets you fire off tons of delete requests without having to wait for replies, etc.
For what it's worth.
David
I am having the same issue with the Javascript SDK. The call to deleteObject returns fine (HTTP 204) regardless if the file exists or not! This makes it impossible to tell if the file was deleted from the response code. Furthermore, it seems that the response only includes DeleteMarker if the bucket has versioning enabled (see also this thread on DeleteMarker).
I see two possibilities to work around this issue.
As first option, you can enable versioning and use DELETE Object versionID to permanently delete your objects (see the AWS documentation). This will require you to either store the versionID in your database, or query it prior to deletion using listObjectVersions
As second option, you can use listObjects to check if the file exists, delete the file using deleteObject and check listObjects again to make sure the file was deleted for sure.
I am not satisfied with either solution, but they do the job for now

php - simpledb - can't get consistent read to work

I'm using AWS SimpleDB for my site, however if I udpate an attribute with something completely different, searching that property with either the new value or the old value are both returning the same record.
Let's say the 'login' property's current value is 'dev'. I then change that value to 'myvar'.
$response = $this->simpledb->select(vsprintf(select * from mydomain where login='%s',array('myvar')),array('ConsistentRead' => 'true'));
# returns the newly updated row
$response = $this->simpledb->select(vsprintf(select * from mydomain where login='%s',array('dev')),array('ConsistentRead' => 'true'));
# returns the same row even though 'login' has changed
Am I doing something wrong with the consistent read argument? I have no clue why this is happening. Also, it's been about a half hour and this issue is still happening, I highly doubt it takes AWS that long to propagate changes across servers.
Anyone have any ideas?
I did not realize this at the time, but I was using v1 of the SDK, after updating to V2 all consistency issues were solved.

How can I efficiently get a list of matching Amazon S3 files?

Assume 200,000 images in a flat Amazon S3 bucket.
The bucket looks something like this:
000000-1.jpg
000000-2.jpg
000000-3.jpg
000000-4.jpg
000001-1.jpg
000001-2.jpg
000002-1.jpg
...
ZZZZZZ-9.jpg
ZZZZZZ-10.jpg
(a 6 digit hash followed by a count, followed by the extension)
If I need all files matching 000001-*.jpg, what's the most efficient way to get that?
In PHP I'd use rglob($path,'{000001-*.jpg}',GLOB_BRACE) to get an array of matches, but I don't think that works remotely.
I can get a list of all files in the bucket, then find matches in the array, but that seems like an expensive request.
What do you recommend?
Amazon provides a way to do this directly using the S3 api.
You can use the prefix option when calling listing S3 objects to only return objects that begin with the prefix. eg using the AWS SDK for PHP:
// Instantiate the class
$s3 = new AmazonS3();
$response = $s3->list_objects('my-bucket', array(
'prefix' => '000001-'
));
// Success?
var_dump($response->isOK());
var_dump(count($response->body->Contents))
You might also find the delimiter option useful - you could use that to get a list of all the unique 6 digit hashes.

How do I display protected Amazon S3 images on my secure site using PHP?

I am trying to move images for my site from my host to Amazon S3 cloud hosting. These images are of client work sites and cannot be publicly available. I would like them to be displayed on my site preferably by using the PHP SDK available from Amazon.
So far I have been able to script for the conversion so that I look up records in my database, grab the file path, name it appropriately, and send it to Amazon.
//upload to s3
$s3->create_object($bucket, $folder.$file_name_new, array(
'fileUpload' => $file_temp,
'acl' => AmazonS3::ACL_PRIVATE, //access denied, grantee only own
//'acl' => AmazonS3::ACL_PUBLIC, //image displayed
//'acl' => AmazonS3::ACL_OPEN, //image displayed, grantee everyone has open permission
//'acl' => AmazonS3::ACL_AUTH_READ, //image not displayed, grantee auth users has open permissions
//'acl' => AmazonS3::ACL_OWNER_READ, //image not displayed, grantee only ryan
//'acl' => AmazonS3::ACL_OWNER_FULL_CONTROL, //image not displayed, grantee only ryan
'storage' => AmazonS3::STORAGE_REDUCED
)
);
Before I copy everything over, I have created a simple form to do test upload and display of the image. If I upload an image using ACL_PRIVATE, I can either grab the public url and I will not have access, or I can grab the public url with a temporary key and can display the image.
<?php
//display the image link
$temp_link = $s3->get_object_url($bucket, $folder.$file_name_new, '1 minute');
?>
<a href='<?php echo $temp_link; ?>'><?php echo $temp_link; ?></a><br />
<img src='<?php echo $temp_link; ?>' alt='finding image' /><br />
Using this method, how will my caching work? I'm guessing every time I refresh the page, or modify one of my records, I will be pulling that image again, increasing my get requests.
I have also considered using bucket policies to only allow image retrieval from certain referrers. Do I understand correctly that Amazon is supposed to only fetch requests from pages or domains I specify?
I referenced:
https://forums.aws.amazon.com/thread.jspa?messageID=188183&#188183 to set that up, but then am confused as to which security I need on my objects. It seemed like if I made them Private they still would not display, unless I used the temp link like mentioned previously. If I made them public, I could navigate to them directly, regardless of referrer.
Am I way off what I'm trying to do here? Is this not really supported by S3, or am I missing something simple? I have gone through the SDK documentation and lots of searching and feel like this should be a little more clearly documented so hopefully any input here can help others in this situation. I've read others who name the file with a unique ID, creating security through obscurity, but that won't cut it in my situation, and probably not best practice for anyone trying to be secure.
The best way to serve your images is to generate a url using the PHP SDK. That way the downloads go directly from S3 to your users.
You don't need to download via your servers as #mfonda suggested - you can set any caching headers you like on S3 objects - and if you did you would be losing some major benefits of using S3.
However, as you pointed out in your question, the url will always be changing (actually the querystring) so browsers won't cache the file. The easy work around is simply to always use the same expiry date so that the same querystring is always generated. Or better still 'cache' the url yourself (eg in the database) and reuse it every time.
You'll obviously have to set the expiry time somewhere far into the future, but you can regenerate these urls every so often if you prefer. eg in your database you would store the generated url and the expiry date(you could parse that from the url too). Then either you just use the existing url or, if the expiry date has passed, generate a new one. etc...
You can use bucket policies in your Amazon bucket to allow your application's domain to access the file. In fact, you can even add your local dev domain (ex: mylocaldomain.local) to the access list and you will be able to get your images. Amazon provides sample bucket policies here: http://docs.aws.amazon.com/AmazonS3/latest/dev/AccessPolicyLanguage_UseCases_s3_a.html. This was very helpful to help me serve my images.
The policy below solved the problem that brought me to this SO topic:
{
"Version":"2008-10-17",
"Id":"http referer policy example",
"Statement":[
{
"Sid":"Allow get requests originated from www.example.com and example.com",
"Effect":"Allow",
"Principal":"*",
"Action":"s3:GetObject",
"Resource":"arn:aws:s3:::examplebucket/*",
"Condition":{
"StringLike":{
"aws:Referer":[
"http://www.example.com/*",
"http://example.com/*"
]
}
}
}
]
}
When you talk about security and protecting data from unauthorized users, something is clear: you have to check every time you access that resource that you are entitled to.
That means, that generating an url that can be accessed by anyone (might be difficult to obtain, but still...). The only solution is an image proxy. You can do that with a php script.
There is a fine article from Amazon's blog that sugests using readfile, http://blogs.aws.amazon.com/php/post/Tx2C4WJBMSMW68A/Streaming-Amazon-S3-Objects-From-a-Web-Server
readfile('s3://my-bucket/my-images/php.gif');
You can download the contents from S3 (in a PHP script), then serve them using the correct headers.
As a rough example, say you had the following in image.php:
$s3 = new AmazonS3();
$response = $s3->get_object($bucket, $image_name);
if (!$response->isOK()) {
throw new Exception('Error downloading file from S3');
}
header("Content-Type: image/jpeg");
header("Content-Length: " . strlen($response->body));
die($response->body);
Then in your HTML code, you can do
<img src="image.php">

PHP/Amazon S3: Query string authentication sometimes fails

I created a simple file browser in PHP that links to the files through generation expiring query URLs. So for each access to a directory, a link to each file is generated that is valid for say 900 seconds.
I now have the problem that the generated signatures seem to fail sometimes. Which is strange, since I intentionally used external S3 libraries for generating the URLs and signatures.
In fact, I tried the following libraries to generate the signatures:
CloudFusion
S3 generator
Amazon S3 PHP class
The libraries internally use hash_hmac('sha256', ... or hash_hmac('sha1', ... - I also don't understand why differnet hash algorithms are used.
Since the problem is the same with all libraries, it could as well be in my URL generation code, which is straightforward though:
$bucket = "myBucket";
$filename = $object->Key;
$linksValidForSeconds = 900;
$url = $s3->get_object_url($bucket, $filename, $linksValidForSeconds);
Sp $bucket and $linksValidForSeconds are constant, $filename is e.g. "Media/Pictures/My Picture.png". But event for same variables, it sometimes works, soemtimes doesn't.
Any ideas?
Edit: Typo/Wrong constant variable name fixed (thanks)
I found the problem and it had nothing to do with the code I mentioned. The generated URL is urlencode()'d and sent to another PHP script. There I use the URL to display an image from S3. I used urldecode() there to undo the changes but apparently this is not neccesary.
So each time the signature contains certain chars, the urldecode() would change them and corrupt it.
Sorry for omitting the actual problem code.
The code the asker is using above is from the CloudFusion AWS PHP SDK. Here's the documentation for get_object_url(): get_object_url ( $bucket, $filename, [ $preauth = 0 ], [ $opt = null ] )
The problem in your code above is your $linksValidForSeconds variable.
Where: $preauth - integer | string (Optional) Specifies that a presigned URL for this request should be returned. May be passed as a number of seconds since UNIX Epoch, or any string compatible with strtotime().
In other words, you are setting an expires time for 900 seconds after UNIX Epoch. I am honestly not sure how any links have worked using that library with your client code. If you are using the CloudFusion SDK, what you want to do is take the current UNIX time and add 900 seconds to that when passing in the parameter.
You seem to be confusing this with the Amazon S3 Class' getAuthenticatedURL method which takes the parameter integer $lifetime in seconds as you've used in your client code.
Be careful when using multiple libraries and swapping between them freely. Things tend to break that way.
The current version of CloudFusion is the AWS SDK for PHP, plus some other stuff. Amazon forked CloudFusion as the basis for their PHP SDK, then when the official SDK went live, CloudFusion backported the changes.
It's kind of a KHTML/WebKit thing. http://en.wikipedia.org/wiki/WebKit#History

Categories