delete oldest object from S3

delete oldest object from S3 - php

I have a script that does a backup of my system every day. In the end I get 3 zip files each day and they get stored to S3. What I'd like to do is always keep a week or 10 days worth of these backups. After the 10th day, delete the oldest 3 zip files from S3. Any ideas on how I could tackle this. I don't see a way I could query the date modified to find the oldest.
I'm using the S3 PHP SDK. For reference, here is what I do to create the objects.
<?php
require_once 'AWSSDKforPHP/sdk.class.php';
define('BACKUPDIR','/backups/');
$backup1="backup1_".time().".zip";
$backup2="backup2_".time().".zip";
$backup3="backup3_".time().".zip";
$s3 = new AmazonS3();
$s3->create_object('mybucket', 'backups/'.$backup1, array(
'fileUpload' => BACKUPDIR.$backup1,
'acl' => $s3::ACL_PRIVATE
));
$s3->create_object('mybucket', 'backups/'.$backup2, array(
'fileUpload' => BACKUPDIR.$backup2,
'acl' => $s3::ACL_PRIVATE
));
$s3->create_object('mybucket', 'backups/'.$backup3, array(
'fileUpload' => BACKUPDIR.$backup3,
'acl' => $s3::ACL_PRIVATE
));
?>

Using list_object requires a lot more work to parse so instead I used object expiration. Nothing to do here other than to let S3 handle it. All I have to do is add the expiry:
$s3->create_object('mybucket', $backup1, array(
'fileUpload' => BACKUPDIR.$backup1,
'acl' => $s3::ACL_PRIVATE,
'headers' => array(
'Expires' => gmdate(CFUtilities::DATE_FORMAT_RFC2616, strtotime('+10 days'))
)
));
Now S3 will clean this up automatically after 10 days. Perfect for me when handling backup files.

Use a GET Bucket (list objects) API call, as documented here:
http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTBucketGET.html
This returns a list of all items in the bucket along with some meta data for every item, including the date on which the item was last modified. You can then use PHP to figure out which of these files you want to delete and use a DELETE Object API call to delete them :)

Related

Digital Ocean: How to add required ContentMD5 to Bucket Policy PUT request for lifecycle, cors or acl?

I am going to answer my own question here. It took me hours to figure this out as there is no info on this anywhere so I thought I should post this somewhere I would look first.
I was using the AWS PHP SDK to send a PUT request to add a lifecycle policy to my Digital Ocean space and it would not take since it requires a ContentMD5 header. There are two problems here, the first problem is the SDK URL encodes the path/key, which is a problem with /?lifecycle, /?location, and /?acl since they become "/%3Flifecycle" -- skip this paragraph if this isn't part of your request path. To temporarily stop this to add or update a bucket policy you have to find the file RestSerializer.php in the SDK files, if you added the API with composer it will be in a path like /vendor/aws/aws-sdk-php/src/Api/Serializer/RestSerializer.php in your composer/websites root, which will likely be in /var/www. In RestSerializer.php find the two rawurlencode function calls and remove them but leave the value/argument "rawurlencode($varspecs[$k])" becomes "$varspecs[$k]".
Now the request is going to the correct URL, to generate the ContentMD5 you need to construct a little PHP code depending on what you're doing. If you have put the XML text for your policy in a file use md5_file(PATH_TO_FILE_HERE, true) if you are using a string use md5(STRING_HERE, true). Then wrap that in base64_encode() so it looks something like base64_encode(md5_file('/path/file.xml', true)). Finally, add that to your putObject array with 'ContentMD5' => base64_encode(md5_file('/path/file.xml', true)) .
PHP Example with File:
// $spaceS3Client is a new S3Client object.
// since its a file, I need to get the file first
$xmlfile = fopen('/spaces.xml', 'r+');
$request = $spaceS3Client->putObject([
'Bucket' => 'myspacename',
'Key' => '?lifecycle',
'Body' => $xmlfile,
'ContentType' => 'application/xml',
'ContentMD5' => base64_encode(md5_file('/spaces.xml'', true))
]);
// close file
fclose($xmlfile);
// if you are having trouble connecting to your space in the first place with an S3Client object, since its set up for AWS and not DO you need to add an 'endpoint' to the array in new S3Client like 'endpoint' => 'https://'.$myspace.'.'.$myspaceregion.'.digitaloceanspaces.com'. You also need to add 'bucket_endpoint' => true.

There are two problems here, the first problem is the SDK URL encodes the path/key, which is a problem with /?lifecycle, /?location, and /?acl since they become "/%3Flifecycle" -- skip this paragraph if this isn't part of your request path. To temporarily stop this to add or update a bucket policy you have to find the file RestSerializer.php in the SDK files, if you added the API with composer it will be in a path like /vendor/aws/aws-sdk-php/src/Api/Serializer/RestSerializer.php in your composer/websites root, which will likely be in /var/www. In RestSerializer.php find the two rawurlencode function calls and remove them but leave the value/argument "rawurlencode($varspecs[$k])" becomes "$varspecs[$k]".
Now the request is going to the correct URL, to generate the ContentMD5 you need to construct a little PHP code depending on what you're doing. If you have put the XML text for your policy in a file use md5_file(PATH_TO_FILE_HERE, true) if you are using a string use md5(STRING_HERE, true). Then wrap that in base64_encode() so it looks something like base64_encode(md5_file('/path/file.xml', true)). Finally, add that to your putObject array with 'ContentMD5' => base64_encode(md5_file('/path/file.xml', true)) .
PHP Example with File:
// $spaceS3Client is a new S3Client object.
// since its a file, I need to get the file first
$xmlfile = file_get_contents('/spaces.xml', 'r+');
$request = $spaceS3Client->putObject([
'Bucket' => 'myspacename',
'Key' => '?lifecycle',
'Body' => $xmlfile,
'ContentType' => 'application/xml',
'ContentMD5' => base64_encode(md5_file('/spaces.xml', true))
]);
// if you are having trouble connecting to your space in the first place with an S3Client object, since its set up for AWS and not DO you need to add an 'endpoint' to the array in new S3Client like 'endpoint' => 'https://'.$myspace.'.'.$myspaceregion.'.digitaloceanspaces.com'. You also need to add 'bucket_endpoint' => true.
// to check the rules have been set use a getObject request and then use the code below to parse the response.
header('Content-type: text/xml');
$request = $request->toArray()["Body"];
echo $request;

How to fetch files from S3 bucket, which are moved from S3 bucket to Glacier?

I am storing files on AWS S3 bucket and we have set option to move file from S3 bucket to Glacier after specific time period(e.g 6 months) via AWS console.
File are successfully moving from S3 to Glacier.
Now, I want to retrieve files moved on Glacier. But I couldn't find any working method to do so.
I have already tried with referring document of AWS Glacier but no luck.
Note : We are trying to do this via PHP SDK or any other way using PHP.

As per documentation said:
Objects in the Amazon Glacier storage class are not immediately accessible: you must first restore a temporary copy of the object to its bucket before it is available.
You need to initiate a restore operation on your archived (S3-Glacier) object, which may take a few hours (typically three to five hours) to be restored as a temporary object. If you want them permanently in S3 bucket, you can create a copy within your S3 bucket after the restore is done.
To initiate restore job, you can use:
S3 Management Console, see here.
AWS CLI, see here.
Call S3 REST API - POST Restore Object, see here.
AWS SDK, for PHP can see in here.
To determine when a restore job is complete programmatically, you can:
Call S3 REST API - HEAD Object, see here.
AWS SDK, for PHP can see in here.
After the restore job is done, you can retrieve the object in S3 bucket for a certain period that you set in the job.

If you are Using PHP SDK you can use this
$objects = $s3Client->restoreObject(array(
'Bucket' => 'Bucket name'
,"Key" => 'File Key which is file bath in S3 bucket'
//,'RequestPayer' => 'requester',
,'RestoreRequest' => [
'Days' => 10,
'GlacierJobParameters' => [
//'Tier' => 'Standard|Bulk|Expedited', // REQUIRED
'Tier' => Expedited, // REQUIRED
]
],
));
Here is the general template for this
$result = $client->restoreObject([
'Bucket' => '<string>', // REQUIRED
'Key' => '<string>', // REQUIRED
'RequestPayer' => 'requester',
'RestoreRequest' => [
'Days' => <integer>, // REQUIRED
'GlacierJobParameters' => [
'Tier' => 'Standard|Bulk|Expedited', // REQUIRED
],
],
'VersionId' => '<string>',
]);

Sdk Glacier php timeout

after playing a bit an uploading some small test files I wanted to upload a bigger file, around 200 MB but I always get the timeout exception, then I tried to upload a 30 MB file and the same happens.
I think the timeout is 30 seconds, it is possible to tell the glacier client to wait until the upload is done?
This is the code I use:
$glacier->uploadArchive(array(
'vaultName' => $vaultName,
'archiveDescription' => $desc
'body' => $body
));
I have tested with other files and the same happens, then I tried with a small file of 4MB and the operation was successful, I thought that dividing the files and uploading them one by one, bu then again around the third one a timeout exception comes out.
I also tried the multiupload with the following code
$glacier = GlacierClient::factory(array(
'key' => 'key',
'secret' => 'secret',
'region' => Region::US_WEST_2
));
$multiupload = $glacier->initiateMultipartUpload(array(
'vaultName' => 'vaultName',
'partSize' => '4194304'
));
// An array for the suffixes of the tar file
foreach($suffixes as $suffix){
$contents = file_get_contents('file.tar.gz'.$suffix);
$glacier->uploadMultipartPart(array(
'vaultName' => 'vaultName',
'uploadId' => $multiupload->get('uploadId'),
'body' => $contents
));
}
$result=$glacier->completeMultipartUpload(array(
'vaultName' => 'vaultName',
'uploadId' => $multiupload->get('uploadId'),
));
echo $result->get('archiveId');
It misses the parameter Range, I don't think I fully understand how this multi part upload works, but I think I will have the same timeout exception. So my question is as I said before.
It is possible to tell the glacier client to wait until the upload is done?

The timeout sounds like a script timeout like Jimzie said.
As for using the Glacier client, you should checkout this blog post from the official AWS PHP Developer Blog, which shows how to do multipart uploads to Glacier using the UploadPartGenerator object. If you are doing the part uploads in different requests/processes, you should also keep in mind that the UploadPartGenerator class can be serialized.

This sounds suspiciously like a script timeout. Try
set_time_limit (120);
just inside of the foreach loop. This will give you a two minute PHP sanity timer for each of your multi-part files.

Photos import facebook-php-sdk not working

$facebook = new Facebook(array(
'appId' => '<key>',
'secret' => '<secret code>',
'cookie' => true
));
print_r($facebook);die;
Output of this is
Facebook Object
(
[appId:protected] => <key>
[apiSecret:protected] => <secret code>
[session:protected] =>
[signedRequest:protected] =>
[sessionLoaded:protected] =>
[cookieSupport:protected] => 1
[baseDomain:protected] =>
[fileUploadSupport:protected] =>
)
This problems occur in the end of October only as before that it always prints information of session. Then I call link https://api.facebook.com/method/photos.getAlbums?uid='.$session['uid'].'&access_token='.$session['access_token'] and used to get list of albums.
This works fine for more than 8 months and suddenly from last month it stopped working.

We had some troubles on our Facebook API a couple months ago too.
Facebook has deprecated the REST API, which you are using. It is very possible the feature you're trying to access has changed and is no longer supported.
Source: http://developers.facebook.com/blog/post/616/
Switch to the new OAuth2.0 API to restore your features and future proof your application for awhile..
Here's the link to the new API documentation:
http://developers.facebook.com/docs/reference/api
Oh, and in the future... Be sure to remove any API keys and secret codes from your posts. These would potentially allow someone else to use your credentials mischievously.

How do I give multiple users access to a single Amazon S3 account AND determine who's added a file?

I have an AWS S3 account which contains 3 buckets. I need to be able to generate access codes for a new user so that they can access the buckets and add/delete files (preferably only their own, but not a deal breaker).
I have managed to get as far as granting access to new users using IAM. However, when I read the metadata of uploaded objects (in PHP using the AWS SDK) the owner comes back as the main AWS account.
I've read pages of documentation but can't seem to find anything relating to determining who the owner (or uploader) of the file was.
Any advice or direction massively appreciated!
Thanks.

If your only problem is to find the owner of Uploaded file.
You can pass the owner info as meta-data of uploaded file.
Check http://docs.amazonwebservices.com/AmazonS3/latest/dev/UsingMetadata.html
In php code while uploading :
// Instantiate the class.
$s3 = new AmazonS3();
$response = $s3->create_object(
$bucket,
$keyname2,
array(
'fileUpload' => $filePath,
'acl' => AmazonS3::ACL_PUBLIC,
'contentType' => 'text/plain',
'storage' => AmazonS3::STORAGE_REDUCED,
'headers' => array( // raw headers
'Cache-Control' => 'max-age',
'Content-Encoding' => 'gzip',
'Content-Language' => 'en-US',
'Expires' => 'Thu, 01 Dec 1994 16:00:00 GMT',
),
'meta' => array(
'uploadedBy' => 'user1',
) )
);
print_r($response);
Check php api for more info.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

delete oldest object from S3 - php

Related

Digital Ocean: How to add required ContentMD5 to Bucket Policy PUT request for lifecycle, cors or acl?

How to fetch files from S3 bucket, which are moved from S3 bucket to Glacier?

Sdk Glacier php timeout

Photos import facebook-php-sdk not working

How do I give multiple users access to a single Amazon S3 account AND determine who's added a file?

Categories

Resources