after playing a bit an uploading some small test files I wanted to upload a bigger file, around 200 MB but I always get the timeout exception, then I tried to upload a 30 MB file and the same happens.
I think the timeout is 30 seconds, it is possible to tell the glacier client to wait until the upload is done?
This is the code I use:
$glacier->uploadArchive(array(
'vaultName' => $vaultName,
'archiveDescription' => $desc
'body' => $body
));
I have tested with other files and the same happens, then I tried with a small file of 4MB and the operation was successful, I thought that dividing the files and uploading them one by one, bu then again around the third one a timeout exception comes out.
I also tried the multiupload with the following code
$glacier = GlacierClient::factory(array(
'key' => 'key',
'secret' => 'secret',
'region' => Region::US_WEST_2
));
$multiupload = $glacier->initiateMultipartUpload(array(
'vaultName' => 'vaultName',
'partSize' => '4194304'
));
// An array for the suffixes of the tar file
foreach($suffixes as $suffix){
$contents = file_get_contents('file.tar.gz'.$suffix);
$glacier->uploadMultipartPart(array(
'vaultName' => 'vaultName',
'uploadId' => $multiupload->get('uploadId'),
'body' => $contents
));
}
$result=$glacier->completeMultipartUpload(array(
'vaultName' => 'vaultName',
'uploadId' => $multiupload->get('uploadId'),
));
echo $result->get('archiveId');
It misses the parameter Range, I don't think I fully understand how this multi part upload works, but I think I will have the same timeout exception. So my question is as I said before.
It is possible to tell the glacier client to wait until the upload is done?
The timeout sounds like a script timeout like Jimzie said.
As for using the Glacier client, you should checkout this blog post from the official AWS PHP Developer Blog, which shows how to do multipart uploads to Glacier using the UploadPartGenerator object. If you are doing the part uploads in different requests/processes, you should also keep in mind that the UploadPartGenerator class can be serialized.
This sounds suspiciously like a script timeout. Try
set_time_limit (120);
just inside of the foreach loop. This will give you a two minute PHP sanity timer for each of your multi-part files.
Related
I am storing some customer PDFs in S3 for multiple parties to either view in the browser or download. The trouble is I can only get a single file in S3 to either always download or always view in the browser.
I could just upload the same file twice with each having its own ContentDisposition, but that seems wasteful when ideally it could be as simple as adding something like ?ContentDisposition=inline to the public bucket URL.
My Question: How can dynamically set a ContentDisposition for a single S3 file?
For context, my current code looks something like this:
$s3_object = array(
'ContentDisposition' => sprintf('attachment; filename="%s"', addslashes($basename)),
'ACL' => 'public-read',
'ContentType' => 'pdf',
'StorageClass' => 'REDUCED_REDUNDANCY',
'Bucket' => 'sample',
'Key' => static::build_file_path($path, $filename, $extension),
'Body' => $binary_content,
);
$result = $s3_client->putObject($s3_object);
Also, I did try to search for this elsewhere in SO, but most people seem to just be looking for one or the other, so I didn't find any SO answers that showed how to do this.
I ended up stumbling across the definitive answer for this today (over a month later) while looking at other S3 documentation. Going to the GetObject docs for the S3 API and under the section labeled "Overriding Response Header Values" we find the following:
Note: You must sign the request, either using an Authorization header or a presigned URL, when using these parameters. They cannot be used with an unsigned (anonymous) request.
response-content-language
response-expires
response-cache-control
response-content-disposition
response-content-encoding
This answer's how to dynamically change any S3 object's content-disposition in the URL. However, at least for me, this is an imperfect solution because my intended use case was to store the URL for years as part of an invoicing archive, but signed URLs are only valid for a maximum of 1 week.
I could technically also try to find a way to make the Authorization header work for me or just query the S3 API to get a new signed URL every time I want to link to it, but that has other security, performance, and ROI implications for me that make it not worth it.
I am storing files on AWS S3 bucket and we have set option to move file from S3 bucket to Glacier after specific time period(e.g 6 months) via AWS console.
File are successfully moving from S3 to Glacier.
Now, I want to retrieve files moved on Glacier. But I couldn't find any working method to do so.
I have already tried with referring document of AWS Glacier but no luck.
Note : We are trying to do this via PHP SDK or any other way using PHP.
As per documentation said:
Objects in the Amazon Glacier storage class are not immediately accessible: you must first restore a temporary copy of the object to its bucket before it is available.
You need to initiate a restore operation on your archived (S3-Glacier) object, which may take a few hours (typically three to five hours) to be restored as a temporary object. If you want them permanently in S3 bucket, you can create a copy within your S3 bucket after the restore is done.
To initiate restore job, you can use:
S3 Management Console, see here.
AWS CLI, see here.
Call S3 REST API - POST Restore Object, see here.
AWS SDK, for PHP can see in here.
To determine when a restore job is complete programmatically, you can:
Call S3 REST API - HEAD Object, see here.
AWS SDK, for PHP can see in here.
After the restore job is done, you can retrieve the object in S3 bucket for a certain period that you set in the job.
If you are Using PHP SDK you can use this
$objects = $s3Client->restoreObject(array(
'Bucket' => 'Bucket name'
,"Key" => 'File Key which is file bath in S3 bucket'
//,'RequestPayer' => 'requester',
,'RestoreRequest' => [
'Days' => 10,
'GlacierJobParameters' => [
//'Tier' => 'Standard|Bulk|Expedited', // REQUIRED
'Tier' => Expedited, // REQUIRED
]
],
));
Here is the general template for this
$result = $client->restoreObject([
'Bucket' => '<string>', // REQUIRED
'Key' => '<string>', // REQUIRED
'RequestPayer' => 'requester',
'RestoreRequest' => [
'Days' => <integer>, // REQUIRED
'GlacierJobParameters' => [
'Tier' => 'Standard|Bulk|Expedited', // REQUIRED
],
],
'VersionId' => '<string>',
]);
My EC2 servers are currently hosting a website that logs each registered user's activity under their own separate log file on the local EC2 instance, say username.log. I'm trying to figure out a way to push log events for these to CloudWatch using the PHP SDK without slowing the application down, AND while still being able to maintain a separate log file for each registered member of my website.
I can't for the life of me figure this out:
OPTION 1: How can I log to CloudWatch asynchronously using the CloudWatch SDK? My PHP application is behaving VERY sluggishly, since each log line takes roughly 100ms to push directly to CloudWatch. Code sample is below.
OPTION 2: Alternatively, how could I configure an installed CloudWatch Agent on EC2 to simply OBSERVE all of my log files, which would basically upload them asynchronously to CloudWatch for me in a separate process? The CloudWatch EC2 Logging Agent requires a static "configuration file" (AWS documentation) on your server which, to my knowledge, needs to lists out all of your log files ("log streams") in advance, which I won't be able to predict at the time of server startup. Is there any way around this (ie, simply observe ALL log files in a directory)? Config file sample is below.
All ideas are welcome here, but I don't want my solution to simply be "throw all your logs into a single file, so that your log names are always predictable".
Thanks in advance!!!
OPTION 1: Logging via SDK (takes ~100ms / logEvent):
// Configuration to use for the CloudWatch client
$sharedConfig = [
'region' => 'us-east-1',
'version' => 'latest',
'http' => [
'verify' => false
]
];
// Create a CloudWatch client
$cwClient = new Aws\CloudWatchLogs\CloudWatchLogsClient($sharedConfig);
// DESCRIBE ANY EXISTING LOG STREAMS / FILES
$create_new_stream = true;
$next_sequence_id = "0";
$result = $cwClient->describeLogStreams([
'Descending' => true,
'logGroupName' => 'user_logs',
'LogStreamNamePrefix' => $stream,
]);
// Iterate through the results, looking for a stream that already exists with the intended name
// This is so that we can get the next sequence id ('uploadSequenceToken'), so we can add a line to an existing log file
foreach ($result->get("logStreams") as $stream_temp) {
if ($stream_temp['logStreamName'] == $stream) {
$create_new_stream = false;
if (array_key_exists('uploadSequenceToken', $stream_temp)) {
$next_sequence_id = $stream_temp['uploadSequenceToken'];
}
break;
}
}
// CREATE A NEW LOG STREAM / FILE IF NECESSARY
if ($create_new_stream) {
$result = $cwClient->createLogStream([
'logGroupName' => 'user_logs',
'logStreamName' => $stream,
]);
}
// PUSH A LINE TO THE LOG *** This step ALONE takes 70-100ms!!! ***
$result = $cwClient->putLogEvents([
'logGroupName' => 'user_logs',
'logStreamName' => $stream,
'logEvents' => [
[
'timestamp' => round(microtime(true) * 1000),
'message' => $msg,
],
],
'sequenceToken' => $next_sequence_id
]);
OPTION 2: Logging via CloudWatch Installed Agent (note that config file below only allows hardcoded, predermined log names as far as I know):
[general]
state_file = /var/awslogs/state/agent-state
[applog]
file = /var/www/html/logs/applog.log
log_group_name = PP
log_stream_name = applog.log
datetime_format = %Y-%m-%d %H:%M:%S
Looks like we have some good news now... not sure if it's too late!
CloudWatch Log Configuration
So to answer the doubt,
Is there any way around this (ie, simply observe ALL log files in a directory)?
yes, we can mention log files and file paths using wild cards, which can help you in having some flexibility in configuring from where the logs are fetched and pushed to the log streams.
I have an image upload system in my application written in PHP. The file browser opens, user picks an image, I upload it to my server, I crop, I resize, I apply a watermark to it. Bottom line is the images are in my server. At some point, the user clicks a button and then I move those files to my S3 bucket. Naturally, I need a progress bar because, ze client wants a progress bar.
Now uploading the files is quite easy:
$result = $this->awsS3Client->putObject(array(
'Bucket' => 'bad-dum-tss-bucket',
'Key' => $destinationFilePath,
'SourceFile' => $sourceFilePath,
'ContentType' => $mimeType,
'ACL' => 'public-read',
));
I can even go multi-part
$uploader = UploadBuilder::newInstance()
->setClient($this->awsS3Client)
->setSource($sourceFilePath)
->setBucket( 'bad-dum-tss-bucket')
->setKey($destinationFilePath)
->build();
try {
$uploader->upload();
} catch (MultipartUploadException $e) {
$uploader->abort();
}
No problem there until I realize my client needs a freaking progress bar. Now I've searched a lot and all I can see are links to uploaders such as http://fineuploader.com/ that assumes that the upload will happen directly from the browser (i.e. not from my server). So PHP-progress bar-S3, anybody?
If you're still interested, I found a way to track progress in PHP with AWS SDK v3.
$client = new S3Client(/* config */);
$result = $client->putObject([
'Bucket' => 'bucket-name',
'Key' => 'bucket-name/file.ext',
'SourceFile' => 'local-file.ext',
'ContentType' => 'application/pdf',
'#http' => [
'progress' => function ($downloadTotalSize, $downloadSizeSoFar, $uploadTotalSize, $uploadSizeSoFar) {
printf(
"%s of %s downloaded, %s of %s uploaded.\n",
$downloadSizeSoFar,
$downloadTotalSize,
$uploadSizeSoFar,
$uploadTotalSize
);
}
]
]);
This is explained in the AWS docs - S3 Config section. It works by exposing GuzzleHttp's progress property-callable, as explained in this SO answer.
I got this to work by firing concurrent XHRs to the server to poll the upload progress and saving it to a session variable. See: Why are my XHR calls waiting for each other to return a response where I asked another question related to XHR polling and session blocking in order to accomplish this.
In the end though, I decided to drop all of this altogether. My production server was an EC2 instance so any upload to the S3 server took only very little network overhead, (I should have realized this sooner). I could transfer a couple of MBs of images (all that I will ever need) in less than 3 seconds so I decided to just not display a progress bar as it doesn't justify the cost of adding nasty session calls in various parts of my code.
I'm having an issue where I am trying to support cancelling file uploads. I would like to know what the best practice for determine whether an upload is cancellable. So how can you determine when the file has completed uploading, versus the server generating/returning a response? I understand this is possible by tracking the file progress in HTML5, but since I have to support IE9, I am running out of ideas.
The end result is if you attempt to cancel a file upload that is nearing being completely upload, and issue the abort request, you end up aborting the response and the file is happily sitting on the server.
I am using jquery to submit the request, and am cancelling via the abort() method. I see in the browser console that the request was successfully aborted.
Am I missing something trivial?
PHP now has the ability to track the upload progress. See "Session Upload Progress".
Use this feature to write a short script: checkUpload.php, and use AJAX to return the status back to your IE9 page.
<?php
$_SESSION["upload_progress_123"] = array(
"start_time" => 1234567890, // The request time
"content_length" => 57343257, // POST content length
"bytes_processed" => 453489, // Amount of bytes received and processed
"done" => false, // true when the POST handler has finished, successfully or not
"files" => array(
0 => array(
"field_name" => "file1", // Name of the <input/> field
// The following 3 elements equals those in $_FILES
"name" => "foo.avi",
"tmp_name" => "/tmp/phpxxxxxx",
"error" => 0,
"done" => true, // True when the POST handler has finished handling this file
"start_time" => 1234567890, // When this file has started to be processed
"bytes_processed" => 57343250, // Number of bytes received and processed for this file
),
// An other file, not finished uploading, in the same request
1 => array(
"field_name" => "file2",
"name" => "bar.avi",
"tmp_name" => NULL,
"error" => 0,
"done" => false,
"start_time" => 1234567899,
"bytes_processed" => 54554,
),
)
);
Using PHP, it is also now possible to cancel the upload process. From that same manual page referenced above, comes the following text:
It is also possible to cancel the currently in-progress file upload, by setting the
$_SESSION[$key]["cancel_upload"] key to TRUE. When uploading multiple files in the same
request, this will only cancel the currently in-progress file upload, and pending file
uploads, but will not remove successfully completed uploads. When an upload is cancelled like
this, the error key in $_FILES array will be set to UPLOAD_ERR_EXTENSION.
The only thing not covered in the PHP documentation is how to get the total file size. There is a very good review of this process here: "PHP Master | Tracking Upload Progress"