Seeing if object exists in S3 using PHP - php

I am using PHP and I am using the S3 API to upload a file, but I wanted to make sure that this exact filename doesn't already exist in the bucket before upload.
I have found a few examples online that use "file_get_contents" but doesn't this mean that you would have to download the entire file first? Usually, these files are about 10 mb, so ideally, I wouldn't really want to do this.
Is there perhaps a way to use "file_get_contents" without downloading the file?
Or better yet, perhaps I could use an API request to see if the filename exists?
It's not important to me whether or not the content, or filesize, is the same, just the filename.

Gets whether or not the specified Amazon S3 object exists in the specified bucket.
AmazonS3 doesObjectExist
$s3 = new AmazonS3();
$bucket = 'my-bucket' . strtolower($s3->key);
$response = $s3->doesObjectExist($bucket, 'test1.txt');
// Success? (Boolean, not a CFResponse object)
var_dump($response);

try to use code below:
$s3 = new S3();
$info = $s3->getObjectInfo($bucket, $filename);
if ($info)
{
echo 'File exists';
}
else
{
echo 'File does not exists';
}
download the S3 SDK from amazon for php. There is a class called S3; create an object of S3. The object will allow to call the getObjectInfo() method. Pass your S3 bucket name and the file name (often the file name is referred as key). The getObjectInfo() method will return some information if the file exists, otherwise the method will return FALSE.

Please note that the other suggestions are based on version 1 of the AWS SDK for PHP. For version 2, you'll want to be familiar with the latest guide found here:
http://docs.aws.amazon.com/aws-sdk-php/guide/latest/index.html
The "Getting Started" section in the link above will help you get the SDK installed and setup, so be sure to take your time reading through those docs if you haven't done so already. When you're done with the setup, you'll want to be familiar with the stream wrapper method found here:
http://docs.aws.amazon.com/aws-sdk-php/guide/latest/feature-s3-stream-wrapper.html
Finally, below is a brief, real-life example of how you could use it in the flow of your code.
require('vendor/autoload.php');
// your filename
$filename = 'my_file_01.jpg';
// this will use AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from env vars
$s3 = Aws\S3\S3Client::factory();
// S3_BUCKET must also be defined in env vars
$bucket = getenv('S3_BUCKET')?: die('No "S3_BUCKET" config var in found in env!');
// register stream wrapper method
$s3->registerStreamWrapper();
// does file exist
$keyExists = file_exists("s3://".$bucket."/".$filename);
if ($keyExists) {
echo 'File exists!';
}

If you have or have the ability to install the PECL extension HTTP then you can use http_head to make a head request easily and check whether the response was 200 or 404.

Updated version for anyone looking for v3 and up...
$s3Client = new \Aws\S3\S3Client([
'version' => 'latest',
'region' => getenv('AWS_REGION'),
'credentials' => [
'key' => getenv('AWS_KEY'),
'secret' => getenv('AWS_SECRET')
]
]);
$response = $s3Client->doesObjectExist(getenv('AWS_S3_BUCKET'),'somefolder/somefile.ext');
if ($response) {
echo "Yay, it exists :)";
} else {
echo "Boo, nothing there :(";
}

Related

Google Cloud Storage get temp filename (using fopen('php://temp'))

Similar question asked here a few years ago but with no answer:
Get path of temp file created via fopen('php://temp')
I am using Google Cloud Storage to download a number of large files in parallel and then upload them to another service. Essentially transferring from A to C, via my server B.
Under the hood, Google's StorageObject -> downloadAsStream() uses Guzzle to get the file via fopen('php://temp','r+').
I am running into a disk space issue because Google's Cloud Storage library is not cleaning up the temp files if there is an exception thrown during the transfer. (This is expected behaviour per the docs). Every retry of the script dumps another huge file in my tmp dir which isn't cleaned up.
If Guzzle used tmpfile() I would be able to use stream_get_meta_data()['uri'] to get the file path, but because it uses php://temp, this option seems to be blocked off:
[
"wrapper_type" => "PHP",
"stream_type" => "TEMP",
"mode" => "w+b",
"unread_bytes" => 0,
"seekable" => true,
"uri" => "php://temp", // <<<<<<<< grr.
]
So: does anyone know of a way to get the temporary file name created by fopen('php://temp') such that I can perform a manual clean-up?
UPDATE:
It appears this isn't possible. Hopefully GCS will update their library to change the way the temp file is generated. Until then I am using the following clean-up code:
public function cleanTempDir(int $timeout = 7200) {
foreach (glob(sys_get_temp_dir()."/php*") as $f) {
if (is_writable($f) && filemtime($f) < (time() - $timeout))
unlink($f);
}
}
UPDATE 2
It is possible, see accepted answer below.
Something like the following should do the trick:
use Google\Cloud\Storage\StorageClient;
$client = new StorageClient;
$tempStream = tmpfile();
$tempFile = stream_get_meta_data($tempStream)['uri'];
try {
$stream = $client->bucket('my-bucket')
->object('my-big-ol-file')
->downloadAsStream([
'restOptions' => [
'sink' => $tempStream
]
]);
} catch (\Exception $ex) {
unlink($tempFile);
}
The restOptions option allows you to proxy through commands to the underlying HTTP 1.1 transport (Guzzle, by default). My apologies this isn't clearly documented, but hope it helps!
Google Cloud Platform Support here!
At the moment, using the php Cloud Storage library it is not possible to get the temporary file name created when using the method downloadAsStream(). Therefore I have created a Feature Request on your behalf, you can follow it here.
As a workaround, you may be able to remove the file by hand, you can get the temp file name using the following command:
$filename = shell_exec('ls -lt | awk 'NR==2' | cut -d: -f2 | cut -d " " -f2');
After that, $filename will contain the last modified file name, which will be the one that failed and you wish to remove. With the filename you can now proceed to remove it.
Notice that you will have to be in your php://temp folder before executing the function.
It will most likely be the system configured temporary directory which you can get by sys_get_temp_dir.
Note that this will only save to file if needed and can reside in memory.
https://www.php.net/manual/en/wrappers.php.php
Edit: Ok, the file created. Then you can probably use stream_get_meta_data on the stream handle to get that information from the stream.

Flysystem S3 remote file download always corrupted

I recently started using Flysystem in an existing application with the intention of abstracting the local and remote (specifically, S3) filesystems. Everything was working ok on my development environment, on which I successfully configured the LocalAdapter. However, I cannot get S3 file downloads to work. I'd like to point out that file uploads are working perfectly, given that I can successfully download the file by manually browsing the S3 bucket in the AWS management console. That being said, I will skip the code that initializes the $filesystem variable.
My application is using a PSR-7 approach. That is, the code below is inside a function that is passed an object of type Psr\Http\Message\ServerRequestInterface as first argument and an object of type Psr\Http\Message\ResponseInterface as the second. Given that the local filesystem works fine, I think it is safe to assume that the problem doesn't lie there.
This is the code:
<?php
$stream = new \Zend\Diactoros\Stream($filesystem->readStream($filename));
$filesize = $stream->getSize();
return $response
->withHeader('Content-Type', 'application/pdf')
->withHeader('Content-Transfer-Encoding', 'Binary')
->withHeader('Content-Description', 'File Transfer')
->withHeader('Pragma', 'public')
->withHeader('Expires', '0')
->withHeader('Cache-Control', 'must-revalidate')
->withHeader('Content-Length', "{$filesize}")
->withBody($stream);
When I dump the $stream variable and the $filesize variable the results are as expected. The remote file contents are successfully printed. However, the file download is always corrupted and the file size is always of 0 bytes.
I am assuming that Flysystem takes care of everything behind the scenes and that I don't have to manually download the file to a temp folder first, before serving it to the client.
Any clue to what could be the problem?
Update 1
I have also tried with the following code, without any luck. However, it continues to work locally:
use Zend\Diactoros\CallbackStream;
$stream = new CallbackStream(function() use ($filesystem, $filename) {
$resource = $filesystem->readStream($filename);
while (!feof($resource)) {
echo fread($resource, 1024);
}
fclose($resource);
return '';
});
and
use Zend\Diactoros\CallbackStream;
$stream = new CallbackStream(function() use ($filesystem, $filename) {
$resource = $filesystem->readStream($filename);
fpassthru($resource);
return '';
});
Removing the Content-Length header seems to solve the problem.
See https://github.com/thephpleague/flysystem/issues/543 for more details.

Use file_get_contents to to download multiple files at same time in php?

UDDATE: THIS IS AN S3 BUCKET QUESTION (SEE ANSWER)
I am looking to upload some code that reads files from an S3 bucket which uses the file_get_contents command to download a file one at a time.
Start
file_get_contents(s3://file1.json)
Wait until finished , then start next download:
file_get_contents(s3://file2.json)
And I want instead for them to all start at once to save time. like this:
Start both at same time:
file_get_contents(s3://file1.json)
file_get_contents(s3://file2.json)
Wait for them both at same time to finish.
I have seen multi curl requests but nothing for file_get_contents on this topic, is it possible ?
EDIT: Currently the code I am looking at uses s3:// which doesn't seem to work with curl. This is a way of getting to Amazon's S3 bucket.
EDIT2: Sample of current code :
function get_json_file( $filename = false ){
if(!$filename) return false;
// builds s3://somefile.on.amazon.com/file.json
$path = $this->get_json_filename( $filename );
if(!$filename || !file_exists($path)){
return false;
}
elseif(file_exists($path)) {
$data = file_get_contents($path);
}
else $data = false;
return ( empty( $data ) ? false : json_decode( $data , true ));
}
ANSWER: S3 SECURED, REQUIRES SPECIAL URI
Thank you guys for responding in comments. I was WAY off base with this earlier today when I asked this question.
This is the deal, the version of php we are using does not allow for threading. Hence, to ask for multiple URLs you need to use the multi curl option. The s:// somehow worked with a class file for S3 that I missed before. Hence the weird naming.
ALSO, IMPORTANT, if you don't care about protecting the data on S3, you can just make it public and NOT have this issue. In my case the data needs to be somewhat protected so that requires a bunch of things in the URI.
You can use Amazon's S3 class to generate a secure link from the URI and the bucket name on S3. This will return the proper URL to use with your bucket. The S3 class can be downloaded manually or installed via composer in laravel for example. It requires that you install an user with a key to access this in the AWS console.
$bucketName = "my-amazon-bucket";
$uri = "/somefile.txt"
$secure_url = $s3->getObjectUrl($bucketName, $uri, '+5 minutes');
This will generate a valid URL to access the file on S3 which can be used with curl etc..
https://domain.s3.amazonaws.com/bucket/filename.zip?AWSAccessKeyId=myaccesskey&Expires=1305311393&Signature=mysignature

How to rename or move a file in Google Cloud Storage (PHP API)

I am currently trying to rename and/or move a cloud storage file to another name/position, but I can't get it to work. I am using https://github.com/google/google-api-php-client as client, the uploads works fine with:
...
$storageService = new \Google_Service_Storage( $client )
$file = new \Google_Service_Storage_StorageObject()
$file->setName( 'test.txt' );
$storageService->objects->insert(
$bucketName,
$file,
array(
'name' => $filename,
'data' => file_get_contents( $somefile )
)
);
...
So I have tried to change a filename by the $storageObject->objects->update() method, but I cannot find any documentation on this. I used $storageService->objects->get( $bucketName, $fileName ) to get a specific file I wanted to rename (with $file->setName()), but it seems I just cannot pass the file to the objects->update function. Am I doing it wrong?
Ok, it seems I cannot directly rename a file (please correct me if I'm wrong), I could only update the metadata. I managed to get it to work by copying the file to a new filename/destination and then delete the old file. I successfully used $storageService->objects->copy and $storageService->objects->delete for this. This doesn't feels right but at least it works.
As this is not very well documented with google, here a basic example:
//RENAME FILE ON GOOGLE CLOUD STORAGE (GCS)
//Get client and auth token (might vary depending on the way you connect to gcs – here with laravel framework facade)
//DOC: https://cloud.google.com/storage/docs/json_api/v1/json-api-php-samples
//DOC: https://developers.google.com/api-client-library/php/auth/service-accounts
//Laravel Client: https://github.com/pulkitjalan/google-apiclient
//Get google client
$gc = \Google::getClient();
//Get auth token if it is not valid/not there yet
if($gc->isAccessTokenExpired())
$gc->getAuth()->refreshTokenWithAssertion();
//Get google cloud storage service with the client
$gcStorageO = new \Google_Service_Storage($gc);
//GET object at old position ($path)
//DOC: https://cloud.google.com/storage/docs/json_api/v1/objects/get
$oldObj = $gcStorageO->objects->get($bucket, $path);
//COPY desired object from old position ($path) to new position ($newpath)
//DOC: https://cloud.google.com/storage/docs/json_api/v1/objects/copy
$gcStorageO->objects->copy(
$bucket, $path,
$bucket, $newpath,
$oldObj
);
//DELETE old object ($path)
//DOC: https://cloud.google.com/storage/docs/json_api/v1/objects/delete
$gcStorageO->objects->delete($bucket, $path);
I found that when using gcutils in conjunction with PHP, you can execute pretty much every php file command on app engine. Copy, delete, check if file exists.
if(file_exists("gs://$bucket/{$folder}/$old_temp_file")){
$old_path = "gs://$bucket/{$folder}/$old_temp_file";
$new_permanent_path = "gs://$bucket/{$folder}/$new_permanent_file";
copy($old_path, $new_permanent_path);
unlink($old_path);
}

How to copy an image to Amazon S3?

I have a problem a copying an image to Amazon S3
I am using the PHP copy function to copy the image from one server to other server ..it works on go-daddy host server. But it doesn't work for S3. Here is the code that is not working:
$strSource =http://img.youtube.com/vi/got6nXcpLGA/hqdefault.jpg
copy($strSource ,$dest);
$dest is my bucket url with folder present to upload images
I am not sure you could copy an image to AWS just like that. I would suggest using a library which talks to the AWS server and then running your commands.
Check this - http://undesigned.org.za/2007/10/22/amazon-s3-php-class
It provides a REST implementation for AWS.
For example, if you want to copy your image, you can do:
$s3 = new S3($awsAccessKey, $awsSecretKey);
$s3->copyObject($srcBucket, $srcName, $bucketName, $saveName, $metaHeaders = array(), $requestHeaders = array());
$awsAccessKey and $awsSecretKey are your secret keys for AWS a/c.
Check it out and hope it helps.
Not sure if you have used the AWS PHP SDK, but the AWS SDKs can come in handy in situations like this. The SDK can be used in conjunction with IAM roles to grant access to your S3 bucket. These are the steps:
Modify your code to use the PHP SDK to upload the files (if needed).
Create an IAM Role and grant the role permission to the needed S3 buckets.
When you start your EC2 instance, specify that you want to use the role.
Then your code will automatically use the permissions that you grant that role. IAM gives the instance temporary credentials that the SDK uses. These credentials are automatically rotated for you by IAM and EC2.
Here is my examnple from the documentation to copy an object in S3 Bucket
public function copyObject($sSourceKey, $sDestKey)
{
$this->checkKey($sSourceKey);
$this->checkKey($sDestKey);
$bRet = false;
// http://docs.aws.amazon.com/aws-sdk-php-2/latest/class-Aws.S3.S3Client.html#_copyObject
try {
$response = $this->_oS3Client->copyObject(
array(
'Bucket' => $this->getBucketName(),
'Key' => $sDestKey,
'CopySource' => urlencode($this->getBucketName() . '/' . $sSourceKey),
)
);
if (isset($response['LastModified'])) {
$bRet = true;
}
} catch (Exception $e) {
$GLOBALS['error'] = 1;
$GLOBALS["info_msg"][] = __METHOD__ . ' ' . $e->getMessage();
$bRet = false;
}
return $bRet;
}

Categories