Google Cloud Storage get temp filename (using fopen('php://temp')) - php

Similar question asked here a few years ago but with no answer:
Get path of temp file created via fopen('php://temp')
I am using Google Cloud Storage to download a number of large files in parallel and then upload them to another service. Essentially transferring from A to C, via my server B.
Under the hood, Google's StorageObject -> downloadAsStream() uses Guzzle to get the file via fopen('php://temp','r+').
I am running into a disk space issue because Google's Cloud Storage library is not cleaning up the temp files if there is an exception thrown during the transfer. (This is expected behaviour per the docs). Every retry of the script dumps another huge file in my tmp dir which isn't cleaned up.
If Guzzle used tmpfile() I would be able to use stream_get_meta_data()['uri'] to get the file path, but because it uses php://temp, this option seems to be blocked off:
[
"wrapper_type" => "PHP",
"stream_type" => "TEMP",
"mode" => "w+b",
"unread_bytes" => 0,
"seekable" => true,
"uri" => "php://temp", // <<<<<<<< grr.
]
So: does anyone know of a way to get the temporary file name created by fopen('php://temp') such that I can perform a manual clean-up?
UPDATE:
It appears this isn't possible. Hopefully GCS will update their library to change the way the temp file is generated. Until then I am using the following clean-up code:
public function cleanTempDir(int $timeout = 7200) {
foreach (glob(sys_get_temp_dir()."/php*") as $f) {
if (is_writable($f) && filemtime($f) < (time() - $timeout))
unlink($f);
}
}
UPDATE 2
It is possible, see accepted answer below.

Something like the following should do the trick:
use Google\Cloud\Storage\StorageClient;
$client = new StorageClient;
$tempStream = tmpfile();
$tempFile = stream_get_meta_data($tempStream)['uri'];
try {
$stream = $client->bucket('my-bucket')
->object('my-big-ol-file')
->downloadAsStream([
'restOptions' => [
'sink' => $tempStream
]
]);
} catch (\Exception $ex) {
unlink($tempFile);
}
The restOptions option allows you to proxy through commands to the underlying HTTP 1.1 transport (Guzzle, by default). My apologies this isn't clearly documented, but hope it helps!

Google Cloud Platform Support here!
At the moment, using the php Cloud Storage library it is not possible to get the temporary file name created when using the method downloadAsStream(). Therefore I have created a Feature Request on your behalf, you can follow it here.
As a workaround, you may be able to remove the file by hand, you can get the temp file name using the following command:
$filename = shell_exec('ls -lt | awk 'NR==2' | cut -d: -f2 | cut -d " " -f2');
After that, $filename will contain the last modified file name, which will be the one that failed and you wish to remove. With the filename you can now proceed to remove it.
Notice that you will have to be in your php://temp folder before executing the function.

It will most likely be the system configured temporary directory which you can get by sys_get_temp_dir.
Note that this will only save to file if needed and can reside in memory.
https://www.php.net/manual/en/wrappers.php.php
Edit: Ok, the file created. Then you can probably use stream_get_meta_data on the stream handle to get that information from the stream.

Related

Resolve relative urls of youtube using PHP

This question is asked before but non of the answers worked for me.
I use the following code to directly copy a file from a remote server to my server,
<?php
set_time_limit(0); //Unlimited max execution time
$remote_file_url = $_GET['url'];
$ext = pathinfo($remote_file_url, PATHINFO_EXTENSION);
$name = basename($remote_file_url);
if(isset($ext)){
$local_file = 'download/'.$name.'.'.$ext;
}
else
$local_file = 'download/'.$name;
$copy = copy( $remote_file_url, "1.mp4" );
if( !$copy ) {
echo "Doh! failed to copy $file...\n";
}
else{
echo "WOOT! success to copy $file...\n";
}
?>
It works well but it doesn't copy the files I get from Youtube. I use 1-Click Youtube Video Downloader extension for Firefox which gives me direct link to youtube videos. I can use these direct links in browser and Internet Download Manager as well.
For example the direct url of
https://www.youtube.com/watch?v=xPXrJwQ5lqQ
is
https://r6---sn-ab5l6nzy.googlevideo.com/videoplayback?ipbits=0&requiressl=yes&sparams=dur,ei,expire,id,initcwndbps,ip,ipbits,ipbypass,itag,lmt,mime,mip,mm,mn,ms,mv,pl,ratebypass,requiressl,source&ei=3DNOWfq4CImGc9rxvcgO&signature=3D188D073D872381433A45462E84928383D10D02.4E0AF7D777E76AA19A576D42983A81F4E62EF84D&lmt=1472135086539955&mime=video%2Fmp4&ratebypass=yes&id=o-ABaoUEn3pBt5SLXdWXlrzCdteMLfLPizrRTPoakDoLSX&expire=1498318908&source=youtube&dur=119.211&itag=22&pl=20&ip=162.217.31.128&key=cms1&redirect_counter=1&req_id=ce038b9993a9a3ee&cms_redirect=yes&ipbypass=yes&mip=159.203.89.210&mm=31&mn=sn-ab5l6nzy&ms=au&mt=1498297234&mv=m
The problem is my code can't copy this file to my server. I would like to know of there is any way to resolve such urls?
The error is
failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden in /home/...
thanks in advance.
Well, I have no idea why that happened. (Would it be expired?I hope not) I just managed to try another link for the above video (copy the link using right click) in your code as the $remote_file_url and it worked as expected
How did I get that link?
I've used the underlined library : YouTube-Downloader to the 1-Click Youtube Video Downloader extension (it is inherently used by that extension ) this way you will have more control over the process. Then after hosting the files in your web server. Simply run the index.php and when you use it, you'll get something like :
Then you can automate this last part to suit your needs.
That doesn't mean that all videos could be smoothly downloaded with this method. Because of the used videos that have signatures issue or that are recently uploaded issue and here's the list of issues of Youtube-Downloader
For that There is a fix that is somewhat involved: youtube-dl-php, it is based on a sound principle : there is a very good command line utility to download YouTube videos called youtube-dl : here is the download page
Basically, you'll just call it using php. Then, notice that you'll need its path installed in order for the following to work
After you install Composer, go to your web project folder
and run composer require norkunas/youtube-dl-php as explained in the Github page
When running its example, I've get an error
proc_open() 267 CreateProcess failed
I've never dealt with Symphony before and I've found it particularly interesting to play with YoutubeDl.php and redefine the $arguments passed to createProcess and commenting out much of the less useful configuration options to get rid of that error, give it more time to run with
ini_set('max_execution_time', 300);
And yikes it was downloaded.
You don't have to follow this unless you couldn't figure out a better way. It is just supposed to give you an idea of where lies the problem if you havn'et figure it out. And if you have that problem in the first place.
private function createProcess(array $arguments = [])
{
array_unshift($arguments, $this->binPath ?: 'youtube-dl');
$process = new Process("youtube-dl https://www.youtube.com/watch?v=nDMwW41AlSI");
/*$process->setEnv(['LANG' => 'en_US.UTF-8']);
$process->setTimeout($this->timeout);
$process->setOptions($this->processOptions);
if ($this->moveWithPhp) {
$cwd = sys_get_temp_dir();
} else {
$cwd = $this->downloadPath ?: sys_get_temp_dir();
}
$process->setWorkingDirectory($cwd);*/
return $process;
}
Or you can just write your own code that calls youtube-dl, good luck!

php - unlink throws error: Resource temporarily unavailable

Here is the piece of code:
public function uploadPhoto(){
$filename = '../storage/temp/image.jpg';
file_put_contents($filename,file_get_contents('http://example.com/image.jpg'));
$photoService->uploadPhoto($filename);
echo("If file exists: ".file_exists($filename));
unlink($filename);
}
I am trying to do the following things:
Get a photo from a URL and save it in a temp folder in my server. This works fine. The image file is created and echoes If file exists: 1 when echo("If file exists: ".file_exists('../storage/temp/image.jpg'));.
Pass that file to another function that hanldes uploading the file to Amazon s3 bucket. The file gets stored in my s3 bucket.
Delete the photo stored in the temp folder. This doesn't work! I get an error saying:
unlink(../storage/temp/image.jpg): Resource temporarily unavailable
If I use rename($filename,'../storage/temp/renimage.jpg'); instead of unlink($filename); i get an error:
rename(../storage/temp/image.jpg,../storage/temp/renimage.jpg): The process cannot access the file because it is being used by another process. (code: 32)
If I remove the function call $photoService->uploadPhoto($filename);, everything works perfectly fine.
If the file is being used by another process, how do I unlink it after the process has been completed and the file is no longer being used by any process? I do not want to use timers.
Please help! Thanks in advance.
Simplest solution:
gc_collect_cycles();
unlink($file);
Does it for me!
Straight after uploading a file to amazon S3 it allows me to delete the file on my server.
See here: https://github.com/aws/aws-sdk-php/issues/841
The GuzzleHttp\Stream object holds onto a resource handle until its
__destruct method is called. Normally, this means that resources are freed as soon as a stream falls out of scope, but sometimes, depending
on the PHP version and whether a script has yet filled the garbage
collector's buffer, garbage collection can be deferred.
gc_collect_cycles will force the collector to run and call __destruct
on all unreachable stream objects.
:)
Just had to deal with a similar Error.
It seems your $photoService is holding on to the image for some reason...
Since you didn't share the code of $photoService, my suggestion would be to do something like this (assuming you don't need $photoService anymore):
[...]
echo("If file exists: ".file_exists($filename));
unset($photoService);
unlink($filename);
}
The unset() method will destroy the given variable/object, so it can't "use" (or wharever it does) any files.
I sat over this problem for an hour or two, and finally realized that "temporarily unavailable" really means "temporarily".
In my case, concurrent PHP scripts access the file, either writing or reading. And when the unlink() process had a poor timing, then the whole thing failed.
The solution was quite simple: Use the (generally not very advisable) # to prevent the error being shown to the user (sure, one could also stop errors from beinf printed), and then have another try:
$gone = false;
for ($trial=0; $trial<10; $trial++) {
if ($gone = #unlink($filename)) {
break;
}
// Wait a short time
usleep(250000);
// Maybe a concurrent script has deleted the file in the meantime
clearstatcache();
if (!file_exists($filename)) {
$gone = true;
break;
}
}
if (!$gone) {
trigger_error('Warning: Could not delete file '.htmlspecialchars($filename), E_USER_WARNING);
}
After solving this issue and pushing my luck further, I could also trigger the "Resource temporarily unavailable" issue with file_put_contents(). Same solution, now everything works fine.
If I'm wise enough and/or unlinking fails in the future, I'll replace the # by ob_start(), so the error message could tell me the exact error.
I had the same problem. The S3 Client doesn't seem to want to unlock before unlink is being executed. If you extract the contents into a variable and set it as the 'body' in the putObject array:
$fileContent = file_get_contents($filepath);
$result = $s3->putObject(array(
'Bucket' => $bucket,
'Key' => $folderPath,
'Body' => $fileContent,
//'SourceFile' => $filepath,
'ContentType' => 'text/csv',
'ACL' => 'public-read'
));
See this answer: How to unlock the file after AWS S3 Helper uploading file?
The unlink method return bool value, so you can build a cycle, with some wait() and retries limit to wait for your processes to complete.
Additionally put "#" on the unlink, to hide the access error.
Throw another error/exception if retries count reached.

Use file_get_contents to to download multiple files at same time in php?

UDDATE: THIS IS AN S3 BUCKET QUESTION (SEE ANSWER)
I am looking to upload some code that reads files from an S3 bucket which uses the file_get_contents command to download a file one at a time.
Start
file_get_contents(s3://file1.json)
Wait until finished , then start next download:
file_get_contents(s3://file2.json)
And I want instead for them to all start at once to save time. like this:
Start both at same time:
file_get_contents(s3://file1.json)
file_get_contents(s3://file2.json)
Wait for them both at same time to finish.
I have seen multi curl requests but nothing for file_get_contents on this topic, is it possible ?
EDIT: Currently the code I am looking at uses s3:// which doesn't seem to work with curl. This is a way of getting to Amazon's S3 bucket.
EDIT2: Sample of current code :
function get_json_file( $filename = false ){
if(!$filename) return false;
// builds s3://somefile.on.amazon.com/file.json
$path = $this->get_json_filename( $filename );
if(!$filename || !file_exists($path)){
return false;
}
elseif(file_exists($path)) {
$data = file_get_contents($path);
}
else $data = false;
return ( empty( $data ) ? false : json_decode( $data , true ));
}
ANSWER: S3 SECURED, REQUIRES SPECIAL URI
Thank you guys for responding in comments. I was WAY off base with this earlier today when I asked this question.
This is the deal, the version of php we are using does not allow for threading. Hence, to ask for multiple URLs you need to use the multi curl option. The s:// somehow worked with a class file for S3 that I missed before. Hence the weird naming.
ALSO, IMPORTANT, if you don't care about protecting the data on S3, you can just make it public and NOT have this issue. In my case the data needs to be somewhat protected so that requires a bunch of things in the URI.
You can use Amazon's S3 class to generate a secure link from the URI and the bucket name on S3. This will return the proper URL to use with your bucket. The S3 class can be downloaded manually or installed via composer in laravel for example. It requires that you install an user with a key to access this in the AWS console.
$bucketName = "my-amazon-bucket";
$uri = "/somefile.txt"
$secure_url = $s3->getObjectUrl($bucketName, $uri, '+5 minutes');
This will generate a valid URL to access the file on S3 which can be used with curl etc..
https://domain.s3.amazonaws.com/bucket/filename.zip?AWSAccessKeyId=myaccesskey&Expires=1305311393&Signature=mysignature

Concurrent writing to zip file

I have a process done in PHP. This process get a file from internet and put it inside a zip file. The target zipfile is based in an algorithm, there are 4096 zipfiles. The target zipfile is based in a hash of the url processed.
I have another program that launches http petitions so i can run the script concurrently (around 110 processes).
My question is simple. Since threads are pseudorandom, easly 2 threads can try to add files to the same zipfile in the same moment.
Is it possible? Will the file get corrupt if 2 proccess try to add files at same time?
Locking the file or something like that would be possible a possible solution.
I was thinking to use semaphores, but reading, php semaphores dont work under windows.
I have seen this possible solution:
if ( !function_exists('sem_get') ) {
function sem_get($key) { return fopen(__FILE__.'.sem.'.$key, 'w+'); }
function sem_acquire($sem_id) { return flock($sem_id, LOCK_EX); }
function sem_release($sem_id) { return flock($sem_id, LOCK_UN); }
}
Anyways the question is if it is allowed to add files to a zip file from 2 or more different php proccesses at same time.
Short answer: No! The zip algorithm analyses and compresses one stream at a time.
This is tough under Windows. It's far from easy in Linux! I would be tempted to create a db table with a unique index, and use that index number to determine a filename, or at least flag that a file is being written to.

Creating files on a time (hourly) basis

I experimenting with twitter streaming API,
I use Phirehose to connect to twitter and fetch the data but having problems storing it in files for further processing.
Basically what I want to do is to create a file named
date("YmdH")."."txt"
for every hour of connection.
Here is how my code looks like right now (not handling the hourly change of files)
public function enqueueStatus($status)
$data = json_decode($status,true);
if(isset($data['text'])/*more conditions here*/) {
$fp = fopen("/tmp/$time.txt");
fwirte ($status,$fp);
fclose($fp);
}
Help is as always much appreciated :)
You want the 'append' mode in fopen - this will either append to a file or create it.
if(isset($data['text'])/*more conditions here*/) {
$fp = fopen("/tmp/" . date("YmdH") . ".txt", "a");
fwrite ($status,$fp);
fclose($fp);
}
From the Phirehose googlecode wiki:
As of Phirehose version 0.2.2 there is
an example of a simple "ghetto queue"
included in the tarball (see file:
ghetto-queue-collect.php and
ghetto-queue-consume.php) that shows
how statuses could be easily collected
on to the filesystem for processing
and then picked up by a separate
process (consume).
This is a complete working sample of doing what you want to do. The rotation time interval is configurable too. Additionally there's another script to consume and process the written files too.
Now if only I could find a way to stop the whole sript, my log keeps filling up (the script continues execution) even if I close the browser tab :P

Categories