gzread() fails when continue reads in seperate php requests - php

I have gz compressed file and uncompressing it to normal file so I use fwrite(). It works fine when do uncompress in a single PHP request.
Because of large compressed files, I do mind in PHP timeouts, I uncompress the file until 30 seconds and stop the process and store the current offset of gz file using gztell() then continue it form the next PHP request where I left.
I do gzseek() with current offset and continue the uncompression but gzread() continuously gives empty string
function gz_uncompress_file($source, $offset = 0){
$dest = str_replace('.gz', '', $source);
$fp_in = gzopen($source, 'rb');
if (empty($fp_in)) {
return 'Cannot open gzfile to uncompress sql';
}
$fp_out = fopen($dest, 'ab');
if (empty($fp_out)) {
return 'Cannot open temp file to uncompress sql';
}
gzseek($fp_in, $offset);
$break = false;
while (!gzeof($fp_in)){
$chunk_data = gzread($fp_in, 1024 * 512);
if (empty($chunk_data)) {
return 'empty string so stop the process';
}
fwrite($fp_out, $chunk_data);
//Clearning to save memory
unset($chunk_data);
if(!is_time_out()){
continue;
}
$break = true;
$offset = gztell($fp_in);
break;
}
fclose($fp_out);
gzclose($fp_in);
if ($break) {
//Saving offset
$this->set_option('sql_gz_uncompression_offset', $offset);
return 'continue_from_call';
}
echo "Un compression done";
}

Related

How to cache remote video

I want to play video from a remote server. so I write this function.
$remoteFile = 'blabla.com/video_5GB.mp4';
play($remoteFile);
function play($url){
ini_set('memory_limit', '1024M');
set_time_limit(3600);
ob_start();
if (isset($_SERVER['HTTP_RANGE'])) $opts['http']['header'] = "Range: " . $_SERVER['HTTP_RANGE'];
$opts['http']['method'] = "HEAD";
$conh = stream_context_create($opts);
$opts['http']['method'] = "GET";
$cong = stream_context_create($opts);
$out[] = file_get_contents($url, false, $conh);
$out[] = $httap_response_header;
ob_end_clean();
array_map("header", $http_response_header);
readfile($url, false, $cong);
}
The above function works very well in playing videos. But I don't want to burden the remote server
My question is how can I cache video files every 5 hours to my server. if possible, the cache folder contains small files (5MB / 10MB) from remote video
As mentioned in my comment, the following code has been tested only on a small selection of MP4 files. It could probably do with some more work but it does fill your immediate needs as it is.
It uses exec() to spawn a separate process that generates the cache files when they are needed, i.e. on the first request or after 5 hours. Each video must have its own cache folder because the cached chunks are simply called 1, 2, 3, etc. Please see additional comments in the code.
play.php - This is the script that will be called by the users from the browser
<?php
ini_set('memory_limit', '1024M');
set_time_limit(3600);
$remoteFile = 'blabla.com/video_5GB.mp4';
play($remoteFile);
/**
* #param string $url
*
* This will serve the video from the remote url
*/
function playFromRemote($url)
{
ob_start();
$opts = array();
if(isset($_SERVER['HTTP_RANGE']))
{
$opts['http']['header'] = "Range: ".$_SERVER['HTTP_RANGE'];
}
$opts['http']['method'] = "HEAD";
$conh = stream_context_create($opts);
$opts['http']['method'] = "GET";
$cong = stream_context_create($opts);
$out[] = file_get_contents($url, false, $conh);
$out[] = $http_response_header;
ob_end_clean();
$fh = fopen('response.log', 'a');
if($fh !== false)
{
fwrite($fh, print_r($http_response_header, true)."\n\n\n\n");
fclose($fh);
}
array_map("header", $http_response_header);
readfile($url, false, $cong);
}
/**
* #param string $cacheFolder Directory in which to find the cached chunk files
* #param string $url
*
* This will serve the video from the cache, it uses a "completed.log" file which holds the byte ranges of each chunk
* this makes it easier to locate the first chunk of a range request. The file is generated by the cache script
*/
function playFromCache($cacheFolder, $url)
{
$bytesFrom = 0;
$bytesTo = 0;
if(isset($_SERVER['HTTP_RANGE']))
{
//the client asked for a specific range, extract those from the http_range server var
//can take the form "bytes=123-567" or just a from "bytes=123-"
$matches = array();
if(preg_match('/^bytes=(\d+)-(\d+)?$/', $_SERVER['HTTP_RANGE'], $matches))
{
$bytesFrom = intval($matches[1]);
if(!empty($matches[2]))
{
$bytesTo = intval($matches[2]);
}
}
}
//completed log is a json_encoded file containing an array or byte ranges that directly
//correspond with the chunk files generated by the cache script
$log = json_decode(file_get_contents($cacheFolder.DIRECTORY_SEPARATOR.'completed.log'));
$totalBytes = 0;
$chunk = 0;
foreach($log as $ind => $bytes)
{
//find the first chunk file we need to open
if($bytes[0] <= $bytesFrom && $bytes[1] > $bytesFrom)
{
$chunk = $ind + 1;
}
//and while we are at it save the last byte range "to" which is the total number of bytes of all the chunk files
$totalBytes = $bytes[1];
}
if($bytesTo === 0)
{
if($totalBytes === 0)
{
//if we get here then something is wrong with the cache, revert to serving from the remote
playFromRemote($url);
return;
}
$bytesTo = $totalBytes - 1;
}
//calculate how many bytes will be returned in this request
$contentLength = $bytesTo - $bytesFrom + 1;
//send some headers - I have hardcoded MP4 here because that is all I have developed with
//if you are using different video formats then testing and changes will no doubt be required
header('Content-Type: video/mp4');
header('Content-Length: '.$contentLength);
header('Accept-Ranges: bytes');
//Send a header so we can recognise that the content was indeed served by the cache
header('X-Cached-Date: '.(date('Y-m-d H:i:s', filemtime($cacheFolder.DIRECTORY_SEPARATOR.'completed.log'))));
if($bytesFrom > 0)
{
//We are sending back a range so it needs a header and the http response must be 206: Partial Content
header(sprintf('content-range: bytes %s-%s/%s', $bytesFrom, $bytesTo, $totalBytes));
http_response_code(206);
}
$bytesSent = 0;
while(is_file($cacheFolder.DIRECTORY_SEPARATOR.$chunk) && $bytesSent < $contentLength)
{
$cfh = fopen($cacheFolder.DIRECTORY_SEPARATOR.$chunk, 'rb');
if($cfh !== false)
{
//if we are fetching a range then we might need to seek the correct starting point in the first chunk we look at
//this check will be performed on all chunks but only the first one should need seeking so no harm done
if($log[$chunk - 1][0] < $bytesFrom)
{
fseek($cfh, $bytesFrom - $log[$chunk - 1][0]);
}
//read and send data until the end of the file or we have sent what was requested
while(!feof($cfh) && $bytesSent < $contentLength)
{
$data = fread($cfh, 1024);
//check we are not going to be sending too much back and if we are then truncate the data to the correct length
if($bytesSent + strlen($data) > $contentLength)
{
$data = substr($data, 0, $contentLength - $bytesSent);
}
$bytesSent += strlen($data);
echo $data;
}
fclose($cfh);
}
//move to the next chunk
$chunk ++;
}
}
function play($url)
{
//I have chosen a simple way to make a folder name, this can be improved any way you need
//IMPORTANT: Each video must have its own cache folder
$cacheFolder = sha1($url);
if(!is_dir($cacheFolder))
{
mkdir($cacheFolder, 0755, true);
}
//First check if we are currently in the process of generating the cache and so just play from remote
if(is_file($cacheFolder.DIRECTORY_SEPARATOR.'caching.log'))
{
playFromRemote($url);
}
//Otherwise check if we have never completed the cache or it was completed 5 hours ago and if so spawn a process to generate the cache
elseif(!is_file($cacheFolder.DIRECTORY_SEPARATOR.'completed.log') || filemtime($cacheFolder.DIRECTORY_SEPARATOR.'completed.log') + (5 * 60 * 60) < time())
{
//fork the caching to a separate process - the & echo $! at the end causes the process to run as a background task
//and print the process ID returning immediately
//The cache script can be anywhere, pass the location to sprintf in the first position
//A base64 encoded url is passed in as argument 1, sprintf second position
$cmd = sprintf('php %scache.php %s & echo $!', __DIR__.DIRECTORY_SEPARATOR, base64_encode($url));
$pid = exec($cmd);
//with that started we need to serve the request from the remote url
playFromRemote($url);
}
else
{
//if we got this far then we have a completed cache so serve from there
playFromCache($cacheFolder, $url);
}
}
cache.php - This script will be called by play.php via exec()
<?php
//This script expects as argument 1 a base64 encoded url
if(count($argv)!==2)
{
die('Invalid Request!');
}
$url = base64_decode($argv[1]);
//make sure to use the same method of obtaining the cache folder name as the main play script
//or change the code to pass it in as an argument
$cacheFolder = sha1($url);
if(!is_dir($cacheFolder))
{
die('Invalid Arguments!');
}
//double check it is not already running
if(is_file($cacheFolder.DIRECTORY_SEPARATOR.'caching.log'))
{
die('Already Running');
}
//create a file so we know this has started, the file will be removed at the end of the script
file_put_contents($cacheFolder.DIRECTORY_SEPARATOR.'caching.log', date('d/m/Y H:i:s'));
//get rid of the old completed log
if(is_file($cacheFolder.DIRECTORY_SEPARATOR.'completed.log'))
{
unlink($cacheFolder.DIRECTORY_SEPARATOR.'completed.log');
}
$bytesFrom = 0;
$bytesWritten = 0;
$totalBytes = 0;
//this is the size of the chunk files, currently 10MB
$maxSizeInBytes = 10 * 1024 * 1024;
$chunk = 1;
//open the url for binary reading and first chunk for binary writing
$fh = fopen($url, 'rb');
$cfh = fopen($cacheFolder.DIRECTORY_SEPARATOR.$chunk, 'wb');
if($fh !== false && $cfh!==false)
{
$log = array();
while(!feof($fh))
{
$data = fread($fh, 1024);
fwrite($cfh, $data);
$totalBytes += strlen($data); //use actual length here
$bytesWritten += strlen($data);
//if we are on or passed the chunk size then close the chunk and open a new one
//keeping a log of the byte range of the chunk
if($bytesWritten>=$maxSizeInBytes)
{
$log[$chunk-1] = array($bytesFrom,$totalBytes);
$bytesFrom = $totalBytes;
fclose($cfh);
$chunk++;
$bytesWritten = 0;
$cfh = fopen($cacheFolder.DIRECTORY_SEPARATOR.$chunk, 'wb');
}
}
fclose($fh);
$log[$chunk-1] = array($bytesFrom,$totalBytes);
fclose($cfh);
//write the completed log. This is a json encoded string of the chunk byte ranges and will be used
//by the play script to quickly locate the starting chunk of a range request
file_put_contents($cacheFolder.DIRECTORY_SEPARATOR.'completed.log', json_encode($log));
//finally remove the caching log so the play script doesn't think the process is still running
unlink($cacheFolder.DIRECTORY_SEPARATOR.'caching.log');
}

Why does PHP deletes the content of the file on input?

I have one file: configuration.txt.
This file gets read by PHP, then wrote by the same PHP, while a C++ program reads the content of the same file at a regular interval.
PHP:
$closeFlag = false;
$arrayInputs = new SplFixedArray(3);
$arrayInputs[0] = "URL not entered";
$arrayInputs[1] = "3";
$arrayInputs[2] = "50";
$configFilePath = "/var/www/html/configuration.txt";
$currentSettingsFile = fopen($configFilePath, "r");
if(flock($currentSettingsFile, LOCK_SH)) {
$arrayInputs = explode(PHP_EOL, fread($currentSettingsFile, filesize($configFilePath)));
flock($currentSettingsFile, LOCK_UN);
$closeFlag = fclose($currentSettingsFile);
}
if(isset( $_POST['save_values'])) {
if(!empty($_POST['getURL'])) {
$arrayInputs[0] = $_POST['getURL'];
}
if(!empty($_POST['getURR'])) {
$arrayInputs[1] = $_POST['getURR'];
}
if(!empty($_POST['getBrightness'])) {
$arrayInputs[2] = $_POST['getBrightness'];
}
}
if(!$closeFlag) fclose($currentSettingsFile);
$currentSettingsFile = fopen($configFilePath, "w");
if(flock($currentSettingsFile, LOCK_SH)) {
foreach ($arrayInputs as $key => $value) {
if($value != '')
fwrite($currentSettingsFile,$value.PHP_EOL);
}
flock($currentSettingsFile, LOCK_UN);
fclose($currentSettingsFile);
}
?>
C++
char configFilePath[]="/var/www/html/configuration.txt";
std::fstream configFile;
configFile.open(configFilePath, std::fstream::in);
if(configFile.is_open()){
// do stuff
} else {
std::cout<<"Error ! Could not open Configuration file to read"<<std::endl;
}
The c++ returned no error so far. It can open the file. And php will return Warning: fread(): Length parameter must be greater than 0 because the file is empty.
I believe that PHP is deleting the file's content.
When locking a file in PHP, you lock a LOCK file, not the main file. Example:
$myfile = 'myfile.txt';
$lockfile = 'myfile.lock';
$lock = fopen($lockfile,'a');
if(flock($lock, LOCK_EX)) // The lock file is locked in exclusive mode - so I can write to it.
{
$fp = fopen($myfile,'w');
fputs($fp, "I am writing safely!");
fclose($fp);
flock($lock, LOCK_UN); // Always unlock it!
}
fclose($lock);
You work similarly in C++ because PHP is not locking the actual file. It is locking a lock file. The exact syntax depends heavily on your version of C/C++ and the operating system. So, I will use minimal syntax.
int lock=fopen(lockfile, "r+");
if(flock(fileno(lock), LOCK_EX))
{
//Locked. You can open a stream to ANOTHER file and play with it.
flock(fileno(lock), LOCK_UN));
}
fclose(lock);

Video upload stop working - Youtube API

I use a PHP script to upload some daily videos to a Youtube channel (based on this code sample: https://developers.google.com/youtube/v3/code_samples/php#resumable_uploads)
The problem is after this loop:
// Read the media file and upload it chunk by chunk.
$status = false;
$handle = fopen($videoPath, "rb");
while (!$status && !feof($handle)) {
$chunk = fread($handle, $chunkSizeBytes);
$status = $media->nextChunk($chunk);
}
Normally the $status variable has the video id ($status['id']) after the upload is complete but since mid January all the uploads failed with one of this errors:
$status variable value stills as "false"
Inside the catch, the Google_Service_Exception message is like "A service error occurred: Error calling PUT https://www.googleapis.com/upload/youtube/v3/videos?part=status%2Csnippet&uploadType=resumable&upload_id=xxx: (400) Invalid request. The number of bytes uploaded is required to be equal or greater than 262144, except for the final request (it's recommended to be the exact multiple of 262144). The received request contained nnn bytes, which does not meet this requirement.", where nnn is less than 262144 and seems to be the last request.
When I access the Youtube channel I can see the new videos with a status "Preparing upload" or stuck with a fixed percentage.
My code has not changed for months but now I can't upload any new video.
Anyone can please help me to know what's wrong? Thanks in advance!
The solution proposed by #pom (thanks by the way) doesn't really solve this issue if you need to implement a progress indicator.
I'm facing the same problem than #astor; after calling ->nextChunk for the final chunk i get:
The number of bytes uploaded is required to be equal or greater than
262144, except for the final request (it's recommended to be the exact
multiple of 262144). The received request contained 38099 bytes,
which does not meet this requirement.
See this log file :
The code is copy-paste from the google/google-api-php-client doc.
The log file shows the size of the first chunk being a bit superior to the others (except the last one) which i couldn't explain. Another "strange" thing is that its size changes test after test.
The size of the last chunk seems correct, and at the end all bytes should have been uploaded.
However, uploading the remaining bytes in the last chunk ->nextChunk($chunk) throws this error.
One important precision is that my source file is on AWS S3. File operations (filesize, fread, fopen) are done with the Amazon S3 Stream Wrapper.
It may add some headers or IDK that causes the problem.
EDIT: I don't have such problem with local files
Has anyone run into the same problem? ?
...
$chunkSizeBytes = 5 * 1024 * 1024;
$client->setDefer(true);
$request = $service->files->create($file);
$media = new Google_Http_MediaFileUpload(
$client,
$request,
'text/plain',
null,
true,
$chunkSizeBytes
);
$media->setFileSize(filesize(TESTFILE));
// Upload the various chunks. $status will be false until the process is
// complete.
$status = false;
$handle = fopen(TESTFILE, "rb");
while (!$status && !feof($handle)) {
// read until you get $chunkSizeBytes from TESTFILE
// fread will never return more than 8192 bytes if the stream is read buffered and it does not represent a plain file
// An example of a read buffered file is when reading from a URL
$chunk = readVideoChunk($handle, $chunkSizeBytes);
$status = $media->nextChunk($chunk);
}
// The final value of $status will be the data from the API for the object
// that has been uploaded.
$result = false;
if ($status != false) {
$result = $status;
}
fclose($handle);
}
function readVideoChunk ($handle, $chunkSize)
{
$byteCount = 0;
$giantChunk = "";
while (!feof($handle)) {
// fread will never return more than 8192 bytes if the stream is read buffered and it does not represent a plain file
$chunk = fread($handle, 8192);
$byteCount += strlen($chunk);
$giantChunk .= $chunk;
if ($byteCount >= $chunkSize)
{
return $giantChunk;
}
}
return $giantChunk;
}
Maybe you can try like that and then let MediaFileUpload cut the chunk itself:
$media = new \Google_Http_MediaFileUpload(
$client,
$insertRequest,
'video/*',
file_get_contents($pathToFile), <== put file content instead of null
true,
$chunkSizeBytes
);
$media->setFileSize($size);
$status = false;
while (!$status) {
$status = $media->nextChunk();
}
in previous answer from eveyrat everything is okay except one thing, readVideoChunk() doesn't always return exact amount of bytes as it is required in chunk upload docs. Moreover google accepts only as much bytes as it was in the first chunk request. I fixed that issue by custom buffering of overlap bytes:
(consider following piece of the code as a method of a class, the class has to have $remaining property)
function readVideoChunk ($handle, $chunkSize)
{
$byteCount = 0;
$giantChunk = "";
while (!feof($handle)) {
// fread will never return more than 8192 bytes if the stream is read buffered and it does not represent a plain file
$chunk = fread($handle, 8192);
$byteCount += strlen($chunk);
$giantChunk .= $chunk;
if ($byteCount > $chunkSize)
{
break;
}
}
$chunks = str_split(($this->remaining . $giantChunk), $chunkSize);
if (count($chunks) > 1) {
$giantChunk = $chunks[0];
$this->remaining = $chunks[1];
}
else {
$giantChunk = $chunks[0];
$this->remaining = '';
}
return $giantChunk;
}

How can I read XMP data from a JPG with PHP?

PHP has built in support for reading EXIF and IPTC metadata, but I can't find any way to read XMP?
XMP data is literally embedded into the image file so can extract it with PHP's string-functions from the image file itself.
The following demonstrates this procedure (I'm using SimpleXML but every other XML API or even simple and clever string parsing may give you equal results):
$content = file_get_contents($image);
$xmp_data_start = strpos($content, '<x:xmpmeta');
$xmp_data_end = strpos($content, '</x:xmpmeta>');
$xmp_length = $xmp_data_end - $xmp_data_start;
$xmp_data = substr($content, $xmp_data_start, $xmp_length + 12);
$xmp = simplexml_load_string($xmp_data);
Just two remarks:
XMP makes heavy use of XML namespaces, so you'll have to keep an eye on that when parsing the XMP data with some XML tools.
considering the possible size of image files, you'll perhaps not be able to use file_get_contents() as this function loads the whole image into memory. Using fopen() to open a file stream resource and checking chunks of data for the key-sequences <x:xmpmeta and </x:xmpmeta> will significantly reduce the memory footprint.
I'm only replying to this after so much time because this seems to be the best result when searching Google for how to parse XMP data. I've seen this nearly identical snippet used in code a few times and it's a terrible waste of memory. Here is an example of the fopen() method Stefan mentions after his example.
<?php
function getXmpData($filename, $chunkSize)
{
if (!is_int($chunkSize)) {
throw new RuntimeException('Expected integer value for argument #2 (chunkSize)');
}
if ($chunkSize < 12) {
throw new RuntimeException('Chunk size cannot be less than 12 argument #2 (chunkSize)');
}
if (($file_pointer = fopen($filename, 'r')) === FALSE) {
throw new RuntimeException('Could not open file for reading');
}
$startTag = '<x:xmpmeta';
$endTag = '</x:xmpmeta>';
$buffer = NULL;
$hasXmp = FALSE;
while (($chunk = fread($file_pointer, $chunkSize)) !== FALSE) {
if ($chunk === "") {
break;
}
$buffer .= $chunk;
$startPosition = strpos($buffer, $startTag);
$endPosition = strpos($buffer, $endTag);
if ($startPosition !== FALSE && $endPosition !== FALSE) {
$buffer = substr($buffer, $startPosition, $endPosition - $startPosition + 12);
$hasXmp = TRUE;
break;
} elseif ($startPosition !== FALSE) {
$buffer = substr($buffer, $startPosition);
$hasXmp = TRUE;
} elseif (strlen($buffer) > (strlen($startTag) * 2)) {
$buffer = substr($buffer, strlen($startTag));
}
}
fclose($file_pointer);
return ($hasXmp) ? $buffer : NULL;
}
A simple way on linux is to call the exiv2 program, available in an eponymous package on debian.
$ exiv2 -e X extract image.jpg
will produce image.xmp containing embedded XMP which is now yours to parse.
I know... this is kind of an old thread, but it was helpful to me when I was looking for a way to do this, so I figured this might be helpful to someone else.
I took this basic solution and modified it so it handles the case where the tag is split between chunks. This allows the chunk size to be as large or small as you want.
<?php
function getXmpData($filename, $chunk_size = 1024)
{
if (!is_int($chunkSize)) {
throw new RuntimeException('Expected integer value for argument #2 (chunkSize)');
}
if ($chunkSize < 12) {
throw new RuntimeException('Chunk size cannot be less than 12 argument #2 (chunkSize)');
}
if (($file_pointer = fopen($filename, 'rb')) === FALSE) {
throw new RuntimeException('Could not open file for reading');
}
$tag = '<x:xmpmeta';
$buffer = false;
// find open tag
while ($buffer === false && ($chunk = fread($file_pointer, $chunk_size)) !== false) {
if(strlen($chunk) <= 10) {
break;
}
if(($position = strpos($chunk, $tag)) === false) {
// if open tag not found, back up just in case the open tag is on the split.
fseek($file_pointer, -10, SEEK_CUR);
} else {
$buffer = substr($chunk, $position);
}
}
if($buffer === false) {
fclose($file_pointer);
return false;
}
$tag = '</x:xmpmeta>';
$offset = 0;
while (($position = strpos($buffer, $tag, $offset)) === false && ($chunk = fread($file_pointer, $chunk_size)) !== FALSE && !empty($chunk)) {
$offset = strlen($buffer) - 12; // subtract the tag size just in case it's split between chunks.
$buffer .= $chunk;
}
fclose($file_pointer);
if($position === false) {
// this would mean the open tag was found, but the close tag was not. Maybe file corruption?
throw new RuntimeException('No close tag found. Possibly corrupted file.');
} else {
$buffer = substr($buffer, 0, $position + 12);
}
return $buffer;
}
?>
Bryan's solution was the best one so far, but it had a few issues so I modified it to simplify it, and remove some functionality.
There were three issues I found with his solution:
A) If the chunk extracted falls right in between one of the strings we're searching for, it won't find it. Small chunk sizes are more likely to cause this issue.
B) If the chunk contains both the start AND the end, it won't find it. This is an easy one to fix with an extra if statement to recheck the chunk that the start is found in to see if the end is also found.
C) The else statement added to the end to break the while loop if it doesn't find the xmp data has a side effect that if the start element isn't found on the first pass, it will not check anymore chunks. This is likely easy to fix too, but with the first issue it's not worth it.
My solution below isn't as powerful, but it's more robust. It will only check one chunk, and extract the data from that. It will only work if the start and end are in that chunk, so the chunk size needs to be large enough to ensure that it always captures that data. From my experience with Adobe Photoshop/Lightroom exported files, the xmp data typically starts at around 20kB, and ends at around 45kB. My chunk size of 50k seems to work nicely for my images, it would be much less if you strip some of that data on export, such as the CRS block that has a lot of develop settings.
function getXmpData($filename)
{
$chunk_size = 50000;
$buffer = NULL;
if (($file_pointer = fopen($filename, 'r')) === FALSE) {
throw new RuntimeException('Could not open file for reading');
}
$chunk = fread($file_pointer, $chunk_size);
if (($posStart = strpos($chunk, '<x:xmpmeta')) !== FALSE) {
$buffer = substr($chunk, $posStart);
$posEnd = strpos($buffer, '</x:xmpmeta>');
$buffer = substr($buffer, 0, $posEnd + 12);
}
fclose($file_pointer);
return $buffer;
}
Thank you Sebastien B. for that shortened version :). If you want to avoid the problem, when chunk_size is just too small for some files, just add recursion.
function getXmpData($filename, $chunk_size = 50000){
$buffer = NULL;
if (($file_pointer = fopen($filename, 'r')) === FALSE) {
throw new RuntimeException('Could not open file for reading');
}
$chunk = fread($file_pointer, $chunk_size);
if (($posStart = strpos($chunk, '<x:xmpmeta')) !== FALSE) {
$buffer = substr($chunk, $posStart);
$posEnd = strpos($buffer, '</x:xmpmeta>');
$buffer = substr($buffer, 0, $posEnd + 12);
}
fclose($file_pointer);
// recursion here
if(!strpos($buffer, '</x:xmpmeta>')){
$buffer = getXmpData($filename, $chunk_size*2);
}
return $buffer;
}
I've developped the Xmp Php Tookit extension : it's a php5 extension based on the adobe xmp toolkit, which provide the main classes and method to read/write/parse xmp metadatas from jpeg, psd, pdf, video, audio... This extension is under gpl licence. A new release will be available soon, for php 5.3 (now only compatible with php 5.2.x), and should be available on windows and macosx (now only for freebsd and linux systems).
http://xmpphptoolkit.sourceforge.net/
If you have ExifTool available (a very useful tool) and can run external commands, you can use it's option to extract XMP data (-xmp:all) and output it in JSON format (-json), which you can then easily convert to a PHP object:
$command = 'exiftool -g -json -struct -xmp:all "'.$image_path.'"';
exec($command, $output, $return_var);
$metadata = implode('', $output);
$metadata = json_decode($metadata);
There is now also a github repo you can add via composer that can read xmp data:
https://github.com/jeroendesloovere/xmp-metadata-extractor
composer require jeroendesloovere/xmp-metadata-extractor

Unpack large files with gzip in PHP

I'm using a simple unzip function (as seen below) for my files so I don't have to unzip files manually before they are processed further.
function uncompress($srcName, $dstName) {
$string = implode("", gzfile($srcName));
$fp = fopen($dstName, "w");
fwrite($fp, $string, strlen($string));
fclose($fp);
}
The problem is that if the gzip file is large (e.g. 50mb) the unzipping takes a large amount of ram to process.
The question: can I parse a gzipped file in chunks and still get the correct result? Or is there a better other way to handle the issue of extracting large gzip files (even if it takes a few seconds more)?
gzfile() is a convenience method that calls gzopen, gzread, and gzclose.
So, yes, you can manually do the gzopen and gzread the file in chunks.
This will uncompress the file in 4kB chunks:
function uncompress($srcName, $dstName) {
$sfp = gzopen($srcName, "rb");
$fp = fopen($dstName, "w");
while (!gzeof($sfp)) {
$string = gzread($sfp, 4096);
fwrite($fp, $string, strlen($string));
}
gzclose($sfp);
fclose($fp);
}
try with
function uncompress($srcName, $dstName) {
$fp = fopen($dstName, "w");
fwrite($fp, implode("", gzfile($srcName)));
fclose($fp);
}
$length parameter is optional.
If you are on a Linux host, have the required privilegies to run commands, and the gzip command is installed, you could try calling it with something like shell_exec
SOmething a bit like this, I guess, would do :
shell_exec('gzip -d your_file.gz');
This way, the file wouldn't be unzip by PHP.
As a sidenote :
Take care where the command is run from (ot use a swith to tell "decompress to that directory")
You might want to take a look at escapeshellarg too ;-)
As maliayas mentioned, it may lead to a bug. I experienced an unexpected fall out of the while loop, but the gz file has been decompressed successfully. The whole code looks like this and works better for me:
function gzDecompressFile($srcName, $dstName) {
$error = false;
if( $file = gzopen($srcName, 'rb') ) { // open gz file
$out_file = fopen($dstName, 'wb'); // open destination file
while (($string = gzread($file, 4096)) != '') { // read 4kb at a time
if( !fwrite($out_file, $string) ) { // check if writing was successful
$error = true;
}
}
// close files
fclose($out_file);
gzclose($file);
} else {
$error = true;
}
if ($error)
return false;
else
return true;
}

Categories