How to save the image fetched from DOCX by PHP using ZipArchive - php

Brief description: I have a Docx file. I have done a simple code in PHP which extracts the images in that file and display it on the page.
What I want to achieve: I want that these images should be saved beside my php file with the same name and format.
My folder has sample.docx (which has images), extract.php (which extracts the images from docx) and display.php
Below is the code of extract.php
<?php
/*Name of the document file*/
$document = 'sample.docx';
/*Function to extract images*/
function readZippedImages($filename) {
/*Create a new ZIP archive object*/
$zip = new ZipArchive;
/*Open the received archive file*/
if (true === $zip->open($filename)) {
for ($i=0; $i<$zip->numFiles;$i++) {
/*Loop via all the files to check for image files*/
$zip_element = $zip->statIndex($i);
/*Check for images*/
if(preg_match("([^\s]+(\.(?i)(jpg|jpeg|png|gif|bmp))$)",$zip_element['name'])) {
/*Display images if present by using display.php*/
echo "<image src='display.php?filename=".$filename."&index=".$i."' /><hr />";
}
}
}
}
readZippedImages($document);
?>
display.php
<?php
/*Tell the browser that we want to display an image*/
header('Content-Type: image/jpeg');
/*Create a new ZIP archive object*/
$zip = new ZipArchive;
/*Open the received archive file*/
if (true === $zip->open($_GET['filename'])) {
/*Get the content of the specified index of ZIP archive*/
echo $zip->getFromIndex($_GET['index']);
}
$zip->close();
?>
How can I do that?

I not sure that you need to open the zip archive multiple times like this - especially when another instance is already open but I'd be tempted to try something like the following - I should stress that it is totally untested though.
Updated after testing:
There is no need to use display.php if you do like this - seems to work ok on different .docx files. The data returned by $zip->getFromIndex yields the raw image data ( so I discovered )so passing that in a query string is not possible due to the length. I tried to avoid opening/closing the zip archive unnecessarily hence the approach below which adds the raw data to the output array and the image is then displayed using this base64 encoded data inline.
<?php
#extract.php
$document = 'sample.docx';
function readZippedImages($filename) {
$paths=[];
$zip = new ZipArchive;
if( true === $zip->open( $filename ) ) {
for( $i=0; $i < $zip->numFiles;$i++ ) {
$zip_element = $zip->statIndex( $i );
if( preg_match( "([^\s]+(\.(?i)(jpg|jpeg|png|gif|bmp))$)", $zip_element['name'] ) ) {
$paths[ $zip_element['name'] ]=base64_encode( $zip->getFromIndex( $i ) );
}
}
}
$zip->close();
return $paths;
}
$paths=readZippedImages( $document );
/* to display & save the images */
foreach( $paths as $name => $data ){
$filepath=__DIR__ . '/' . $name;
$dirpath=pathinfo( $filepath, PATHINFO_DIRNAME );
$ext=pathinfo( $name, PATHINFO_EXTENSION );
if( !file_exists( $dirpath ) )mkdir( $dirpath,0777, true );
if( !file_exists( $filepath ) )file_put_contents( $filepath, base64_decode( $data ) );
printf('<img src="data:image/%s;base64, %s" />', $ext, $data );
}
?>

Related

PHP | Get file that is most similar to string

Currently I have a zip folder with files in it that I do not know the filenames of. The only thing I know is that one filename is very similar to a string a have. It is literally one character off.
What I am trying to do right now is to extract only the file that is the most similar to the string I have. To extract only one file from a zip I use the following code that works:
$zip = new ZipArchive;
if ($zip->open('directory/to/zipfile') === TRUE)
{
$zip->extractTo('directory/where/to/extract', array('the/filename/that/is/most/similair/most/go/here'));
$zip->close();
echo 'ok';
}
else
{
echo 'failed';
}
I know that to check the similarity of strings I can use the following code:
$var_1 = 'PHP IS GREAT';
$var_2 = 'WITH MYSQL';
similar_text($var_1, $var_2, $percent);
And based on the percentage I can tell which file is most similar to the string I have. The only thing I am worried about is that ZipArchieve doesn't have a function to retrieve files from a zip without knowing the exact filename.
So I was wondering if there is a way to retrieve a single file from a zip based on a string that is most similar to the filename.
This comment in the docs mentions how to list the files in a zip archive, so, all you would have to do is loop through all the file names and find the one that closest matches the string you have and then extract it.
$search = 'Closefilename.doc';
$za = new ZipArchive();
$za->open('theZip.zip');
$similarity = 0;
for( $i = 0; $i < $za->numFiles; $i++ ){
$stat = $za->statIndex( $i );
similar_text($stat['name'], $search, $sim);
if ($sim > $similarity) {
$similarity = $sim;
$filename = $stat['name'];
}
}
// Now extract $filename;
Try this code:
// Your Zip File path
$zip = zip_open( $fileName );
if ( is_resource( $zip ) ) {
while( $zip_entry = zip_read( $zip ) ) {
$zip_entry_string = zip_entry_read ( $zip_entry );
// Compare here with similar_text
// If success you can write this string to file
}
}
zip_close( $zip );
?>

Cannot extract zip file in php, no feedbackor error

I am retrieving my google map in a kmz format like this:
file_put_contents($_SERVER['DOCUMENT_ROOT'].'/temp/map.kmz', file_get_contents('https://mapsengine.google.com/map/kml?mid=zLucZBnh_ipg.kS906psI1W9k') );
$zip = new ZipArchive;
$res = $zip->open($_SERVER['DOCUMENT_ROOT'].'/temp/map.kmz');
if ($res === true)
{
trace("Number of files: $res->numFiles".PHP_EOL);
for( $i = 0; $i < $res->numFiles; $i++ )
{
$stat = $res->statIndex( $i );
print_r( basename( $stat['name'] ) . PHP_EOL );
}
}
But no files are showing and $zip->extractTo() is not working either. The file is downloaded on the server and I can extract it manually though. I have tried renaming the file to .zip or .kmz, still not working. I have opened the map.kmz file in Winrar and it does indeed say that it is a zip file format.
Any idea why it's not working? Do I need some special permissions to read the number of files or extract?
Check your file types .mkz and .kmz.
file_put_contents($_SERVER['DOCUMENT_ROOT'].'/temp/map.mkz',
file_get_contents('https://mapsengine.google.com/map/kml? mid=zLucZBnh_ipg.kS906psI1W9k') );
$zip = new ZipArchive;
$res = $zip->open($_SERVER['DOCUMENT_ROOT'].'/temp/map.kmz');
Got tired of the damn class not working, tried this method instead and it works:
$data = file_get_contents("https://mapsengine.google.com/map/kml?mid=zLucZBnh_ipg.kS906psI1W9k");
file_put_contents($_SERVER['DOCUMENT_ROOT'].'/temp/kmz_temp', $data);
ob_start();
passthru("unzip -p {$_SERVER['DOCUMENT_ROOT']}/temp/kmz_temp");
$xml_data = ob_get_clean();
header("Content-type: text/xml");
echo $xml_data;
exit();

Find images with certain extensions recursively

I am currently trying to make a script that will find images with *.jpg / *.png extensions in directories and subdirectories.
If some picture with one of these extensions is found, then save it to an array with path, name, size, height and width.
So far I have this piece of code, which will find all files, but I don't know how to get only jpg / png images.
class ImageCheck {
public static function getDirectory( $path = '.', $level = 0 ){
$ignore = array( 'cgi-bin', '.', '..' );
// Directories to ignore when listing output.
$dh = #opendir( $path );
// Open the directory to the handle $dh
while( false !== ( $file = readdir( $dh ) ) ){
// Loop through the directory
if( !in_array( $file, $ignore ) ){
// Check that this file is not to be ignored
$spaces = str_repeat( ' ', ( $level * 4 ) );
// Just to add spacing to the list, to better
// show the directory tree.
if( is_dir( "$path/$file" ) ){
// Its a directory, so we need to keep reading down...
echo "<strong>$spaces $file</strong><br />";
ImageCheck::getDirectory( "$path/$file", ($level+1) );
// Re-call this same function but on a new directory.
// this is what makes function recursive.
} else {
echo "$spaces $file<br />";
// Just print out the filename
}
}
}
closedir( $dh );
// Close the directory handle
}
}
I call this function in my template like this
ImageCheck::getDirectory($dir);
Save a lot of headache and just use PHP's built in recursive search with a regex expression:
<?php
$Directory = new RecursiveDirectoryIterator('path/to/project/');
$Iterator = new RecursiveIteratorIterator($Directory);
$Regex = new RegexIterator($Iterator, '/^.+(.jpe?g|.png)$/i', RecursiveRegexIterator::GET_MATCH);
?>
In case you are not familiar with working with objects, here is how to iterate the response:
<?php
foreach($Regex as $name => $Regex){
echo "$name\n";
}
?>

PHP dynamic zip file creation crashes with non-image filetypes. (wp)

I have a function that takes uploaded files (WORDPRESS) and adds them to a (newly created) zip file.
every new file is added to the zip (if is not yet created - the first file will create one ) and also to a comment with the list of the files.
function Ob99_generate_zip_file($meta) {
// we always need post_id , right ?
if( isset($_GET['post_id']) ) {
$post_id = $_GET['post_id'];
} elseif( isset($_POST['post_id']) ) {
$post_id = $_POST['post_id'];
}
//setting some more variables.
$file = wp_upload_dir();// take array
$file2 = wp_upload_dir();//debug
$zipname = $file['path'].'file.zip'; // set zip file name
$file = trailingslashit($file['basedir']).$meta['file'];// construct real path
// Without this next condition the function dies. WHY ??
list($orig_w, $orig_h, $orig_type) = #getimagesize($file); // not help to comment
if (!$orig_type == IMAGETYPE_GIF || !$orig_type == IMAGETYPE_PNG|| !$orig_type == IMAGETYPE_JPEG) {
//why other filetypes not working ??
return ;
}
$zip = new ZipArchive; // initiatte class
$zip->open($zipname , ZipArchive::CREATE); // open buffer
$new_filename = substr($file,strrpos($file,'/') + 1); //we do not need nested folders
$zip->addFile($file,$sitename.'/'.$new_filename); // we add the file to the zip
if (file_exists($zipname)){
$comment = $zip->getArchiveComment(); // if the file already exist read the comment
}
else { // if not - let´s give it a cool retro header
$comment_head = '*********************************'. PHP_EOL ;
$comment_head .= '****** <<< FILE CONTENT >>> *****'. PHP_EOL ;
$comment_head .= '*********************************'. PHP_EOL ;
}
$comment = $comment_head . $comment ;// add header before comment
$comment = $comment . PHP_EOL . PHP_EOL . $meta['file'] ; // add new file name
$zip->setArchiveComment($comment); // and comment
$zip->addFromString('filelist.txt', $comment); // create txt file with the same list
$zip->close()or die('can not create zip file'.$file.print_r($meta).'---- DEBUG SEPERATOR ---- '.print_r($file2)); // FINISHED or DIE with debug
}
My problem : if I try to upload any file other than an image - the function will DIE .
I have added a condition for checking imagetype - but I would like to know why it is crashing and how to make it work without said condition ...
Does the zip function have any problems with PDF , doc or anyother ? is that a wordpress problem ?
The problem section seems to be where you're asking PDFs, etc. their image size. Why don't you try:
$image_size = getimagesize($file);
if($image_size===false)
{
// This is not an image
// Do what you want to PDFs, etc.
}
else
{
// This is an image
// Find out image type, dimensions, etc.
}

problem with folder handling with php

Friends,
I have a problem............
Help me please........
Am getting the image url from my client, i want to store those images in my local folder.
if those images are in less, i will save them manually
But they are greater than 5000 images.........
Please give some code to down load all the images with PHP
you could try file_get_contents for this. just loop over the array of files and use file_get_contents('url'); to retrieve the files into a string and then file_put_contents('new file name'); to write the files again.
You may download file using file_get_contents() PHP function, and then write it on your local computer, for example, with fwrite() function.
The only opened question is, where to get list of files supposed to be downloaded - you did not specify it in your question.
Code draft:
$filesList = // obtain URLs list somehow
$targetDir = // specify target dir
foreach ($filesList: $fileUrl) {
$urlParts = explode("/", $fileUrl);
$name = $urlParts[count($urlParts - 1)];
$contents = file_get_contents($fileUrl);
$handle = fopen($targetDir.$filename, 'a');
fwrite($handle, $contents);
fclose($handle);
}
I'm not sure that this is what you want. Given a folder's (where PHP has the authority to get the folder's contents) URL and a URL you want to write to, this will copy all of the files:
function copyFilesLocally( $source, $target_folder, $index = 5000 )
{
copyFiles( glob( $source ), $target_folder, $index );
}
function copyFiles( array $files, $target_folder, $index )
{
if( count( $files ) > $index )
{
foreach( $files as $file )
{
copy( $file, $target_folder . filename( $file ) );
}
}
}
If you're looking to a remote server, try this:
function copyRemoteFiles( $directory, $target_folder, $exclutionFunction, $index = 5000)
{
$dom = new DOMDocument();
$dom->loadHTML( file_get_contents( $directory ) );
// This is a list of all links which is what is served up by Apache
// when listing a directory without an index.
$list = $dom->getElementsByTagName( "a" );
$images = array();
foreach( $list as $item )
{
$curr = $item->attributes->getNamedItem( "href" )->nodeValue;
if( $exclutionFunction( $curr ) )
$images[] = "$directory/$curr";
}
copyFiles( $images, $target_folder, $index );
}
function exclude_non_dots( $curr )
{
return strpos( $curr, "." ) != FALSE;
}
copyRemoteFiles( "http://example.com", "/var/www/images", "exclude_non_dots" );

Categories