Force opening and reading zip files from php - php

This may be a simple question or a pretty complex one, ill let you be the deciders.
Using PHP To open a zip file, extract the files to a directory and close the zip file is not a complicated class to make.
But lets say that the file is not a zip, but yet is able to be read by WinRar, examples of these files are like exe's SFX archives etc.
What factors do all these files have in conmen to allow WinRar to browse the source of them.
Another example is Anti Virus Software, that individually scan files within an EXE ?
So what an example:
$handle = fopen("an_unknown_file.abc", "rb");
while (!feof($handle))
{
//What generic code could I use to determain weather the file can be extracted ?
}
fclose($handle);
Regards.

Zip's specifications allow the actual "zip" file portion to be embedded ANYWHERE within a file. It doesn't necessarily have to start at position '0' in the file. This is how self-extracting zips work. It's a small .exe stub program which has a larger .zip file appended to the end of it.
Finding a zip is mostly a matter of scanning for a zip file's "magic number" within a file, then doing a few heuristics to determine if it's really a zip file, or just something random that happens to contain a zip's magic number.
A .docx file is really just a .zip that contains various XML files representing a Word file's contents. Just like a .jar is a zip file that contains various different chunks of Java code.
Winrar's got a bunch of extra code within it to scan through a file and look for any identifiable "this is a compress archive" type signatures, one of which happens to be that of a zip file's.
There's nothing too magical about it. It's just a matter of scanning through a file and looking for signatures.

Not sure what exactly is your question, but I think you are confusing something here... File extension can be described as just a convenient way for humans and computers to relate file extensions to the type of the file/programs that work with them. WinRar (or any other program) reads what the file contains and if it can understand it - it works with it. The only important thing is that the file format (data in the file) is valid and that the program you are using can work with this file format.
So, if a file is in any format that WinRar can work with (.rar, .zip, .gz, etc.), it's extension could be .txt or .whatever and WinRar will still be able to work with it. Extension is just for convenience.

Related

Hide string in zip format without decompressing zip file

We would like to protect our zip files by registering the user into the installer at download time.
Is there any place in a zipped package where a script could add such hidden info?
The note field is not good as most zip software displays it.
A zip archiver will read a zip file from the end and find entries using offsets. If you put bytes at the beginning of the zip file (and make sure the offsets are still correct !), you will get a valid zip file and I doubt archivers will show the additional content. That's how self-extracting zip files work: the file starts with the unzip exe and ends with the actual zip content.
In your case, you could prepare archives with a fixed-length blob at the beginning and overwrite it when a user download it.
Note that won't "protect" the zip file, just mark the zip file as downloaded by a given user (and the mark can be removed easily).

How to get name of files which are not copied?

I have copied a number of PDF files from one directory to another directory. Unfortunately, some file have not been copied, and I want to identify those files.
For this purpose I want to write a PHP program which finds those files. What I have so far is a program which compares file names, but I want to check for file size. How can I accomplish this?

How should the temporary file be saved when creating and download zip file using PHP?

I wish to zip a directory on the server and download it to the user. I've found great information on this site on how to zip the directory.
The part that has me stumped is where to store the temporary file. From what I've read, zipArchive requires that the zip file be saved on the server, and I can't download it without saving it. Correct?
I've seen solutions where they recommend saving it as "somename.zip", and then using readfile() along with the appropriate headers to download, and then using unlink() after the download to get rid of it. But where to store "somename.zip" and what if there are collisions?
So, then I started thinking that I should create a file with a random name in some temporary directory. Or maybe I should use something like tmpfile(), but that returns a file handler and I don't know how that will work with zipArchive.
For my application, the size of the directory is fairly small, and would like to not even write it to disk, but to memory instead. Maybe something like $tmp_handle = fopen('php://temp', 'r+');, but again this returns a file handler and I don't know how that will work with zipArchive.
How should the zip file created by zipArchive be temporary saved?
As ZipArchive class works with stacked streams, the $filename parameter cannot be php://temp or php://memory. It has to be a 'real' filepath on the filesystem.
Some lines of thought :
As said in a comment, tempnam() function can help you to avoid temporary files names collisions. I think storing the file temporarily and readfile()'ing it is the most common and not a bad option
If you absolutely want that temporary file be stored in memory, you can mount a tmpfs directory on your Linux system. It will be seen as a 'normal' directory for PHP but will be stored in RAM
If you can use shell_exec() function, you can zip the directory in a command line and read the output in PHP : it will be stored in memory.
In my opinion, the first option (your first idea) is better in most cases, because :
allowing the use of shell_exec() can be dangerous
if one day you have big files to zip, this could fill the memory, resulting in very poor performance

PHP Importing XML feed packed in a zip file

I want to automate the following:
Once a day my cronjob starts a PHP script that obtains a zipped XML file from an URL.
What would be the best way to handle this? Is there any way to directly read the XML file from within the zip file?
Right now, i'm just downloading the zipped file to the server and manually unpacking it later that day.
Any ideas? All suggestions are very much welcome.
You can use PHP's ZipArchive coupled with cURL to download and read the zip file.
Also, the ZipArchive class has a method called getStream which allows you to then use fread to access the contents without explicitly extracting the file.
The only problem I see if that the zip does have to be saved somewhere for PHP's library to read it. But, given you're already doing this, it shouldn't be an issue.
If you need an example, leave me a comment and I can write on up.
There is a collection of zip-related functions that can be used in PHP.
The problem with these is that it requires the compressed file to exist on the server (not just loaded from an external server somewhere using, for example, $file = file($url);).
If you were to save the file to your server then you could use $zip = zip_open($filename) and zip_read($zip) to process the zip file.

PHP: Zipping files as xlsx causes file to become invalid

I have to replace variables inside a user-submitted xlsx file and am doing it this way:
Rename the .xlsx to .zip
Unzip to a temp-folder
Make the necessary changes
Zip the files
Rename the .zip to .xslx
I am using plain ZipArchive in PHP. When I try to open the generated .xlsx in Excel, it fails with a message format or extension invalid. When I compress the temporary files with WinRAR (as zip), and rename the resulting file to .xlsx, it works. The zip files generated with both methods contain the same data structure and files, but the WinRAR file is slightly bigger (10.2K vs. 10.3K with normal compression).
When viewing the garbled code of the files, I could see that the files appear in a different order, but don't know if that's the cause. Any clue would be greatly appreciated.
I got it to work with another component, PclZip (phpconcept.net/pclzip). It is a class that apparently uses gzip, and renaming it's outputs to .xlsx are fine with Excel 2007.

Categories