How to recreate PHAR files with identical sha1sums at different times?

How to recreate PHAR files with identical sha1sums at different times? - php

I'm working on a command-line PHP project and want to be able to recreate the PHAR file that is my deployment artifact. The challenge is that I can't create two PHAR's that have identical sha1sums and were created more than 1 second apart from each other. I would like to be able to exactly recreate my PHAR file if the input files are the same (i.e. came from the same git commit).
The following code snippet demonstrates the problem:
#!/usr/bin/php
<?php
$hashes = array();
$file_names = array('file1.phar','file2.phar');
foreach ($file_names as $name) {
if (file_exists($name)) {
unlink($name);
}
$phar = new Phar($name);
$phar->addFromString('cli.php', "cli\n");
$hashes[]=sha1_file($name);
// remove the sleep and the PHAR's are identical.
sleep(1);
}
if ($hashes[0]==$hashes[1]) {
echo "match\n";
} else {
echo "do not match\n";
}
As far as I can tell, the "modification time" field for each file in the PHAR manifest is always set to the current time, and there seems to be no way or overriding that. Even touch("phar://file1.phar/cli.php", 1413387555) gives the error:
touch(): Can not call touch() for a non-standard stream
I ran the above code in PHP 5.5.9 on ubuntu trusty and PHP 5.3 on RHEL5 and both versions behave the same way and fail to create identical PHAR files.
I'm trying to do this in order to follow the advice in the book Continuous Deployment by Jez Humble and David Farley
Any help is appreciated.

The Phar class currently does not allow users to alter or even access the modifiction time. I thought of storing your string into a temporary file and using touch to alter the mtime, but that does not seem to have any effect. So you'll have to manually change the timestamps in the created files and then regenerate the archive signature. Here's how to do it with current PHP versions:
<?php
$filename = "file1.phar";
$archive = file_get_contents($filename);
# Search for the start of the archive header
# See http://php.net/manual/de/phar.fileformat.phar.php
# This isn't the only valid way to write a PHAR archive, but it is what the Phar class
# currently does, so you should be fine (The docs say that the end-of-PHP-tag is optional)
$magic = "__HALT_COMPILER(); ?" . ">";
$end_of_code = strpos($archive, $magic) + strlen($magic);
$data_pos = $end_of_code;
# Skip that header
$data = unpack("Vmanifest_length/Vnumber_of_files/vapi_version/Vglobal_flags/Valias_length", substr($archive, $end_of_code, 18));
$data_pos += 18 + $data["alias_length"];
$metadata = unpack("Vlength", substr($archive, $data_pos, 4));
$data_pos += 4 + $metadata["length"];
for($i=0; $i<$data["number_of_files"]; $i++) {
# Now $data_pos points to the first file
# Files are explained here: http://php.net/manual/de/phar.fileformat.manifestfile.php
$filename_data = unpack("Vfilename_length", substr($archive, $data_pos, 4));
$data_pos += 4 + $filename_data["filename_length"];
$file_data = unpack("Vuncompressed_size/Vtimestamp/Vcompressed_size/VCRC32/Vflags/Vmetadata_length", substr($archive, $data_pos, 24));
# Change the timestamp to zeros (You can also use some other time here using pack("V", time()) instead of the zeros)
$archive = substr($archive, 0, $data_pos + 4) . "\0\0\0\0" . substr($archive, $data_pos + 8);
# Skip to the next file (it's _all_ the headers first, then file data)
$data_pos += 24 + $file_data["metadata_length"];
}
# Regenerate the file's signature
$sig_data = unpack("Vsigflags/C4magic", substr($archive, strlen($archive) - 8));
if($sig_data["magic1"] == ord("G") && $sig_data["magic2"] == ord("B") && $sig_data["magic3"] == ord("M") && $sig_data["magic4"] == ord("B")) {
if($sig_data["sigflags"] == 1) {
# MD5
$sig_pos = strlen($archive) - 8 - 16;
$archive = substr($archive, 0, $sig_pos) . pack("H32", md5(substr($archive, 0, $sig_pos))) . substr($archive, $sig_pos + 16);
}
else {
# SHA1
$sig_pos = strlen($archive) - 8 - 20;
$archive = substr($archive, 0, $sig_pos) . pack("H40", sha1(substr($archive, 0, $sig_pos))) . substr($archive, $sig_pos + 20);
}
# Note: The manual talks about SHA256/SHA512 support, but the according flags aren't documented yet. Currently,
# PHAR uses SHA1 by default, so there's nothing to worry about. You still might have to add those sometime.
}
file_put_contents($filename, $archive);
I've written this ad-hoc for my local PHP 5.5.9 version and your example above. The script will work for files created similar to your example code from above. The documentation hints to some valid deviations from this format. There are comments at the according lines in the code; you might have to add something there if you want to support general Phar files.

Related

PHP filesize() On Files > 2 GB

I have been struggeling on how to get the valid filesize of a file that is >= 2 GB in PHP.
Example
Here I am checking the filesize of a file that is 3,827,394,560 bytes large with the filesize() function:
echo "The file is " . filesize('C:\MyFile.rar') . " bytes.";
Result
This is what it returns:
The file is -467572736 bytes.
Background
PHP uses signed integers, which means that the maximum number it can represent is 2,147,483,647 (+/- 2 GB).
This is where it is limited.

The solution I tried and apparently works is to use the "Size" property of the COM FileObject. I am not entirely sure what type it uses.
This is my code:
function real_filesize($file_path)
{
$fs = new COM("Scripting.FileSystemObject");
return $fs->GetFile($file_path)->Size;
}
It's simply called as following:
$file = 'C:\MyFile.rar';
$size = real_filesize($file);
echo "The size of the file is: $size";
Result
The size of the file is: 3,827,394,560 bytes

http://us.php.net/manual/en/function.filesize.php#102135 gives a complete and correct means for finding the size of a file larger than 2GB in PHP, without relying on OS-specific interfaces.
The gist of it is that you first use filesize to get the "low" bits, then open+seek the file to determine how many multiples of 2GB it contains (the "high" bits).

I was using a different approach, saving precious server-resources,
have a look at my GitHub repository github.com/eladkarako/download.eladkarako.com.
It is a plain, and complete, download-dashboard, that overcome the (*rare) cases of filesize error using client-side head-request, granted, it will not be embedded into the page's HTML source, but rendered (*fixed) some time later, so it is more suitable for hmm..., lets say, relaxed scenarios..
To make this solution available, an Apache .htaccess (or header in PHP) should be added allowing client-side usage of Content-Length value.
Essentially you can slim down the .htaccess to just allowing Content-Length removing other CORS rules.. making the website more secure.
no jQuery was used and whole thing was written in my Samsung text-editor and uploaded by FTP from my smartphone, in a 1.5-hour train-ride in my MILUIM.. and yet, still impeccable ;)

I have one "hacky" solution what works well.
Look please THIS function how I do it and you need also include this class to function can works well or change by your need.
example:
include_once 'class.os.php';
include_once 'function.filesize.32bit.php';
// Must be real path to file
$file = "/home/username/some-folder/yourfile.zip";
echo get_filesize($file);
This function is not ideal solution but here is how works:
First check if shell_exec is enabled into PHP. If is enabled, it will check via shell command real filesize.
If shell fail and OS is 64bit will return normal filesize() information
If is 32bit will go into "chunking" method and calculate filesize reading buytes.
NOTE!
Alfter reading keep results in string format to can easly calculate because PHP can calculate strings but if you transfrorm results over 2GB into integer you will have same problem as before.
WARNING!
Chunking is realy slow and if you want to loop this, you will have memory leak or script can take minutes to finish reading all the files. If you use this function on server where you have shell_exec enabled, you will have super fast reading.
P.S.
If you have some idea for changes here and improvemants feel free to commit.

For anyone who happens to be on a linux host, the easiest solution I found is to use:
exec("stat --format=\"%s\" \"$file\"");
This assumes no quotation marks or newlines in the file name and technically returns a string instead of a number I suppose, but it works well with this method to get a
human readable file size.
The largest file I tested this with was about 3.6 GB.

I know this is an oldie, but I'm using PHP x64 5.5.38 (for now) and don't want to upgrade to the latest 7.x version yet.
I did read all these posts about finding file sizes that are larger than 2GB, but all solutions where very slow for large amount of files.
So, yesterday I've created C/C++ PHP Extension "php_filesize.dll", that is using the power of C/C++ to find file sizes with a few methods I've found, it's also UTF-8 compatible and very fast.
You can try it:
http://www.jobnik.net/files/PHP/php_filesize.zip
Usage:
methods:
0 - using GetFileAttributesEx
1 - using CreateFile
2 - using FindFirstFile
-1 - using stat64 (default and optional)
$fsize = php_filesize("filepath", $method_optional);
Returns file size in string type up to 9 PetaByte
Credits:
FileSize methods: Check the file-size without opening file in C++?
UTF-8 support: https://github.com/kenjiuno/php-wfio

To get the correct file size I often use this piece of code written by myself some months ago. My code uses: exec/com/stat where available. I know its limits, but it's a good starting point. The best idea is using filesize() on 64bit architecture.
<?php
######################################################################
# Human size for files smaller or bigger than 2 GB on 32 bit Systems #
# size.php - 1.3 - 21.09.2015 - Alessandro Marinuzzi - www.alecos.it #
######################################################################
function showsize($file) {
if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
if (class_exists("COM")) {
$fsobj = new COM('Scripting.FileSystemObject');
$f = $fsobj->GetFile(realpath($file));
$size = $f->Size;
} else {
$size = trim(#exec("for %F in (\"" . $file . "\") do #echo %~zF"));
}
} elseif (PHP_OS == 'Darwin') {
$size = trim(#exec("stat -f %z " . $file));
} else {
$size = trim(#exec("stat -c %s " . $file));
}
if ((!is_numeric($size)) || ($size < 0)) {
$size = filesize($file);
}
if ($size < 1024) {
echo $size . ' Byte';
} elseif ($size < 1048576) {
echo number_format(round($size / 1024, 2), 2) . ' KB';
} elseif ($size < 1073741824) {
echo number_format(round($size / 1048576, 2), 2) . ' MB';
} elseif ($size < 1099511627776) {
echo number_format(round($size / 1073741824, 2), 2) . ' GB';
} elseif ($size < 1125899906842624) {
echo number_format(round($size / 1099511627776, 2), 2) . ' TB';
} elseif ($size < 1152921504606846976) {
echo number_format(round($size / 1125899906842624, 2), 2) . ' PB';
} elseif ($size < 1180591620717411303424) {
echo number_format(round($size / 1152921504606846976, 2), 2) . ' EB';
} elseif ($size < 1208925819614629174706176) {
echo number_format(round($size / 1180591620717411303424, 2), 2) . ' ZB';
} else {
echo number_format(round($size / 1208925819614629174706176, 2), 2) . ' YB';
}
}
?>
<?php include("php/size.php"); ?>
<?php showsize("files/VeryBigFile.tar"); ?>
I hope this helps.

Problem reading files greater than 1GB with XMLReader

Is there a maximum file size the XMLReader can handle?
I'm trying to process an XML feed about 3GB large. There are certainly no PHP errors as the script runs fine and successfully loads to the database after it's been run.
The script also runs fine with smaller test feeds - 1GB and below. However, when processing larger feeds the script stops reading the XML File after about 1GB and continues running the rest of the script.
Has anybody experienced a similar problem? and if so how did you work around it?
Thanks in advance.

I had same kind of problem recently and I thought to share my experience.
It seems that problem is in the way PHP was compiled, whether it was compiled with support for 64bit file sizes/offsets or only with 32bit.
With 32bits you can only address 4GB of data. You can find a bit confusing but good explanation here: http://blog.mayflower.de/archives/131-Handling-large-files-without-PHP.html
I had to split my files with Perl utility xml_split which you can find here: http://search.cpan.org/~mirod/XML-Twig/tools/xml_split/xml_split
I used it to split my huge XML file into manageable chunks. The good thing about the tool is that it splits XML files over whole elements. Unfortunately its not very fast.
I needed to do this one time only and it suited my needs, but I wouldn't recommend it repetitive use. After splitting I used XMLReader on smaller files of about 1GB in size.

Splitting up the file will definitely help. Other things to try...
adjust the memory_limit variable in php.ini. http://php.net/manual/en/ini.core.php
rewrite your parser using SAX -- http://php.net/manual/en/book.xml.php . This is a stream-oriented parser that doesn't need to parse the whole tree. Much more memory-efficient but slightly harder to program.
Depending on your OS, there might also be a 2gb limit on the RAM chunk that you can allocate. Very possible if you're running on a 32-bit OS.

It should be noted that PHP in general has a max file size. PHP does not allow for unsigned integers, or long integers, meaning you're capped at 2^31 (or 2^63 for 64 bit systems) for integers. This is important because PHP uses an integer for the file pointer (your position in the file as you read through), meaning it cannot process a file larger than 2^31 bytes in size.
However, this should be more than 1 gigabyte. I ran into issues with two gigabytes (as expected, since 2^31 is roughly 2 billion).

I've run into a similar issue when parsing large documents. What I wound up doing is breaking the feed into smaller chunks using filesystem functions, then parsing those smaller chunks... So if you have a bunch of <record> tags that you are parsing, parse them out with string functions as a stream, and when you get a full record in the buffer, parse that using the xml functions... It sucks, but it works quite well (and is very memory efficient, since you only have at most 1 record in memory at any one time)...

Do you get any errors with
libxml_use_internal_errors(true);
libxml_clear_errors();
// your parser stuff here....
$r = new XMLReader(...);
// ....
foreach( libxml_get_errors() as $err ) {
printf(". %d %s\n", $err->code, $err->message);
}
when the parser stops prematurely?

Using WindowsXP, NTFS as filesystem and php 5.3.2 there was no problem with this test script
<?php
define('SOURCEPATH', 'd:/test.xml');
if ( 0 ) {
build();
}
else {
echo 'filesize: ', number_format(filesize(SOURCEPATH)), "\n";
timing('read');
}
function timing($fn) {
$start = new DateTime();
echo 'start: ', $start->format('Y-m-d H:i:s'), "\n";
$fn();
$end = new DateTime();
echo 'end: ', $start->format('Y-m-d H:i:s'), "\n";
echo 'diff: ', $end->diff($start)->format('%I:%S'), "\n";
}
function read() {
$cnt = 0;
$r = new XMLReader;
$r->open(SOURCEPATH);
while( $r->read() ) {
if ( XMLReader::ELEMENT === $r->nodeType ) {
if ( 0===++$cnt%500000 ) {
echo '.';
}
}
}
echo "\n#elements: ", $cnt, "\n";
}
function build() {
$fp = fopen(SOURCEPATH, 'wb');
$s = '<catalogue>';
//for($i = 0; $i < 500000; $i++) {
for($i = 0; $i < 60000000; $i++) {
$s .= sprintf('<item>%010d</item>', $i);
if ( 0===$i%100000 ) {
fwrite($fp, $s);
$s = '';
echo $i/100000, ' ';
}
}
$s .= '</catalogue>';
fwrite($fp, $s);
flush($fp);
fclose($fp);
}
output:
filesize: 1,380,000,023
start: 2010-08-07 09:43:31
........................................................................................................................
#elements: 60000001
end: 2010-08-07 09:43:31
diff: 07:31
(as you can see I screwed up the output of the end-time but I don't want to run this script another 7+ minutes ;-))
Does this also work on your system?
As a side-note: The corresponding C# test application took only 41 seconds instead of 7,5 minutes. And my slow harddrive might have been the/one limiting factor in this case.
filesize: 1.380.000.023
start: 2010-08-07 09:55:24
........................................................................................................................
#elements: 60000001
end: 2010-08-07 09:56:05
diff: 00:41
and the source:
using System;
using System.IO;
using System.Xml;
namespace ConsoleApplication1
{
class SOTest
{
delegate void Foo();
const string sourcepath = #"d:\test.xml";
static void timing(Foo bar)
{
DateTime dtStart = DateTime.Now;
System.Console.WriteLine("start: " + dtStart.ToString("yyyy-MM-dd HH:mm:ss"));
bar();
DateTime dtEnd = DateTime.Now;
System.Console.WriteLine("end: " + dtEnd.ToString("yyyy-MM-dd HH:mm:ss"));
TimeSpan s = dtEnd.Subtract(dtStart);
System.Console.WriteLine("diff: {0:00}:{1:00}", s.Minutes, s.Seconds);
}
static void readTest()
{
XmlTextReader reader = new XmlTextReader(sourcepath);
int cnt = 0;
while (reader.Read())
{
if (XmlNodeType.Element == reader.NodeType)
{
if (0 == ++cnt % 500000)
{
System.Console.Write('.');
}
}
}
System.Console.WriteLine("\n#elements: " + cnt + "\n");
}
static void Main()
{
FileInfo f = new FileInfo(sourcepath);
System.Console.WriteLine("filesize: {0:N0}", f.Length);
timing(readTest);
return;
}
}
}

PHP: How to get version from android .apk file?

I am trying to create a PHP script to get the app version from Android APK file.
Extracting XML file from the APK (zip) file and then parsing XML is one way, but I guess it should be simpler. Something like PHP Manual, example #3.
Any ideas how to create the script?

If you have the Android SDK installed on the server, you can use PHP's exec (or similar) to execute the aapt tool (in $ANDROID_HOME/platforms/android-X/tools).
$ aapt dump badging myapp.apk
And the output should include:
package: name='com.example.myapp' versionCode='1530' versionName='1.5.3'
If you can't install the Android SDK, for whatever reason, then you will need to parse Android's binary XML format. The AndroidManifest.xml file inside the APK zip structure is not plain text.
You would need to port a utility like AXMLParser from Java to PHP.

I've created a set of PHP functions that will find just the Version Code of an APK. This is based on the fact that the AndroidMainfest.xml file contains the version code as the first tag, and based on the axml (binary Android XML format) as described here
<?php
$APKLocation = "PATH TO APK GOES HERE";
$versionCode = getVersionCodeFromAPK($APKLocation);
echo $versionCode;
//Based on the fact that the Version Code is the first tag in the AndroidManifest.xml file, this will return its value
//PHP implementation based on the AXML format described here: https://stackoverflow.com/questions/2097813/how-to-parse-the-androidmanifest-xml-file-inside-an-apk-package/14814245#14814245
function getVersionCodeFromAPK($APKLocation) {
$versionCode = "N/A";
//AXML LEW 32-bit word (hex) for a start tag
$XMLStartTag = "00100102";
//APK is esentially a zip file, so open it
$zip = zip_open($APKLocation);
if ($zip) {
while ($zip_entry = zip_read($zip)) {
//Look for the AndroidManifest.xml file in the APK root directory
if (zip_entry_name($zip_entry) == "AndroidManifest.xml") {
//Get the contents of the file in hex format
$axml = getHex($zip, $zip_entry);
//Convert AXML hex file into an array of 32-bit words
$axmlArr = convert2wordArray($axml);
//Convert AXML 32-bit word array into Little Endian format 32-bit word array
$axmlArr = convert2LEWwordArray($axmlArr);
//Get first AXML open tag word index
$firstStartTagword = findWord($axmlArr, $XMLStartTag);
//The version code is 13 words after the first open tag word
$versionCode = intval($axmlArr[$firstStartTagword + 13], 16);
break;
}
}
}
zip_close($zip);
return $versionCode;
}
//Get the contents of the file in hex format
function getHex($zip, $zip_entry) {
if (zip_entry_open($zip, $zip_entry, 'r')) {
$buf = zip_entry_read($zip_entry, zip_entry_filesize($zip_entry));
$hex = unpack("H*", $buf);
return current($hex);
}
}
//Given a hex byte stream, return an array of words
function convert2wordArray($hex) {
$wordArr = array();
$numwords = strlen($hex)/8;
for ($i = 0; $i < $numwords; $i++)
$wordArr[] = substr($hex, $i * 8, 8);
return $wordArr;
}
//Given an array of words, convert them to Little Endian format (LSB first)
function convert2LEWwordArray($wordArr) {
$LEWArr = array();
foreach($wordArr as $word) {
$LEWword = "";
for ($i = 0; $i < strlen($word)/2; $i++)
$LEWword .= substr($word, (strlen($word) - ($i*2) - 2), 2);
$LEWArr[] = $LEWword;
}
return $LEWArr;
}
//Find a word in the word array and return its index value
function findWord($wordArr, $wordToFind) {
$currentword = 0;
foreach ($wordArr as $word) {
if ($word == $wordToFind)
return $currentword;
else
$currentword++;
}
}
?>

Use this in the CLI:
apktool if 1.apk
aapt dump badging 1.apk
You can use these commands in PHP using exec or shell_exec.

aapt dump badging ./apkfile.apk | grep sdkVersion -i
You will get a human readable form.
sdkVersion:'14'
targetSdkVersion:'14'
Just look for aapt in your system if you have Android SDK installed.
Mine is in:
<SDKPATH>/build-tools/19.0.3/aapt

The dump format is a little odd and not the easiest to work with. Just to expand on some of the other answers, this is a shell script that I am using to parse out name and version from APK files.
aapt d badging PACKAGE | gawk $'match($0, /^application-label:\'([^\']*)\'/, a) { n = a[1] }
match($0, /versionName=\'([^\']*)\'/, b) { v=b[1] }
END { if ( length(n)>0 && length(v)>0 ) { print n, v } }'
If you just want the version then obviously it can be much simpler.
aapt d badging PACKAGE | gawk $'match($0, /versionName=\'([^\']*)\'/, v) { print v[1] }'
Here are variations suitable for both gawk and mawk (a little less durable in case the dump format changes but should be fine):
aapt d badging PACKAGE | mawk -F\' '$1 ~ /^application-label:$/ { n=$2 }
$5 ~ /^ versionName=$/ { v=$6 }
END{ if ( length(n)>0 && length(v)>0 ) { print n, v } }'
aapt d badging PACKAGE | mawk -F\' '$5 ~ /^ versionName=$/ { print $6 }'

Which is faster: glob() or opendir()

Which is faster between glob() and opendir(), for reading around 1-2K file(s)?

http://code2design.com/forums/glob_vs_opendir
Obviously opendir() should be (and is) quicker as it opens the directory handler and lets you iterate. Because glob() has to parse the first argument it's going to take some more time (plus glob handles recursive directories so it'll scan subdirs, which will add to the execution time.

glob and opendir do different things. glob finds pathnames matching a pattern and returns these in an array, while opendir returns a directory handle only. To get the same results as with glob you have to call additional functions, which you have to take into account when benchmarking, especially if this includes pattern matching.
Bill Karwin has written an article about this recently. See:
http://www.phparch.com/2010/04/28/putting-glob-to-the-test/

Not sure whether that is perfect comparison but glob() allows you to incorporate the shell-like patterns as well where as opendir is directly there for the directories there by making it faster.

another question that can be answered with a bit of testing. i had a convenient folder with 412 things in it, but the results shouldn't vary much, i imagine:
igor47#whisker ~/test $ ls /media/music | wc -l
412
igor47#whisker ~/test $ time php opendir.php
414 files total
real 0m0.023s
user 0m0.000s
sys 0m0.020s
igor47#whisker ~/test $ time php glob.php
411 files total
real 0m0.023s
user 0m0.010s
sys 0m0.010s

Okay,
Long story short:
if you want full filenames+paths, sorted, glob is practically unbeatable.
if you want full filenames+paths unsorted, use glob with GLOB_NOSORT.
if you want only the names, and no sorting, use opendir + loop.
That's it.
Some more thoughts:
You can do tests to compose the exact same result with different methods only to find they have approximately the same time cost. Merely for fetching the information you'll have no real winner. However, consider these:
Dealing with a huge file list, glob will sort faster - it uses the filesystem's sort method which will always be superior. (It knows what it sorts while PHP doesn't, PHP sorts a hashed array of arbitrary strings, it's simply not fair to compare them.)
You'll probably want to filter your list by some extensions or filename masks for which glob is really efficient. You have fnmatch() of course, but calling it every time will never be faster than a system-level filter trained for this very job.
On the other hand, glob returns a significantly bigger amount of text (each name with full path) so with a lot of files you may run into memory allocation limits. For a zillion files, glob is not your friend.

OpenDir is more Faster...
<?php
$path = "/var/Upload/gallery/TEST/";
$filenm = "IMG20200706075415";
function microtime_float()
{
list($usec, $sec) = explode(" ", microtime());
return ((float)$usec + (float)$sec);
}
echo "<br> <i>T1:</i>".$t1 = microtime_float();
echo "<br><br> <b><i>Glob :</i></b>";
foreach( glob($path.$filenm.".*") as $file )
{
echo "<br>".$file;
}
echo "<br> <i>T2:</i> ".$t2 = microtime_float();
echo "<br><br> <b><i>OpenDir :</b></i>";
function resolve($name)
{
// reads informations over the path
$info = pathinfo($name);
if (!empty($info['extension']))
{
// if the file already contains an extension returns it
return $name;
}
$filename = $info['filename'];
$len = strlen($filename);
// open the folder
$dh = opendir($info['dirname']);
if (!$dh)
{
return false;
}
// scan each file in the folder
while (($file = readdir($dh)) !== false)
{
if (strncmp($file, $filename, $len) === 0)
{
if (strlen($name) > $len)
{
// if name contains a directory part
$name = substr($name, 0, strlen($name) - $len) . $file;
}
else
{
// if the name is at the path root
$name = $file;
}
closedir($dh);
return $name;
}
}
// file not found
closedir($dh);
return false;
}
$file = resolve($path.$filenm);
echo "<br>".$file;
echo "<br> <i>T3:</i> ".$t3 = microtime_float();
echo "<br><br>  <b>glob time:</b> ". $gt= ($t2 - $t1) ."<br><b>opendir time:</b>". $ot = ($t3 - $t2) ;
echo "<u>". (( $ot < $gt ) ? "<br><br>OpenDir is ".($gt-$ot)." more Faster" : "<br><br>Glob is ".($ot-$gt)." moreFaster ") . "</u>";
?>
Output:
T1:1620133029.7558
Glob :
/var/Upload/gallery/TEST/IMG20200706075415.jpg
T2: 1620133029.7929
OpenDir :
/var/Upload/gallery/TEST/IMG20200706075415.jpg
T3: 1620133029.793
  glob time:0.037137985229492
opendir time:5.9843063354492E-5
OpenDir is 0.037078142166138 more Faster

PHP script to generate a file with random data of given name and size?

Does anyone know of one? I need to test some upload/download scripts and need some really large files generated. I was going to integrate the test utility with my debug script.

To start you could try something like this:
function generate_file($file_name, $size_in_bytes)
{
$data = str_repeat(rand(0,9), $size_in_bytes);
file_put_contents($file_name, $data); //writes $data in a file
}
This creates file filled up with a random digit (0-9).

generate_file() from "Marco Demaio" is not memory friendly so I created file_rand().
function file_rand($filename, $filesize) {
if ($h = fopen($filename, 'w')) {
if ($filesize > 1024) {
for ($i = 0; $i < floor($filesize / 1024); $i++) {
fwrite($h, bin2hex(openssl_random_pseudo_bytes(511)) . PHP_EOL);
}
$filesize = $filesize - (1024 * $i);
}
$mod = $filesize % 2;
fwrite($h, bin2hex(openssl_random_pseudo_bytes(($filesize - $mod) / 2)));
if ($mod) {
fwrite($h, substr(uniqid(), 0, 1));
}
fclose($h);
umask(0000);
chmod($filename, 0644);
}
}
As you can see linebreaks are added every 1024 bytes to avoid problems with functions that are limited to 1024-9999 bytes. e.g. fgets() with <= PHP 4.3. And it makes it easier to open the file with an text editor having the same issue with super long lines.

Do you really need so much variation in filesize that you need a PHP script? I'd just create test files of varying sizes via the command line and use them in my unit tests. Unless the filesize itself is likely to cause a bug, it would seem you're over-engineering here...
To create a file in Windows;
fsutil file createnew d:\filepath\filename.txt 1048576
in Linux;
dd if=/dev/zero of=filepath/filename.txt bs=10000000 count=1
if is the file source (in this case nothing), of is the output file, bs is the final filesize, count defines how many blocks you want to copy.

generate_file() from #Marco Demaio caused this below when generating 4GB file.
Warning: str_repeat(): Result is too big, maximum 2147483647 allowed
in /home/xxx/test_suite/handler.php on line 38
I found below function from php.net and it's working like charm.
I have tested it upto
17.6 TB (see update below)
in less than 3 seconds.
function CreatFileDummy($file_name,$size = 90294967296 ) {
// 32bits 4 294 967 296 bytes MAX Size
$f = fopen('dummy/'.$file_name, 'wb');
if($size >= 1000000000) {
$z = ($size / 1000000000);
if (is_float($z)) {
$z = round($z,0);
fseek($f, ( $size - ($z * 1000000000) -1 ), SEEK_END);
fwrite($f, "\0");
}
while(--$z > -1) {
fseek($f, 999999999, SEEK_END);
fwrite($f, "\0");
}
}
else {
fseek($f, $size - 1, SEEK_END);
fwrite($f, "\0");
}
fclose($f);
return true;
}
Update:
I was trying to hit 120TB, 1200 TB and more but filesize was limited to 17.6 TB. After some googling I found that it is max_volume_size for ReiserFS file system which was on my server.
May be PHP can handle 1200TB also in just few seconds. :)

Why not have a script that streams out random data? The script can take parameters for file size, type etc.
This way you can simulate many scenarios, for example bandwidth throttling, premature file end etc. etc.

Does the file really need to be random? If so, just read from /dev/urandom on a Linux system:
dd if=/dev/urandom of=yourfile bs=4096 count=1024 # for a 4MB file.
If it doesn't really need to be random, just find some files you have lying around that are the appropriate size, or (alternatively) use tar and make some tarballs of various sizes.
There's no reason this needs to be done in a PHP script: ordinary shell tools are perfectly sufficient to generate the files you need.

If you want really random data you might want to try this:
$data = '';
for ($byteSize-- >= 0) {
$data .= chr(rand(0,255));
}
Might take a while, though, if you want large file sizes (as with any random data).

I would suggest using a library like Faker to generate test data.

I took the answer of mgutt and shortened it a bit. Also, his answer has a little bug which I wanted to avoid.
function createRandomFile(string $filename, int $filesize): void
{
$h = fopen($filename, 'w');
if (!$h) return;
for ($i = 0; $i < intdiv($filesize, 1024); $i++) {
fwrite($h, bin2hex(random_bytes(511)).PHP_EOL);
}
fwrite($h, substr(bin2hex(random_bytes(512)), 0, $filesize % 1024));
fclose($h);
chmod($filename, 0644);
}
Note: This works only with PHP >= 7. If you really want to run it on lower versions, use openssl_random_pseudo_bytes instead of random_bytes and floor($filesize / 1024) instead of intdiv($filesize, 1024).

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to recreate PHAR files with identical sha1sums at different times? - php

Related

PHP filesize() On Files > 2 GB

Problem reading files greater than 1GB with XMLReader

PHP: How to get version from android .apk file?

Which is faster: glob() or opendir()

PHP script to generate a file with random data of given name and size?

Categories

Resources