PHP filesize() On Files > 2 GB - php

I have been struggeling on how to get the valid filesize of a file that is >= 2 GB in PHP.
Example
Here I am checking the filesize of a file that is 3,827,394,560 bytes large with the filesize() function:
echo "The file is " . filesize('C:\MyFile.rar') . " bytes.";
Result
This is what it returns:
The file is -467572736 bytes.
Background
PHP uses signed integers, which means that the maximum number it can represent is 2,147,483,647 (+/- 2 GB).
This is where it is limited.

The solution I tried and apparently works is to use the "Size" property of the COM FileObject. I am not entirely sure what type it uses.
This is my code:
function real_filesize($file_path)
{
$fs = new COM("Scripting.FileSystemObject");
return $fs->GetFile($file_path)->Size;
}
It's simply called as following:
$file = 'C:\MyFile.rar';
$size = real_filesize($file);
echo "The size of the file is: $size";
Result
The size of the file is: 3,827,394,560 bytes

http://us.php.net/manual/en/function.filesize.php#102135 gives a complete and correct means for finding the size of a file larger than 2GB in PHP, without relying on OS-specific interfaces.
The gist of it is that you first use filesize to get the "low" bits, then open+seek the file to determine how many multiples of 2GB it contains (the "high" bits).

I was using a different approach, saving precious server-resources,
have a look at my GitHub repository github.com/eladkarako/download.eladkarako.com.
It is a plain, and complete, download-dashboard, that overcome the (*rare) cases of filesize error using client-side head-request, granted, it will not be embedded into the page's HTML source, but rendered (*fixed) some time later, so it is more suitable for hmm..., lets say, relaxed scenarios..
To make this solution available, an Apache .htaccess (or header in PHP) should be added allowing client-side usage of Content-Length value.
Essentially you can slim down the .htaccess to just allowing Content-Length removing other CORS rules.. making the website more secure.
no jQuery was used and whole thing was written in my Samsung text-editor and uploaded by FTP from my smartphone, in a 1.5-hour train-ride in my MILUIM.. and yet, still impeccable ;)

I have one "hacky" solution what works well.
Look please THIS function how I do it and you need also include this class to function can works well or change by your need.
example:
include_once 'class.os.php';
include_once 'function.filesize.32bit.php';
// Must be real path to file
$file = "/home/username/some-folder/yourfile.zip";
echo get_filesize($file);
This function is not ideal solution but here is how works:
First check if shell_exec is enabled into PHP. If is enabled, it will check via shell command real filesize.
If shell fail and OS is 64bit will return normal filesize() information
If is 32bit will go into "chunking" method and calculate filesize reading buytes.
NOTE!
Alfter reading keep results in string format to can easly calculate because PHP can calculate strings but if you transfrorm results over 2GB into integer you will have same problem as before.
WARNING!
Chunking is realy slow and if you want to loop this, you will have memory leak or script can take minutes to finish reading all the files. If you use this function on server where you have shell_exec enabled, you will have super fast reading.
P.S.
If you have some idea for changes here and improvemants feel free to commit.

For anyone who happens to be on a linux host, the easiest solution I found is to use:
exec("stat --format=\"%s\" \"$file\"");
This assumes no quotation marks or newlines in the file name and technically returns a string instead of a number I suppose, but it works well with this method to get a
human readable file size.
The largest file I tested this with was about 3.6 GB.

I know this is an oldie, but I'm using PHP x64 5.5.38 (for now) and don't want to upgrade to the latest 7.x version yet.
I did read all these posts about finding file sizes that are larger than 2GB, but all solutions where very slow for large amount of files.
So, yesterday I've created C/C++ PHP Extension "php_filesize.dll", that is using the power of C/C++ to find file sizes with a few methods I've found, it's also UTF-8 compatible and very fast.
You can try it:
http://www.jobnik.net/files/PHP/php_filesize.zip
Usage:
methods:
0 - using GetFileAttributesEx
1 - using CreateFile
2 - using FindFirstFile
-1 - using stat64 (default and optional)
$fsize = php_filesize("filepath", $method_optional);
Returns file size in string type up to 9 PetaByte
Credits:
FileSize methods: Check the file-size without opening file in C++?
UTF-8 support: https://github.com/kenjiuno/php-wfio

To get the correct file size I often use this piece of code written by myself some months ago. My code uses: exec/com/stat where available. I know its limits, but it's a good starting point. The best idea is using filesize() on 64bit architecture.
<?php
######################################################################
# Human size for files smaller or bigger than 2 GB on 32 bit Systems #
# size.php - 1.3 - 21.09.2015 - Alessandro Marinuzzi - www.alecos.it #
######################################################################
function showsize($file) {
if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
if (class_exists("COM")) {
$fsobj = new COM('Scripting.FileSystemObject');
$f = $fsobj->GetFile(realpath($file));
$size = $f->Size;
} else {
$size = trim(#exec("for %F in (\"" . $file . "\") do #echo %~zF"));
}
} elseif (PHP_OS == 'Darwin') {
$size = trim(#exec("stat -f %z " . $file));
} else {
$size = trim(#exec("stat -c %s " . $file));
}
if ((!is_numeric($size)) || ($size < 0)) {
$size = filesize($file);
}
if ($size < 1024) {
echo $size . ' Byte';
} elseif ($size < 1048576) {
echo number_format(round($size / 1024, 2), 2) . ' KB';
} elseif ($size < 1073741824) {
echo number_format(round($size / 1048576, 2), 2) . ' MB';
} elseif ($size < 1099511627776) {
echo number_format(round($size / 1073741824, 2), 2) . ' GB';
} elseif ($size < 1125899906842624) {
echo number_format(round($size / 1099511627776, 2), 2) . ' TB';
} elseif ($size < 1152921504606846976) {
echo number_format(round($size / 1125899906842624, 2), 2) . ' PB';
} elseif ($size < 1180591620717411303424) {
echo number_format(round($size / 1152921504606846976, 2), 2) . ' EB';
} elseif ($size < 1208925819614629174706176) {
echo number_format(round($size / 1180591620717411303424, 2), 2) . ' ZB';
} else {
echo number_format(round($size / 1208925819614629174706176, 2), 2) . ' YB';
}
}
?>
<?php include("php/size.php"); ?>
<?php showsize("files/VeryBigFile.tar"); ?>
I hope this helps.

Related

How to recreate PHAR files with identical sha1sums at different times?

I'm working on a command-line PHP project and want to be able to recreate the PHAR file that is my deployment artifact. The challenge is that I can't create two PHAR's that have identical sha1sums and were created more than 1 second apart from each other. I would like to be able to exactly recreate my PHAR file if the input files are the same (i.e. came from the same git commit).
The following code snippet demonstrates the problem:
#!/usr/bin/php
<?php
$hashes = array();
$file_names = array('file1.phar','file2.phar');
foreach ($file_names as $name) {
if (file_exists($name)) {
unlink($name);
}
$phar = new Phar($name);
$phar->addFromString('cli.php', "cli\n");
$hashes[]=sha1_file($name);
// remove the sleep and the PHAR's are identical.
sleep(1);
}
if ($hashes[0]==$hashes[1]) {
echo "match\n";
} else {
echo "do not match\n";
}
As far as I can tell, the "modification time" field for each file in the PHAR manifest is always set to the current time, and there seems to be no way or overriding that. Even touch("phar://file1.phar/cli.php", 1413387555) gives the error:
touch(): Can not call touch() for a non-standard stream
I ran the above code in PHP 5.5.9 on ubuntu trusty and PHP 5.3 on RHEL5 and both versions behave the same way and fail to create identical PHAR files.
I'm trying to do this in order to follow the advice in the book Continuous Deployment by Jez Humble and David Farley
Any help is appreciated.
The Phar class currently does not allow users to alter or even access the modifiction time. I thought of storing your string into a temporary file and using touch to alter the mtime, but that does not seem to have any effect. So you'll have to manually change the timestamps in the created files and then regenerate the archive signature. Here's how to do it with current PHP versions:
<?php
$filename = "file1.phar";
$archive = file_get_contents($filename);
# Search for the start of the archive header
# See http://php.net/manual/de/phar.fileformat.phar.php
# This isn't the only valid way to write a PHAR archive, but it is what the Phar class
# currently does, so you should be fine (The docs say that the end-of-PHP-tag is optional)
$magic = "__HALT_COMPILER(); ?" . ">";
$end_of_code = strpos($archive, $magic) + strlen($magic);
$data_pos = $end_of_code;
# Skip that header
$data = unpack("Vmanifest_length/Vnumber_of_files/vapi_version/Vglobal_flags/Valias_length", substr($archive, $end_of_code, 18));
$data_pos += 18 + $data["alias_length"];
$metadata = unpack("Vlength", substr($archive, $data_pos, 4));
$data_pos += 4 + $metadata["length"];
for($i=0; $i<$data["number_of_files"]; $i++) {
# Now $data_pos points to the first file
# Files are explained here: http://php.net/manual/de/phar.fileformat.manifestfile.php
$filename_data = unpack("Vfilename_length", substr($archive, $data_pos, 4));
$data_pos += 4 + $filename_data["filename_length"];
$file_data = unpack("Vuncompressed_size/Vtimestamp/Vcompressed_size/VCRC32/Vflags/Vmetadata_length", substr($archive, $data_pos, 24));
# Change the timestamp to zeros (You can also use some other time here using pack("V", time()) instead of the zeros)
$archive = substr($archive, 0, $data_pos + 4) . "\0\0\0\0" . substr($archive, $data_pos + 8);
# Skip to the next file (it's _all_ the headers first, then file data)
$data_pos += 24 + $file_data["metadata_length"];
}
# Regenerate the file's signature
$sig_data = unpack("Vsigflags/C4magic", substr($archive, strlen($archive) - 8));
if($sig_data["magic1"] == ord("G") && $sig_data["magic2"] == ord("B") && $sig_data["magic3"] == ord("M") && $sig_data["magic4"] == ord("B")) {
if($sig_data["sigflags"] == 1) {
# MD5
$sig_pos = strlen($archive) - 8 - 16;
$archive = substr($archive, 0, $sig_pos) . pack("H32", md5(substr($archive, 0, $sig_pos))) . substr($archive, $sig_pos + 16);
}
else {
# SHA1
$sig_pos = strlen($archive) - 8 - 20;
$archive = substr($archive, 0, $sig_pos) . pack("H40", sha1(substr($archive, 0, $sig_pos))) . substr($archive, $sig_pos + 20);
}
# Note: The manual talks about SHA256/SHA512 support, but the according flags aren't documented yet. Currently,
# PHAR uses SHA1 by default, so there's nothing to worry about. You still might have to add those sometime.
}
file_put_contents($filename, $archive);
I've written this ad-hoc for my local PHP 5.5.9 version and your example above. The script will work for files created similar to your example code from above. The documentation hints to some valid deviations from this format. There are comments at the according lines in the code; you might have to add something there if you want to support general Phar files.

PHP x64 still returns wrong filesize

I am trying to get the file size of files >2GB, but PHP seams to have problems regardless of 64 or 32 bit versions. On the PHP 64bit version running on a 64bit processor on a 64 bit OS, it still returns the wrong file size using filesize() function. Before I have encountered that it returns a negative number, or nothing at all... the number is changing if the file size is changing, but is not accurate on files > 2gb... I could understand a number less than actual size, zero or even negative numbers if php is using 32bit integers, but as I read, php 64bit is supposed to support file sizes > 2gb...
I have also tried using fseek to end and ftell like:
$a = fopen("c:\big.txt", 'r');
fseek($a,0,SEEK_END);
$fs = ftell($a);
fclose($a);
echo $fs;
But that just gives 0...
Why not just use the filesize function?
echo filesize('c:\big.txt');
This function may not work properly on 32-bit systems, but should give you the correct size on a 64-bit system:
Note: Because PHP's integer type is signed and many platforms use 32bit integers, some filesystem functions may return unexpected results for files which are larger than 2GB.
The manual actually says "This may be broken on your platform. Oh, too bad."
This comment may help: http://us3.php.net/manual/en/function.filesize.php#113457 . It something similar to what you tried, but it seems to assume SEEK_END doesn't work either.
function RealFileSize($fp)
{
$pos = 0;
$size = 1073741824;
fseek($fp, 0, SEEK_SET);
while ($size > 1)
{
fseek($fp, $size, SEEK_CUR);
if (fgetc($fp) === false)
{
fseek($fp, -$size, SEEK_CUR);
$size = (int)($size / 2);
}
else
{
fseek($fp, -1, SEEK_CUR);
$pos += $size;
}
}
while (fgetc($fp) !== false) $pos++;
return $pos;
}
Disclaimer: I didn't try this.

The best way to get the file size which greater than 2GB in php?

I want to check the file's size of local drives on windows OS.But the native PHP function filesize() only work when the file size less than 2GB. The file which greater than 2GB will return the wrong number.So,is there other way to get the file size which greater than 2GB?
Thank you very much!!
You can always use the system's file size method.
For Windows:
Windows command for file size only?
#echo off
echo %~z1
For Linux
stat -c %s filenam
You would run these through the exec php command.
PHP function to get the file size of a local file with insignificant memory usage:
function get_file_size ($file) {
$fp = #fopen($file, "r");
#fseek($fp,0,SEEK_END);
$filesize = #ftell($fp);
fclose($fp);
return $filesize;
}
In first line of code, $file is opened in read-only mode and attached to the $fp handle.
In second line, the pointer is moved with fseek() to the end of $file.
Lastly, ftell() returns the byte position of the pointer in $file, which is now the end of it.
The fopen() function is binary-safe and it's apropiate for use even with very large files.
The above code is also very fast.
this function works for any size:
function fsize($file) {
// filesize will only return the lower 32 bits of
// the file's size! Make it unsigned.
$fmod = filesize($file);
if ($fmod < 0) $fmod += 2.0 * (PHP_INT_MAX + 1);
// find the upper 32 bits
$i = 0;
$myfile = fopen($file, "r");
// feof has undefined behaviour for big files.
// after we hit the eof with fseek,
// fread may not be able to detect the eof,
// but it also can't read bytes, so use it as an
// indicator.
while (strlen(fread($myfile, 1)) === 1) {
fseek($myfile, PHP_INT_MAX, SEEK_CUR);
$i++;
}
fclose($myfile);
// $i is a multiplier for PHP_INT_MAX byte blocks.
// return to the last multiple of 4, as filesize has modulo of 4 GB (lower 32 bits)
if ($i % 2 == 1) $i--;
// add the lower 32 bit to our PHP_INT_MAX multiplier
return ((float)($i) * (PHP_INT_MAX + 1)) + $fmod;
}
note: this function maybe litte slow for files > 2gb
(taken from php comments)
If you're running a Linux server, use the system command.
$last_line = system('ls');
Is an example of how it is used. If you replace 'ls' with:
du <filename>
then it will return an integer of the file size in the variable $last_line. For example:
472 myProgram.exe
means it's 472 KB. You can use regular expressions to obtain just the number. I haven't used the du command that much, so you'd want to play around with it and have a look at what the output is for files > 2gb.
http://php.net/manual/en/function.system.php
<?php
$files = `find / -type f -size +2097152`;
?>
This function returns the size for files > 2GB and is quite fast.
function file_get_size($file) {
//open file
$fh = fopen($file, "r");
//declare some variables
$size = "0";
$char = "";
//set file pointer to 0; I'm a little bit paranoid, you can remove this
fseek($fh, 0, SEEK_SET);
//set multiplicator to zero
$count = 0;
while (true) {
//jump 1 MB forward in file
fseek($fh, 1048576, SEEK_CUR);
//check if we actually left the file
if (($char = fgetc($fh)) !== false) {
//if not, go on
$count ++;
} else {
//else jump back where we were before leaving and exit loop
fseek($fh, -1048576, SEEK_CUR);
break;
}
}
//we could make $count jumps, so the file is at least $count * 1.000001 MB large
//1048577 because we jump 1 MB and fgetc goes 1 B forward too
$size = bcmul("1048577", $count);
//now count the last few bytes; they're always less than 1048576 so it's quite fast
$fine = 0;
while(false !== ($char = fgetc($fh))) {
$fine ++;
}
//and add them
$size = bcadd($size, $fine);
fclose($fh);
return $size;
}
To riff on joshhendo's answer, if you're on a Unix-like OS (Linux, OSX, macOS, etc) you can cheat a little using ls:
$fileSize = trim(shell_exec("ls -nl " . escapeshellarg($fullPathToFile) . " | awk '{print $5}'"));
trim() is there to remove the carriage return at the end. What's left is a string containing the full size of the file on disk, regardless of size or stat cache status, with no human formatting such as commas.
Just be careful where the data in $fullPathToFile comes from...when making system calls you don't want to trust user-supplied data. The escapeshellarg will probably protect you, but better safe than sorry.

Problem reading files greater than 1GB with XMLReader

Is there a maximum file size the XMLReader can handle?
I'm trying to process an XML feed about 3GB large. There are certainly no PHP errors as the script runs fine and successfully loads to the database after it's been run.
The script also runs fine with smaller test feeds - 1GB and below. However, when processing larger feeds the script stops reading the XML File after about 1GB and continues running the rest of the script.
Has anybody experienced a similar problem? and if so how did you work around it?
Thanks in advance.
I had same kind of problem recently and I thought to share my experience.
It seems that problem is in the way PHP was compiled, whether it was compiled with support for 64bit file sizes/offsets or only with 32bit.
With 32bits you can only address 4GB of data. You can find a bit confusing but good explanation here: http://blog.mayflower.de/archives/131-Handling-large-files-without-PHP.html
I had to split my files with Perl utility xml_split which you can find here: http://search.cpan.org/~mirod/XML-Twig/tools/xml_split/xml_split
I used it to split my huge XML file into manageable chunks. The good thing about the tool is that it splits XML files over whole elements. Unfortunately its not very fast.
I needed to do this one time only and it suited my needs, but I wouldn't recommend it repetitive use. After splitting I used XMLReader on smaller files of about 1GB in size.
Splitting up the file will definitely help. Other things to try...
adjust the memory_limit variable in php.ini. http://php.net/manual/en/ini.core.php
rewrite your parser using SAX -- http://php.net/manual/en/book.xml.php . This is a stream-oriented parser that doesn't need to parse the whole tree. Much more memory-efficient but slightly harder to program.
Depending on your OS, there might also be a 2gb limit on the RAM chunk that you can allocate. Very possible if you're running on a 32-bit OS.
It should be noted that PHP in general has a max file size. PHP does not allow for unsigned integers, or long integers, meaning you're capped at 2^31 (or 2^63 for 64 bit systems) for integers. This is important because PHP uses an integer for the file pointer (your position in the file as you read through), meaning it cannot process a file larger than 2^31 bytes in size.
However, this should be more than 1 gigabyte. I ran into issues with two gigabytes (as expected, since 2^31 is roughly 2 billion).
I've run into a similar issue when parsing large documents. What I wound up doing is breaking the feed into smaller chunks using filesystem functions, then parsing those smaller chunks... So if you have a bunch of <record> tags that you are parsing, parse them out with string functions as a stream, and when you get a full record in the buffer, parse that using the xml functions... It sucks, but it works quite well (and is very memory efficient, since you only have at most 1 record in memory at any one time)...
Do you get any errors with
libxml_use_internal_errors(true);
libxml_clear_errors();
// your parser stuff here....
$r = new XMLReader(...);
// ....
foreach( libxml_get_errors() as $err ) {
printf(". %d %s\n", $err->code, $err->message);
}
when the parser stops prematurely?
Using WindowsXP, NTFS as filesystem and php 5.3.2 there was no problem with this test script
<?php
define('SOURCEPATH', 'd:/test.xml');
if ( 0 ) {
build();
}
else {
echo 'filesize: ', number_format(filesize(SOURCEPATH)), "\n";
timing('read');
}
function timing($fn) {
$start = new DateTime();
echo 'start: ', $start->format('Y-m-d H:i:s'), "\n";
$fn();
$end = new DateTime();
echo 'end: ', $start->format('Y-m-d H:i:s'), "\n";
echo 'diff: ', $end->diff($start)->format('%I:%S'), "\n";
}
function read() {
$cnt = 0;
$r = new XMLReader;
$r->open(SOURCEPATH);
while( $r->read() ) {
if ( XMLReader::ELEMENT === $r->nodeType ) {
if ( 0===++$cnt%500000 ) {
echo '.';
}
}
}
echo "\n#elements: ", $cnt, "\n";
}
function build() {
$fp = fopen(SOURCEPATH, 'wb');
$s = '<catalogue>';
//for($i = 0; $i < 500000; $i++) {
for($i = 0; $i < 60000000; $i++) {
$s .= sprintf('<item>%010d</item>', $i);
if ( 0===$i%100000 ) {
fwrite($fp, $s);
$s = '';
echo $i/100000, ' ';
}
}
$s .= '</catalogue>';
fwrite($fp, $s);
flush($fp);
fclose($fp);
}
output:
filesize: 1,380,000,023
start: 2010-08-07 09:43:31
........................................................................................................................
#elements: 60000001
end: 2010-08-07 09:43:31
diff: 07:31
(as you can see I screwed up the output of the end-time but I don't want to run this script another 7+ minutes ;-))
Does this also work on your system?
As a side-note: The corresponding C# test application took only 41 seconds instead of 7,5 minutes. And my slow harddrive might have been the/one limiting factor in this case.
filesize: 1.380.000.023
start: 2010-08-07 09:55:24
........................................................................................................................
#elements: 60000001
end: 2010-08-07 09:56:05
diff: 00:41
and the source:
using System;
using System.IO;
using System.Xml;
namespace ConsoleApplication1
{
class SOTest
{
delegate void Foo();
const string sourcepath = #"d:\test.xml";
static void timing(Foo bar)
{
DateTime dtStart = DateTime.Now;
System.Console.WriteLine("start: " + dtStart.ToString("yyyy-MM-dd HH:mm:ss"));
bar();
DateTime dtEnd = DateTime.Now;
System.Console.WriteLine("end: " + dtEnd.ToString("yyyy-MM-dd HH:mm:ss"));
TimeSpan s = dtEnd.Subtract(dtStart);
System.Console.WriteLine("diff: {0:00}:{1:00}", s.Minutes, s.Seconds);
}
static void readTest()
{
XmlTextReader reader = new XmlTextReader(sourcepath);
int cnt = 0;
while (reader.Read())
{
if (XmlNodeType.Element == reader.NodeType)
{
if (0 == ++cnt % 500000)
{
System.Console.Write('.');
}
}
}
System.Console.WriteLine("\n#elements: " + cnt + "\n");
}
static void Main()
{
FileInfo f = new FileInfo(sourcepath);
System.Console.WriteLine("filesize: {0:N0}", f.Length);
timing(readTest);
return;
}
}
}

PHP script to generate a file with random data of given name and size?

Does anyone know of one? I need to test some upload/download scripts and need some really large files generated. I was going to integrate the test utility with my debug script.
To start you could try something like this:
function generate_file($file_name, $size_in_bytes)
{
$data = str_repeat(rand(0,9), $size_in_bytes);
file_put_contents($file_name, $data); //writes $data in a file
}
This creates file filled up with a random digit (0-9).
generate_file() from "Marco Demaio" is not memory friendly so I created file_rand().
function file_rand($filename, $filesize) {
if ($h = fopen($filename, 'w')) {
if ($filesize > 1024) {
for ($i = 0; $i < floor($filesize / 1024); $i++) {
fwrite($h, bin2hex(openssl_random_pseudo_bytes(511)) . PHP_EOL);
}
$filesize = $filesize - (1024 * $i);
}
$mod = $filesize % 2;
fwrite($h, bin2hex(openssl_random_pseudo_bytes(($filesize - $mod) / 2)));
if ($mod) {
fwrite($h, substr(uniqid(), 0, 1));
}
fclose($h);
umask(0000);
chmod($filename, 0644);
}
}
As you can see linebreaks are added every 1024 bytes to avoid problems with functions that are limited to 1024-9999 bytes. e.g. fgets() with <= PHP 4.3. And it makes it easier to open the file with an text editor having the same issue with super long lines.
Do you really need so much variation in filesize that you need a PHP script? I'd just create test files of varying sizes via the command line and use them in my unit tests. Unless the filesize itself is likely to cause a bug, it would seem you're over-engineering here...
To create a file in Windows;
fsutil file createnew d:\filepath\filename.txt 1048576
in Linux;
dd if=/dev/zero of=filepath/filename.txt bs=10000000 count=1
if is the file source (in this case nothing), of is the output file, bs is the final filesize, count defines how many blocks you want to copy.
generate_file() from #Marco Demaio caused this below when generating 4GB file.
Warning: str_repeat(): Result is too big, maximum 2147483647 allowed
in /home/xxx/test_suite/handler.php on line 38
I found below function from php.net and it's working like charm.
I have tested it upto
17.6 TB (see update below)
in less than 3 seconds.
function CreatFileDummy($file_name,$size = 90294967296 ) {
// 32bits 4 294 967 296 bytes MAX Size
$f = fopen('dummy/'.$file_name, 'wb');
if($size >= 1000000000) {
$z = ($size / 1000000000);
if (is_float($z)) {
$z = round($z,0);
fseek($f, ( $size - ($z * 1000000000) -1 ), SEEK_END);
fwrite($f, "\0");
}
while(--$z > -1) {
fseek($f, 999999999, SEEK_END);
fwrite($f, "\0");
}
}
else {
fseek($f, $size - 1, SEEK_END);
fwrite($f, "\0");
}
fclose($f);
return true;
}
Update:
I was trying to hit 120TB, 1200 TB and more but filesize was limited to 17.6 TB. After some googling I found that it is max_volume_size for ReiserFS file system which was on my server.
May be PHP can handle 1200TB also in just few seconds. :)
Why not have a script that streams out random data? The script can take parameters for file size, type etc.
This way you can simulate many scenarios, for example bandwidth throttling, premature file end etc. etc.
Does the file really need to be random? If so, just read from /dev/urandom on a Linux system:
dd if=/dev/urandom of=yourfile bs=4096 count=1024 # for a 4MB file.
If it doesn't really need to be random, just find some files you have lying around that are the appropriate size, or (alternatively) use tar and make some tarballs of various sizes.
There's no reason this needs to be done in a PHP script: ordinary shell tools are perfectly sufficient to generate the files you need.
If you want really random data you might want to try this:
$data = '';
for ($byteSize-- >= 0) {
$data .= chr(rand(0,255));
}
Might take a while, though, if you want large file sizes (as with any random data).
I would suggest using a library like Faker to generate test data.
I took the answer of mgutt and shortened it a bit. Also, his answer has a little bug which I wanted to avoid.
function createRandomFile(string $filename, int $filesize): void
{
$h = fopen($filename, 'w');
if (!$h) return;
for ($i = 0; $i < intdiv($filesize, 1024); $i++) {
fwrite($h, bin2hex(random_bytes(511)).PHP_EOL);
}
fwrite($h, substr(bin2hex(random_bytes(512)), 0, $filesize % 1024));
fclose($h);
chmod($filename, 0644);
}
Note: This works only with PHP >= 7. If you really want to run it on lower versions, use openssl_random_pseudo_bytes instead of random_bytes and floor($filesize / 1024) instead of intdiv($filesize, 1024).

Categories