Related
I want to remove all functions ending with _example from my code. I am processing the code using token_get_all. The code I currently have is below to change the opening tags and strip the comments out.
foreach ($files as $file) {
$content = file_get_contents($file);
$tokens = token_get_all($content);
$output = '';
foreach($tokens as $token) {
if (is_array($token)) {
list($index, $code, $line) = $token;
switch($index) {
case T_OPEN_TAG_WITH_ECHO:
$output .= '<?php echo';
break;
case T_COMMENT:
case T_DOC_COMMENT:
$output .= '';
break;
case T_FUNCTION:
// ???
default:
$output .= $code;
break;
}
} else {
$output .= $token;
}
}
file_put_contents($file, $output);
}
I just can't figure out how I can modify it to strip entire functions based on their names.
Ok, I wrote the new code for your problem:
First, he finds every functions and them declarations in your source code.
Second, he checks if function name finished by "_example" and remove his code.
$source = file_get_contents($filename); // Obtain source from filename $filename
$tokens = token_get_all($source); // Get php tokens
// Init variables
$in_fnc = false;
$fnc_name = null;
$functions = array();
// Loop $tokens to locate functions
foreach ($tokens as $token){
$t_array = is_array($token);
if ($t_array){
list($t, $c) = $token;
if (!$in_fnc && $t == T_FUNCTION){ // "function": we register one function start
$in_fnc = true;
$fnc_name = null;
$nb_opened = $nb_closed = 0;
continue;
}
else if ($in_fnc && null === $fnc_name && $t == T_STRING){ // we check and store the name of function if exists
if (preg_match('`function\s+'.preg_quote($c).'\s*\(`sU', $source)){ // "function function_name ("
$fnc_name = $c;
continue;
}
}
}
else {
$c = $token; // single char: content is $token
}
if ($in_fnc && null !== $fnc_name){ // we are in declaration of function
$nb_closed += substr_count($c, '}'); // we count number of } to extract later complete code of this function
if (!$t_array){
$nb_opened += substr_count($c, '{') - substr_count($c, '}'); // we count number of { not closed (num "{" - num "}")
if ($nb_closed > 0 && $nb_opened == 0){ // once "}" parsed and all "{" are closed by "}"
if (preg_match('`function\s+'.preg_quote($fnc_name).'\s*\((.*\}){'.$nb_closed.'}`sU', $source, $m)){
$functions[$fnc_name] = $m[0]; // we store entire code of this function in $functions
}
$in_fnc = false; // we declare that function is finished
}
}
}
}
// Ok, now $functions contains all functions found in $filename
$source_changed = false; // Prevents re-write $filename with the original content
foreach ($functions as $f_name => $f_code){
if (preg_match('`_example$`', $f_name)){
$source = str_replace($f_code, '', $source); // remove function if her name finished by "_example"
$source_changed = true;
}
}
if ($source_changed){
file_put_contents($filename, $source); // replace $filename file contents
}
I would like to rename all variables within the file to random name.
For example this:
$example = "some $string";
function ($variable2) {
echo $variable2;
}
foreach ($variable3 as $key => $var3val) {
echo $var3val . "somestring";
}
Will become this:
$frk43r = "some $string";
function ($izi34ee) {
echo $izi34ee;
}
foreach ($erew7er as $iure7 => $er3k2) {
echo $er3k2 . "some$string";
}
It doesn't look so easy task so any suggestions will be helpful.
I would use token_get_all to parse the document and map a registered random string replacement on all interesting tokens.
To obfuscate all the variable names, replace T_VARIABLE in one pass, ignoring all the superglobals.
Additionally, for the bounty's requisite function names, replace all the T_FUNCTION declarations in the first pass. Then a second pass is needed to replace all the T_STRING invocations because PHP allows you to use a function before it's declared.
For this example, I generated all lowercase letters to avoid case-insensitive clashes to function names, but you can obviously use whatever characters you want and add an extra conditional check for increased complexity. Just remember that they can't start with a number.
I also registered all the internal function names with get_defined_functions to protect against the extremely off-chance possibility that a randomly generated string would match one of those function names. Keep in mind this won't protect against special extensions installed on the machine running the obfuscated script that are not present on the server obfuscating the script. The chances of that are astronomical, but you can always ratchet up the length of the randomly generated string to diminish those odds even more.
<?php
$tokens = token_get_all(file_get_contents('example.php'));
$globals = array(
'$GLOBALS',
'$_SERVER',
'$_GET',
'$_POST',
'$_FILES',
'$_COOKIE',
'$_SESSION',
'$_REQUEST',
'$_ENV',
);
// prevent name clashes with randomly generated strings and native functions
$registry = get_defined_functions();
$registry = $registry['internal'];
// first pass to change all the variable names and function name declarations
foreach($tokens as $key => $element){
// make sure it's an interesting token
if(!is_array($element)){
continue;
}
switch ($element[0]) {
case T_FUNCTION:
$prefix = '';
// this jumps over the whitespace to get the function name
$index = $key + 2;
break;
case T_VARIABLE:
// ignore the superglobals
if(in_array($element[1], $globals)){
continue 2;
}
$prefix = '$';
$index = $key;
break;
default:
continue 2;
}
// check to see if we've already registered it
if(!isset($registry[$tokens[$index][1]])){
// make sure our random string hasn't already been generated
// or just so crazily happens to be the same name as an internal function
do {
$replacement = $prefix.random_str(16);
} while(in_array($replacement, $registry));
// map the original and register the replacement
$registry[$tokens[$index][1]] = $replacement;
}
// rename the variable
$tokens[$index][1] = $registry[$tokens[$index][1]];
}
// second pass to rename all the function invocations
$tokens = array_map(function($element) use ($registry){
// check to see if it's a function identifier
if(is_array($element) && $element[0] === T_STRING){
// make sure it's one of our registered function names
if(isset($registry[$element[1]])){
// rename the variable
$element[1] = $registry[$element[1]];
}
}
return $element;
},$tokens);
// dump the tokens back out to rebuild the page with obfuscated names
foreach($tokens as $token){
echo $token[1] ?? $token;
}
/**
* https://stackoverflow.com/a/31107425/4233593
* Generate a random string, using a cryptographically secure
* pseudorandom number generator (random_int)
*
* For PHP 7, random_int is a PHP core function
* For PHP 5.x, depends on https://github.com/paragonie/random_compat
*
* #param int $length How many characters do we want?
* #param string $keyspace A string of all possible characters
* to select from
* #return string
*/
function random_str($length, $keyspace = 'abcdefghijklmnopqrstuvwxyz')
{
$str = '';
$max = mb_strlen($keyspace, '8bit') - 1;
for ($i = 0; $i < $length; ++$i) {
$str .= $keyspace[random_int(0, $max)];
}
return $str;
}
Given this example.php
<?php
$example = 'some $string';
if(isset($_POST['something'])){
echo $_POST['something'];
}
function exampleFunction($variable2){
echo $variable2;
}
exampleFunction($example);
$variable3 = array('example','another');
foreach($variable3 as $key => $var3val){
echo $var3val."somestring";
}
Produces this output:
<?php
$vsodjbobqokkaabv = 'some $string';
if(isset($_POST['something'])){
echo $_POST['something'];
}
function gkfadicwputpvroj($zwnjrxupprkbudlr){
echo $zwnjrxupprkbudlr;
}
gkfadicwputpvroj($vsodjbobqokkaabv);
$vfjzehtvmzzurxor = array('example','another');
foreach($vfjzehtvmzzurxor as $riuqtlravsenpspv => $mkdgtnpxaqziqkgo){
echo $mkdgtnpxaqziqkgo."somestring";
}
EDIT 4.12.2016 - please see below! (after first answer)
I've just tried to find a solution which can handle both cases: your given case and this example from Elias Van Ootegerm.
of course it should be improved as mentioned in one of my comments, but it works for your example:
$source = file_get_contents("source.php");
// this should get all Variables BUT isn't right at the moment if a variable is followed by an ' or " !!
preg_match_all('/\$[\$a-zA-Z0-9\[\'.*\'\]]*/', $source, $matches);
$matches = array_unique($matches[0]);
// this array saves all old and new variable names to track all replacements
$replacements = array();
$obfuscated_source = $source;
foreach($matches as $varName)
{
do // generates random string and tests if it already is used by an earlier replaced variable name
{
// generate a random string -> should be improved.
$randomName = substr(md5(rand()), 0, 7);
// ensure that first part of variable name is a character.
// there could also be a random character...
$randomName = "a" . $randomName;
}
while(in_array("$" . $randomName, $replacements));
if(substr($varName, 0,8) == '$GLOBALS')
{
// this handles the case of GLOBALS variables
$delimiter = substr($varName, 9, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$GLOBALS[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,8) == '$_SERVER')
{
// this handles the case of SERVER variables
$delimiter = substr($varName, 9, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_SERVER[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,5) == '$_GET')
{
// this handles the case of GET variables
$delimiter = substr($varName, 6, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_GET[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,6) == '$_POST')
{
// this handles the case of POST variables
$delimiter = substr($varName, 7, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_POST[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,7) == '$_FILES')
{
// this handles the case of FILES variables
$delimiter = substr($varName, 8, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_FILES[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,9) == '$_REQUEST')
{
// this handles the case of REQUEST variables
$delimiter = substr($varName, 10, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_REQUEST[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,9) == '$_SESSION')
{
// this handles the case of SESSION variables
$delimiter = substr($varName, 10, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_SESSION[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,5) == '$_ENV')
{
// this handles the case of ENV variables
$delimiter = substr($varName, 6, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_ENV[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 0,8) == '$_COOKIE')
{
// this handles the case of COOKIE variables
$delimiter = substr($varName, 9, 1);
if($delimiter == '$') $delimiter = '';
$newName = '$_COOKIE[' .$delimiter . $randomName . $delimiter . ']';
}
else if(substr($varName, 1, 1) == '$')
{
// this handles the case of variable variables
$name = substr($varName, 2, strlen($varName)-2);
$pattern = '/(?=\$)\$' . $name . '.*;/';
preg_match_all($pattern, $source, $varDeclaration);
$varDeclaration = $varDeclaration[0][0];
preg_match('/\s*=\s*["\'](?:\\.|[^"\\]])*["\']/', $varDeclaration, $varContent);
$varContent = $varContent[0];
preg_match('/["\'](?:\\.|[^"\\]])*["\']/', $varContent, $varContentDetail);
$varContentDetail = substr($varContentDetail[0], 1, strlen($varContentDetail[0])-2);
$replacementDetail = str_replace($varContent, substr($replacements["$" . $varContentDetail], 1, strlen($replacements["$" . $varContentDetail])-1), $varContent);
$explode = explode($varContentDetail, $varContent);
$replacement = $explode[0] . $replacementDetail . $explode[1];
$obfuscated_source = str_replace($varContent, $replacement, $obfuscated_source);
}
else
{
$newName = '$' . $randomName;
}
$obfuscated_source = str_replace($varName, $newName, $obfuscated_source);
$replacements[$varName] = $newName;
}
// this part may be useful to change hard-coded returns of functions.
// it changes all remaining words in the document which are like the previous changed variable names to the new variable names
// attention: if the variables in the document have common names it could also change text you don't like to change...
foreach($replacements as $before => $after)
{
$name_before = str_replace("$", "", $before);
$name_after = str_replace("$", "", $after);
$obfuscated_source = str_replace($name_before, $name_after, $obfuscated_source);
}
// here you can place code to write back the obfuscated code to the same or to a new file, e.g:
$file = fopen("result.php", "w");
fwrite($file, $obfuscated_source);
fclose($file);
EDIT there are still some cases left which require some effort.
At least some kinds of variable declarations may not be handled correctly!
Also the first regex is not perfect, my current status is like:
'/\$\$?[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*/'
but this does not get the index-values of predefined variables... But I think it has some potential. If you use it like here you get all 18 involved variables... The next step could be to determine if a [..] follws after the variable name. If so any predefined variable AND such cases like $g = $GLOBALS; and any further use of such a $g would be covered...
EDIT 4.12.2016
due to LSerni and several comments on both the original quesion and some solutions I also wrote a parsing solution which you can find below.
It handles an extended example file which was my aim. If you find any other challenge, please tell me!
new solution:
$variable_names_before = array();
$variable_names_after = array();
$function_names_before = array();
$function_names_after = array();
$forbidden_variables = array(
'$GLOBALS',
'$_SERVER',
'$_GET',
'$_POST',
'$_FILES',
'$_COOKIE',
'$_SESSION',
'$_REQUEST',
'$_ENV',
);
$forbidden_functions = array(
'unlink'
);
// read file
$data = file_get_contents("example.php");
$lock = false;
$lock_quote = '';
for($i = 0; $i < strlen($data); $i++)
{
// check if there are quotation marks
if(($data[$i] == "'" || $data[$i] == '"'))
{
// if first quote
if($lock_quote == '')
{
// remember quotation mark
$lock_quote = $data[$i];
$lock = true;
}
else if($data[$i] == $lock_quote)
{
$lock_quote = '';
$lock = false;
}
}
// detect variables
if(!$lock && $data[$i] == '$')
{
$start = $i;
// detect variable variable names
if($data[$i+1] == '$')
{
$start++;
// increment $i to avoid second detection of variable variable as "normal variable"
$i++;
}
$end = 1;
// find end of variable name
while(ctype_alpha($data[$start+$end]) || is_numeric($data[$start+$end]) || $data[$start+$end] == "_")
{
$end++;
}
// extract variable name
$variable_name = substr($data, $start, $end);
if($variable_name == '$')
{
continue;
}
// check if variable name is allowed
if(in_array($variable_name, $forbidden_variables))
{
// forbidden variable deteced, do whatever you want!
}
else
{
// check if variable name already has been detected
if(!in_array($variable_name, $variable_names_before))
{
$variable_names_before[] = $variable_name;
// generate random name for variable
$new_variable_name = "";
do
{
$new_variable_name = random_str(rand(5, 20));
}
while(in_array($new_variable_name, $variable_names_after));
$variable_names_after[] = $new_variable_name;
}
//var_dump("variable: " . $variable_name);
}
}
// detect function-definitions
// the third condition checks if the symbol before 'function' is neither a character nor a number
if(!$lock && strtolower(substr($data, $i, 8)) == 'function' && (!ctype_alpha($data[$i-1]) && !is_numeric($data[$i-1])))
{
// find end of function name
$end = strpos($data, '(', $i);
// extract function name and remove possible spaces on the right side
$function_name = rtrim(substr($data, ($i+9), $end-$i-9));
// check if function name is allowed
if(in_array($function_name, $forbidden_functions))
{
// forbidden function detected, do whatever you want!
}
else
{
// check if function name already has been deteced
if(!in_array($function_name, $function_names_before))
{
$function_names_before[] = $function_name;
// generate random name for variable
$new_function_name = "";
do
{
$new_function_name = random_str(rand(5, 20));
}
while(in_array($new_function_name, $function_names_after));
$function_names_after[] = $new_function_name;
}
//var_dump("function: " . $function_name);
}
}
}
// this array contains prefixes and suffixes for string literals which
// may contain variable names.
// if string literals as a return of functions should not be changed
// remove the last two inner arrays of $possible_pre_suffixes
// this will enable correct handling of situations like
// - $func = 'getNewName'; echo $func();
// but it will break variable variable names like
// - ${getNewName()}
$possible_pre_suffixes = array(
array(
"prefix" => "= '",
"suffix" => "'"
),
array(
"prefix" => '= "',
"suffix" => '"'
),
array(
"prefix" => "='",
"suffix" => "'"
),
array(
"prefix" => '="',
"suffix" => '"'
),
array(
"prefix" => 'rn "', // return " ";
"suffix" => '"'
),
array(
"prefix" => "rn '", // return ' ';
"suffix" => "'"
)
);
// replace variable names
for($i = 0; $i < count($variable_names_before); $i++)
{
$data = str_replace($variable_names_before[$i], '$' . $variable_names_after[$i], $data);
// try to find strings which equals variable names
// this is an attempt to handle situations like:
// $a = "123";
// $b = "a"; <--
// $$b = "321"; <--
// and also
// function getName() { return "a"; }
// echo ${getName()};
$name = substr($variable_names_before[$i], 1);
for($j = 0; $j < count($possible_pre_suffixes); $j++)
{
$data = str_replace($possible_pre_suffixes[$j]["prefix"] . $name . $possible_pre_suffixes[$j]["suffix"],
$possible_pre_suffixes[$j]["prefix"] . $variable_names_after[$i] . $possible_pre_suffixes[$j]["suffix"],
$data);
}
}
// replace funciton names
for($i = 0; $i < count($function_names_before); $i++)
{
$data = str_replace($function_names_before[$i], $function_names_after[$i], $data);
}
/**
* https://stackoverflow.com/a/31107425/4233593
* Generate a random string, using a cryptographically secure
* pseudorandom number generator (random_int)
*
* For PHP 7, random_int is a PHP core function
* For PHP 5.x, depends on https://github.com/paragonie/random_compat
*
* #param int $length How many characters do we want?
* #param string $keyspace A string of all possible characters
* to select from
* #return string
*/
function random_str($length, $keyspace = 'abcdefghijklmnopqrstuvwxyz')
{
$str = '';
$max = mb_strlen($keyspace, '8bit') - 1;
for ($i = 0; $i < $length; ++$i)
{
$str .= $keyspace[random_int(0, $max)];
}
return $str;
}
example input file:
$example = 'some $string';
$test = '$abc 123' . $example . '$hello here I "$am"';
if(isset($_POST['something'])){
echo $_POST['something'];
}
function exampleFunction($variable2){
echo $variable2;
}
exampleFunction($example);
$variable3 = array('example','another');
foreach($variable3 as $key => $var3val){
echo $var3val."somestring";
}
$test = "example";
$$test = 'hello';
exampleFunction($example);
exampleFunction($$test);
function getNewName()
{
return "test";
}
exampleFunction(${getNewName()});
output of my function:
$fesvffyn = 'some $string';
$zimskk = '$abc 123' . $fesvffyn . '$hello here I "$am"';
if(isset($_POST['something'])){
echo $_POST['something'];
}
function kainbtqpybl($yxjvlvmyfskwqcevo){
echo $yxjvlvmyfskwqcevo;
}
kainbtqpybl($fesvffyn);
$lmiphctfgjfdnonjpia = array('example','another');
foreach($lmiphctfgjfdnonjpia as $qypdfcpcla => $gwlpcpnvnhbvbyflr){
echo $gwlpcpnvnhbvbyflr."somestring";
}
$zimskk = "fesvffyn";
$$zimskk = 'hello';
kainbtqpybl($fesvffyn);
kainbtqpybl($$zimskk);
function tauevjkk()
{
return "zimskk";
}
kainbtqpybl(${tauevjkk()});
I know there are some cases left, where you can find an issue with variable variable names, but then you may have to expand the $possible_pre_suffixes array...
Maybe you also want to differentiate between global variables and "forbidden variables"...
Well, you can try write your own but the number of strange things you have to handle are likely to overwhelm you, and I presume you are more interested in using such a tool than writing and maintaining one yourself. (There a lots of broken PHP obfuscators out there, where people have tried to do this).
If you want one that is reliable, you do have base it on a parser or your tool will mis-parse the text and handle it wrong (this is the first "strange thing"). Regexes simply won't do the trick.
The Semantic Designs PHP Obfuscator (from my company), taken out of the box, took this slightly modified version of Elias Van Ootegem's example:
<?php
//non-obfuscated
function getVarname()
{//the return value has to change
return (('foobar'));
}
$format = '%s = %d';
$foobar = 123;
$variableVar = (('format'));//you need to change this string
printf($$variableVar, $variableVar = getVarname(), $$variableVar);
echo PHP_EOL;
var_dump($GLOBALS[(('foobar'))]);//note the key == the var
and produced this:
<?php function l0() { return (('O0')); } $l1="%\163 = %d"; $O1=0173; $l2=(('O2')); printf($$l2,$l2=l0(),$$l2); echo PHP_EOL; var_dump($GLOBALS[(('O0'))]);
The key issue in Elias's example are strings that actually contain variable names. In general, there is no way for a tool to know that "x" is a variable name, and not just the string containing the letter x. But, the programmers know. We insist that such strings be marked [by enclosing them in ((..)) ] and then the obfuscator can obfuscate their content properly.
Sometimes the string contains variables names and other things; it that case,
the programmer has to break up the string into "variable name" content and everything else. This is pretty easy to do in practice, and is
the "slight change" I made to his supplied code.
Other strings, not being marked, are left alone. You only have to do this
once to the source file. [You can say this is cheating, but no other practical answer will work; the tool cannot know reliably. Halting Problem, if you insist.].
The next thing to get right is reliable obfuscation across multiple files. You can't do this one file at a time. This obfuscator has been used on very big PHP applications (thousands of PHP script files).
Yes, it does use a full PHP parser. Not nikic's.
I ended up with this simple code:
$tokens = token_get_all($src);
$skip = array('$this','$_GET','$_POST','$_REQUEST','$_SERVER','$_COOKIE','$_SESSION');
function renameVars($tokens,$content,$skip){
$vars = array();
foreach($tokens as $token) {
if ($token[0] == T_VARIABLE && !in_array($token[1],$skip))
$vars[generateRandomString()]= $token[1];
}
$vars = array_unique($vars);
$vars2 = $vars;
foreach($vars as $new => $old){
foreach($vars2 as $var){
if($old!=$var && strpos($var,$old)!==false){
continue 2;
}
}
$content = str_replace($old,'${"'.$new.'"}',$content);
//function(${"example"}) will trigger error. This is why we need this:
$content = str_replace('(${"'.$new.'"}','($'.$new,$content);
$content = str_replace(',${"'.$new.'"}',',$'.$new,$content);
$chars = array('a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z');
//for things like function deleteExpired(Varien_Event_Observer $fz5eDWIt1si), Exception,
foreach($chars as $char){
$content = str_replace($char.' ${"'.$new.'"}',$char.' $'.$new,$content);
}
}
It works for me because the code is simple. I guess it wont work in all scenarios.
I have it working now but there may still be some vulnerabilities because PHP allows functions names and variables names to be generated dynamically.
The first function replaces $_SESSION, $_POST etc. with functions:
function replaceArrayVariable($str, $arr, $function)
{
$str = str_replace($arr, $function, $str);
$lastPos = 0;
while (($lastPos = strpos($str, $function, $lastPos)) !== false)
{
$lastPos = $lastPos + strlen($function);
$currentPos = $lastPos;
$openSqrBrackets = 1;
while ($openSqrBrackets > 0)
{
if ($str[$currentPos] === '[')
$openSqrBrackets++;
elseif ($str[$currentPos] === ']')
$openSqrBrackets--;
$currentPos++;
}
$str[$currentPos - 1] = ')';
}
return $str;
}
The second renames functions ignoring whitelisted keywords:
function renameFunctions($str)
{
preg_match_all('/[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*/', $str, $matches, PREG_OFFSET_CAPTURE);
$totalMatches = count($matches[0]);
$offset = 0;
for ($i = 0; $i < $totalMatches; $i++)
{
$matchIndex = $matches[0][$i][1] + $offset;
if ($matchIndex === 0 || $str[$matchIndex - 1] !== '$')
{
$keyword = $matches[0][$i][0];
if ($keyword !== 'true' && $keyword !== 'false' && $keyword !== 'if' && $keyword !== 'else' && $keyword !== 'getPost' && $keyword !== 'getSession')
{
$str = substr_replace($str, 'qq', $matchIndex, 0);
$offset += 2;
}
}
}
return $str;
}
Then to rename functions, variables, and non-whitelisted keywords, I use this code:
$str = replaceArrayVariable($str, '$_POST[', 'getPost(');
$str = replaceArrayVariable($str, '$_SESSION[', 'getSession(');
preg_match_all('/\'(?:\\\\.|[^\\\\\'])*\'|.[^\']+/', $str, $matches);
$str = '';
foreach ($matches[0] as $match)
{
if ($match[0] != "'")
{
$match = preg_replace('!\s+!', ' ', $match);
$match = renameFunctions($match);
$match = str_replace('$', '$qq', $match);
}
$str .= $match;
}
I have a folder that contains files named standard_xx.jpg (xx being a number)
I would like to find the highest number so that I can get the filename ready to rename the next file being uploaded.
Eg. if the highest number is standard_12.jpg
$newfilename = standard_13.jpg
I have created a method to do it by just exploding the file name but it isn't very elegant
$files = glob($uploaddir.'test-xrays-del/standard_*.JPG');
$maxfile = $files[count($files)-1];
$explode = explode('_',$maxfile);
$filename = $explode[1];
$explode2 = explode('.',$filename);
$number = $explode2[0];
$newnumber = $number + 1;
$standard = 'test-xrays-del/standard_'.$newnumber.'.JPG';
echo $newfile;
Is there a more efficient or elegant way of doing this?
I'd do it like this myself:
<?php
$files = glob($uploaddir.'test-xrays-del/standard_*.JPG');
natsort($files);
preg_match('!standard_(\d+)!', end($files), $matches);
$newfile = 'standard_' . ($matches[1] + 1) . '.JPG';
echo $newfile;
You can make use of sscanfDocs:
$success = sscanf($maxfile, 'standard_%d.JPG', $number);
It will allow you to not only pick out the number (and only the number) but also whether or not this worked ($success).
Additionally you could also take a look into natsortDocs to actually sort the array you get back for the highest natural number.
A complete code-example making use of these:
$mask = 'standard_%s.JPG';
$prefix = 'test-xrays-del';
$glob = sprintf("%s%s/%s", $uploaddir, $prefix, sprintf($mask, '*'));
$files = glob($glob);
if (!$files) {
throw new RuntimeException('No files found or error with ' . $glob);
}
natsort($files);
$maxfile = end($files);
$success = sscanf($maxfile, sprintf($mask, '%d'), $number);
if (!$success) {
throw new RuntimeException('Unable to obtain number from: ', $maxfile);
}
$newnumber = $number + 1;
$newfile = sprintf("%s/%s", $prefix, sprintf($mask, $newnumber));
Try with:
$files = glob($uploaddir.'test-xrays-del/standard_*.JPG');
natsort($files);
$highest = array_pop($files);
Then get it's number with regex and increase value.
Something like this:
function getMaxFileID($path) {
$files = new DirectoryIterator($path);
$filtered = new RegexIterator($files, '/^.+\.jpg$/i');
$maxFileID = 0;
foreach ($filtered as $fileInfo) {
$thisFileID = (int)preg_replace('/.*?_/',$fileInfo->getFilename());
if($thisFileID > $maxFileID) { $maxFileID = $thisFileID;}
}
return $maxFileID;
}
I have a php file which contains only one class. how can I know what class is there by knowing the filename? I know I can do something with regexp matching but is there a standard php way? (the file is already included in the page that is trying to figure out the class name).
There are multiple possible solutions to this problem, each with their advantages and disadvantages. Here they are, it's up to know to decide which one you want.
Tokenizer
This method uses the tokenizer and reads parts of the file until it finds a class definition.
Advantages
Do not have to parse the file entirely
Fast (reads the beginning of the file only)
Little to no chance of false positives
Disadvantages
Longest solution
Code
$fp = fopen($file, 'r');
$class = $buffer = '';
$i = 0;
while (!$class) {
if (feof($fp)) break;
$buffer .= fread($fp, 512);
$tokens = token_get_all($buffer);
if (strpos($buffer, '{') === false) continue;
for (;$i<count($tokens);$i++) {
if ($tokens[$i][0] === T_CLASS) {
for ($j=$i+1;$j<count($tokens);$j++) {
if ($tokens[$j] === '{') {
$class = $tokens[$i+2][1];
}
}
}
}
}
Regular expressions
Use regular expressions to parse the beginning of the file, until a class definition is found.
Advantages
Do not have to parse the file entirely
Fast (reads the beginning of the file only)
Disadvantages
High chances of false positives (e.g.: echo "class Foo {";)
Code
$fp = fopen($file, 'r');
$class = $buffer = '';
$i = 0;
while (!$class) {
if (feof($fp)) break;
$buffer .= fread($fp, 512);
if (preg_match('/class\s+(\w+)(.*)?\{/', $buffer, $matches)) {
$class = $matches[1];
break;
}
}
Note: The regex can probably be improved, but no regex alone can do this perfectly.
Get list of declared classes
This method uses get_declared_classes() and look for the first class defined after an include.
Advantages
Shortest solution
No chance of false positive
Disadvantages
Have to load the entire file
Have to load the entire list of classes in memory twice
Have to load the class definition in memory
Code
$classes = get_declared_classes();
include 'test2.php';
$diff = array_diff(get_declared_classes(), $classes);
$class = reset($diff);
Note: You cannot simply do end() as others suggested. If the class includes another class, you will get a wrong result.
This is the Tokenizer solution, modified to include a $namespace variable containing the class namespace, if applicable:
$fp = fopen($file, 'r');
$class = $namespace = $buffer = '';
$i = 0;
while (!$class) {
if (feof($fp)) break;
$buffer .= fread($fp, 512);
$tokens = token_get_all($buffer);
if (strpos($buffer, '{') === false) continue;
for (;$i<count($tokens);$i++) {
if ($tokens[$i][0] === T_NAMESPACE) {
for ($j=$i+1;$j<count($tokens); $j++) {
if ($tokens[$j][0] === T_STRING) {
$namespace .= '\\'.$tokens[$j][1];
} else if ($tokens[$j] === '{' || $tokens[$j] === ';') {
break;
}
}
}
if ($tokens[$i][0] === T_CLASS) {
for ($j=$i+1;$j<count($tokens);$j++) {
if ($tokens[$j] === '{') {
$class = $tokens[$i+2][1];
}
}
}
}
}
Say you have this class:
namespace foo\bar {
class hello { }
}
...or the alternative syntax:
namespace foo\bar;
class hello { }
You should have the following result:
var_dump($namespace); // \foo\bar
var_dump($class); // hello
You could also use the above to detect the namespace a file declares, regardless of it containing a class or not.
You can make PHP do the work by just including the file and get the last declared class:
$file = 'class.php'; # contains class Foo
include($file);
$classes = get_declared_classes();
$class = end($classes);
echo $class; # Foo
If you need to isolate that, wrap it into a commandline script and execute it via shell_exec:
$file = 'class.php'; # contains class Foo
$class = shell_exec("php -r \"include('$file'); echo end(get_declared_classes());\"");
echo $class; # Foo
If you dislike commandline scripts, you can do it like in this question, however that code does not reflect namespaces.
I modified Nette\Reflection\AnnotationsParser that so it returns an array of namespace+classname that are defined in the file
$parser = new PhpParser();
$parser->extractPhpClasses('src/Path/To/File.php');
class PhpParser
{
public function extractPhpClasses(string $path)
{
$code = file_get_contents($path);
$tokens = #token_get_all($code);
$namespace = $class = $classLevel = $level = NULL;
$classes = [];
while (list(, $token) = each($tokens)) {
switch (is_array($token) ? $token[0] : $token) {
case T_NAMESPACE:
$namespace = ltrim($this->fetch($tokens, [T_STRING, T_NS_SEPARATOR]) . '\\', '\\');
break;
case T_CLASS:
case T_INTERFACE:
if ($name = $this->fetch($tokens, T_STRING)) {
$classes[] = $namespace . $name;
}
break;
}
}
return $classes;
}
private function fetch(&$tokens, $take)
{
$res = NULL;
while ($token = current($tokens)) {
list($token, $s) = is_array($token) ? $token : [$token, $token];
if (in_array($token, (array) $take, TRUE)) {
$res .= $s;
} elseif (!in_array($token, [T_DOC_COMMENT, T_WHITESPACE, T_COMMENT], TRUE)) {
break;
}
next($tokens);
}
return $res;
}
}
$st = get_declared_classes();
include "classes.php"; //one or more classes in file, contains class class1, class2, etc...
$res = array_values(array_diff_key(get_declared_classes(),$st));
print_r($res); # Array ([0] => class1 [1] => class2 [2] ...)
Thanks to some people from Stackoverflow and Github, I was able to write this amazing fully working solution:
/**
* get the full name (name \ namespace) of a class from its file path
* result example: (string) "I\Am\The\Namespace\Of\This\Class"
*
* #param $filePathName
*
* #return string
*/
public function getClassFullNameFromFile($filePathName)
{
return $this->getClassNamespaceFromFile($filePathName) . '\\' . $this->getClassNameFromFile($filePathName);
}
/**
* build and return an object of a class from its file path
*
* #param $filePathName
*
* #return mixed
*/
public function getClassObjectFromFile($filePathName)
{
$classString = $this->getClassFullNameFromFile($filePathName);
$object = new $classString;
return $object;
}
/**
* get the class namespace form file path using token
*
* #param $filePathName
*
* #return null|string
*/
protected function getClassNamespaceFromFile($filePathName)
{
$src = file_get_contents($filePathName);
$tokens = token_get_all($src);
$count = count($tokens);
$i = 0;
$namespace = '';
$namespace_ok = false;
while ($i < $count) {
$token = $tokens[$i];
if (is_array($token) && $token[0] === T_NAMESPACE) {
// Found namespace declaration
while (++$i < $count) {
if ($tokens[$i] === ';') {
$namespace_ok = true;
$namespace = trim($namespace);
break;
}
$namespace .= is_array($tokens[$i]) ? $tokens[$i][1] : $tokens[$i];
}
break;
}
$i++;
}
if (!$namespace_ok) {
return null;
} else {
return $namespace;
}
}
/**
* get the class name form file path using token
*
* #param $filePathName
*
* #return mixed
*/
protected function getClassNameFromFile($filePathName)
{
$php_code = file_get_contents($filePathName);
$classes = array();
$tokens = token_get_all($php_code);
$count = count($tokens);
for ($i = 2; $i < $count; $i++) {
if ($tokens[$i - 2][0] == T_CLASS
&& $tokens[$i - 1][0] == T_WHITESPACE
&& $tokens[$i][0] == T_STRING
) {
$class_name = $tokens[$i][1];
$classes[] = $class_name;
}
}
return $classes[0];
}
You can do this in two ways:
complex solution: open the file and through regex extract the class-name (like /class ([a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*)/)
simply solution: name all your php files with the class-name contained (eg: the class TestFoo in the file TestFoo.php or TestFoo.class.php)
You could get all declared classes before you include the file using get_declared_classes. Do the same thing after you have included it and compare the two with something like array_diff and you have your newly added class.
This sample returns all classes. If you're looking for a class which is derived of a specific one, use is_subclass_of
$php_code = file_get_contents ( $file );
$classes = array ();
$namespace="";
$tokens = token_get_all ( $php_code );
$count = count ( $tokens );
for($i = 0; $i < $count; $i ++)
{
if ($tokens[$i][0]===T_NAMESPACE)
{
for ($j=$i+1;$j<$count;++$j)
{
if ($tokens[$j][0]===T_STRING)
$namespace.="\\".$tokens[$j][1];
elseif ($tokens[$j]==='{' or $tokens[$j]===';')
break;
}
}
if ($tokens[$i][0]===T_CLASS)
{
for ($j=$i+1;$j<$count;++$j)
if ($tokens[$j]==='{')
{
$classes[]=$namespace."\\".$tokens[$i+2][1];
}
}
}
return $classes;
I spent lots of productive time looking for a way around this.
From #netcoder's solution its obvious there are lots of cons in all the solutions so far.
So I decided to do this instead.
Since most PHP classes has class name same as filename, we could get the class name from the filename. Depending on your project you could also have a naming convention.
NB: This assume class does not have namespace
<?php
$path = '/path/to/a/class/file.php';
include $path;
/*get filename without extension which is the classname*/
$classname = pathinfo(basename($path), PATHINFO_FILENAME);
/* you can do all this*/
$classObj = new $classname();
/*dough the file name is classname you can still*/
get_class($classObj); //still return classname
Let me add a PHP 8 compatible solution as well. This will scan a file accordingly and return all FQCN's:
$file = 'whatever.php';
$classes = [];
$namespace = '';
$tokens = PhpToken::tokenize(file_get_contents($file));
for ($i = 0; $i < count($tokens); $i++) {
if ($tokens[$i]->getTokenName() === 'T_NAMESPACE') {
for ($j = $i + 1; $j < count($tokens); $j++) {
if ($tokens[$j]->getTokenName() === 'T_NAME_QUALIFIED') {
$namespace = $tokens[$j]->text;
break;
}
}
}
if ($tokens[$i]->getTokenName() === 'T_CLASS') {
for ($j = $i + 1; $j < count($tokens); $j++) {
if ($tokens[$j]->getTokenName() === 'T_WHITESPACE') {
continue;
}
if ($tokens[$j]->getTokenName() === 'T_STRING') {
$classes[] = $namespace . '\\' . $tokens[$j]->text;
} else {
break;
}
}
}
}
// Contains all FQCNs found in a file.
$classes;
You may be able to use the autoload function.
function __autoload($class_name) {
include "special_directory/" .$class_name . '.php';
}
And you can echo $class_name. But that requires a directory with a single file.
But it is standard practice to have one class in each file in PHP. So Main.class.php will contain Main class. You may able to use that standard if you are the one coding.
function foldersize($path) {
$total_size = 0;
$files = scandir($path);
foreach($files as $t) {
if (is_dir(rtrim($path, '/') . '/' . $t)) {
if ($t<>"." && $t<>"..") {
$size = foldersize(rtrim($path, '/') . '/' . $t);
$total_size += $size;
}
} else {
$size = filesize(rtrim($path, '/') . '/' . $t);
$total_size += $size;
}
}
return $total_size;
}
function format_size($size) {
$mod = 1024;
$units = explode(' ','B KB MB GB TB PB');
for ($i = 0; $size > $mod; $i++) {
$size /= $mod;
}
return round($size, 2) . ' ' . $units[$i];
}
$SIZE_LIMIT = 5368709120; // 5 GB
$sql="select * from users order by id";
$result=mysql_query($sql);
while($row=mysql_fetch_array($result)) {
$disk_used = foldersize("C:/xampp/htdocs/freehosting/".$row['name']);
$disk_remaining = $SIZE_LIMIT - $disk_used;
print 'Name: ' . $row['name'] . '<br>';
print 'diskspace used: ' . format_size($disk_used) . '<br>';
print 'diskspace left: ' . format_size($disk_remaining) . '<br><hr>';
}
php disk_total_space
Any idea why the processor usage shoot up too high or 100% till the script execution is finish ? Can anything be done to optimize it? or is there any other alternative way to check folder and folders inside it size?
function GetDirectorySize($path){
$bytestotal = 0;
$path = realpath($path);
if($path!==false && $path!='' && file_exists($path)){
foreach(new RecursiveIteratorIterator(new RecursiveDirectoryIterator($path, FilesystemIterator::SKIP_DOTS)) as $object){
$bytestotal += $object->getSize();
}
}
return $bytestotal;
}
The same idea as Janith Chinthana suggested.
With a few fixes:
Converts $path to realpath
Performs iteration only if path is valid and folder exists
Skips . and .. files
Optimized for performance
The following are other solutions offered elsewhere:
If on a Windows Host:
<?
$f = 'f:/www/docs';
$obj = new COM ( 'scripting.filesystemobject' );
if ( is_object ( $obj ) )
{
$ref = $obj->getfolder ( $f );
echo 'Directory: ' . $f . ' => Size: ' . $ref->size;
$obj = null;
}
else
{
echo 'can not create object';
}
?>
Else, if on a Linux Host:
<?
$f = './path/directory';
$io = popen ( '/usr/bin/du -sk ' . $f, 'r' );
$size = fgets ( $io, 4096);
$size = substr ( $size, 0, strpos ( $size, "\t" ) );
pclose ( $io );
echo 'Directory: ' . $f . ' => Size: ' . $size;
?>
directory size using php filesize and RecursiveIteratorIterator.
This works with any platform which is having php 5 or higher version.
/**
* Get the directory size
* #param string $directory
* #return integer
*/
function dirSize($directory) {
$size = 0;
foreach(new RecursiveIteratorIterator(new RecursiveDirectoryIterator($directory)) as $file){
$size+=$file->getSize();
}
return $size;
}
A pure php example.
<?php
$units = explode(' ', 'B KB MB GB TB PB');
$SIZE_LIMIT = 5368709120; // 5 GB
$disk_used = foldersize("/webData/users/vdbuilder#yahoo.com");
$disk_remaining = $SIZE_LIMIT - $disk_used;
echo("<html><body>");
echo('diskspace used: ' . format_size($disk_used) . '<br>');
echo( 'diskspace left: ' . format_size($disk_remaining) . '<br><hr>');
echo("</body></html>");
function foldersize($path) {
$total_size = 0;
$files = scandir($path);
$cleanPath = rtrim($path, '/'). '/';
foreach($files as $t) {
if ($t<>"." && $t<>"..") {
$currentFile = $cleanPath . $t;
if (is_dir($currentFile)) {
$size = foldersize($currentFile);
$total_size += $size;
}
else {
$size = filesize($currentFile);
$total_size += $size;
}
}
}
return $total_size;
}
function format_size($size) {
global $units;
$mod = 1024;
for ($i = 0; $size > $mod; $i++) {
$size /= $mod;
}
$endIndex = strpos($size, ".")+3;
return substr( $size, 0, $endIndex).' '.$units[$i];
}
?>
function get_dir_size($directory){
$size = 0;
$files = glob($directory.'/*');
foreach($files as $path){
is_file($path) && $size += filesize($path);
is_dir($path) && $size += get_dir_size($path);
}
return $size;
}
Thanks to Jonathan Sampson, Adam Pierce and Janith Chinthana I did this one checking for most performant way to get the directory size. Should work on Windows and Linux Hosts.
static function getTotalSize($dir)
{
$dir = rtrim(str_replace('\\', '/', $dir), '/');
if (is_dir($dir) === true) {
$totalSize = 0;
$os = strtoupper(substr(PHP_OS, 0, 3));
// If on a Unix Host (Linux, Mac OS)
if ($os !== 'WIN') {
$io = popen('/usr/bin/du -sb ' . $dir, 'r');
if ($io !== false) {
$totalSize = intval(fgets($io, 80));
pclose($io);
return $totalSize;
}
}
// If on a Windows Host (WIN32, WINNT, Windows)
if ($os === 'WIN' && extension_loaded('com_dotnet')) {
$obj = new \COM('scripting.filesystemobject');
if (is_object($obj)) {
$ref = $obj->getfolder($dir);
$totalSize = $ref->size;
$obj = null;
return $totalSize;
}
}
// If System calls did't work, use slower PHP 5
$files = new \RecursiveIteratorIterator(new \RecursiveDirectoryIterator($dir));
foreach ($files as $file) {
$totalSize += $file->getSize();
}
return $totalSize;
} else if (is_file($dir) === true) {
return filesize($dir);
}
}
Even though there are already many many answers to this post, I feel I have to add another option for unix hosts that only returns the sum of all file sizes in the directory (recursively).
If you look at Jonathan's answer he uses the du command. This command will return the total directory size but the pure PHP solutions posted by others here will return the sum of all file sizes. Big difference!
What to look out for
When running du on a newly created directory, it may return 4K instead of 0. This may even get more confusing after having deleted files from the directory in question, having du reporting a total directory size that does not correspond to the sum of the sizes of the files within it. Why? The command du returns a report based on some file settings, as Hermann Ingjaldsson commented on this post.
The solution
To form a solution that behaves like some of the PHP-only scripts posted here, you can use ls command and pipe it to awk like this:
ls -ltrR /path/to/dir |awk '{print \$5}'|awk 'BEGIN{sum=0} {sum=sum+\$1} END {print sum}'
As a PHP function you could use something like this:
function getDirectorySize( $path )
{
if( !is_dir( $path ) ) {
return 0;
}
$path = strval( $path );
$io = popen( "ls -ltrR {$path} |awk '{print \$5}'|awk 'BEGIN{sum=0} {sum=sum+\$1} END {print sum}'", 'r' );
$size = intval( fgets( $io, 80 ) );
pclose( $io );
return $size;
}
I found this approach to be shorter and more compatible. The Mac OS X version of "du" doesn't support the -b (or --bytes) option for some reason, so this sticks to the more-compatible -k option.
$file_directory = './directory/path';
$output = exec('du -sk ' . $file_directory);
$filesize = trim(str_replace($file_directory, '', $output)) * 1024;
Returns the $filesize in bytes.
Johnathan Sampson's Linux example didn't work so good for me. Here's an improved version:
function getDirSize($path)
{
$io = popen('/usr/bin/du -sb '.$path, 'r');
$size = intval(fgets($io,80));
pclose($io);
return $size;
}
It works perfectly fine .
public static function folderSize($dir)
{
$size = 0;
foreach (glob(rtrim($dir, '/') . '/*', GLOB_NOSORT) as $each) {
$func_name = __FUNCTION__;
$size += is_file($each) ? filesize($each) : static::$func_name($each);
}
return $size;
}
There are several things you could do to optimise the script - but maximum success would make it IO-bound rather than CPU-bound:
Calculate rtrim($path, '/') outside the loop.
make if ($t<>"." && $t<>"..") the outer test - it doesn't need to stat the path
Calculate rtrim($path, '/') . '/' . $t once per loop - inside 2) and taking 1) into account.
Calculate explode(' ','B KB MB GB TB PB'); once rather than each call?
PHP get directory size (with FTP access)
After hard work, this code works great!!!! and I want to share with the community (by MundialSYS)
function dirFTPSize($ftpStream, $dir) {
$size = 0;
$files = ftp_nlist($ftpStream, $dir);
foreach ($files as $remoteFile) {
if(preg_match('/.*\/\.\.$/', $remoteFile) || preg_match('/.*\/\.$/', $remoteFile)){
continue;
}
$sizeTemp = ftp_size($ftpStream, $remoteFile);
if ($sizeTemp > 0) {
$size += $sizeTemp;
}elseif($sizeTemp == -1){//directorio
$size += dirFTPSize($ftpStream, $remoteFile);
}
}
return $size;
}
$hostname = '127.0.0.1'; // or 'ftp.domain.com'
$username = 'username';
$password = 'password';
$startdir = '/public_html'; // absolute path
$files = array();
$ftpStream = ftp_connect($hostname);
$login = ftp_login($ftpStream, $username, $password);
if (!$ftpStream) {
echo 'Wrong server!';
exit;
} else if (!$login) {
echo 'Wrong username/password!';
exit;
} else {
$size = dirFTPSize($ftpStream, $startdir);
}
echo number_format(($size / 1024 / 1024), 2, '.', '') . ' MB';
ftp_close($ftpStream);
Good code!
Fernando
Object Oriented Approach :
/**
* Returns a directory size
*
* #param string $directory
*
* #return int $size directory size in bytes
*
*/
function dir_size($directory)
{
$size = 0;
foreach(new RecursiveIteratorIterator(new RecursiveDirectoryIterator($directory)) as $file)
{
$size += $file->getSize();
}
return $size;
}
Fast and Furious Approach :
function dir_size2($dir)
{
$line = exec('du -sh ' . $dir);
$line = trim(str_replace($dir, '', $line));
return $line;
}
Code adjusted to access main directory and all sub folders within it. This would return the full directory size.
function get_dir_size($directory){
$size = 0;
$files= glob($directory.'/*');
foreach($files as $path){
is_file($path) && $size += filesize($path);
if (is_dir($path))
{
$size += get_dir_size($path);
}
}
return $size;
}
if you are hosted on Linux:
passthru('du -h -s ' . $DIRECTORY_PATH)
It's better than foreach
Regarding Johnathan Sampson's Linux example, watch out when you are doing an intval on the outcome of the "du" function, if the size is >2GB, it will keep showing 2GB.
Replace:
$totalSize = intval(fgets($io, 80));
by:
strtok(fgets($io, 80), " ");
supposed your "du" function returns the size separated with space followed by the directory/file name.
Just another function using native php functions.
function dirSize($dir)
{
$dirSize = 0;
if(!is_dir($dir)){return false;};
$files = scandir($dir);if(!$files){return false;}
$files = array_diff($files, array('.','..'));
foreach ($files as $file) {
if(is_dir("$dir/$file")){
$dirSize += dirSize("$dir/$file");
}else{
$dirSize += filesize("$dir/$file");
}
}
return $dirSize;
}
NOTE: this function returns the files sizes, NOT the size on disk
Evolved from Nate Haugs answer I created a short function for my project:
function uf_getDirSize($dir, $unit = 'm')
{
$dir = trim($dir, '/');
if (!is_dir($dir)) {
trigger_error("{$dir} not a folder/dir/path.", E_USER_WARNING);
return false;
}
if (!function_exists('exec')) {
trigger_error('The function exec() is not available.', E_USER_WARNING);
return false;
}
$output = exec('du -sb ' . $dir);
$filesize = (int) trim(str_replace($dir, '', $output));
switch ($unit) {
case 'g': $filesize = number_format($filesize / 1073741824, 3); break; // giga
case 'm': $filesize = number_format($filesize / 1048576, 1); break; // mega
case 'k': $filesize = number_format($filesize / 1024, 0); break; // kilo
case 'b': $filesize = number_format($filesize, 0); break; // byte
}
return ($filesize + 0);
}
A one-liner solution. Result in bytes.
$size=array_sum(array_map('filesize', glob("{$dir}/*.*")));
Added bonus: you can simply change the file mask to whatever you like, and count only certain files (eg by extension).
This is the simplest possible algorithm to find out directory size irrespective of the programming language you are using.
For PHP specific implementation. go to: Calculate Directory Size in PHP | Explained with Algorithm | Working Code