I use scrutinizer to analyse my code, and I get a function declared:
Worst rated PHP Operations
This is the function:
/**
* Insert Empty Fighters in an homogeneous way.
*
* #param Collection $fighters
* #param Collection $byeGroup
*
* #return Collection
*/
private function insertByes(Collection $fighters, Collection $byeGroup)
{
$bye = count($byeGroup) > 0 ? $byeGroup[0] : [];
$sizeFighters = count($fighters);
$sizeByeGroup = count($byeGroup);
$frequency = $sizeByeGroup != 0
? (int)floor($sizeFighters / $sizeByeGroup)
: -1;
// Create Copy of $competitors
$newFighters = new Collection();
$count = 0;
$byeCount = 0;
foreach ($fighters as $fighter) {
if ($frequency != -1 && $count % $frequency == 0 && $byeCount < $sizeByeGroup) {
$newFighters->push($bye);
$byeCount++;
}
$newFighters->push($fighter);
$count++;
}
return $newFighters;
}
What this function is doing is trying to insert Empty Fighters in a regular
/ homogeneous way
But for me, this method seems quite OK, what am I not seeing?
Any better way to achieve it???
Misleading name (probably not picked up by Scrutinizer). At no point the actual $byeGroup collection is necessary
private function insertByes(Collection $fighters, Collection $byeGroup)
An if statement, that is only used to pull out something, that should have been a method's parameter.
$bye = count($byeGroup) > 0 ? $byeGroup[0] : [];
$sizeFighters = count($fighters);
$sizeByeGroup = count($byeGroup);
Another if statement that adds to complexity. Also uses weak comparison.
$frequency = $sizeByeGroup != 0
? (int)floor($sizeFighters / $sizeByeGroup)
: -1;
// Create Copy of $competitors
$newFighters = new Collection();
$count = 0;
$byeCount = 0;
Content of this foreach should most likely go in a separate method.
foreach ($fighters as $fighter) {
And that complex condition in yet another if statement (which also contains weak comparison), should also be better in a well named private method.
if ($frequency != -1 && $count % $frequency == 0 && $byeCount < $sizeByeGroup) {
Since $bye can be an empty array, this kinda makes no sense.
$newFighters->push($bye);
$byeCount++;
}
$newFighters->push($fighter);
$count++;
}
return $newFighters;
}
TBH, I have no idea what this method does, and it would also be really hard to write any unit test for it.
Related
I am trying to do a very simple but numerous iterations task. I choose 7 random serial numbers from an array of 324000 serial numbers and place them in another array and then search that array to see if a particular number is within it, execute another script and fwrite out how many times the looked for number is in the array.
This goes fairly fast in single thread. But when I put it in pthreads, even one single pthread running is 100x slower than single thread. The workers are not sharing any resources (i.e. the grab all info from their own folders and write info to their own folders)..fwrite bottlenecks is not the problem. The problem is with the arrays which I note below. Am I running into a cache line problem, where the arrays although they have separate variables are still sharing the same cache line? Sigh...much appreciate your help, in figuring out why the arrays are slowing it to a crawl.
<?php
class WorkerThreads extends Thread
{
private $workerId;
private $linesId;
private $linesId2;
private $c2_result;
private $traceId;
public function __construct($id,$newlines,$newlines2,$xxtrace)
{
$this->workerId = $id;
$this->linesId = (array) $newlines;
$this->linesId2 = (array) $newlines2;
$this->traceId = $xxtrace;
$this->c2_result= (array) array();
}
public function run()
{
for($h=0; $h<90; $h++) {
$fp42=fopen("/folder/".$this->workerId."/count.txt","w");
for($master=0; $master <200; $master++) {
// *******PROBLEM IS IN THE <3000 loop -very slow***********
$b=0;
for($a=0; $a<3000; $a++) {
$zex=0;
while($zex != 1) {
$this->c2_result[0]=$this->linesId[rand(0,324631)];
$this->c2_result[1]=$this->linesId[rand(0,324631)];
$this->c2_result[2]=$this->linesId[rand(0,324631)];
$this->c2_result[3]=$this->linesId[rand(0,324631)];
$this->c2_result[4]=$this->linesId[rand(0,324631)];
$this->c2_result[5]=$this->linesId[rand(0,324631)];
$this->c2_result[6]=$this->linesId[rand(0,324631)];
if(count(array_flip($this->c2_result)) != count($this->c2_result)) { //echo "duplicates\n";
$zex=0;
} else { //echo "no duplicates\n";
$zex=1;
//exit;
}
}
// *********PROBLEM here too !in_array statement, slowing down******
if(!in_array($this->linesId2[$this->traceId],$this->c2_result)) {
//fwrite($fp4,"nothere\n");
$b++;
}
}
fwrite($fp42,$b."\n");
}
fclose($fp42);
$mainfile3="/folder/".$this->workerId."/count_pthread.php";
$command="php $mainfile3 $this->workerId";
exec($command);
}
}
}
$xxTrack=0;
$lines = range(0, 324631);
for($x=0; $x<56; $x++) {
$workers = [];
// Initialize and start the threads
foreach (range(0, 8) as $i) {
$workers[$i] = new WorkerThreads($i,$lines,$lines2,$xxTrack);
$workers[$i]->start();
$xxTrack++;
}
// Let the threads come back
foreach (range(0, 8) as $i) {
$workers[$i]->join();
}
unset($workers);
}
UPDATED CODE
I was able to speed up the original code by 6x times with help from #tpunt suggestions. Most importantly what I learned is that the code is being slowed down by the calls to rand(). If I could get rid of that, then speed time would be 100x faster. array_rand,mt_rand() and shuffle() are even slower. Here is the new code:
class WorkerThreads extends Thread
{
private $workerId;
private $c2_result;
private $traceId;
private $myArray;
private $myArray2;
public function __construct($id,$xxtrace)
{
$this->workerId = $id;
$this->traceId = $xxtrace;
$c2_result=array();
}
public function run()
{
////////////////////THE WORK TO BE DONE/////////////////////////
$lines = file("/fold/considers.txt",FILE_IGNORE_NEW_LINES);
$lines2= file("/fold/considers.txt",FILE_IGNORE_NEW_LINES);
shuffle($lines2);
$fp42=fopen("/fold/".$this->workerId."/count.txt","w");
for($h=0; $h<90; $h++) {
fseek($fp42, 0);
for($master=0; $master <200; $master++) {
$b=0;
for($a=0; $a<3000; $a++) {
$zex=0;
$myArray = [];
$myArray[rand(0,324631)] = true;
$myArray[rand(0,324631)] = true;
$myArray[rand(0,324631)] = true;
$myArray[rand(0,324631)] = true;
$myArray[rand(0,324631)] = true;
$myArray[rand(0,324631)] = true;
$myArray[rand(0,324631)] = true;
while (count($myArray) !== 7) {
$myArray[rand(0,324631)] = true;
}
if (!isset($myArray[$lines2[$this->traceId]])) {
$b++;
}
}
fwrite($fp42,$b."\n");
}
$mainfile3="/newfolder/".$this->workerId."/pthread.php";
$command="php $mainfile3 $this->workerId";
exec($command);
}//END OF H LOOP
fclose($fp42);
}
}
$xxTrack=0;
$p = new Pool(5);
for($b=0; $b<56; $b++) {
$tasks[$b]= new WorkerThreads($b,$xxTrack);
$xxTrack++;
}
// Add tasks to pool queue
foreach ($tasks as $task) {
$p->submit($task);
}
// shutdown will wait for current queue to be completed
$p->shutdown();
Your code is just incredibly inefficient. There are also a number of problems with it - I've made a quick breakdown of some of these things below.
Firstly, you are spinning up over 500 threads (9 * 56 = 504). This is going to be very slow because threading in PHP requires a shared-nothing architecture. This means that a new instance of PHP's interpreter will need to be created for each thread you create, where all classes, interfaces, traits, functions, etc, will need to be copied over to the new interpreter instance.
Perhaps more to the point, though, is that your 3 nested for loops are performing 54 million iterations (90 * 200 * 3000). Multiply this by the 504 threads being created, and you can soon see why things are becoming sluggish. Instead, use a thread pool (see pthreads' Pool class) with a more modest amount of threads (try 8, and go from there), and cut down on the iterations being performed per thread.
Secondly, you are opening up a file 90 times per thread (so a total of 90 * 504 = 45360). You only need one file handler per thread.
Thirdly, utilising actual PHP arrays inside of Threaded objects makes them read-only. So with respect to the $this->c2_result property, the code inside of your nested while loop should not even work. Not to mention that the following check does not look for duplicates:
if(count(array_flip($this->c2_result)) != count($this->c2_result))
If you avoid casting the $this->c2_result property to an array (therefore making it a Volatile object), then the following code could instead replace your while loop:
$keys = array_rand($this->linesId, 7);
for ($i = 0; $i < 7; ++$i) {
$this->c2_result[$this->linesId[$keys[$i]]] = true;
}
By setting the values as the keys in $this->c2_result we can remove the subsequent in_array function call to search through the $this->c2_result. This is done by utilising a PHP array as a hash table, where the lookup time for a key is constant time (O(1)), rather than linear time required when searching for values (with in_array). This enables us to replace the following slow check:
if(!in_array($this->linesId2[$this->traceId],$this->c2_result))
with the following fast check:
if (!isset($this->c2_result[$this->linesId2[$this->traceId]]))
But with that said, you don't seem to be using the $this->c2_result property anywhere else. So (assuming you haven't purposefully redacted code that uses it), you could remove it altogether and simply replace the while loop at check after it with the following:
$found = false;
foreach (array_rand($this->linesId, 7) as $key) {
if ($this->linesId[$key] === $this->linesId2[$this->traceId]) {
$found = true;
break;
}
}
if (!$found) {
++$b;
}
Beyond the above, you could also look at storing the data you're collecting in-memory (as some property on the Threaded object), to prevent expensive disk writes. The results could be aggregated at the end, before shutting down the pool.
Update based up your update
You've said that the rand function is causing major slowdown. Whilst it may be part of the problem, I believe it is actually all of the code inside of your third nested for loop. The code inside there is very hot code, because it gets executed 54 million times. I suggested above that you replace the following code:
$zex=0;
while($zex != 1) {
$c2_result[0]=$lines[rand(0,324631)];
$c2_result[1]=$lines[rand(0,324631)];
$c2_result[2]=$lines[rand(0,324631)];
$c2_result[3]=$lines[rand(0,324631)];
$c2_result[4]=$lines[rand(0,324631)];
$c2_result[5]=$lines[rand(0,324631)];
$c2_result[6]=$lines[rand(0,324631)];
$myArray = (array) $c2_result;
$myArray2 = (array) $c2_result;
$myArray=array_flip($myArray);
if(count($myArray) != count($c2_result)) {//echo "duplicates\n";
$zex=0;
} else {//echo "no duplicates\n";
$zex=1;
//exit;
}
}
if(!in_array($lines2[$this->traceId],$myArray2)) {
$b++;
}
with a combination of array_rand and foreach. Upon some initial tests, it turns out that array_rand really is outstandingly slow. But my hash table solution to replace the in_array invocation still holds true. By leveraging a PHP array as a hash table (basically, store values as keys), we get a constant time lookup performance (O(1)), as opposed to a linear time lookup (O(n)).
Try replacing the above code with the following:
$myArray = [];
$myArray[rand(0,324631)] = true;
$myArray[rand(0,324631)] = true;
$myArray[rand(0,324631)] = true;
$myArray[rand(0,324631)] = true;
$myArray[rand(0,324631)] = true;
$myArray[rand(0,324631)] = true;
$myArray[rand(0,324631)] = true;
while (count($myArray) !== 7) {
$myArray[rand(0,324631)] = true;
}
if (!isset($myArray[$lines2[$this->traceId]])) {
$b++;
}
For me, this resulted in a 120% speedup.
As for further performance, you can (as mentioned above, again) store the results in-memory (as a simple property) and perform a write of all results at the end of the run method.
Also, the garbage collector for pthreads is not deterministic. It should therefore not be used to retrieve data. Instead, a Threaded object should be injected into the worker thread, where data to be collected should be saved to this object. Lastly, you should shutdown the pool after garbage collection (which, again, should not be used in your case).
despite it is unclear about your code and what $newlines and $newlines2 are, so, I am just guessing here...
something like this ?
The idea is to avoid as much as possible fopen and fwrite in your loop.
1 - open it only once in the construct.
2 - concat your chain in your loop.
3 - write it only once after the loop.
class WorkerThreads extends Thread {
private $workerId;
private $linesId;
private $linesId2;
private $c2_result;
private $traceId;
private $fp42;
private $mainfile3;
public function __construct($id, $newlines, $newlines2, $xxtrace) {
$this->workerId = $id;
$this->linesId = (array) $newlines;
$this->linesId2 = (array) $newlines2;
$this->traceId = $xxtrace;
$this->c2_result = array();
$this->fp42 = fopen("/folder/" . $id . "/count.txt", "w");
$this->mainfile3 = "/folder/" . $id . "/count_pthread.php";
}
public function run() {
for ($h = 0; $h < 90; $h++) {
$globalf42='';
for ($master = 0; $master < 200; $master++) {//<200
$b = 0;
for ($a = 0; $a < 3000; $a++) {
$zex = 0;
if ($zex != 1) {
for ($ii = 0; $ii < 6; $ii++) {
$this->c2_result[$ii] = $this->linesId[rand(0, 324631)];
}
$zex = (count(array_flip($this->c2_result)) != count($this->c2_result)) ? 0 : 1;
}
if (!in_array($this->linesId2[$this->traceId], $this->c2_result)) {
$b++;
}
}
$globalf42 .= $b . "\n";
}
fwrite($this->fp42, $globalf42);
fclose($this->fp42);
$command = "php $this->mainfile3 $this->workerId";
exec($command);
}
}
}
I have a list of entities:
array = (0 => entity1, 1=> entity2,..., n=>entityn);
Every entity has a property called:
$initializedLevel = new ArrayCollection();
//key for this ArrayCollection is a 'date', value is an integer (= the level)
Every entity has a method called
getInitializedLevel($date)
// returns the level (= integer) for a specific date.
How do I get the highest level for a given date?
I was seeking similar method, and finally do in manually. I
$date = new \Date("now");
$max = $array[0]->getInitializedLevel($date);
foreach ($array as $entity) {
$current = $entity->getInitializedLevel($date);
$max = $current > $max ? $current : $max;
};
You have searched valued in $max variable now.
could be done with :
public function getMaxLevelForAGivenDate($date)
{
$levels = this->initializedLevel->map(function ($initialize) use ($date) {
if($date == $initialize->getDate()) {
return $initialize->getLevel();
}
});
return max($levels->toArray());
}
I use this to check if an object has properties,
function objectHasProperty($input){
return (is_object($input) && (count(get_object_vars($input)) > 0)) ? true : false;
}
But then I want to check further to make sure all properties have values, for instance,
stdClass Object
(
[package] =>
[structure] =>
[app] =>
[style] =>
[js] =>
)
Then I want to return false if all the properties have empty values. Is it possible? Any hint and ideas?
There are several ways of doing this, all the way up to using PHP's reflection API, but to simply check if all public properties of an object are empty, you could do this:
$properties = array_filter(get_object_vars($object));
return !empty($properties);
(The temporary variable $properties is required because you're using PHP 5.4.)
For deep inspection and a more advanced handling I'd go for something like the following which is easily extended. Consider it a follow up answer to dev-null-dweller's answer (which is perfectly valid and a great solution).
/**
* Deep inspection of <var>$input</var> object.
*
* #param mixed $input
* The variable to inspect.
* #param int $visibility [optional]
* The visibility of the properties that should be inspected, defaults to <code>ReflectionProperty::IS_PUBLIC</code>.
* #return boolean
* <code>FALSE</code> if <var>$input</var> was no object or if any property of the object has a value other than:
* <code>NULL</code>, <code>""</code>, or <code>[]</code>.
*/
function object_has_properties($input, $visibility = ReflectionProperty::IS_PUBLIC) {
set_error_handler(function(){}, E_WARNING);
if (is_object($input)) {
$properties = (new ReflectionClass($input))->getProperties($visibility);
$c = count($properties);
for ($i = 0; $i < $c; ++$i) {
$properties[$i]->setAccessible(true);
// Might trigger a warning!
$value = $properties[$i]->getValue($input);
if (isset($value) && $value !== "" && $value !== []) {
restore_error_handler();
return true;
}
}
}
restore_error_handler();
return false;
}
// Some tests
// The bad boy that emits a E_WARNING
var_dump(object_has_properties(new \mysqli())); // boolean(true)
var_dump(object_has_properties(new \stdClass())); // boolean(false)
var_dump(object_has_properties("")); // boolean(false)
class Foo {
public $prop1;
public $prop2;
}
var_dump(object_has_properties(new Foo())); // boolean(false)
$foo = new Foo();
$foo->prop1 = "bar";
var_dump(object_has_properties($foo)); // boolean(true)
Depending on what do you consider as 'empty value' you may have adjust the callback function that removes unwanted values:
function objectHasProperty($input){
return (
is_object($input)
&&
array_filter(
get_object_vars($input),
function($val){
// remove empty strings and null values
return (is_string($val) && strlen($val))||($val!==null);
}
)
) ? true : false;
}
$y = new stdClass;
$y->zero = 0;
$n = new stdClass;
$n->notset = null;
var_dump(objectHasProperty($y),objectHasProperty($n));//true,false
recently I had a job interview. I had two tasks:
1) to refactor a JavaScript code
// The library 'jsUtil' has a public function that compares 2 arrays, returning true if
// they're the same. Refactor it so it's more robust, performs better and is easier to maintain.
/**
#name jsUtil.arraysSame
#description Compares 2 arrays, returns true if both are comprised of the same items, in the same order
#param {Object[]} a Array to compare with
#param {Object[]} b Array to compare to
#returns {boolean} true if both contain the same items, otherwise false
#example
if ( jsUtil.arraysSame( [1, 2, 3], [1, 2, 3] ) ) {
alert('Arrays are the same!');
}
*/
// assume jsUtil is an object
jsUtil.arraysSame = function(a, b) {
var r = 1;
for (i in a) if ( a[i] != b[i] ) r = 0;
else continue;
return r;
}
2) To refactor a PHP function that checks for a leap year
<?php
/*
The class 'DateUtil' defines a method that takes a date in the format DD/MM/YYYY, extracts the year
and works out if it is a leap year. The code is poorly written. Refactor it so that it is more robust
and easier to maintain in the future.
Hint: a year is a leap year if it is evenly divisible by 4, unless it is also evenly
divisible by 100 and not by 400.
*/
class DateUtil {
function notLeapYear ($var) {
$var = substr($var, 6, 4);
if (! ($var % 100) && $var % 400) {
return 1;
}
return $var % 4;
}
}
$testDates = array('03/12/2000', '01/04/2001', '28/01/2004', '29/11/2200');
/* the expected result is
* 03/12/2000 falls in a leap year
* 01/04/2001 does not fall in a leap year
* 28/01/2004 falls in a leap year
* 29/11/2200 does not fall in a leap year
*/
?>
<? $dateUtil = new DateUtil(); ?>
<ul>
<? foreach ($testDates as $date) { ?>
<li><?= $date ?> <?= ($dateUtil->notLeapYear($date) ? 'does not fall' : 'falls') ?> in a leap year</li>
<? } ?>
</ul>
I think I cope with the task but I am not quite sure, I still don't have an answer from them and it's been about a week. Could you give an example of your approach to this tasks? I'd really appreciate. Later I can post my solutions/code.
OK here are my answers to the questions.
<?php // Always use full/long openning tags not
$start = microtime(true);
class DateUtil {
/**
* The date could be used in other
* class methods in the future.
* Use just internally.
**/
var $_date;
/**
* The constructor of the class takes
* 1 argument, date, as a string and sets
* the object parameter _date to be used
* internally. This is compatable only in PHP5
* for PHP4 should be replaced with function DateUtil(...)
*/
public function __construct( $date = '' ) {
$this->_date = $date;
}
/**
* Setter for the date. Currently not used.
* Also we could use _set magical function.
* for PHP5.
**/
public function setDate( $date = '' ) {
$this->_date = $date;
}
/**
* Gettre of the date. Currently not used.
* Also we could use _get magical function.
* for PHP5.
**/
public function getDate() {
return $this->_date;
}
public function isLeapYear( $year = '' ) {
// all leap years can be divided through 4
if (($year % 4) != 0) {
return false;
}
// all leap years can be divided through 400
if ($year % 400 == 0) {
return true;
} else if ($year % 100 == 0) {
return false;
}
return true;
}
}
$dates = array('03/12/2000', '01/04/2001', '30/01/2004', '29/11/2200');
$dateUtil = new DateUtil();
foreach($dates as $date) {
/**
* This processing is not done in the class
* because the date format could be different in
* other cases so we cannot assume this will allways
* be the format of the date
*
* The php function strtotime() was not used due to
* a bug called 2K38, more specifically dates after and 2038
* are not parsed correctly due to the format of the UNIX
* timestamp which is 32bit integer.
* If the years we use are higher than 1970 and lower
* than 2038 we can use date('L', strtotime($date));
**/
$year = explode('/', $date);
$year = $year[2];
$isLeap = $dateUtil->isLeapYear($year);
echo '<pre>' . $date . ' - ';
echo ($isLeap)? 'Is leap year': 'Is not leap year';
echo '</pre>';
}
echo 'The execution took: ' . (microtime(true) - $start) . ' sec';
?>
JavaScript
/***************************************************/
jsUtil = new Object();
jsUtil.arraysSame = function(a, b) {
if( typeof(a) != 'object') {
// Check if tepes of 'a' is object
return false;
} else if(typeof(a) != typeof(b)) {
// Check if tepes of 'a' and 'b' are same
return false;
} else if(a.length != b.length) {
// Check if arrays have different length if yes return false
return false;
}
for(var i in a) {
// We compare each element not only by value
// but also by type so 3 != '3'
if(a[i] !== b[i]) {
return false;
}
}
return true;
}
// It will work with associative arrays too
var a = {a:1, b:2, c:3};
var b = {a:1, b:2, c:3}; // true
var c = {a:1, b:2, g:3}; // false
var d = {a:1, b:2, c:'3'}; // false
var output = '';
output += 'Arrays a==b is: ' + jsUtil.arraysSame( a, b );
output += '\n';
output += 'Arrays a==c is: ' + jsUtil.arraysSame( a, c );
output += '\n';
output += 'Arrays a==d is: ' + jsUtil.arraysSame( a, d );
alert(output);
Iterate arrays using a for loop rather than for...in. If the arrays are different, you want to return as quickly as possible, so start with a length comparison and return immediately you come across an element that differs between the two arrays. Compare them using the strict inequality operator !==. Iterate backwards through the array for speed and to minimise the number of variables required by assigning a's length to i and reusing i as the iteration variable.
This code assumes that the parameters a and b are both supplied and are both Array objects. This seems to be implied by the question.
var jsUtil = jsUtil || {};
jsUtil.arraysSame = function(a, b) {
var i = a.length;
if (i != b.length) return false;
while (i--) {
if (a[i] !== b[i]) return false;
}
return true;
};
For the PHP version:
class DateUtil {
function LeapYear ($var) {
$date = DateTime::CreateFromFormat($var, 'd/m/Y');
return($date->format('L')); // returns 1 for leapyear, 0 otherwise
}
function notLeapYear($var) {
return(!$this->LeapYear($var)) {
}
}
For the first problem maybe I can help you with this:
var jsUtil = jsUtil || {};
jsUtil.arraysSame = function(a, b){
//both must be arrays
if (!a instanceof Array || !b instanceof Array) {
return false;
}
//both must have the same size
if (a.length !== b.length) {
return false;
}
var isEquals = true;
for (var i = 0, j = a.length; i < j; i++) {
if (a[i] !== b[i]) {
isEquals = false;
i = j; //don't use break
}
}
return isEquals;
}
I included type checking and made the things more clear.
In my opinion using built-in predefined functions is always your best bet.
1) Use a function that converts the arrays into strings. There are many of these available and depending on which library you are already using you may want to use different ones. You can find one at Json.org
jsUtil.arraysSame = function(a, b) {
return JSON.stringify(a) == JSON.stringify(b);
}
2) USe PHP's built in date function and strtotime
class DateUtil {
function notLeapYear ($var) {
return (date( 'L', strtotime( $var)) == "0");
}
}
check inputs (type, range - keep in mind that very old dates used a different calendar); you might use PHP date functions to parse date (more flexibility on one hand, limited to relatively recent dates on the other)
never iterate with in in javascript, will fail horribly when prototypes of the standard types have been extended
you should clarify what the functions should do; e.g. should the array comparison be recursive? Should it use strict equivalence?
you can stop iterating the arrays when the first difference is found. Also, you might want to check if the two refer to the same object before starting to iterate.
write unit tests
I'm writing a text tag parser and I'm currently using this recursive method to create tags of n words. Is there a way that it can be done non-recursively or at least be optimized? Assume that $this->dataArray could be a very large array.
/**
* A recursive function to add phrases to the tagTracker array
* #param string $data
* #param int $currentIndex
* #param int $depth
*/
protected function compilePhrase($data, $currentIndex, $depth){
if (!empty($data)){
if ($depth >= $this->phraseStart){
$this->addDataCount($data, $depth);
}
if ($depth < $this->phraseDepth){
$currentIndex = $currentIndex + 1;
//$this->dataArray is an array containing all words in the text
$data .= ' '.$this->dataArray[$currentIndex];
$depth += 1;
$this->compilePhrase($data, $currentIndex, $depth);
}
}
}
See if you can use tail recursion rather than call-based recursion. Some rewriting may be required but a cursory looks says it is fine to do.
Tail recursion is great for a subset of recursive functions, and good practice to spot when loops can replace recursion, and how to rewrite.
Saying that, I don't know what the overhead inside PHP is of the call. Might just be one return-pointer type setup rather than a real stack wind.
Turns out about the same. Does PHP optimize tail recursive calls out itself?
Here is my rewrite, but beware, my brain is currently sleep deprived!
protected function compilePhrase($data, $currentIndex, $depth){
/* Loop until break condition */
while(true) {
if (!empty($data)){
if ($depth >= $this->phraseStart){
$this->addDataCount($data, $depth);
}
if ($depth < $this->phraseDepth){
$currentIndex = $currentIndex + 1;
// A test here might be better than the !empty($data)
// in the IF condition. Check array bounds, assuming
// array is not threaded or anything
$data .= ' '.$this->dataArray[$currentIndex];
$depth += 1;
}else{
break;
}
}else{
break;
}
}
/* Finish up here */
return($this->dataArray);
}