detecting infinite array recursion in PHP? - php

i've just reworked my recursion detection algorithm in my pet project dump_r()
https://github.com/leeoniya/dump_r.php
detecting object recursion is not too difficult - you use spl_object_hash() to get the unique internal id of the object instance, store it in a dict and compare against it while dumping other nodes.
for array recursion detection, i'm a bit puzzled, i have not found anything helpful. php itself is able to identify recursion, though it seems to do it one cycle too late. EDIT: nvm, it occurs where it needs to :)
$arr = array();
$arr[] = array(&$arr);
print_r($arr);
does it have to resort to keeping track of everything in the recursion stack and do shallow comparisons against every other array element?
any help would be appreciated,
thanks!

Because of PHP's call-by-value mechanism, the only solution I see here is to iterate the array by reference, and set an arbitrary value in it, which you later check if it exists to find out if you were there before:
function iterate_array(&$arr){
if(!is_array($arr)){
print $arr;
return;
}
// if this key is present, it means you already walked this array
if(isset($arr['__been_here'])){
print 'RECURSION';
return;
}
$arr['__been_here'] = true;
foreach($arr as $key => &$value){
// print your values here, or do your stuff
if($key !== '__been_here'){
if(is_array($value)){
iterate_array($value);
}
print $value;
}
}
// you need to unset it when done because you're working with a reference...
unset($arr['__been_here']);
}
You could wrap this function into another function that accepts values instead of references, but then you would get the RECURSION notice from the 2nd level on. I think print_r does the same too.

Someone will correct me if I am wrong, but PHP is actually detecting recursion at the right moment. Your assignation simply creates the additional cycle. The example should be:
$arr = array();
$arr = array(&$arr);
Which will result in
array(1) { [0]=> &array(1) { [0]=> *RECURSION* } }
As expected.
Well, I got a bit curious myself how to detect recursion and I started to Google. I found this article http://noteslog.com/post/detecting-recursive-dependencies-in-php-composite-values/ and this solution:
function hasRecursiveDependency($value)
{
//if PHP detects recursion in a $value, then a printed $value
//will contain at least one match for the pattern /\*RECURSION\*/
$printed = print_r($value, true);
$recursionMetaUser = preg_match_all('#\*RECURSION\*#', $printed, $matches);
if ($recursionMetaUser == 0)
{
return false;
}
//if PHP detects recursion in a $value, then a serialized $value
//will contain matches for the pattern /\*RECURSION\*/ never because
//of metadata of the serialized $value, but only because of user data
$serialized = serialize($value);
$recursionUser = preg_match_all('#\*RECURSION\*#', $serialized, $matches);
//all the matches that are user data instead of metadata of the
//printed $value must be ignored
$result = $recursionMetaUser > $recursionUser;
return $result;
}

Related

When is foreach with a parameter by reference dangerous?

I knew, that it can be dangerous to pass the items by reference in foreach.
In particular, one must not reuse the variable that was passed by reference, because it affects the $array, like in this example:
$array = ['test'];
foreach ($array as &$item){
$item = $item;
}
$item = 'modified';
var_dump($array);
array(1) {
[0]=>
&string(8) "modified"
}
Now this here bite me: the content of the array gets modified inside the function should_not_modify, even though I don't pass the $array by value.
function should_not_modify($array){
foreach($array as &$item){
$item = 'modified';
}
}
$array = ['test'];
foreach ($array as &$item){
$item = (string)$item;
}
should_not_modify($array);
var_dump($array);
array(1) {
[0]=>
&string(8) "modified"
}
I'm tempted to go through my whole codebase and insert unset($item); after each foreach($array => &$item).
But, since this is a big task and introduces a potentially useless line, I would like to know if there is a simple rule to know when foreach($array => &$item) is safe without a unset($item); after it, and when not.
Edit for clarification
I think I understand what happens and why. I also know what is best to do against: foreach($array as &$item){...};unset($item);
I know that this is dangerous after foreach($array as &$item):
reuse the variable $item
pass the array to a function
My question is: Are there other cases that are dangerous, and can we build an exhaustive list of what is dangerous. Or the other way round: is it possible to describe when it is not dangerous.
About foreach
First of all, some (maybe obvious) clarifications about two behaviors of PHP:
foreach($array as $item) will leave the variable $item untouched after the loop. If the variable is a reference, as in foreach($array as &$item), it will "point" to the last element of the array even after the loop.
When a variable is a reference then the assignation, e.g. $item = 'foo'; will change whatever the reference is pointing to, not the variable ($item) itself. This is also true for a subsequent foreach($array2 as $item) which will treat $item as a reference if it has been created as such and therefore will modify whatever the reference is pointing to (the last element of the array used in the previous foreach in this case).
Obviously this is very error prone and that is why you should always unset the reference used in a foreach to ensure following writes do not modify the last element (as in example #10 of the doc for the type array).
About the function that modifies the array
It's worth noting that - as pointed out in a comment by #iainn - the behavior in your example has nothing to do with foreach. The mere existence of a reference to an element of the array will allow this element to be modified. Example:
function should_not_modify($array){
$array[0] = 'modified';
$array[1] = 'modified2';
}
$array = ['test', 'test2'];
$item = & $array[0];
should_not_modify($array);
var_dump($array);
Will output:
array(2) {
[0] =>
string(8) "modified"
[1] =>
string(5) "test2"
}
This is admittedly very suprising but explained in the PHP documentation "What References Do"
Note, however, that references inside arrays are potentially dangerous. Doing a normal (not by reference) assignment with a reference on the right side does not turn the left side into a reference, but references inside arrays are preserved in these normal assignments. This also applies to function calls where the array is passed by value. [...] In other words, the reference behavior of arrays is defined in an element-by-element basis; the reference behavior of individual elements is dissociated from the reference status of the array container.
With the following example (copy/pasted):
/* Assignment of array variables */
$arr = array(1);
$a =& $arr[0]; //$a and $arr[0] are in the same reference set
$arr2 = $arr; //not an assignment-by-reference!
$arr2[0]++;
/* $a == 2, $arr == array(2) */
/* The contents of $arr are changed even though it's not a reference! */
It's important to understand that when creating a reference, for example $a = &$b then both $a and $b are equal. $a is not pointing to $b or vice versa. $a and $b are pointing to the same place.
So when you do $item = & $array[0]; you actually make $array[0] pointing to the same place as $item. Since $item is a global variable, and references inside array are preserved, then modifying $array[0] from anywhere (even from within the function) modifies it globally.
Conclusion
Are there other cases that are dangerous, and can we build an exhaustive list of what is dangerous. Or the other way round: is it possible to describe when it is not dangerous.
I'm going to repeat the quote from the PHP doc again: "references inside arrays are potentially dangerous".
So no, it's not possible to describe when it is not dangerous, because it is never not dangerous. It's too easy to forget that $item has been created as a reference (or that a global reference as been created and not destroyed), and reuse it elsewhere in your code and corrupt the array. This has long been a topic of debate (in this bug for example), and people call it either a bug or a feature...
The accepted answer is the best, but I'd like to give a complement: When is unset($item); not necessary after a foreach($array as &$item) ?
$item: if it is never reused after, it cannot harm.
$array: the last element is a reference. This always dangerous, for all the reasons already stated.
So what does change that element form being a reference to a value ?
the most cited: unlink($item);
when $item falls out of scope when the array is returned from a function, then the array becomes 'normal' after being return from the function.
function test(){
$array = [1];
foreach($array as &$item){
$item = $item;
}
var_dump($array);
return $array;
}
$a = test();
var_dump($a);
array(1) {
[0]=>
&int(1)
}
array(1) {
[0]=>
int(1)
}
But beware: if you do anything else before returning, it can bite !
You can break the reference by "json decode/encode"
function should_not_modify($array){
$array = json_decode(json_encode($array),false);
foreach($array as &$item){
$item = 'modified';
}
}
$array = ['test'];
foreach ($array as &$item){
$item = (string)$item;
}
should_not_modify($array);
var_dump($array);
The question is purely academic, and this is a bit of a hack. But, it's sort of fun, in a stupid programming way.
And of course it outputs:
array(1) {
[0]=>string(4) "test"
}
As a side the same thing works in JavaScript, which also can give you some wonky-ness from references.
I wish I had a good example, because I've had some "weird" stuff happen, I mean like some quantum entanglement stuff. This one time at a PHP camp, I had a recursive function ( pass by reference ) with a foreach ( pass by reference ) and well it sort of ripped a hole in the space time continuum.

Remove first element from simple array in loop

This question has been asked a thousand times, but each question I find talks about associative arrays where one can delete (unset) an item by using they key as an identifier. But how do you do this if you have a simple array, and no key-value pairs?
Input code
$bananas = array('big_banana', 'small_banana', 'ripe_banana', 'yellow_banana', 'green_banana', 'brown_banana', 'peeled_banana');
foreach ($bananas as $banana) {
// do stuff
// remove current item
}
In Perl I would work with for and indices instead, but I am not sure that's the (safest?) way to go - even though from what I hear PHP is less strict in these things.
Note that after foreach has run, I expected var_dump($bananas) to return an empty array (or null, but preferably an empty array).
1st method (delete by value comparison):
$bananas = array('big_banana', 'small_banana', 'ripe_banana', 'yellow_banana', 'green_banana', 'brown_banana', 'peeled_banana');
foreach ($bananas as $key=>$banana) {
if($banana=='big_banana')
unset($bananas[$key]);
}
2nd method (delete by key):
$bananas = array('big_banana', 'small_banana', 'ripe_banana', 'yellow_banana', 'green_banana', 'brown_banana', 'peeled_banana');
unset($bananas[0]); //removes the first value
unset($bananas[count($bananas)-1]); //removes the last value
//unset($bananas[n-1]); removes the nth value
Finally if you want to reset the keys after deletion process:
$bananas = array_map('array_values', $bananas);
If you want to empty the array completely:
unset($bananas);
$bananas= array();
it still has the indexes
foreach ($bananas as $key => $banana) {
// do stuff
unset($bananas[$key]);
}
for($i=0; $i<count($bananas); $i++)
{
//doStuff
unset($bananas[$i]);
}
This will delete every element after its use so you will eventually end up with an empty array.
If for some reason you need to reindex after deleting you can use array_values
How about a while loop with array_shift?
while (($item = array_shift($bananas)) !== null)
{
//
}
Your Note: Note that after foreach has run, I expected var_dump($bananas) to return an empty array (or null, but preferably
an empty array).
Simply use unset.
foreach ($bananas as $banana) {
// do stuff
// remove current item
unset($bananas[$key]);
}
print_r($bananas);
Result
Array
(
)
This question is old but I will post my idea using array_slice for new visitors.
while(!empty($bananas)) {
// ... do something with $bananas[0] like
echo $bananas[0].'<br>';
$bananas = array_slice($bananas, 1);
}

PHP Add elements to array

The line below commented GOAL creates an error. The error is not displayed (just get a white screen) and I do not have access to php.ini to change the settings. I'm quite sure the error is something along the lines of "can not use [] for reading".
How can I get around this? The keys must be preserved and that doesn't seem possible with array_push.
foreach ($invention_values as $value)
{
if( array_key_exists($value->field_name, $array) )
{
//GOAL but creates error: $array[$value->field_name][] = $value->field_value;
//works but only with numeric keys
array_push($array, $value);
}
else $array[$value->field_name] = $value;
}
EDIT: code
EDIT2: Actually I think the error is cause I'm dealing with an object an not an array. What is the object equivalent of
$array[$value->field_name][] = $value ?
Your $array[$value->field_name] is empty, so you can't use [] on it. To initialize it as an array, you have to do the following:
if(!array_key_exists($value->field_name, $array) ){
$array[$value->field_name] = array();
}
$array[$value->field_name][] = $value->field_value;
It contradicts with what you have in your last line, so you have to decide, do you want $array[$value->field_name] be an array or scalar value.

How do you replicate an array whilst keeping the same keys?

Ok. I've written a simple(ish) function to take an argument and return the same argument with the danger html characters replaced with their character entities.
The function can take as an argument either a string, an array or a 2D array - 3d arrays or more are not supported.
The function is as follows:
public function html_safe($input)
{
if(is_array($input)) //array was passed
{
$escaped_array = array();
foreach($input as $in)
{
if(is_array($in)) //another array inside the initial array found
{
$inner_array = array();
foreach($in as $i)
{
$inner_array[] = htmlspecialchars($i);
}
$escaped_array[] = $inner_array;
}
else
$escaped_array[] = htmlspecialchars($in);
}
return $escaped_array;
}
else // string
return htmlspecialchars($input);
}
This function does work, but the problem is that I need to maintain the array keys of the original array.
The purpose of this function was to make it so we could literally pass a result set from a database query and get back all the values with the HTML characters made safe. Obviously therefore, the keys in the array will be the names of database fields and my function at the moment is replacing these with numeric values.
So yeah, I need to get back the same argument passed to the function with array keys still intact (if an array was passed).
Hope that makes sense, suggestions appreciated.
You can use recursion rather than nesting loads of foreaches:
function html_safe($input) {
if (is_array($input)) {
return array_map('html_safe', $input);
} else {
return htmlspecialchars($input);
}
}
Ok I think I've figured this one out myself...
my foreach loops didn't have any keys specified for example they were:
foreach($array_val as $val)
instead of:
foreach($array_val as $key => $val)
in which case I could have preserved array keys in the output arrrays.

A $_GET input parameter that is an Array

I'm trying to pass 3 parameter to a script, where the 3rd parameter $_GET['value3'] is supposed to be an array
$_GET['value1']
$_GET['value2']
$_GET['value3'] //an array of items
I'm calling the script like this: (notice my syntax for value3, I'm not sure it's correct)
http://localhost/test.php?value1=test1&value2=test2&value3=[the, array, values]
I then use a foreach to hopefully loop through the third parameter value3 which is the array
//process the first input $_GET['value1']
//process the second input $_GET['value2']
//process the third input $_GET['value3'] which is the array
foreach($_GET['value3'] as $arrayitem){
echo $arrayitem;
}
but I get the error Invalid argument supplied for foreach()
I'm not sure if my methodology is correct. Can some clarify how you'd go about doing the sort of thing
There is no such thing as "passing an array as a URL parameter" (or a form value, for that matter, because this is the same thing). These are strings, and anything that happens to them beyond that is magic that has been built into your application server, and therefore it is non-portable.
PHP happens to support the &value3[]=the&value3[]=array&value3[]=values notation to automagically create $_GET['value3'] as an array for you, but this is special to PHP and does not necessarily work elsewhere.
You can also be straight-forward and go for a cleaner URL, like this: value3=the,array,values, and then use explode(',', $_GET['value3']) in your PHP script to create an array. Of course this implies that your separator char cannot be part of the value.
To unambiguously transport structured data over HTTP, use a format that has been made for the purpose (namely: JSON) and then use json_decode() on the PHP side.
try
http://localhost/test.php?value1=test1&value2=test2&value3[]=the&value3[]=array&value3[]=values
For arrays you need to pass the query parameters as
value3[]=abc&value3[]=pqr&value3[]=xyz
You can cast the name of the index in the string too
?value1[a]=test1a&value1[b]=test1b&value2[c][]=test3a&value2[c][]=test3b
would be
$_GET['value1']['a'] = test1a
$_GET['value1']['b'] = test1b
$_GET['value2']['c'] = array( 'test3a', 'test3b' );
http://php.net/manual/en/reserved.variables.get.php
Check out the above link..
You will see how the GET method is implemented.
What happens is that the URL is taken, it is delimited using '&' and then they are added as a key-value pair.
public function fixGet($args) {
if(count($_GET) > 0) {
if(!empty($args)) {
$lastkey = "";
$pairs = explode("&",$args);
foreach($pairs as $pair) {
if(strpos($pair,":") !== false) {
list($key,$value) = explode(":",$pair);
unset($_GET[$key]);
$lastkey = "&$key$value";
} elseif(strpos($pair,"=") === false)
unset($_GET[$pair]);
else {
list($key, $value) = explode("=",$pair);
$_GET[$key] = $value;
}
}
}
return "?".((count($_GET) > 0)?http_build_query($_GET).$lastkey:"");
}
Since, they are added as a key-value pair you can't pass array's in the GET method...
The following would also work:
http://localhost/test.php?value3[]=the&value3[]=array&value3[]=values
A more advanced approach would be to serialize the PHP array and print it in your link:
http://localhost/test.php?value3=a:3:{i:0;s:3:"the";i:1;s:5:"array";i:2;s:6:"values";}
would, essentially, also work.

Categories