Does PHP's in_array really go through the whole array? - php

I stumbled over a few articles (e.g. this one) and infos that suggest PHP's in_array() goes through the whole array.
Now there is a possible duplicate of this question here: How does PHP's in_array function work? but the OP was obviously satisfied with the copy/paste of the C language function definition and no further description...
My question however is:
Does PHP's in_array() really go through the whole array?
I tried to look further and go after the ZEND_HASH_FOREACH_KEY_VAL, but then it got a bit confusing:
the C-language definition of php_search_array() ... AKA in_arary() in PHP
Codes of ZEND_HASH_FOREACH_KEY_VAL and ZEND_HASH_FOREAC
Only thing I am sure of is that since the ??iteration?? happens on the "C-level" it should be faster than "manual" foreach...

Does PHP's in_array really go through the whole array?
TLDR; No it doesn't.
The way I read the C implementation:
ZEND_HASH_FOREACH_KEY_VAL or rather ZEND_HASH_FOREACH iterates over the array data bucket with a pointer to the current element.
The element pointer is assigned to the variable entry in void php_search_array for each iteration.
When a match is found, The PHP list item itself or PHP bool is returned by the engine depending on the behavior argument given to the function.
To answer your question:
php_search_array either invokes Zend RETURN_TRUE (impl: https://github.com/php/php-src/blob/master/Zend/zend_API.h) or sets RET_VAL and performs a C return; afterwards. It both cases, C execution breaks out of the iteration of the array if a match is found.

Related

How can I write this in Python?

$class->categories[$cat->category_parent_id][]=$cat;
I have developed a script where I want to write this php code in python.
How can I create a this categories array in python?
So far I have done this:
categories_c.insert(row["category_parent_id"], row)
But I am not sure if this is the correct implementation.
Let's deconstruct:
$class->categories
This accesses the categories property of the $class object. In Python that's:
klass.categories # (class is a reserved keyword, using klass instead)
We'll assume that categories is a dict here, because you want to do this:
categories[$cat->category_parent_id]
This accesses a particular key of an array, in Python that would be the key of a dict:
categories[cat.category_parent_id]
Now the tricky part:
...[] = $cat
This pushes $cat into the array. In Python that means appending a value to a list. PHP will implicitly create new sub-arrays as necessary if they don't exist. Python doesn't. This is the only part that needs a bit of additional consideration. What you need to know is whether categories[cat.category_parent_id] already exists in your dict or whether you want to create it in the process.
If you know it exists, if categories is a dict of lists in which all cat.category_parent_id keys are already pre-populated, then it's as simple as this:
klass.categories[cat.category_parent_id].append(cat)
However, if the keys don't exist, you must ensure they're created and that their value is set to a list the first time they're accessed. The most compact notation for that in Python is the dict.setdefault method:
klass.categories.setdefault(cat.category_parent_id, []).append(cat)
If categories[cat.category_parent_id] exists, it is returned and you can append to it. If it doesn't exist, it is created and initialised to [] and then returned.

Why would one want to pass primitive-type parameters by reference in PHP?

One thing that's always bugged me (and everyone else, ever) about PHP is its inconsistency in function naming and parameters. Another more recent annoyance is its tendency to ask for function parameters by reference rather than by value.
I did a quick browse through the PHP manual, and found the function sort() as an example. If I was implementing that function I'd take an array by value, sort it into a new array, and return the new value. In PHP, sort() returns a boolean, and modifies the existing array.
How I'd like to call sort():
$array = array('c','a','b');
$sorted_array = sort($array);
How PHP wants me to call sort():
$array = array('c','a','b');
sort($array);
$sorted_array = $array;
And additionally, the following throws a fatal error: Fatal error: Only variables can be passed by reference
sort(array('c','a','b');
I'd imagine that part of this could be a legacy of PHP's old days, but there must have been a reason things were done this way. I can see the value in passing an object by reference ID like PHP 5+ does (which I guess is sort of in between pass by reference and pass by value), but not in the case of strings, arrays, integers and such.
I'm not an expert in the field of Computer Science, so as you can probably gather I'm trying to grasp some of these concepts still, and I'm curious as to whether there's a reason things are set up this way, or whether it's just a leftover.
The main reason is that PHP was developed by C programmers, and this is very much a C-programming paradigm. In C, it makes sense to pass a pointer to a data structure you want changed. In PHP, not so much (Among other things, because references are not the same as a pointer).
I believe this is done for speed-reason.
Most of the time you need the array you are working on to be sorted, not a copy.
If sort should have returned a new copy of the array then for each time you call sort(); the PHP engine should have copied the array into new one (lowering speed and increasing space cost) and you would have no way to control this behaviour.
If you need the original array to be not sorted (and this doesn't happen so often) then just do:
$copy = $yourArray;
sort($yourArray);

$data = array() vs unset($array)

this is my first question.
I am doing some optimizations on a php script, improving its speed of execution...
Between :
$datas = array();
$datas['file_import'] = $file_name_reporting;
And :
unset($datas);
$datas['file_import'] = $file_name_reporting;
Can someone tell me which one is faster ?
Thank you
Your second example causes warning, because $datas is right now null and you are treating it as an array, so you have to declare it as an empty array before.
So just follow your first example - assign an empty array and then put into it some data.
array() will create an array whereas unset() will destroy a variable.
I think first method is just a overwriting but second one includes deleting, checking existence, triggering warning and creating new array
It's ridiculous to claim that either form is "faster" than the other. Both versions will execute so fast that you would need to run them millions of times inside a loop to perhaps notice a difference. Do you actually do that inside your script? If not, forget about "optimization" here (actually, it would be a good idea to forget about all optimization "by eye", as any experienced developer can tell you).
On top of that, the two versions actually do different things, in that unset will remove the name $datas from the sumbol table (and give you a notice in the next line when you attempt to add a value to an array).
Just use what feels right, and look inside heavy loops to find something to optimize.
In both cases, a new Array will be constructed. Unsetting a variable in php, will set it's value to null, only to call the array constructor on the next line. Although I agree with knittl, my suggestion would be:
$datas = array('file_import' => $file_name_reporting);
By creating a new array, you automatically 'unset' the variable, and by passing values to the array constructor, you can fill your array with whatever values you want while you're at it.
Obviously the first code will work faster because you do only two operations: explicitly create an array and add a portion of data. The second example will cause a warning because you destroy a variable and then try to use it again.
Additionally unset will not release used memory, it will only release a pointer on variable. Memory will be released when gc will be runned. To release a memory, use $datas = null; instead.

What to use as array_import/var_import for sort-of exported array?

I have a string. It's a user submitted string. (And you should never ever trust user submitted anything.)
If certain (not unsafe) characters exist in the string, it's supposed to become a multi dimensional array/tree. First I tried splits, regex and loops. Too difficult. I've found a very easy solution with a few simple str_replace's and the result is a string that looks like an array definition. Eg:
array('body', array('div', array('x'), array(), array('')), array(array('oele')))
It's a silly array, but it's very easily created. Now that string has to become that array. I'm using eval() for that and I don't like it. Since it's user submitted (and must be able to contain just about anything), there could be any sort of function calls in that string.
So the million dollar question: is there some kind of var_import, or array_import that creates an array from a string and does nothing else (like mysterious, dangerous calls to exec etc)?
Yes, I have tried php.net and neither of the above _import functions exist.
What I'm looking for is the exact opposite of var_import, becasuse the string I have as input, looks exactly like the string var_export would output.
Any other suggestions to make it safer then eval are also welcome! But I'm not abandoning the current method (it's just too simple).
Using
array('body', array('div', array('x'), array(), array('')), array(array('oele')))
as input, I replaced some chars to make it a valid JSON string and imported that via json_decode.
Works perfectly. If some illegal chars are present, json_decode will trip over them (and not execute any dangerous code).

Does foreach always create a copy on a none reference in PHP?

I'm wondering if PHP has this optimization built in. Normally when you call foreach without using a reference it copies the passed array and operates on it. What happens if the reference count to that array is only 1?
Say for example if getData returns some array of data.
foreach(getData() as $data)
echo $data;
Since the array returned by getData() only has one reference shouldn't it just be used by reference and not copied first or does php not have this optimization?
This seems like a simple optimization that could help a lot of badly written code.
I can't say for certain, but PHP normally uses "copy on write", so everything is a reference until you try to write to it, at which time a copy is made and you write to the copy.

Categories