How can "[" be an operator in the PHP language specification? - php

On the http://php.net/manual/en/language.operators.precedence.php webpage, the second highest precedence level contains a left-associative operator called [.
I don't understand that. Is it the [ used to access/modify array entries, as in $myArray[23] ? I cannot imagine any code snippet where we would need to know the "precedence" of it wrt other operators, or where the "associativity" of [ would be useful.

This is a very valid question.
1. Precedence in between [...]
First there is never an ambiguity to what PHP should evaluate first when looking at the
right side of the [, since the bracket requires a closing one to go with it, and so
every operator in between has precedence over the opening bracket.
Example:
$a[1+2]
The + has precedence, i.e. first 1+2 has to be evaluated before PHP can determine which
element to retrieve from $a.
But the operator precedence list is not about this.
2. Associativity
Secondly there is an order of evaluating consecutive pairs of [], like here:
$b[1][2]
PHP will first evaluate $b[1] and then apply [2] to that. This is left-to-right
evaluation and is what is intended with left associativity.
But the question at hand is not so much about associativity, but about precedence with regards to other operators.
3. Precedence over operators on the left side
The list states that clone and new operators have precedence over [, and this is not easy to test.
First of all, most of the constructs where you would combine new with square brackets are considered invalid
syntax. For example, both of these statements:
$a = new myClass()[0];
$a = new myClass[0];
will give a parsing error:
syntax error, unexpected '['
PHP requires you to add parentheses to make the syntax valid. So there is no way we can test
the precedence rules like this.
But there is another way, by using a variable containing a class name:
$a = new $test[0];
This is valid syntax, but now the challenge is to make a class that creates something
that acts like an array.
This is not trivial to do, as an object property is referenced like this: obj->prop, not
like obj["prop"]. One can however use the ArrayObject class which can deal with square brackets. The idea is to extend this class and redefine the offsetGet method to make sure a freshly made object of that class has array elements to return.
To make objects printable, I ended up using the magical method __toString, which is executed when an object needs to be cast to a string.
So I came up with this set-up, defining two similar classes:
class T extends ArrayObject {
public function __toString() {
return "I am a T object";
}
public function offsetGet ($offset) {
return "I am a T object's array element";
}
}
class TestClass extends ArrayObject {
public function __toString() {
return "I am a TestClass object";
}
public function offsetGet ($offset) {
return "I am a TestClass object's array element";
}
}
$test = "TestClass";
With this set-up we can test a few things.
Test 1
echo new $test;
This statement creates a new TestClass instance, which then needs to be converted to
string, so the __toString method is called on that new instance, which returns:
I am a TestClass object
This is as expected.
Test 2
echo (new $test)[0];
Here we start with the same actions, as the parentheses force the new operation to be executed first. This time PHP does not convert the created object to string, but requests array element 0 from it. This request is answered by the offsetGet method, and so the above statement outputs:
I am a TestClass object's array element
Test 3
echo new ($test[0]);
The idea is to force the opposite order of execution. Sadly enough, PHP does not allow this syntax, so will have to break the statement into two in order to get the intended evaluation order:
$name = $test[0];
echo new $name;
So now the [ is executed first, taking the first character of the value of
$test, i.e. "T", and then new is applied to that. That's why I
defined also a T class. The echo calls __toString on that instance, which yields:
I am a T object
Now comes the final test to see which is the order when no parentheses are present:
Test 4
echo new $test[0];
This is valid syntax, and...
4. Conclusion
The output is:
I am a T object
So in fact, PHP applied the [ before the new operator, despite what is stated in the
operator precedence table!
5. Comparing clone with new
The clone operator has similar behaviour in combination with [. Strangely enough, clone and new are not completely equal in terms of syntax rules. Repeating test 2 with clone:
echo (clone $test)[0];
yields a parsing error:
syntax error, unexpected '['
But test 4 repeated with clone shows that [ has precedence over it.
#bishop informed that this reproduces the long standing documentation bug #61513: "clone operator precedence is wrong".

It just means the array variable (left associativity - $first) will be evaluated before the array key (right associativity - $second)
$first[$second]
This have lot of sense when array has multiple dimensions
$first[$second][$third][$fourth]

In PHP you can initialize empty arrays with [] so in order to know how to define an array the precedence of the next character defines on how to initialize the array.
Since arrays are part of the syntax structure, it is done before any math, it simply has a higher precedence than other calculative operators for that reason.
var_dump([5, 6, 7] + [1, 2, 3, 4]); # 5674 - The array's must be known before applying the operator
However in all honesty I don't really understand the question. In most programming languages the [ and the ] are associated with arrays which is part of the base syntax that always have a high priority (if not the highest)

Related

Avoid a "PHP Strict standards" warning with parentheses? [duplicate]

It was noted in another question that wrapping the result of a PHP function call in parentheses can somehow convert the result into a fully-fledged expression, such that the following works:
<?php
error_reporting(E_ALL | E_STRICT);
function get_array() {
return array();
}
function foo() {
// return reset(get_array());
// ^ error: "Only variables should be passed by reference"
return reset((get_array()));
// ^ OK
}
foo();
I'm trying to find anything in the documentation to explicitly and unambiguously explain what is happening here. Unlike in C++, I don't know enough about the PHP grammar and its treatment of statements/expressions to derive it myself.
Is there anything hidden in the documentation regarding this behaviour? If not, can somebody else explain it without resorting to supposition?
Update
I first found this EBNF purporting to represent the PHP grammar, and tried to decode my scripts myself, but eventually gave up.
Then, using phc to generate a .dot file of the two foo() variants, I produced AST images for both scripts using the following commands:
$ yum install phc graphviz
$ phc --dump-ast-dot test1.php > test1.dot
$ dot -Tpng test1.dot > test1.png
$ phc --dump-ast-dot test2.php > test2.dot
$ dot -Tpng test2.dot > test2.png
In both cases the result was exactly the same:
This behavior could be classified as bug, so you should definitely not rely on it.
The (simplified) conditions for the message not to be thrown on a function call are as follows (see the definition of the opcode ZEND_SEND_VAR_NO_REF):
the argument is not a function call (or if it is, it returns by reference), and
the argument is either a reference or it has reference count 1 (if it has reference count 1, it's turned into a reference).
Let's analyze these in more detail.
First point is true (not a function call)
Due to the additional parentheses, PHP no longer detects that the argument is a function call.
When parsing a non empty function argument list there are three possibilities for PHP:
An expr_without_variable
A variable
(A & followed by a variable, for the removed call-time pass by reference feature)
When writing just get_array() PHP sees this as a variable.
(get_array()) on the other hand does not qualify as a variable. It is an expr_without_variable.
This ultimately affects the way the code compiles, namely the extended value of the opcode SEND_VAR_NO_REF will no longer include the flag ZEND_ARG_SEND_FUNCTION, which is the way the function call is detected in the opcode implementation.
Second point is true (the reference count is 1)
At several points, the Zend Engine allows non-references with reference count 1 where references are expected. These details should not be exposed to the user, but unfortunately they are here.
In your example you're returning an array that's not referenced from anywhere else. If it were, you would still get the message, i.e. this second point would not be true.
So the following very similar example does not work:
<?php
$a = array();
function get_array() {
return $GLOBALS['a'];
}
return reset((get_array()));
A) To understand what's happening here, one needs to understand PHP's handling of values/variables and references (PDF, 1.2MB). As stated throughout the documentation: "references are not pointers"; and you can only return variables by reference from a function - nothing else.
In my opinion, that means, any function in PHP will return a reference. But some functions (built in PHP) require values/variables as arguments. Now, if you are nesting function-calls, the inner one returns a reference, while the outer one expects a value. This leads to the 'famous' E_STRICT-error "Only variables should be passed by reference".
$fileName = 'example.txt';
$fileExtension = array_pop(explode('.', $fileName));
// will result in Error 2048: Only variables should be passed by reference in…
B) I found a line in the PHP-syntax description linked in the question.
expr_without_variable = "(" expr ")"
In combination with this sentence from the documentation: "In PHP, almost anything you write is an expression. The simplest yet most accurate way to define an expression is 'anything that has a value'.", this leads me to the conclusion that even (5) is an expression in PHP, which evaluates to an integer with the value 5.
(As $a = 5 is not only an assignment but also an expression, which evalutes to 5.)
Conclusion
If you pass a reference to the expression (...), this expression will return a value, which then may be passed as argument to the outer function. If that (my line of thought) is true, the following two lines should work equivalently:
// what I've used over years: (spaces only added for readability)
$fileExtension = array_pop( ( explode('.', $fileName) ) );
// vs
$fileExtension = array_pop( $tmp = explode('.', $fileName) );
See also PHP 5.0.5: Fatal error: Only variables can be passed by reference; 13.09.2005

Why do I have to use the reference operator (&) in a function call?

Setup
I am borrowing a function from an open source CMS that I frequently use for a custom project.
It's purpose is not important to this question but if you want to know it's a simple static cache designed to reduce database queries. I can call getObject 10 times in one page load and not have to worry about hitting the database 10 times.
Code
A simplified version of the function looks like this:
function &staticStorage($name, $default_value = NULL)
{
static $data = array();
if (isset($data[$name])
{
return $data[$name];
}
$data[$name] = $default_value;
return $data[$name];
}
This function would be called in something like this:
function getObject($object_id)
{
$object = &staticStorage('object_' . $object_id);
if ($object)
{
return $object;
}
// This query isn't accurate but that's ok it's not important to the question.
$object = databaseQuery('SELECT * FROM Objects WHERE id = #object_id',
array('#object_id => $object_id'));
return $object;
}
The idea is that once I call static_storage the returned value will update the static storage as it is changed.
The problem
My interest is in the line $object = &staticStorage('object_' . $object_id); Notice the & in front of the function. The staticStorage function returns a reference already so I initially did not include the reference operator preceding the function call. However, without the reference preceding the function call it does not work correctly.
My understanding of pointers is if I return a pointer php will automatically cast the variable as a pointer $a = &$b will cause $a to point to the value of $b.
The question
Why? If the function returns a reference why do I have to use the reference operator to precede the function call?
From the PHP docs
Note: Unlike parameter passing, here you have to use & in both places - to indicate that you want to return by reference, not a copy, and to indicate that reference binding, rather than usual assignment, should be done for $myValue.
http://php.net/manual/en/language.references.return.php
Basically, its to help the php interpreter. The first use in the function definition is to return the reference, and the second is to bind by reference instead of value to the assignment.
By putting the & in the function declaration, the function will return a memory address of the return value. The assignment, when getting this memory address would interpret the value as an int unless explicitly told otherwise, this is why the second & is needed for the assignment operator.
EDIT: As pointed out by #ringø below, it does not return a memory address, but rather an object that will be treated like a copy (technically copy-on-write).
The PHP doc explains how to use, and why, functions that return references.
In your code, the getObject() function needs also a & (and the call as well) otherwise the reference is lost and the data, while usable, is based on PHP copy-on-write (returned data and source data point both to the same actual data until there is a change in one of them => two blocks of data having a distinct life)
This wouldn't work (syntax error)
$a = array(1, 2, 3);
return &$a;
this doesn't work as intended (no reference returned)
$a = array(1, 2, 3);
$ref = &$a;
return $ref;
and without adding the & to the function call as you said, no reference returned either.
To the question as to why... There doesn't seem to be a consistent answer.
if one of the & is missing PHP treats data as if it isn't a reference (like returning an array for instance) with no warning whatsoever
here some strangeness associated to functions returning references
PHP evolved during the years but still inherits some of the initial poor design choices. This seems to be one of them (this syntax is error prone as one may easily miss one &... and no warning ahead... ; also why not directly return a reference like return &$var;?). PHP made some progress but still, traces of poor design subsist.
You may also be interested in this chapter of the doc linked above
Do not use return-by-reference to increase performance. The engine will automatically optimize this on its own. Only return references when you have a valid technical reason to do so.
Finally, it's better not to look too much for equivalences between the pointers in C and the PHP references (Perl is closer than PHP in this regard). PHP adds a layer between the actual pointer to data and variables and references point rather to that layer than the actual data. But a reference is not a pointer. If $a is an array and $b is a reference to $a, using either $a or $b to access the array is equivalent. There is no dereference syntax, a *$b for instance like in C. $b should be seen as an alias of $a. This is also the reason a function can only return a reference to a variable.

PHP array syntax/operator?

When writing the syntax for an associative array in PHP we do the following
$a = array('foo' => 'bar');
I am curious of the relationship of the => syntax, or possibly operator. Does this relate to some kind of reference used in the hash table in ZE, or some kind of subsequent right shift or reference used in C? I guess I am just wondering the true underlying purpose of this syntax, how it relates to ZE and/or php extensions used to handle arrays, how it possibly relates to the written function in C before compiled, or If I just have no idea what I am talking about :)
The => symbol a.k.a. T_DOUBLE_ARROW is just a parser token like class, || or ::.
See: The list of php parser tokens
It's nothing special apart from that fact that "it looks like an arrow" and it is used for "array stuff".
Of course the exact usage is more complicated than that but "array stuff" is the short inaccurate description that should do it.
It's used to represent key => (points to) value
The answer to that is no simpler than "It looks like an arrow".
It's not exactly the assignment operator per say because that would mean a variable-like assignment (like for the array itself). This is an array-internals specific assignment operator.
Webdevelopers are cool like that :P

Initializing PHP class property declarations with simple expressions yields syntax error

According to the PHP docs, one can initialize properties in classes with the following restriction:
"This declaration may include an initialization, but this initialization must be a constant value--that is, it must be able to be evaluated at compile time and must not depend on run-time information in order to be evaluated."
I'm trying to initialize an array and having some issues. While this works fine:
public $var = array(
1 => 4,
2 => 5,
);
This creates a syntax error:
public $var = array(
1 => 4,
2 => (4+1),
);
Even this isn't accepted:
public $var = 4+1;
which suggests it's not a limitation of the array() language construct.
Now, the last time I checked, "4+1" equated to a constant value that not only should be accepted, but should in fact be optimized away. In any case, it's certainly able to be evaluated at compile-time.
So what's going on here? Is the limitation really along the lines of "cannot be any calculated expression at all", versus any expression "able to be evaluated at compile time"? The use of "evaluated" in the doc's language suggests that simple calculations are permitted, but alas....
If this is a bug in PHP, does anyone have a bug ID? I tried to find one but didn't have any luck.
PHP doesn't do such operations at compile-time; you cannot assign calculated values to constants, even if all operators are constants themselves. Default values of class members are treated the exact same way. I encountered this behaviour as I tried to assign powers of two to constants:
class User {
const IS_ADMIN = 1;
const IS_MODERATOR1 = 1 << 1; // Won't work
const IS_MODERATOR2 = 0x02; // works
}
This limitation no longer exists as of PHP 5.6
The new feature that enables the previously-disallowed syntax is called constant scalar expressions:
It is now possible to provide a scalar expression involving numeric
and string literals and/or constants in contexts where PHP previously
expected a static value, such as constant and property declarations
and default function arguments.
class C {
const THREE = TWO + 1;
const ONE_THIRD = ONE / self::THREE;
const SENTENCE = 'The value of THREE is '.self::THREE;
public function f($a = ONE + self::THREE) {
return $a;
}
}
echo (new C)->f()."\n"; echo C::SENTENCE; ?>
The above example will output:
4 The value of THREE is 3
Before you throw your arms up at php for this, think about the execution model. In the environment that php is typically used for(and, in fact, designed for), everything is built up, executed, and then thrown away...until the next http request comes in. It doesn't make a lot of sense to waste time doing computations during the parsing/compilation phase. The engine needs to be very swift here in the general case.
But, you're right, that quote from the manual does say "evaluate". Maybe you should open a documentation ticket.
Edit march 2014
it looks like php will now support Constant Scalar Expressions in php 5.6:

What is the "->" PHP operator called? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Closed 5 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
What do you call this arrow looking -> operator found in PHP?
It's either a minus sign, dash or hyphen followed by a greater than sign (or right chevron).
How do you pronounce it when reading code out loud?
The official name is "object operator" - T_OBJECT_OPERATOR.
I call it "dart"; as in $Foo->bar() : "Foo dart bar"
Since many languages use "dot" as in Foo.bar(); I wanted a one-syllable word to use. "Arrow" is just too long-winded! ;)
Since PHP uses . "dot" for concatenation (why?) I can't safely say "dot" -- it could confuse.
Discussing with a co-worker a while back, we decided on "dart" as a word similar enough to "dot" to flow comfortably, but distinct enough (at least when we say it) to not be mistaken for a concatenating "dot".
When reading PHP code aloud, I don't pronounce the "->" operator. For $db->prepare($query); I mostly say "Db [short pause] prepare query." So I guess I speak it like a comma in a regular sentence.
The same goes for the Paamayim Nekudotayim ("::").
When reading the code to myself, I think of it like a "possessive thing".
For example:
x->value = y->value
would read "x's value equals y's value"
Most often, I use some variation on #Tor Valamo's method ("the B method of A" or "A's B method"), but I sometimes say "dot". E.g. "call A dot B()".
Property operator.
When reading $a->b() or $a->b loud I just say "call b on the $a obj" or "get b from/in/of $a"
I personally like to be verbose in expressing my code verbally.
e.g.:
$foo = new Foo();
echo $foo->bar
would read as such:
echo(/print) the bar property of object foo.
It's verbose and more time consuming, but I find if there is a reason for me to be expressing my code verbally, then I probably need to be clear as to what I'm communicating exactly.
Harkening back to the Cobol 'in' where you would say "Move 5 to b in a." Most languages today qualify things the other direction.
Yet I would still read $a->b(); as "Call b in a".
$a->b
I call as "param b of $a".
$a->b()
I call as "function b of $a".
The single arrow can easily be referred verbally as what it means for PHP OOP: Member. So, for $a->var you would say "Object a's member var".
When reading code aloud, it does help to read ahead (lookahead reference, sorry), and know what you may actually be referring to. For instance, let's have the following bit of code:
<?php
Class MyClass {
$foo = 0;
public function bar() {
return $this->foo;
}
}
$myobject = new MyClass();
$myobject->bar();
?>
So, if I were to read aloud this code block as a demonstration of code, I would say this:
"Define Class 'MyClass', open-brace. Variable 'foo' equals zero, terminate line. Define public function 'bar' open-close parentheses, open-brace. Return member variable 'foo' of object 'this', terminate line. Close-brace, close-brace. Instantiate new instance of 'MyClass' (no arguments) as variable object 'myobject', terminate line. Call member method 'bar' of object 'myobject', terminate line."
However, if I were reading the code aloud for other students or programmers to copy without a visual, then I would say:
"PHP open tag (full), newline, indent, Class MyClass open-brace, newline. Indent, ['dollar-sign' | 'Object-marker'] foo equals 0, semicolon, newline, newline. public function bar open-close parentheses, open-brace, newline. Indent, return ['dollar-sign' | 'Object-marker'] this arrow foo, semicolon, newline. Reverse-indent, close-brace, newline, reverse-indent, close-brace, newline, newline. ['dollar-sign' | 'Object-marker'] myobject equals new MyClass open-close parentheses, semicolon, newline. ['dollar-sign' | 'Object-marker'] myobject arrow bar open-close parentheses, semicolon, newline. Reverse-indent, PHP close tag."
Where you see ['dollar-sign' | 'Object-marker'], just choose whichever you tend to speak for '$', I switch between the two frequently, just depends on my audience.
So, to sum it up, in PHP, -> refers to a member of an object, be it either a variable, another object, or a method.
The senior PHP developer where I work says "arrow".
$A->B;
When he's telling me to type the above, he'll say, "Dollar A arrow B" or for
$A->B();
"Dollar A arrow B parens."
I would do it like this:
//Instantiated object variable with the name mono call property name
$mono->name;
//Instantiated object variable with the name mono call method ship
$mono->ship();
In my book (PHP Master by Lorna Mitchell) it's called the object operator.
user187291 has answered the non-subjective question, as for the subjective one, I say "sub". For example, I would pronounce $x->y as "x sub y" - "sub" in this context is short for "subscript". This is more appropriate for array notation; $x['y'] I also pronounce "x sub y". Typically when reading code out loud, I am using my words to identify a line the other developer can see, so the ambiguity has yet to become a problem. I believe the cause is that I view structs/objects and arrays as "collections", and elements thereof are subscripted as in mathematical notation.
Japanese has a convenient particle for this, "no" (の). It is the possessive particle, meaning roughly "relating to the preceding term".
"Fred の badger" means "Fred's badger".
Which is a long way round of saying that at least mentally, and also amongst those who understand Japanese, I tend to differentiate them by reading them as:
$A->B(); // "Call $A's B."
C::D(); // "Call C の D."
But reading both as a possessive "'s" works too, for non-Japanese speakers. You can also flip them and use "of", so "D of C"... but that's kinda awkward, and hard to do it you're reading rapidly because it breaks your linear flow.
If I were dictating, though, perhaps when pair coding and suggesting what to type, I'd refer to the specific characters as "dash-greater-than arrow" and "double-colon" (if i know the person I'm talking to is a PHPophile, I might call that a "paamayim nekudotayim double-colon" the first time, partly because it's a cool in-joke, partly to prevent them from "correcting" me).
Object. So $a->$b->$c would be 'A object B object c'.

Categories