PHP parser: braces around variables - php

I was wondering, how is the semantics of braces exactly defined
inside PHP? For instance, suppose we have defined:
$a = "foo";
then what are the differences among:
echo "${a}";
echo "{$a}";
that is, are there any circumstances where the placement of the
dollar sign outside the braces as opposed to within braces makes
a difference or is the result always the same (with braces used
to group just about anything)?

There are a lot of possibilities for braces (such as omitting them), and things get even more complicated when dealing with objects or arrays.
I prefer interpolation to concatenation, and I prefer to omit braces when not necessary. Sometimes, they are.
You cannot use object operators with ${} syntax. You must use {$...} when calling methods, or chaining operators (if you have only one operator such as to get a member, the braces may be omitted).
The ${} syntax can be used for variable variables:
$y = 'x';
$x = 'hello';
echo "${$y}"; //hello
The $$ syntax does not interpolate in a string, making ${} necessary for interpolation. You can also use strings (${'y'}) and even concatenate within a ${} block. However, variable variables can probably be considered a bad thing.
For arrays, either will work ${foo['bar']} vs. {$foo['bar']}. I prefer just $foo[bar] (for interpolation only -- outside of a string bar will be treated as a constant in that context).

The brackets delimit where the variable name ends; this example should speak for itself.
$a = "hi!"
echo "$afoo"; //$afoo is undefined
echo "${a}foo"; //hi!foo
echo "{$a}foo"; //hi!foo
Also, this should spit out a warning; you should use
${'a'}
Otherwise it will attempt to assume a is a constant.

Also you can use braces to get Char in the position $i of string $text:
$i=2;
$text="safi";
echo $text{$i}; // f

Related

Array and string offset access syntax with curly braces is no longer supported [duplicate]

In PHP you can access characters of strings in a few different ways, one of which is substr(). You can also access the Nth character in a string with curly or square braces, like so:
$string = 'hello';
echo $string{0}; // h
echo $string[0]; // h
My question is, is there a benefit of one over the other? What's the difference between {} and []?
Thanks.
use $string[0], the other method (braces) has been removed in PHP 8.0.
For strings:
Accessing characters within string literals using the {} syntax has been deprecated in PHP 7.4. This has been removed in PHP 8.0.
And for arrays:
Prior to PHP 8.0.0, square brackets and curly braces could be used interchangeably for accessing array elements (e.g. $array[42] and $array{42} would both do the same thing in the example above). The curly brace syntax was deprecated as of PHP 7.4.0 and no longer supported as of PHP 8.0.0.
There is no difference. Owen's answer is outdated, the latest version of PHP Manual no longer states that it is deprecated §:
Characters within strings may be accessed and modified by specifying
the zero-based offset of the desired character after the string using
square array brackets, as in $str[42]. Think of a string as an array
of characters for this purpose. [...]
Note: Strings may also be accessed using braces, as in $str{42}, for
the same purpose.
However it seems that more people/projects use [], and that many people don't even know {} is possible. If you need to share your code publicly or with people who don't know the curly brace syntax, it may be beneficial to use [].
UPDATED : accessing string characters with {} is deprecated, use [] instead.
Yes, there's no difference. This language quirk has some history...
Originally, the curly brace syntax was intended to replace the square bracket syntax which was going to be deprecated:
http://web.archive.org/web/20010614144731/http://www.php.net/manual/en/language.types.string.php#language.types.string.substr.
Later that policy was reversed, and the square brackets syntax was preferred instead:
http://web.archive.org/web/20060702080821/http://php.net/manual/en/language.types.string.php#language.types.string.substr
and even later, the curly braces one was going to be deprecated:
http://web.archive.org/web/20080612153808/http://www.php.net/manual/en/language.types.string.php#language.types.string.substr
As of this writing, it seems that the deprecation has been withdrawn as well and they are just considered two alternative syntaxes:
http://web.archive.org/web/20160607224929/http://php.net/manual/en/language.types.string.php#language.types.string.substr
Curly brace access was deprecated in PHP 7.4
Array and string offset access using curly braces ¶
The array and string offset access syntax using curly braces is
deprecated. Use $var[$idx] instead of $var{$idx}.
PHP 7.4 Deprecated Features, PHP Core

How to replace below preg_replace with preg_replace_callback?

I am having difficulty converting below preg_replace() function call
preg_replace("/\{(.*?)\}/e", '$\1', $data)
to using preg_replace_callback() (because of the removed e modifier in PHP 7.0).
I have tried this but I have no idea how to fully handle '$\1':
preg_replace_callback('/\{(.*?)\}/', function ($matches) {
return $matches[0];
}, $data);
Any help would be highly appreciated.
I'd like to suggest the following code as an answer to the concrete question:
$vars = get_defined_vars();
$result = preg_replace_callback('/{(.*?)}/', function ($matches) use ($vars) {
return $vars[$matches[1]];
}, $data);
unset($vars);
The remaining part of the answer should provide more information and references for mainly two things:
Show how this can be solved with divide and conquer also leading to a step-by-step guide on how to port such code.
Add more context as depending on how/where that code is to be ported, there can be differences, also for error handling and PHP version compatibility requirements.
This should make the answer more applicable to similar variable variables related preg_replace() with e modifier migration question based on backreferences.
The e (PREG_REPLACE_EVAL) Modifier
This feature was DEPRECATED in PHP 5.5.0, and REMOVED as of PHP 7.0.0.
It was only used by preg_replace() and was ignored by other PCRE functions.
From a previous PHP manual description revision:
If this deprecated modifier is set, preg_replace() does normal substitution of backreferences in the replacement string, evaluates it as PHP code, and uses the result for replacing the search string. Single quotes, double quotes, backslashes (\) and NULL chars will be escaped by backslashes in substituted backreferences.
Rationale and context why it was deprecated/removed can be found in RFC: Remove preg_replace /e modifier, mainly three issue classes:
Security issues
Overescaping of quotes
Use as obfuscation in exploit scripts
The PHP RFC Wiki page has more details, and the information is a good addition to the answer as a port at least crosses 1. and 2. for the removed PHP code evaluation.
The '$\1' Replacement
As per the e modifiers description, '$\1' will be evaluated after the backreference \1 (first matching group) is replaced.
In the questions example that is the contents of the angle brackets {...}:
'/\{(.*?)\}/'
~~~~~
1 : first matching group
For example when the subject string is "Hello {name}", the contents of the first matching group is "name". Resolving it leads to the following PHP code that then is evaluated:
$name
That is a variable named "name". The evaluation is done within the scope where preg_replace() is called.
So far the description of the replacement pattern.
How to make compatible with PHP 7.0.0 (and earlier/later)?
A common way to start changing away from the e modifier is to make use of preg_replace_callback() instead of preg_replace(), which is done by replacing it and using an anonymous function (or any other callback method, however anonymous functions are normally the preferable way in most cases).
This is also (thankfully) outlined on the reference question. In the following I'll first leave backslash escaping of the substituted backreferences out to simplify the solution (and address it later).
An example of what has been done so far (with only a slight correction on the $matches index - it needs to be 1 not 0):
preg_replace_callback('/\{(.*?)\}/', function ($matches) {
return $matches[1];
}, $data);
The \1 backreference from the first matching group is done by using $matches[1] here. It will contain the contents of the angle brackets {...}, e.g. "name" from the previous example.
(compare: Changing preg_replace to preg_replace_callback)
More or less obviously for the here specific $\1 replacement, it is incomplete as it would only replace with the name of the variable and not (yet) its contents.
Still missing is to connect the name with the original scope. Which requires a little more work.
Obtain Variables in preg_replace() Scope
To obtain all variables defined in the same scope as the preg_replace_callback() (previously preg_replace()) call, the get_defined_vars() function is an option:
This function returns a multidimensional array containing a list of all defined variables, be them environment, server or user-defined variables, within the scope that get_defined_vars() is called.
Using that array within the anonymous callback function then allows to obtain the value of a variable by its name as array key:
$vars = get_defined_vars(); # <1>
preg_replace_callback('/\{(.*?)\}/', function ($matches) use ($vars) { # <2>
$name = $matches[1];
return $vars[$name]; # <3>
}, $data);
Obtain variables from preg_replace scope.
Use variables with the anonymous function (the use language construct).
Access variables value by name and return.
This was the missing part in the question to turn the backreference used as variable name to obtain the actual value already.
As so often, there are similar ways to achieve the same, some of them more depending on context. Truly get_defined_vars() is a pretty generic way to create a "variable table" and map names to their value. But there can be circumstances for which an array is already available and there might be no need to call that function.
Alternative to get_defined_vars(): Use of $GLOBALS array
This approach has been chosen by Wiktor Stribiżew in his answer:
Given the scope is the global scope (likely not, but if), then the $GLOBALS superglobal can be used instead:
$result = preg_replace_callback('/\{(.*?)\}/', function ($matches) {
$name = $matches[1];
return $GLOBALS[$name];
}, $data);
No need to call get_defined_vars() nor to unset the $vars array after the call (or otherwise need to potentially care about it). But this is binding to global variable state (may or not be an issue with the application).
Alternative to get_defined_vars(): Re-Use of another array (if available)
Given variables were previously imported into the scope where preg_replace() with the e modifier was running from an array, then the import is redundant and the array itself can be used with the callbacks function use clause. An example:
function replace_variables(string $data, array $vars) {
# previously here: extract($vars);
$result = preg_replace_callback('/\{(.*?)\}/', function ($matches) use ($vars) {
$name = $matches[1];
return $vars[$name];
}, $data);
# ...
}
As extract() comes with side effects you normally want to prevent, this would catch two birds with one stone: The variables array was already available and get_defined_vars() must not be called. Additionally, an unsafe extract operation can be dropped as it is not necessary any longer to create variables in the scope of the earlier preg_replace().
This should leave enough food for thought to connect the name in the backreference to the value. The PHP manual has more about variable scope in case there is a more specific context. Normally get_defined_vars() should address most issues if an array is not yet available.
Notes for the '/\{(.*?)\}/' Regular Expression Pattern
This pattern comes with some caveats, therefore I'm leaving some notes for additional information and to open up on error handling and changes of it due to porting, which will address more issues.
The backslashes "\" are redundant:
Just a minor thing to get it out of the way:
ok.....: '/\{(.*?)\}/'
correct: '/{(.*?)}/'
This change can be always done, those backslashes are redundant. They don't qualify as quantifiers.
This improves readability of the pattern.
Change in Regular Expression Pattern PHP Error Behaviour
Second worth a note on the search pattern is to highlight a potential incompatibility:
The pattern allows a zero-length match, that is the empty angle brackets group {} does match leading to a zero-length (variable) name. It could be used to present a default value (e.g. null) but perhaps you may want to not have it matching at all or may want to add error handling.
w/ empty.: '/{(.*?)}/'
w/ length: '/{(.+?)}/'
Which brings up a related point: Undefined variable/index warnings.
To prevent undefined index warnings these could resolve to null silently (or you may want to add error handling). This has been done in the upfront code porting suggestion at the very beginning of the answer.
Note thought that these errors were harsher with the previous preg_replace() call with the e modifier as the empty name resulted in a parse error when evaluated and then a fatal error. Example:
PHP Parse error: syntax error, unexpected ';', expecting variable (T_VARIABLE) or '$' in ... : regexp code on line 1
PHP Fatal error: preg_replace(): Failed evaluating code:
$
To define such errors out of existence as of a PHP 7.0.0 (and above/below) compatible port:
$vars = get_defined_vars();
$result = preg_replace_callback('/{(.*?)}/', function ($matches) use ($vars) {
$name = $matches[1];
return isset($vars[$name]) ? $vars[$name] : null;
}, $data);
unset($vars);
Alternatively it is possible to mimik the old error behaviour (a bit) by throwing (e.g. on empty name), as it triggers a fatal, uncaught exception error:
$vars = get_defined_vars();
$result = preg_replace_callback('/{(.*?)}/', function ($matches) use ($vars) {
$name = $matches[1];
if ('' === $name) {
throw new \RuntimeException('preg_replace_callback(): callback: Expected variable name, got zero-length string.');
}
return isset($vars[$name]) ? $vars[$name] : null;
}, $data);
unset($vars);
(if backwards compatibility below PHP 7.0.0 is not an issue, throwing an \Error is a more matching alternative for PHP 7.0.0 and above. Alternatively use trigger_error() instead to include versions below PHP 7.0.0 as well)
However, I'd suggest to look more into how the overall process can be made more error-safe. Even this depends much on the context of the original code and requires a more decent look, it allows benefiting from the changes. The following discussion/example will show even more.
Changes in Replacement Pattern (previous Backslash Escapes for Backreferences)
Removing the e (PREG_REPLACE_EVAL) modifier does not only require to have a callback function but also comes with another change: Backslash escapes were added earlier but will not any longer with the callback function.
This has been kept out so far. To complete the answer, it should get some attention. First as a reminder, from the (now removed) e modifier documentation what this is about:
Single quotes, double quotes, backslashes (\) and NULL chars will be escaped by backslashes in substituted backreferences.
This can lead to code that contains one or more calls to stripslashes() within the replacement pattern. This is not the case for this question so the consequences are that backslash escapes aren't added any longer.
As mario writes in an answer to the reference question:
[...] stripslashes() often becomes redundant in literal expressions.
In this question, it is a little different: As stripslashes() is not within the replacement pattern, there is nothing to be redundant / remove in "$\1".
To demonstrate the changes with a double and single quote within a "variable name" in the absence of the escaping for preg_replace_callback() compared to using the e modifier:
Data
e Modifier
Callback
{abc}
$abc
$abc
{a"bc}
$a\"bc (E)
$a"bc (I)
${${'abc'}}
$${\'abc\' (E)
$${'abc' (I)
(E): PHP Parse error
(I): Invalid variable name (informative only)
This once more highlights that the original replacement pattern has issues with the name stored as backreference to the first matching group - as discussed above for a zero-length variable name - it is lax and allows invalid names (which could have lead to PHP Parse errors due to evaluating the replacement previously).
The backslash escaping added to that. As the regular expression pattern does a lazy match (.*? - the question mark after the asterix) it was at least not completely in free-form.
The port therefore has less such issues but only on a finer difference.
Therefore, porting itself does not address this issue much. Actually what was a PHP fatal error earlier now turns into an undefined index PHP warning with the consequence that the script continues to run where it stopped earlier.
This could be seen as an argument for (or against) failing early with the port - it depends.
It could be done by checking for invalid variable names (assuming those would have caused a fatal parse error during evaluation - not an undefined variable warning).
A PCRE regular expression pattern for variable names in PHP is ^[a-zA-Z_\x80-\xff][a-zA-Z0-9_\x80-\xff]*$). One idea which came to my mind was to use it to classify a name whether it is a valid PHP variable name or not.
Additionally, the next example is an opportunity to show how the backslash escapes and the error/warning behaviour can be preserved:
$vars = get_defined_vars();
$result = preg_replace_callback('/{(.*?)}/', function ($matches) use ($vars) {
$name = addcslashes($matches[1], "'\"\0"); # <1>
if (!preg_match('(^[a-zA-Z_\x80-\xff][a-zA-Z0-9_\x80-\xff]*$)D', $name)) { # <2>
trigger_error("Not a variable name: $name", E_USER_ERROR);
}
if (!array_key_exists($name, $vars)) { # <3>
trigger_error("Undefined variable: $name");
}
return isset($vars[$name]) ? $vars[$name] : null; # <4>
}, $data);
unset($vars);
Backslash escape single quotes, double quotes, backslashes (\) and NULL chars for backreference \1 (as the e modifier did so).
Trigger fatal error on invalid variable names as those with the e modifier would result in a parse error followed by a fatal error evaluating code with syntax error(s).
Trigger warning on undefined variable.
Undefined variables (now not isset() indexes) result in null values.
This more verbose example is to mimik even more of the original behaviour and therefore could be seen as a more complete port. However, it contradicts many of the benefits why the e modifier was deprecated and removed in the first place. Therefore, do not apply it blindly, it is an additional example to highlight the differences between the e modifiers eval and the callback version.
This is also the reason I've kept this out of the foremost answer.
PHP Version Compatibility
The port as outlined above is done with an anonymous function and therefore it is compatible with PHP 5.3 or later.
If backwards compatibility is not necessary or as an outlook for a future migration, some comments on more recent PHP versions:
Since PHP 7.4 arrow functions can be used. They have the benefit that the scope is automatically inherited ("closed"), so the use-clause becomes redundant. However variable variables can not be used with arrow functions which makes the array as "variable table" still necessary - like before. It can condense the code thought, especially if error conditions (see discussion above) from the pattern would be removed already (not the case in the following example code, it still uses the original pattern):
$vars = get_defined_vars();
$result = preg_replace_callback('/{(.*?)}/', fn($matches) => $vars[$matches[1]] ?? null, $data);
unset($vars);
Since PHP 8.0 - as throw new \Error is an expression - throwing could be another option, however for my taste it is not of much benefit then as control is not fine-grained and also readability is degraded. Your mileage may vary thought, it is an option since PHP 8.0:
$vars = get_defined_vars();
$result = preg_replace_callback(
'/{(.*?)}/',
fn($matches) =>
$vars[$matches[1]]
?? throw new \Error(sprintf('Expected existing variable name, got "%s" which is undefined', $matches[1])),
$data
);
unset($vars);
You can access global variables using $GLOBALS Superglobal array:
preg_replace_callback('/\{(.*?)\}/', function ($matches) {
return $GLOBALS[$matches[1]];
}, $data);
See the PHP demo:
$data = 'Some {abc} here';
$abc = "Word";
echo preg_replace_callback('/\{(.*?)\}/', function ($matches) {
return $GLOBALS[$matches[1]];
}, $data);
Output:
Some Word here

'Greedy Token Parsing' in PHP

What is 'Greedy token parsing' in PHP ?
I found this in Codeigniter guideline:
"Always use single quoted strings unless you need variables parsed, and in cases where you do need variables parsed, use braces to prevent greedy token parsing."
"My string {$foo}"
An answer with good explanation will help.
Thanks !!
Greedy token parsing refers to something like this:
$fruit = "apple";
$amount = 3;
$string = "I have $amount $fruits";
Possible expected output: "I have 3 apples"
Actual output: "I have 3 "
Of course, this is a beginner mistake, but even experts make mistakes sometimes!
Personally, I don't like to interpolate variables at all, braces or not. I find my code much more readable like this:
$string = "I have ".$amount." ".$fruit."s";
Note that code editors have an easier job colour-coding this line, as shown here in Notepad++:
Then again, some people might prefer letting the engine do the interpolation:
$string = sprintf("I have %d %ss",$amount,$fruit);
It's all up to personal preference really, but the point made in the guideline you quoted is to be careful of what you are writing.
"Greedy" is a general term in parsing referring to "getting as much as you can". The opposite would be "ungreedy" or "get only as much as you need".
The difference in variable interpolation is, for example:
$foo = 'bar';
echo "$foos";
The parser here will greedily parse as much as makes sense and try to interpolate the variable "$foos", instead of the actually existing variable "$foo".
Another example in regular expressions:
preg_match('/.+\s/', 'foo bar baz')
This greedily grabs "foo bar", because it's the longest string that fits the pattern .+\s. On the other hand:
preg_match('/.+?\s/', 'foo bar baz')
This ungreedy +? only grabs "foo", which is the minimum needed to match the pattern.

PHP: What do the curly braces in $variable{0} do?

I was going through a codebase and came across a line I had a question about. It's something I haven't seen before and I was wondering if someone could explain it for me. Here's the code:
$variableName = $array[1];
$variableName{0} = strtolower($variableName{0});
$this->property = $variableName;
What are the curly braces being used for? I've used curly braces to define variables as variable names before, but is this the same thing? I can't seem to find any resources online that explain it, but I'm not sure if I'm searching for the right thing.
access the single byte with that index {0} => first char (in non-utf8 string)
you could simply test it with:
$var='hello';
echo $var{0};
It's setting the first character of the string to lower case. It's a string shortcut operator, functioning the same as this:
<?php
$variableName = strtolower(substr($variableName, 0, 1)) . substr($variableName, 1)
Curly braces {} work the same as square brackets [], for array or string indexing. I'm guessing it is borrowed from perl, in which the square brackets are used for arrays and braces are used for hashes. But in PHP arrays and hashes are the same thing.

Separate variable from string

I've got 2 vars - $j and $r
In string "$jx$r" php sees $jx as variable, but "x" is a string.
You need to reformat your string a litte:
Tis might be easier to read:
echo $j."x".$r
But if you want it in one string:
echo "{$j}x$r";
See the manual page for double quoted strings, there is also an explaination how the curly braces work.

Categories