PHP equivalent of Perl's 'use strict' (to require variables to be initialzied before use) - php

Python's convention is that variables are created by first assignment, and trying to read their value before one has been assigned raises an exception. PHP by contrast implicitly creates a variable when it is read, with a null value. This means it is easy to do this in PHP:
function mymodule_important_calculation() {
$result = /* ... long and complex calculation ... */;
return $resukt;
}
This function always returns null, and if null is a valid value for the functuion then the bug might go undetected for some time. The Python equivalent would complain that the variable resukt is being used before it is assigned.
So... is there a way to configure PHP to be stricter with variable assignments?

PHP doesn't do much forward checking of things at parse time.
The best you can do is crank up the warning level to report your mistakes, but by the time you get an E_NOTICE, its too late, and its not possible to force E_NOTICES to occur in advance yet.
A lot of people are toting the "error_reporting E_STRICT" flag, but its still retroactive warning, and won't protect you from bad code mistakes like you posted.
This gem turned up on the php-dev mailing-list this week and I think its just the tool you want. Its more a lint-checker, but it adds scope to the current lint checking PHP does.
PHP-Initialized Google Project
There's the hope that with a bit of attention we can get this behaviour implemented in PHP itself. So put your 2-cents on the PHP mailing list / bug system / feature requests and see if we can encourage its integration.

There is no way to make it fail as far as I know, but with E_NOTICE in error_reporting settings you can make it throw a warning (well, a notice :-) But still a string you can search for ).

Check out error reporting, http://php.net/manual/en/function.error-reporting.php
What you want is probably E_STRICT. Just bare in mind that PHP has no namespaces, and error reporting becomes global. Kind of sucks to be you if you use a 3rd party library from developers that did not have error reporting switched on.

I'm pretty sure that it generates an error if the variable wasn't previously declared. If your installation isn't showing such errors, check the error_reporting() level in your php.ini file.

You can try to play with the error reporting level as indicated here: http://us3.php.net/error_reporting but I'm not sure it mention the usage of non initiated variable, even with E_STRICT.

There is something similar : in PHP you can change the error reporting level. It's a best practice to set it to maximum in a dev environnement. To do so :
Add in your PHP.ini:
error_reporting = E_ALL
Or you can just add this at the top of the file your are working on :
error_reporting(E_ALL);
This won't prevent your code from running but the lack of variable assignments will display a very clear error message in your browser.

If you use the "Analyze Code" on files, or your project in Zend Studio it will warn you about any uninitialized variables (this actually helped find a ton of misspelled variables lurking in seldom used portions of the code just waiting to cause very difficult to detect errors). Perhaps someone could add that functionality in the PHP lint function (php -l), which currently only checks for syntax errors.

Related

PHP: Is there any benefit to writing strict code?

When I set error_reporting(E_ALL | E_STRICT);, my code produces Undefined variable errors. I can solve them, but I am wondering whether there is any difference in speed or memory usage between writing code that passes strict checks, and just turning E_STRICT off?
There is no mechanical benefit. You are, however, protected from doing really common, really dumb things like not always initializing a variable before using it - because with E_STRICT on, PHP will generate an error instead of allowing functions to break in potentially-catastrophic, and probably-invisible ways.
For example, it's completely conceivable that a database-backed application uses a variable that isn't initialized by all possible execution paths:
// Adds an allergy to the user's records
public function Add($AllergyID) {
$Patient = $this->Patient->Load();
if ($Patient->Insurance->StartDate < now()) {
$Allergies = $Patient->Allergies->Get();
$Allergies[] = $AllergyID;
}
$Patient->Allergies->Set($Allergies);
}
Eventually it doesn't get initialized, and somebody's medical records table is silently truncated.
In short, you should always develop with all warnings: it's your first line of defense. When it comes time to move your code into production, though, you absolutely want error reporting off. You don't want malicious users gaining insight into the inner workings of your application, or - worse - your database.
There is NO speed Benefit, but while using PHP 5.2.0. or before you should use E_ALL | E_STRICT for the development purposes.
But for the PHP 5.2.0 above E_STRICT is included in the E_ALL itself.
Or you can use error_reporting(-1); Which will always include everything, even if they are present in E_ALL.
use the below stackoverflow question for further reference
What is the recommended error_reporting() setting for development? What about E_STRICT?
less errors result in better speed;
maintainability will be incresed;
memory enhancement maybe too, because of log won't be flus

E_NOTICE: How useful is it REALLY to fix every one?

First off I know this question has gone around more than once here:
Why should I fix E_NOTICE errors?
Why should I fix E_NOTICE errors? Pros and cons
But the more that I fix all E_NOTICEs (as people say you should) the more I notice that:
I am micro-optimising
I am actually making more code and making my code harder to mantain and slower
Take an example:
Say your using the MongoDB PHP driver and you have a MongoDate object in a class var named ts within a class that represents a single row in a collection in your database. Now you acces this var like: $obj->ts->sec but PHP throws a fit (E_NOTICE) because ts in this case is not defined as an object in itself because this particular row does not have a ts field. So you think this is OK, this is desired behaviour, if it's not set return null and I will take care of it myself outside of the interpreters own robotic workings (since you wrap this in a date() function that just returns 1970 if the var is null or a none-object).
But now to fix that E_NOTICE as another developer really wants me to since having ANY E_NOTICEs is terribad and it makes the code slower to not do it according to the errors. So I make a new function in the $obj class called getTs and I give it 3 lines, literally to do nothing but check if the ts var is a MongoDate object and return it if it is...
WHY? Can't PHP do this perfectly fine for me within its much faster interpreter than having to do it within the runtime of the app itself? I mean every where I am having to add useless bumpth to my code, pretty much empty functions to detect variables that I actually just handle with PHPs own ability to return null or checking their instanceof when I really need to (when it is vital to the operation and behaviour of the said function) and don't get me started on the isset()s I have added about 300 lines of isset()s, it's getting out of hand. I have of course got to make this getTs functions because you can't do:
class obj{
public $ts = new MongoDate();
}
I would either have to store the ts within the __constructor (which I am not too happy about either, I am using a lot of magics as it is) or use a function to detect if it's set (which I do now).
I mean I understand why I should fix:
Undefined vars
Assigning properties of unset vars (null vars)
constant errors etc
But if you have tested your code and you know it is safe and will only work the way you desire what is the point in fixing all of the undefined index or none-object errors? Isn't adding a bunch of isset()s and 2 lines functions to your code actually micro-optimisation?
I have noticed after making half my site E_NOTICE compliant that actually it uses more CPU, memory and time now...so really what's the point of dealing with every E_NOTICE error and not just the ones that ARE errors?
Thanks for your thoughts,
You do certainly do get better performance by using isset(). I did some benchmarks, not too long ago, and just hiding errors came out to be about 10x slower.
http://garrettbluma.com/2011/11/14/php-isset-performance/
That said, performance usually isn't a critical factor in PHP. What does, personally drive me crazy is silent errors.
When the interpreter chooses to not flag something as an error (which could lead to instability) is a huge problem. PHP in particular has a tendency to
warn about things that should error (e.g. failure to connect to database) and
issue notices about things that ought to warn (e.g. attempting to access a member of a null object).
Perhaps I'm just overly opinionated about this kind of stuff but I've been bitten before by these silent errors. I recommend always including E_NOTICE in error reporting.
Whether or not you should fix them is certainly debatable, and will just depend on the return in your situation; eg, it's more important if the code will have a longer life-span, more devs, etc.
In general, assuming that your functions will be used (and mis-used) by someone else is the best practice, so you should do isset/!empty/is_object checks to account for this. Often, your code will find it's way into uses and situations you never intended it for.
As far as performance, Every time any kind of error is thrown--E_NOTICE included--the interpreter spins up the error handler, builds a stack trace, and formats the error. The point is that, whether or not you have them reporting, errors always slow execution; therefore, 2-3 function calls to avoid an E_NOTICE will still improve your performance.
Edit:
Alternatives for the above example
I wouldn't necessarily create extra objects to avoid the errors; you can gracefully avoid them without. Here are a couple of options:
1) Function that handles missing ts:
SpecialClass class {
funciton getTs () {
return !empty($this->ts) ? $ts->sec : false;
}
}
2) Deal with missing ts in template/procedure:
if (!empty($obj->ts->sec)) {
//do something
}
I particularly like empty() because you can use it to replace of (isset($var) && ($var or 0 != $var //etc)), saving multiple calls/comparisons and empty never throws notices for the target var or attribute. It will throw an error if you're calling it on a proptery/member of a non-existent variable.

Is there a way to use preg_match_all without declaring variable first?

Is there some fancy syntax I can use within the preg_match_all function to establish the new $matches variable at that time, rather than doing so beforehand as I have done below?
$matches = '';
preg_match_all('/[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,4}/', file_get_contents($eFetchURL), $matches);
Thanks in advance for your help!
Yes, namely this:
preg_match_all('/[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,4}/', file_get_contents($eFetchURL), $matches);
Taking the reference of a non-existent variable in PHP is not an error. Rather, PHP automatically declares the variable for you and defines it as NULL.
Not declare variables throws an E_NOTICE. Depending on the php.ini or runtime configuration, using error_reporting function, exception may be omitted or not.
Good practice is to have E_STRICT mode enabled in development environment.
Note:
Enabling E_NOTICE during development has some benefits. For debugging purposes: NOTICE messages will warn you about possible bugs in your code. For example, use of unassigned values is warned. It is extremely useful to find typos and to save time for debugging. NOTICE messages will warn you about bad style. For example, $arr[item] is better to be written as $arr['item'] since PHP tries to treat "item" as constant. If it is not a constant, PHP assumes it is a string index for the array.
Note:
In PHP 5 a new error level E_STRICT is available. As E_STRICT is not included within E_ALL you have to explicitly enable this kind of error level. Enabling E_STRICT during development has some benefits. STRICT messages will help you to use the latest and greatest suggested method of coding, for example warn you about using deprecated functions.
You can find more information in
http://php.net/manual/en/errorfunc.configuration.php

Are there any essential reasons to use isset() over # in php

So I'm working on cleanup of a horrible codebase, and I'm slowly moving to full error reporting.
It's an arduous process, with hundreds of notices along the lines of:
Notice: Undefined index: incoming in /path/to/code/somescript.php on line 18
due to uses of variables assuming undefined variables will just process as false, like:
if($_SESSION['incoming']){
// do something
}
The goal is to be able to know when a incorrectly undefined variable introduced, the ability to use strict error/notice checking, as the first stage in a refactoring process that -will- eventually include rewriting of the spots of code that rely on standard input arrays in this way. There are two ways that I know of to replace a variable that may or may not be defined
in a way that suppresses notices if it isn't yet defined.
It is rather clean to just replace instances of a variable like $_REQUEST['incoming'] that are only looking for truthy values with
#$_REQUEST['incoming'].
It is quite dirty to replace instances of a variable like $_REQUEST['incoming'] with the "standard" test, which is
(isset($_REQUEST['incoming'])? $_REQUEST['incoming'] : null)
And you're adding a ternary/inline if, which is problematic because you can actually nest parens differently in complex code and totaly change the behavior.
So.... ...is there any unacceptable aspect to use of the # error suppression symbol compared to using (isset($something)? $something : null) ?
Edit: To be as clear as possible, I'm not comparing "rewriting the code to be good" to "#", that's a stage later in this process due to the added complexity of real refactoring. I'm only comparing the two ways (there may be others) that I know of to replace $undefined_variable with a non-notice-throwing version, for now.
Another option, which seems to work well with lame code that uses "superglobals" all over the place, is to wrap the globals in dedicated array objects, with more or less sensible [] behaviour:
class _myArray implements ArrayAccess, Countable, IteratorAggregate
{
function __construct($a) {
$this->a = $a;
}
// do your SPL homework here: offsetExists, offsetSet etc
function offsetGet($k) {
return isset($this->a[$k]) ? $this->a[$k] : null;
// and maybe log it or whatever
}
}
and then
$_REQUEST = new _myArray($_REQUEST);
This way you get back control over "$REQUEST" and friends, and can watch how the rest of code uses them.
You need to decide on your own if you rate the # usage acceptable or not. This is hard to rate from a third party, as one needs to know the code for that.
However, it already looks like that you don't want any error suppression to have the code more accessible for you as the programmer who needs to work with it.
You can create a specification of it in the re-factoring of the code-base you're referring to and then apply it to the code-base.
It's your decision, use the language as a tool.
You can disable the error suppression operator as well by using an own callback function for errors and warnings or by using the scream extension or via xdebug's xdebug.scream setting.
You answered you question yourself. It suppress error, does not debug it.
In my opinion you should be using the isset() method to check your variables properly.
Suppressing the error does not make it go away, it just stops it from being displayed because it essentially says "set error_reporting(0) for this line", and if I remember correctly it would be slower than checking isset() too.
And if you don't like the ternary operator then you should use the full if else statement.
It might make your code longer but it is more readable.
I would never suppress errors on a development server, but I would naturally suppress errors on a live server. If you're developing on a live server, well, you shouldn't. That means to me that the # symbol is always unacceptable. There is no reason to suppress an error in development. You should see all errors including notices.
# also slows things down a bit, but I'm not sure if isset() is faster or slower.
If it is a pain to you to write isset() so many times, I'd just write a function like
function request($arg, $default = null) {
return isset($_REQUEST[$arg]) ? trim($_REQUEST[$arg]) : $default;
}
And just use request('var') instead.
Most so-called "PHP programmers" do not understand the whole idea of assigning variables at all.
Just because of lack of any programming education or background.
Well, it isn't going a big deal with usual php script, coded with considerable efforts and consists of some HTML/Mysql spaghetti and very few variables.
Another matter is somewhat bigger code, when writing going to be relatively easy but debugging turns up a nightmare. And you are learn to value EVERY bloody error message as you come to understanding that error messages are your FRIENDS, not some irritating and disturbing things, which better to be gagged off.
So, upon this understanding you're learn to leave no intentional errors in your code.
And define all your variables as well.
And thus make error messages your friends, telling you that something gone wrong, lelping to hunt down some hard-spotting error which caused by uninitialized variable.
Another funny consequence of lack of education is that 9 out of 10 "PHP programmers" cannot distinguish error suppression from turning displaying errors off and use former in place of latter.
I've actually discovered another caveat of the # beyond the ones mentioned here that I'll have to consider, which is that when dealing with functions, or object method calls, the # could prevent an error even through the error kills the script, as per here:
http://us3.php.net/manual/en/language.operators.errorcontrol.php
Which is a pretty powerful argument of a thing to avoid in the rare situation where an attempt to suppress a variable notice suppressed a function undefined error instead (and perhaps that potential to spill over into more serious errors is another unvoiced reason that people dislike #?).

Notice: Undefined index: XXX - Does it really matter?

Since i changed my error reporting level to error_reporting(E_ALL | E_STRICT); I am facing this error. I can obviate from this error using isset() but the code looks so ugly!
So my question is: What if I go back to my normal settings of error reporting? does it really matter to know that something is not already defined? because it woks properly without the Notice error.
Because i have +10 inputs and i get them like that:
$username = $_POST['username'];
I also tried to pre-define the variables using this in the top on the file.
$username = null; and $username = 0; but they don't work.
Thanks.
It does matter. Errors slow down PHP and you really should design you application not to throw errors. Many other languages will completely die in situations where PHP happily continues script execution.
When developing, your script should not throw any errors (even an E_NOTICE).
I would suggest creating a simple function to grab the $_POST values and do the checking for you.
e.g.
<?php
function getPost($key)
{
return isset($_POST[$key]) ? $_POST[$key] : null;
}
Edit:
Apparently it wasn't clear to the OP how to use this:
$username = getPost('username');
It means there is no key 'username' in the POST array.
Generally, it is a good idea to check and correct these things, as they may ripple to other parts in your application that do depend on the missing value.
It does matter -- when I get strange behaviour in a php application the error log is the first place I look and nine times out of ten an "UNDEFINED INDEX" message leads me straight to the root cause.
Notices do have a purpose: they're a tool to detect potential errors in your code. If you write code that triggers notices for trivial operations and you are not willing to change it, you'll have to disable notice reporting and thus reject a helpful tool on purpose and make your work harder than needed.
Historically, PHP was designed with extreme simplicity in mind (in old versions you'd just have an $username available with zero lines of code) but this approach proved highly inadequate as the web evolved: it only lead to code that was insecure and hard to maintain.
All errors should be addressed, no matter the level, for portability.
If you build your application not addressing strict errors, and your application is deployed on a server that does have strict error reporting, your application is going to fall over pretty quickly.
Your best bet is to check the existence of $_POST['username'] and then act independently on that return value. Using isset() your return value with either be true or false.
I'm guessing $_POST['username'] is for use in an authentication system of some description? Therefore, if your isset() function returns false you could then display an error detailing to the user that username is required.

Categories