So, I don't come from a huge PHP background—and I was wondering if in well formed code, one should use the 'superglobals' directly, e.g. in the middle of some function say $_SESSION['x'] = 'y'; or if, like I'd normally do with variables, it's better to send them as arguments that can be used from there, e.g:
class Doer {
private $sess;
public function __construct(&$sess) {
$this->sess =& $sess;
}
}
$doer = new Doer($_SESSION);
and then use the Doer->sess version from within Doer and such. (The advantage of this method is that it makes clear that Doer uses $_SESSION.)
What's the accepted PHP design approach for this problem?
I do like to wrap $_SESSION, $_POST, $_GET, and $_COOKIE into OOP structures.
I use this method to centralize code that handles sanitation and validation, all of the necessary isset () checks, nonces, setcookie parameters, etc. It also allows client code to be more readable (and gives me the illusion that it's more maintainable).
It may be difficult to enforce use of this kind of structure, especially if there are multiple coders. With $_GET, $_POST, and $_COOKIE (I believe), your initialization code can copy the data, then destroy the superglobal. Maybe a clever destructor could make this possible with $_SESSION (wipe $_SESSION on load, write it back in the destructor), though I haven't tried.
I don't usually use any of these enforcement techniques, though. After getting used to it, seeing $_SESSION in code outside the session class just looks strange, and I mostly work solo.
EDIT
Here's some sample client code, in case it helps somebody. I'm sure looking at any of the major frameworks would give you better ideas...
$post = Post::load ();
$post->numeric ('member_age');
$post->email ('member_email');
$post->match ('/regex/','member_field');
$post->required ('member_first_name','member_email');
$post->inSet ('member_status',array('unemployed','retired','part-time','full-time'));
$post->money ('member_salary');
$post->register ('member_last_name'); // no specific requirements, but we want access
if ($post->isValid())
{
// do good stuff
$firstName = $post->member_first_name;
}
else
{
// do error stuff
}
Post and its friends all derive from a base class that implements the core validation code, adding their own specific functionality like form tokens, session cookie configuration, whatever.
Internally, the class holds a collection of valid data that's extracted from $_POST as the validation methods are called, then returns them as properties using a magic __get method. Failed fields can't be accessed this way. My validation methods (except required) don't fail on empty fields, and many of them use func_get_args to allow them to operate on multiple fields at once. Some of the methods (like money) automatically translate the data into custom value types.
In the error case, I have a way to transform the data into a format that can be saved in the session and used to pre-populate the form and highlight errors after redirecting to the original form.
One way to improve on this would be to store the validation info in a Form class that's used to render the form and power client-side validation, as well as cleaning the data after submission.
Modifying the contents of the superglobals is considered poor practice. While there's nothing really wrong with it, especially if the code is 100% under your control, it can lead to unexpected side effects, especially when you consider mixed-source code. For instance, if you do something like this:
$_POST['someval'] = mysql_real_escape_string($_POST['someval']);
you might expect that everywhere PHP makes that 'someval' available would also get changed, but this is not the case. The copy in $_REQUEST['someval'] will be unchanged and still the original "unsafe" version. This could lead to an unintentional injection vulnerability if you do all your escaping on $_POST, but a later library uses $_REQUEST and assumes it's been escaped already.
As such, even if you can modify them, it's best to treat the superglobals as read-only. If you have to mess with the values, maintain your own parallel copies and do whatever wrappers/access methods required to maintain that copy.
I know this question is old but I'd like to add an answer.
mario's classes to handle the inputs is awesome.
I much prefer wrapping the superglobals in some way. It can make your code MUCH easier to read and lead to better maintainability.
For example, there is some code at my current job the I hate! The session variables are used so heavily that you can't realistically change the implementation without drastically affecting the whole site.
For example,
Let's say you created a Session class specific to your application.
class Session
{
//some nice code
}
You could write something like the following
$session = new Session();
if( $session->isLoggedIn() )
{
//do some stuff
}
As opposed to this
if( $_SESSION['logged'] == true )
{
//do some stuff
}
This seems a little trivial but it's a big deal to me. Say that sometime in the future I decide that I want to change the name of the index from 'logged' to 'loggedIn'.
I now have to go to every place in the app that the session variable is used to change this. Or, I can leave it and find someway to maintain both variables.
Or what if I want to check that that user is an admin user and is logged in? I might end up checking two different variables in the session for this. But, instead I could encapsulate it into one method and shorten my code.
This helps other programmers looking at your code because it becomes easier to read and they don't have to 'think' about it as much when they look at the code. They can go to the method and see that there is only ONE way to have a logged in user. It helps you too because if you wanted to make the 'logged' in check more complex you only have to go to one place to change it instead of trying to do global finds with your IDE and trying to change it that way.
Again, this is a trivial example but depending on how you use the session this route of using methods and classes to protect access could make your life much easier to live.
I would not recommend at all passing superglobal by reference. In your class, it's unclear that what you are modifying is a session variable. Also, keep in mind that $_SESSION is available everywhere outside your class. It's so wrong from a object oriented point of view to be able to modify a variable inside a class from outside that class by modifying a variable that is not related to the class. Having public attribute is consider to be a bad practice, this is even worst.
I found my way here while researching for my new PHP framework.
Validating input is REALLY important. But still, I do often find myself falling back to code like this:
function get( $key, $default=FALSE ){
return (isset($_GET[$key]) ? $_GET[$key]:$default);
}
function post( $key, $default=FALSE ){
return (isset($_POST[$key]) ? $_POST[$key]:$default);
}
function session( $key, $default=FALSE ){
return (isset($_SESSION[$key]) ? $_SESSION[$key]:$default);
}
Which I then use like this:
$page = get('p', 'start');
$first_name = post('first_name');
$last_name = post('last_name');
$age = post('age', -1);
I have found that since I have wildly different requirements for validation for different projects, any class to handle all cases would have to be incredibly large and complex. So instead I write validation code with vanilla PHP.
This is not good use of PHP.
get $_SESSION variables directly:
$id = $_SESSION['id'];
$hash = $_SESSION['hash'];
etc.
Related
I came across an interesting piece of PHP code which has me a bit stumped as to why the author has chosen to do this.
function do_something($db, $post_vars){
foreach($post_vars as $key => $value{
$vars[$key] = mysqli_real_escape_string($db, $value);
}
return $vars;
}
$db = mysqli_connect("myhost","myuser","mypassw","mybd") or die("Error " . mysqli_error($link));
do_something($db, $_POST);
It got me thinking about why someone would want to pass $_POST as a variable and just not access it directly inside the function? The only benefit I could think of (and this was a bit of a long shot) was if we were to append other information to $_POST before calling the function (such as):
function do_something($db, $post_vars){
foreach($post_vars as $key => $value{
$vars[$key] = mysqli_real_escape_string($db, $value);
}
return $vars;
}
$db = mysqli_connect("myhost","myuser","mypassw","mybd") or die("Error " . mysqli_error($link));
foreach($_POST as $post_key => $post_value){
$post[$post_key] = $post_value;
}
$post['my_custom_var'] = "a";
do_something($db, $post);
There is, however, no evidence of this practise anywhere in the code. Just calls to do_something() with $_POST being passed as an arugment.
My question then is, is there any benefit in doing it like this that I've missed or did the author simply not understand that $_POST is a global variable?
A complete long shot: Is there perhaps even any well intended "later additions" they could make to this (such as my example) that would almost justify this practise or is this just a case of misunderstanding. Or perhaps is there a security implication that could justify the practise?
IMHO it's a practice of abstraction, and there are benefits:
Generality: by receiving $_POST as a parameter, the function becomes less tightly coupled to $_POST. The function may serve more scenarios & possibly be more reusable.
Inversion of control: because the function's dependency($_POST) is injected from outside, you have more control over the function. It is a bit unlikely but let's suppose your form has been updated and now you need to submit via GET method. Without modifying the function's body, passing in $_GET on the caller's side, is enough to reflect the change.
Test fixture isolation: To mock FORM inputs to test a certain code path in the function, it is better access global states (such as $_POST) in an abstracted way, so that the test itself does not bring side effects to other part of the system.
In general, there is a very good reason to pass $_POST or $_GET or whatever array you want as argument instead of accessing it through the auto-global variables. It's called "dependency injection" and it's a key feature for testable code.
Another reason to not use auto-globals (or global variables, in general) is the readability of the code.
Compare:
function do_something($db, array $post_vars) {
// many lines of code here, you don't want to read them
}
do_something($db, $_POST);
with
function do_something_else($db) {
// many lines of code here that use $_POST, you don't want to read them either
}
do_something_else($db);
In the first case its clear that function do_something() operates over the values of $_POST (or $_GET or whatever other array full of data you pass it as argument). You don't have to read the function's code to know that; all the relevant information is displayed in the function header and in the example usage.
In the second case, the behaviour of function do_something_else() depends on the content of $_POST but one cannot know this without looking at its code.
Back to the code you posted, it doesn't look like it was written having testability in mind.
Take a look only at:
foreach($_POST as $post_key => $post_value){
$post[$post_key] = $post_value;
}
It basically does $post = $_POST in more words.
It's a good idea to sanitize the $_POST Superglobal before passing it as an argument of a function, then escape it inside the function that has your query logic.
What he does is escaping every POST variable through mysqli_real_escape_string function, which makes sure you avoid SQL Injection attacks.Then he returns the array (with escaped values) by calling : return $vars;
He might have used a function to do this because he intends to repeat the steps several times throughout the application and he might just call that function and passing the variables instead of writing the code over and over again.
MAKE SURE YOU ALWAYS ESCAPE USER INPUT BEFORE MAKING ANY DATABASE CALLS.
I wonder what would be the professional way to handle insert/delete requests of a website?
Right now, what I have is I inserted two <input type = "hidden"/> on each form where one hidden's value correspond to a function it needs and the other is the parameter of this function. So when the form submits, I have a post.php file that handles ALL insert/delete requests that simply invokes the value of the hiddens via call_user_func() in PHP. Like so:
<input name = "arg" value = "{$id}" type = "hidden"/>
<input name = "call" value = "delete_member" type = "hidden"/>
call_user_func($_POST['call'], $_POST['arg']);
I'm having doubts on how sensible this solution is because I found out that the hiddens aren't actually hidden in the source on the client-side.
My first solution was to basically have a lot of conditionals checking for which function to invoke but I really hated that a lot so I changed it with this solution.
I wonder what are the better ways I can do this, or maybe how the professionals do it? Handling insert/delete queries.
I would consider this a very bad way to call actions from the client side.
For one this data is put into the HTML which will always be viewable and editable by the client. As such this means you cannot trust the data you receive from the client and as such you cannot trust the function they are calling.
Another point to re-inforce my previous idea. You say you run validation to make sure it is a function and all that, but you have a problem. Closures return true on these functions (since they are functions and methods and they exist). So a user can put a anon function as the value of your hiden field and actually run whatever they want on your server.
As others say I would recommend looking into MVC. Look into how Yii/CodeIgniter/Zend/Lithium/Kohana/etc do this and how they route.
An example of how routing for actions such as deletion is done by my favourite framework, Yii:
class UserController extends CController{
public function actionDelete($id = null){
if($id===null){ return false; }
}
}
Then the form/link calls /user/delete?id=2 which makes index.php route to the userController and use the actionDelete function inside the user controller, running it's code. Of course this is a very simplified version and it gets a lot more complex to stop vulnerabilities.
You may also wish to look into CSRF validation.
The most common way is to just call a function that takes care of one form at the time like this example for submitting a blog message:
blog.php
if(isset($_POST['submit'])) {
save_message();
} else {
display_form();
}
function display_form()
{
?>
<form action="blog.php"> etc etc....
<?php
}
function save_message()
{
//security checks and inserts etc
$_SESSION['message'] = 'Form saved succesfully';
header('location: blog_overview.php');
}
This is according to me a practish, but you might want to checkout frameworks like Codeigniter and Kohana since the above code is functional (and to me outdated). Read some tutorials about OOP (Object Oriented Programming) and MVC (Model View Controller). It might seem alot of work, but if you really want it to do it right it is worth the time and effort.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Are global variables in PHP considered bad practice? If so, why?
global in functions
Edit: question answered in link above.
No, "global" in php is not the same thing as global in other languages, and while it does not introduce any security problems it can make the code less comprehensible to others.
OP:
Project summary - I am writing a web CMS to get my feet wet with PHP / MySQL. In order to break up the code I have a concept of these basic tiers / modules:
Data
- MySQL Tables
- PHP Variables
Function
- SQL - Get / Set / etc
- Frontend - Showing pages
- Backend - Manager
Presentation
- HTML Templates
- Page Content
- CSS Stylesheets
The goal is common enough. Use MySQL to hold site settings and page content, use php to get / manipulate content data for a page being served, then insert into an html template and echo to browser. Coming from OO languages like C# the first problem I ran into was variable scope issues when using includes and functions.
From the beginning I had been writing function-only php files and including them where needed in other files that had existing variable array definitions. For example, ignoring the data tier for a moment, a simple page might generically look like this:
File 1 (page)
$DATA_PAGE = Array
(
'temp' = null,
'title' = null,
[...]
);
include 'functions.php';
get_data ( 'page.php' );
[...]
run_html ();
File 2 (functions)
function get_data ( $page_name )
{
global $DATA_PAGE;
$DATA_PAGE [ 'temp' ] = 'template';
$DATA_PAGE [ 'title' ] = 'test page';
[...]
}
function run_html ()
{
global $DATA_PAGE;
echo '<html>';
echo '<head>';
echo '<title>' . $DATA_PAGE [ 'title' ] . '</title>';
[...]
}
I chose this method for a few reasons:
The data in these arrays after sql fetch might be used anywhere, including page content
I didnt want to have a dozen function arguments, or pass entire arrays
The code runs great. But in every article I find on the subject, the "global" calls in my functions are called bad practice, even though the articles never state why? I thought all that meant was "use parent scope". Am in introducing a security hole into my app? Is there a better method? Thanks in advance.
I think a top reason for avoiding this is that it hides dependencies.
Your functions get_data and run_html do not advertise in any way that they share data, and yet they do, in a big way. And there is no way (short of reading the code) to know that run_html will be useless if get_data has not been called.
As the complexity of your codebase grows, this kind of lurking dependency will make your code fragile and hard to reason about.
Because global variables can be modified by the process without other parts of your code knowing it producing unexpected results.
You should always try to keep variables scoped as locally as possible -- not only will this help you with debugging, but it will keep your code easier and cleaner to read and go back and modify.
If you are looking to share data across multiple functions, you might look into making a class for your data and then defining methods to operate on the data encapsulated in your object. (Object-Oriented programming)
For many reasons, for example:
Hard to support code with global variables. You don't know where global variables can affect your logic and don't control access to them
Security - if you have complex system (especially with plugins), someone can compromise all system with global variables.
Settings in a global variable are fine, but putting data that can be modified into a global variable can, as tkone said, have unexpected results.
That said, I don't agree with the notion that global variables should be avoided at all costs - just try wrapping them into, say, a singleton settings class.
It is bad practice to use variables from a different scope to your own, because it makes your code less portable. In other words, if I want to use the get_data() function somewhere else, in another project, I have to define the $DATA_PAGE variable before I can use it.
AFAIK this is the only reason this should be avoided - but it's a pretty good reason.
I would pass by reference instead.
function get_data (&$data_page, $page_name )
{
$data_page['temp'] = 'template';
$data_page['title'] = 'test page';
[...]
}
get_data($DATA_PAGE);
To check If a user is logged in I need to pull off a pretty long if-statement and then redirect the user depending if the user is logged in or not. I think a custom function like
if (logged_in()) { redirect }
Would be more appropriative. But building a library for one function seems unnecessary to me. What should I do?
I need to pull off a pretty long if-statement, but building a library for one function seems unnecessary
It's not at all "unnecessary", neither is it strictly "necessary", but it's probably a good idea to create a library/class for this.
If you have a lot of logic you need to work with, "a pretty long if-statement" for example, using a class can help you break this down into smaller pieces and make the logic more manageable. If you only need to call one public method of the class, like $this->auth->is_logged_in(), there's nothing wrong with that, then you can create a small helper file or wrapper function to call the method, and put the redirect logic there instead of the class. Something like this perhaps:
// Make sure your "auth" library is autoloaded or load it here
function logged_in($redirect = TRUE)
{
$CI =& get_instance();
$logged_in = $CI->auth->is_logged_in();
// Redirect the user...
if ( ! $logged_in AND $redirect)
{
redirect('somewhere/else/');
}
// Or just check if they are logged in
return $logged_in;
}
Using a class/library has many benefits, and with something as complicated as user authorization you will benefit greatly from taking advantage of it, especially once your project starts to expand and you need more utility.
Although helpers are usually preserved for decoupled functions that have nothing to do with your app, I think in this case they are appropriate. Simply create a helper function called is_logged_in.
To learn more about helpers, visit the Docs:
http://codeigniter.com/user_guide/general/helpers.html
im re-factoring php on zend code and all the code is full of $_GET["this"] and $_POST["that"]. I have always used the more phpish $this->_request->getPost('this') and $this->_request->getQuery('that') (this one being not so much logical with the getquery insteado of getGet).
So i was wondering if my method was safer/better/easier to mantain. I read in the Zend Framework documentation that you must validate your own input since the request object wont do it.
That leaves me with 2 questions:
What is best of this two? (or if theres another better way)
What is the best practice for validating php input with this methods?
Thanks!
I usually use $this->_request->getParams(); to retrieve either the post or the URL parameters. Then I use the Zend_Filter_Input to do validation and filtering. The getParams() does not do validation.
Using the Zend_Filter_Input you can do application level validation, using the Zend Validators (or you can write your own too). For example, you can make sure the 'months' field is a number:
$data = $this->_request->getParams();
$validators = array(
'month' => 'Digits',
);
$input = new Zend_Filter_Input($filters, $validators, $data);
Extending Brian's answer.
As you noted you can also check out $this->_request->getPost() and $this->_request->getQuery(). If you generalize on getParams(), it's sort of like using the $_REQUEST superglobal and I don't think that's acceptable in terms of security.
Additional to Zend_Filter, you may also use simple PHP to cast the required.
E.g.:
$id = (int) $this->_request->getQuery('id');
For other values, it gets more complicated, so make sure to e.g. quote in your DB queries (Zend_Db, see quoting identifiers, $db->quoteIdentifier()) and in views use $this->escape($var); to escape content.
You can't write a one-size-fits-all validation function for get/post data. As in some cases you require a field to be a integer and in others a date for instance. That's why there is no input validation in the zend framework.
You will have to write the validation code at the place where you need it. You can of course write some helper methods, but you can't expect the getPost() to validate something for you all by itself...
And it isn't even getPost/getQuery's place to validate anything, it's job is to get you the data you wan't, what happens to it from there on should not be it's concern.
$dataGet = $this->getRequest()->getParam('id',null);
$valid = new Zend_Validate_Digits();
if( isset($dataGet) && $valid->isValid($dataGet) ){
// do some...
} else{
// not set
}
I have always used the more phpish $this->_request->getPost('this') and $this->_request->getQuery('that') (this one being not so much logical with the getquery insteado of getGet).
What is best of this two? (or if theres another better way)
Just a quick explanation on the choice of getQuery(). The wording choice comes from what kind of data it is, not how it got there. GET and POST are just request methods, carrying all sorts of information, including, in the case of a POST request, a section known as "post data". A GET request has no such block, any variable data it carries is part of the query string of the url (the part after the ?).
So, while getPost() gets the data from the post data section of a POST request, getQuery() retrieves data from the query string of either a GET or POST request (as well as other HTTP Request methods).
(Note that GET Requests should not be used for anything that might produce a side effect, like altering a DB row)
So, in answer to your first question, use the getPost() and getQuery() methods, this way, you can be sure of where the data source (if you don't care, getParams() also works, but may include additional data).
What is the best practice for validating php input with this methods?
The best place to validate input is where you first use it. That is to say, when you pull it from getParams(), getPost(), or getQuery(). This way, your data is always correct for where you need it, and if you pass it off, you know it is safe. Keep in mind, if you pass it to another Controller (or Controller Action), you should probably check it again there, just to be safe. How you do this depends on your application, but it still needs to be checked.
not directly related to the topic, but
to insure that you get an number in your input, one could also use $var+0
(however if $var is a float it stays a float)
you may use in most cases
$id = $this->_request->getQuery('id')+0;