A PHP script (for booking an event) exits to an HTML page for a ‘Thank you for your booking’ etc message with a usual:
header ("Location: http://www.thankyou.com/")
at the head of the script.
The script now also needs to exit to other HTML pages (‘Sorry – no spaces available’ etc) where the page URIs are only determined later in the script. It could issue more header function calls to replace the original header but this makes it vulnerable to accidentally inserted spaces.
This must be a common requirement; what please is the best way to achieve it?
As far as no output is sent you can set headers (including the Location header for redirect) wherever you want in your script. There is no other better solution. Of course you would typically only do the operations you need to determine the destination of redirect and then redirect and exit script immediately, so redirects will be rather in the beginning of your script.
You should also place redirects in some predictable place, e.g. in your controller object or request validation function. It is not a good idea to have redirects scattered over many unrelated places, like inside many nested function calls etc.
If you have problems with accidental spaces, you should repair your scripts or use output buffering functions to buffer your accidental output and flush it only after all necessary headers has been sent.
Related
I'm writing a little PHP authentication library(please hold off the "don't write your own". I've done this before). As part of my library I want to be able to "abort" execution in some cases.
For instance, take this example:
<html>
<head></head>
<body>
<?php
$auth=...
$auth.RequiresInGroup("admin");
?>
</body>
</html>
What I want RequiresInGroup to do is basically check if a user is logged in (by looking at the cookies and such).. and if they are not logged in or in that group, then it needs to send back a 401 error and do a server-side redirect to a not authorized page.
I know that I could move my <?php.. statement bit above <html> to make this work, but I'm trying to cover all possible use cases(including poor code design).
Is there a way in PHP to basically hold off on sending content to the client until the end of the request or some similar way to send different HTTP headers in the middle of execution and then exit out of the script?
You need to read the "control output buffering section" of the manual http://www.php.net/manual/en/ref.outcontrol.php
Headers themselves are not sent until your first bit of output - then the response code gets sent, followed by customer headers you have set. You can see if they have been sent using headers_sent() function. But you can't actually cancel headers once you've set them.
So buffer up what you need to send, then spit out the buffer at the end of the application.
Edit: another way of doing this (and the way I prefer, although it relies on a global or singleton class, that can both be frowned upon) is to create your own "output" class. Set that as a global (or singleton) and append output to that. Then you can send headers willy-nilly, and pump your output at the end of execution. Not as "unit test friendly" as ob_functions, but easier to debug and control.
So I know the general rule of thumb is after doing a header redirect in PHP, you should call exit() to avoid having extra code running, but I want to know if you put code after the redirect header, if it will always run?
I was doing some research on various ways of tracking referrals in Google Analytics and came across this post: Google Analytics Tips & Tricks – Tracking 301 Redirects in Google Analytics
It recommends doing something like this:
<?
Header( “HTTP/1.1 301 Moved Permanently” );
Header( “Location: http://www.new-url.com” );
?>
<script type=”text/javascript”>
var gaJsHost = ((“https:” == document.location.protocol) ? “https://ssl.” : “http://www.”);
document.write(unescape(“%3Cscript src=’” + gaJsHost + “google-analytics.com/ga.js’ type=’text/javascript’%3E%3C/script%3E”));
</script>
<script type=”text/javascript”>
try {
var pageTracker = _gat._getTracker(“UA-YOURPROFILE-ID”);
pageTracker._trackPageview();
} catch(err) {}</script>
From the way I've always understood the header() function, it's up to the browser and it can run the redirect whenever it wants to. So there's no guarantee the JavaScript would actually begin or finish executing prior to the redirect occurring.
PHP's documentation for the header() function indicates the reason for exiting after a redirect is to "make sure that code below does not get executed when we redirect." That doesn't sound like they guarantee all following code will run, just that it could happen.
Regardless, I found a different way to actually manage the tracking, but I wanted to see if I could find out how exactly header() worked in this situation..
Thanks for your help.
Using the header function in PHP only adds to the headers of the response returned by the server. It does not immediately send any data and does not immediately terminate the connection. Any code after the header call will be executed.
In particular, it's a good idea to add a response body even after doing a 301 redirect so that clients that do not support the redirect also get some descriptive response. Infact according to the HTTP 1.1 specification Section 10.3.2 -
Unless the request method was HEAD, the entity of the response SHOULD
contain a short hypertext note with a hyperlink to the new URI(s). If
the 301 status code is received in response to a request other than
GET or HEAD, the user agent MUST NOT automatically redirect the
request unless it can be confirmed by the user, since this might
change the conditions under which the request was issued.
It's a race condition. Once the redirect header is sent to the browser, the browser will close the current connection and open a new one for the redirect URL. Until that original connection is closed and Apache shuts down the script, your code will continue to execute as before.
In theory, if there was a sufficiently fast connection between the client/server, and there was no buffering anywhere in the pipeline, issuing the header would cause the script to be terminated immediately. In reality, it can be anywhere between "now" and "never" for the shutdown to be initiated.
The HTML after your Location line doesn't run inside PHP; it would run in the browser. It's up to the browser whether or not to execute the Javascript that you've included on that page; PHP has nothing to do with it.
To me, the PHP docs imply that any PHP below the header() when you send a redirect will continue to run. But it 'runs' in the PHP interpreter, dumping JS to the browser. There's no relation between what it says in the PHP docs and whether or not the JS gets run by the browser.
EDIT:
Well, as Anupam Jain pointed out, looks like that browsers do not terminate connection without getting the response body and it sounds sensible. So i rethinked my answer
That doesn't sound like they guarantee all following code will run
Exactly More likely it's a warning in case there is some sensible code that shouldn't be executed. A private page contents for example. So, beside sending header you have to also make sure that no sensitive content were sent and exit looks like quite robust solution. So, I'd phrase it as "make sure that sensible code below does not get executed when we redirect."
So there's no guarantee the JavaScript would actually begin or finish executing prior to the redirect occurring.
Exactly It seems it has nothing to do with script execution but rather with browser's will to execute anything after getting 3xx response. I think I'm gonna test it, but you can test it as well too.
I have noticed that the code does still execute and multiple headers based on if statements can cause a "redirect loop error". i made it a habit to now add in die("Redirecting..."); after every header redirect and have not see the problem persist.
header("profil.php?id=" . $show["id"]);
What i tried to do, but headers are already sent at top, so how can I redirect the user? Should I use window.location.replace("URL"); (javascript) instead?
If you can't control the very beginning of the script, where headers would be sent, then yes, your only method is to use JavaScript.
Also, the proper syntax is header('Location: profil.php?id=' . $show['id']);
You need the Location: part so the browser knows what header it's receiving. Also, don't forget to do an exit() or die() right after the redirect.
Someone correct me if i'm wrong, but I think you can use ob_start() at the beginning of your page and that will allow you to redirect via PHP even if headers are already sent.
You should redesign your application, to make it more sensible.
It should start output only when it necessary, not just every time this file is called.
You have to modify all your code by dividing every script to 2 parts. First part will contain all data manipulations and second will contain output only. It will be better to put the latter one into separate file, called template. thus your profiles php will looks like
include 'dbc.php';
//some code that sends headers, gets data etc
//after it's all done, call your template files
include 'top.php';
include 'profiles.tpl.php';
include 'bottom.php';
there can be some variations, but the main idea would be the same: separate your data manipulation from data presentation.
From the header documentation:
Remember that header() must be called before any actual output is sent, either by normal HTML tags, blank lines in a file, or from PHP. It is a very common error to read code with include(), or require(), functions, or another file access function, and have spaces or empty lines that are output before header() is called. The same problem exists when using a single PHP/HTML file.
The headers are being sent before your call to header() due to output from the script. You just need to track down where the output is coming from.
I see it that you have two options
1) You try to ensure that your headers are not set until after you have executed your code. Your headers being set before you have even determined what you are sending back to the user suggests your code is a little messy, or you are constrained in some way.
2) You can use your javascript solution. However, I would consider this as a hack, rather than an appropriate solution. Try to figure out the answer to why you can't use approach 1.
EDIT: A code example added
Your code should look something like this
<?php
// perform logic to determine if you need to do the redirect or not.
// if you do need to redirect, set the following
$iNeedToRedirect = true;
// if you do not need to redirect, set the following
$iNeedToRedirect = false;
if ($iNeedToRedirect) {
header("Location: profil.php?id=" . $show["id"]");
die();
}
// if code gets here, carry on as normal
include("dbc.php");
include("top.php");
... etc etc etc
?>
Quick question, I noticed that on some of my header directors I was getting some lag while the header processed. Is using return standard after using headers? Also if you use a header on pages you don't want directly accessed, such as processing pages will return; stop that processing even if the page is not directly accessed? IF return is a good idea would it be better to use exit()?
header("Location: ......"); exit; is a fairly common pattern.
You do not need to supply return; after calling header but I know some people use the convention of supply exit; after header call to ensure the code below will not execute during a redirect.
Keep in mind you can use header() for other things besides Location: redirects:
header("Content-type: image/jpeg"); // for example
The reason you would exit after a header redirect is, any content output after a header() redirect, will (most likely) not be seen by the browser.
More importantly you wouldn't want any code to be executed after a header() redirect, so calling exit() after a redirect is good practice.
When you send the header, it is but a mere advisory to the client(the browser) that you think they should request another url instead. However, nothing can stop them from not following your recommendation. They can continue reading more data from the current url, if your server keeps feeding it to them. This is why you generally see php code that calls exit() after sending a redirect header, because if you stop outputting more data, there is nothing for them to read.
Aside from keeping them from reading unintended data, there's other reasons:
Maybe it's just plain senseless for the rest of the script to continue executing, wasting resources.
Maybe runtime errors would occur if the script were to continue(ex, there were missing variables, or a db connection failed).
Maybe logic errors would occur if the script were to continue(ex, user input validation/authentication failed).
It's up to the client to determine what to do after an header("Location: ...").
Any code after header() will be executed regardless. Putting an exit(); just after the header is a safeguard and is required for securing your site.
If you have some candy after header("Location: ..."), the only thing the browser have to do is to ignore the request. Then it'll be clear as day. With exit(); you're stopping execution of the page and hopefully there are no other attack vectors to your app!
i read somewhere that ob_start() should be places top of the page. whereas
somewhere i read that session_start() should be placed on the top of the page.
somewhere i read header() should be placed on the top of the page.
somewhere i read include() or require() should be placed on the top of the page.
i m getting confused what should be written on the top and in which order ther are placed ? and what means by on the top??? is it
before <html> or
after <html> or before <head> or
after <head>
please tell me what is real order of all these function
like same manner where we have to put ob_end_flush(); and other function, at the bottom of the page after <html> or after </body> and what is the order of functions that comes on the bottom of the page
In order to understand the value of the statements you have written you need to have some basic understanding of the operations of the functions you mention. I'll try to break them down here.
Let's start with session_start() and header() calls:
The first function does exactly what the name implies; it starts a session.
Due to the stateless nature of the HTTP protocol, there is a need for some mechanism that can remember state between page requests. This can be achieved with sessions. Although sessions, in the early days of PHP where sometimes propagated by passing along the session ID in links ( someurl?sessionId=someSessionHash ), this, nowadays, is considered bad practice.
Nowadays, sessions are predominantly kept track of by using a cookie (in the early days they where widely used too, don't get me wrong). This session cookie (which, contrary to popular belief, is nothing more than a normal cookie, with merely the session ID in it, that (usualy) simply expires after you close your browser) is sent along to the browser with each subsequent page request. And here is where the catch is: A cookie is sent as a header of the response (meaning before the actual body), like so:
// I've left out a lot of other headers for brevity
HTTP/1.x 200 OK
Date: Sun, 31 Jan 2010 09:37:35 GMT
Cookie: SESSION=DRwHHwAAACpes38Ql6LlhGr2t70df // here is your Cookie header
// after all response headers come the actual content:
// the response body, for instance:
<html>
<head>
</head>
<body>
</body>
</html>
Now, because response headers must be sent before the response body, you need to put a call to session_start() and header() before any body content is output. Here's why: if you output any response body content (could be something as simple as a whitespace character) before a call to session_start() or header(), PHP will automatically output the response headers. This is because a HTTP response must have the response headers sent out first before the response body. And it is exactly this that often leads to the infamous Warning: headers already sent warning in PHP. In other words; once PHP has sent out the headers, because it had to send body data too, it cannot add any headers anymore.
So, now that you understand this about the HTTP protocol, there are some measurements you can take to prevent this from happening. And this is where we come to the next function(s):
ob_start, ob_flush, etc...:
In a default setup PHP usualy outputs anything immediately. Therefor, if you output any response body content, headers are automatically sent first.
But PHP offers mechanisms of buffering output. This is the ob_* family of functions. With ob_start you tell PHP to start buffering. And with ob_flush you tell PHP to flush the buffer; in other words output the current content of the buffer to the standard output.
With these buffering mechanisms you can still add headers to the response, after you have output body data, because you haven't actually sent body data yet, you have simply buffered it, to be output later with a call to ob_flush or ob_end_flush and what have you.
Keep in mind though, that using ob_* functions is more than often a code smell. In other words (and this is why it is important to do certain stuff at the top), it is then used to make up for poor design. Somebody forgot to set up their order of operations properly and resorts to output buffering to circumvent this header and session drama.
Having said all this, you can easily see why the outputting of html and/or other body content should come last. Apart from that, I strongly recommend you to separate PHP code from output code anyway. Because it is much more easy to read and understand. And a good way to start doing that is having the actual html come after the main <?php ?> code block. But there are other ways as well, which is beyond this questions scope.
Then lastly about the include and require calls. To have these at the top of your php files is usually ment to be clarifying. It keeps these calls nicely in one place. But keep in mind, that if one of these files output anything before you call session_start() or header() without using output buffering, you're screwed again.
"Top of the page" means before any output. "Bottom of the page" means after all output.
It simply means before any other character, where "other" means "non PHP code".
All code between <?php and ?> is not sent to the browser, so it doesn't count. Thus usually "top of the page" means before the <html> start tag. Be careful because if you have an empty line or even just one single whitespace before that tag (or even before the PHP opening tag), that counts as output.