What is the definitive solution for avoid any kind of caching of http data? We can modify the client as well as the server - so I think we can split the task between client and the server.
Client can append to each request a random parameter http://URL/path?rand=6372637263 – My feeling is that using only this way it is not working 100% - might be there are some intelligent proxies, which can detect that… On the other side I think that if the URL is different from the previous one, the proxy cannot simply decide to send back some cached response.
On server can control a bunch of HTTP headers:
Expires: Tue, 03 Jul 2001 06:00:00 GMT
Last-Modified: {now} GMT
Cache-Control: no-store, no-cache, must-revalidate, max-age=0
Cache-Control: post-check=0, pre-check=0
Pragma: no-cache
Any comments to this, what is the best approach?
Server-side cache control headers should look like:
Expires: Tue, 03 Jul 2001 06:00:00 GMT
Last-Modified: {now} GMT
Cache-Control: max-age=0, no-cache, must-revalidate, proxy-revalidate
Avoid rewriting URLs on the client because it pollutes caches, and causes other weird semantic issues. Furthermore:
Use one Cache-Control header (see rfc 2616) because behaviour with multiple entries is undefined. Also the MSIE specific entries in the second cache-control are at best redundant.
no-store is about data security. (it only means don't write this to disk - caches are still allowed to store the response in memory).
Pragma: no-cache is meaningless in a server response - it's a request header meaning that any caches receiving the request must forward it to the origin.
Using both Expires (http/1.0) and cache-control (http/1.1) is not redundant since proxies exist that only speak http/1.0, or will downgrade the protocol.
Technically, the last modified header is redundant in light of no-cache, but it's a good idea to leave it in there.
Some browsers will ignore subsequent directives in a cache-control header after they come across one they don't recognise - so put the important stuff first.
Adding header
Cache-control: private
guarantees, that gataway cache won't cache such request.
I'd like to recommend you Fabien Potencier lecture about caching: http://www.slideshare.net/fabpot/caching-on-the-edge
To disable the cache, you should use
Expires: 0
Or
Cache-Control: no-store
If you use one then should not use other one.
Related
I realise that this might be a VERY obscure question but it's driving me mad, I have 5 extra characters being inserted into the URL while navigating between the pages on my site. (eg. http://track.chhs.nsw.edu.au/UXTWP/userAccount.php?) The UXTWP is being added and I'm not sure where from but it is breaking the navigation randomly.
The site is hosted on goDaddy.
It contains HTML CSS PHP JavaScript and mySQL.
Everything was working well until I added a "fix" in PHP to stop a potential 'hack' that would use an id being passed in the URL to switch the viewed content.
I'm not sure this was the problem but that was the most recent change before the errors started occurring.
this is the site I also looked to place the code up on phpfiddle but I'm not sure what to add?
if(isset($_GET['a'])){
if(strpos($userRow['sID'], $_GET['a']) !== false) {
$_SESSION['student']=$_GET['a'];
$tempArray = db_select("SELECT * FROM student WHERE sID ='".$_SESSION['student']."'");
$studentRow = array_shift($tempArray);
$_SESSION['impactTool'] =$studentRow['impactAssToolID'];
$SName = $studentRow['sName'];
$SDOB = $studentRow['dob'];
$SFormDate = $studentRow['formDate'];
$prevInf = $studentRow['prevInfo'];
$famInf = $studentRow['famInfo'];
$contInf = $studentRow['contextInfo'];
$impactIDMsg = "?z=".$_SESSION['impactTool'];
$btnFlag = true;
}else{
header("Location:logout.php");
}
The intention is to dump the user back to the login screen via logout if they attempt to access a student's detail that doesn't belong to them.
Thanks in advance for any help provided.
Ok this time I think it is fixed!! Thank you so much #Progrock for your persistent testing and ideas.
The fix:
I have included a blank .htaccess file into the root of the site.
Now I can navigate through the different pages using the onsite navigation and the browser navigation and I can't create the error anymore.
I'm hoping that this is a permanent fix and my best guess is that it was the browser/server looking for the .htaccess file on particular triggers when not finding it looking to the server generic .htaccess file.
Hope this post helps someone in the future experiencing a similar problem.
Not an answer, but an observation:
I finally experienced the bug when using curl to view headers:
curl -I http://track.chhs.nsw.edu.au
Output:
HTTP/1.1 302 Found
Connection: close
Pragma: no-cache
cache-control: no-cache
Location: /TSXbZ/
Then shortly after, the same curl call resulted in the desired page without the redirect. So the bug is inconsistent, as you have said.
If I do a header location redirect in Php code. Or I use a .htaccess rule to do something similar: A return header reads something like this:
Server: Apache/2.2.22 (Foo)
The absence of an apache server header (for some of your responses) makes me suspicious that a proxy or caching layer may sit in front of your webserver and Php code.
Reading your code, I can't see any obvious reasons for the character insertions.
Notice subsequent differences with the following responses (return headers):
3:21% curl -I http://track.chhs.nsw.edu.au
HTTP/1.1 302 Found
Connection: close
Pragma: no-cache
cache-control: no-cache
Location: /XRjRZ/
3:23% curl -I http://track.chhs.nsw.edu.au
HTTP/1.1 302 Found
Connection: close
Pragma: no-cache
cache-control: no-cache
Location: /
3:24% curl -I http://track.chhs.nsw.edu.au
HTTP/1.1 302 Found
Connection: close
Pragma: no-cache
cache-control: no-cache
Location: /
3:24% curl -I http://track.chhs.nsw.edu.au/index.php
HTTP/1.1 302 Found
Connection: close
Pragma: no-cache
cache-control: no-cache
Location: /index.php
3:24% curl -I http://track.chhs.nsw.edu.au/index.php
HTTP/1.1 200 OK
Date: Fri, 02 Dec 2016 15:24:56 GMT
Server: Apache/2.4.23
X-Powered-By: PHP/5.4.45
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: PHPSESSID=60d307bdc288bf1371dc5e0c8c397cdf; path=/
Vary: User-Agent
Content-Type: text/html
Have you got an esoteric .htaccess, or server config?
I am building a very simple page here: http://www.wordjackpot.com
My problem appears in Google Chrome only, when I reload the page, the images are reloaded each time as if there's no cache, I'm not sure if the problem comes from my code or from chrome because for the example on stackoverflow.com images have http code 304 when I reload the page.
Then my question is: what am I doing wrong ?
Thanks.
These are your return headers... you are explicitly telling the browsers to not cache.
This will be an apache (web server) setting.
Accept-Ranges:bytes
Accept-Ranges:bytes
Cache-Control:no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Connection:keep-alive
Content-Length:4026
Content-Type:image/png
Date:Tue, 03 Feb 2015 14:33:44 GMT
Pragma:no-cache
Server:Apache
Set-Cookie:300gp=R3396092545; path=/; expires=Tue, 03-Feb-2015 15:46:10 GMT
X-Cacheable:Not cacheable: no-cache
X-Geo:varn34.rbx5
X-Geo-Port:1011
X-Pad:avoid browser bug
Look at your HTTP Headers, you have no-cache all over it.
I am having difficulty with the header function in PHP.
The call to the function is initiated on a secure HTTPS page. Every time I call the header function with http://, something somewhere is changing the protocol to HTTPS.
In my program, this example:
header("Location: http://www.google.com");
takes me to https://www.google.com instead.
My environment is IIS 7.5 Windows 2008 64-Bit
PHP 5.5.12 with Fast CGI
Is there something that I have accidentally enabled either in IIS or php.ini that would automatically force http to https?
This does not happen when launching the code from an http page, http to http works, http to https works and https to https work. However, https to http is failing.
I've been searching and most results keep reversing my question by showing me ways to force http to https. I need the opposite.
Thanks in advance for any assistance!
EDIT: Google was an example URL. Sorry.
header("Location: http://www.systronicsinc.com/");
is my actual URL that is failing. This keeps redirecting to https://www.systronicsinc.com/.
This is a raw header from Fiddler.
HTTP/1.1 303 See Other
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Content-Type: text/html; charset=UTF-8
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Location: https://www.systronicsinc.com/
Server: Microsoft-IIS/7.5
X-Powered-By: PHP/5.5.12
Set-Cookie: PHPSESSID=va1hh3ff8h0buus689kf86eoc1; path=/
Date: Fri, 24 Oct 2014 17:39:34 GMT
Content-Length: 156
<head><title>Document Moved</title></head>
<body><h1>Object Moved</h1>This document may be found here</body>
I find it interesting that the link in the body retained the original http protocol as initially set, but the Location field in the header is modifying it to https. I've been hunting through IIS and my php.ini file. I cannot see anything that would dictate this behavior. Maybe this additional information will spark a thought with someone. Thanks!
Google uses SSL, so https://, for it's websites.
See: https://support.google.com/websearch/answer/173733?hl=en
and: https://www.seroundtable.com/google-ssl-drops-query-data-14188.html
No, Google redirects you to a secure page.
They probably use a function that does something like my https function. Feel free to use.
function https(){
$sv = $_SERVER;
if(!isset($sv['HTTPS'])){
header("LOCATION:https://{$sv['SERVER_NAME']}{$sv['PHP_SELF']}"); die;
}
}
function http(){
$sv = $_SERVER;
if(isset($sv['HTTPS'])){
unset($_SERVER['HTTPS']);
header("LOCATION:http://{$sv['SERVER_NAME']}{$sv['PHP_SELF']}"); die;
}
}
I am going a little crazy here trying to find a solution to something that is probably pretty straight forward.
I have a group of reports on an intranet (not accesible to the outside world) and each report has an input form that has a bunch of HTML inputs that vary the report data.
Problem being when you hit back to the form from the report the form is reset to it's original state. I want it to cache (remember the HTML input variables) and all I can find is how to turn caching off, I want it on! I would prefer not to do this with $_SESSION and $_COOKIE storing as I have 120 reports with roughly 10 or so inputs each, so its going to take forever to store everyone of them and re-load variables on refresh.
I am not the server administrator, but I beleive we are running Apache 2.2 web server. These are all PHP/HTML based pages. Any advice would be great!
It is not to do with my Browser as other forms are being cached. I am more looking into what modules on the server need to be activated to allow caching and what notes I should put in the header of the forms to allow caching. The intranet runs through a proxy so I am thinking I will need cache-control to be public.
EDIT:
When I run the form page, the HTTP headers show me this which I feel should be changed:
(under Response Headers)
X-Powered-By: PHP/5.3.3
Via: *[REMOVED]*
Server: Apache/2.2.3 (Red Hat)
Proxy-Connection: Keep-Alive
Pragma: no-cache
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Date: Wed, 13 Feb 2013 23:33:32 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 5191
Connection: Keep-Alive
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
I have a feeling I need to change the Cache-Control and Pragma values. Anyone know how to acheive this?
trying adding these headers to the top of the page:
header("Cache-Control: private, max-age=10800, pre-check=10800");
header("Pragma: private");
header("Expires: " . date(DATE_RFC822,strtotime("+2 day")));
NOTE: if the form submits and post data to a second page, you may want to put it at the top of both pages. also, make sure the code is after any session_start(); if you are using sessions.
Try setting the autocomplete attribute of the inputs to on.
<input name="myinput" autocomplete="on" type="text">
I've noticed my sites are not ranking as well as they did before and when I checked Webmaster tools I see that gooblebot cannot crawl pages that I can perfectly crawl with my browser and I'm getting an 500 error.
The websites are not WordPress and use PHP.
What can be causing this problem?
This is the actual error in WMT
HTTP/1.1 500 Internal Server Error
Date: Tue, 06 Nov 2012 21:04:38 GMT
Server: Apache
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: PHPSESSID=blkss9toirna36p2mjl44htv01; path=/
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 3840
Connection: close
Content-Type: text/html
You may be blocking Googlebot with .htaccess, robots.txt or by some other means (maybe firewall settings?)
a. this is not good
b. you should use WMT to get Crawl stats/Crawl Error reports and use these to get better understanding of this issue (at what URLs / How Often does this occur...)
Also, try to look at your last Google Cache date (direct search the domain and click on the Cache link in the preview window)
This may be temporary, downtime related issue that will solve itself or a site wide blocking rule that you'll need to change.
GL
If you're still having a problem with googlebot receiving a 500 error code, I suggest you register with Google Webmaster Tools not Analytics. If you choose Health then Fetch As Google. You should get what the googlebot receives and see what the error is.
I had the same problem and discovered that it was one of the plugins that was causing this. Basically I disabled every plugin and then re-enabled one, tested, re-enabled the next .......
Took about 1 hour to find the culprit but now all is good