best practice to cache dynamic content at client side - php

Sorry for that, but i really concluded with the decision that it's better to ask directly than browsing tons of pages in vain.
I've already looked through enough resources, but haven't found a decent explanation that could fulfill my curiosity about simplest question.
Assume there’s a URI located at – hhtp://example.com/example (including php script, queries to the database).
Let’s imagine I’ve loaded it in my browser, click some link and hit “back” to return to hhtp://example.com/example
As far as I can allow myself to understand about what happens behind the scene looks smth like this:
After “back” been clicked there browser checks its cache specifically hhtp://example.com/example which matches exactly to the requested file (after “back”) and finds out that it wasn’t changed within this short period of time since it was first time loaded and returns it from its cache.
Wait!!!!
The file contains server side scripts, database queries and so forth.
So it should again reach web server, request same data from mysql and output it in a file.
So what’s the best strategy to cache dynamic content preferably on client-side vs server-side?
In which cases it makes useful to cache content at server-side, and what practice is the best?
Please can someone provide some resources covering this subject that can be conceived by such dumpers like my and refute or adjust the scheme above about what actually happens.
While browsing the issue i run into one service - http://gtmetrix.com/ I liked very much,
There were smth mentioned about making ajax request cacheable – I may assume that it can be perfectly used for client-side caching of dynamic content retrieved from database. Can someone please acknowledge it or deprecate?

Related

How can I cache (cache on my server) a site like Stackoverflow written in PHP

I have tried to make my own SO for practice and am basically finished the basics. I have never cached anything more then javascript and css before. I have tried searching SO and google but I cant get clear on the following questions.
I also want to know what pages should be cached on the server. i.e should I cache a page like questions/45/title-goes-here?
How do I cache the header part if the username is different for everyone?
Do I dump the whole file to a text file for every single question? That doesn't seem very practical.
How do i set up a cache page to be used by the back button.
Sorry if the answers are obvious, but I have researched and just dont get it.
Thanks
Your caching system is basically a set of tools to do quick lookups on things that are "expensive" to generate and don't change much.
To determine what should be cached, you need to study your work so far and figure out what parts are taking the most CPU or database time. And then cache those.
For caching StackOverflow, perhaps one strategy might be to generate a cache object for the HTML of each question, including s that could be populated afterwards using JavaScript. The process of looking up the question and tags might be more time consuming than just looking up the single cache entry that includes both.
For your header, that's also a candidate for JavaScript, as long as you don't mind skipping graceful degradation of the user interface. The cached username section of the header might look something like:
<div id="username"></div>
Then JavaScript, generated by something not cached, would "fill in the blanks" with personalized content.
Hope this helps.

AJAX exposing the API of a web app

A couple of years ago, before I knew about Stack Overflow, I was working in an office with a lot of competition between the programmers. There, I had to code a web page in PHP with Drupal, that needed to get data from another site by RSS. What happened was that there was no way to get the data beforehand: the data depended on the content of the page which itself was dynamic, so the page stopped loading for a couple of seconds while PHP went to get the RSS data. That was bad. The page depended on a couple of parameters out of a huge list. So fetching all possible combinations in davance was out of the questions. It was some sort of search page, that included the results of a sister site, I think.
The first thing I did to improve that was to set up a caching system. When the page was loaded, it launched a Javascript method that saved the RSS data back into the database for this specific page, using AJAX. That meant that if the same page was requested again, the old data would be sent immediately. and the AJAX script would get the cache updated with the new data, if needed. The Javascript pretty much opened a hidden page on the site with a GET instruction that matched the current page's parameters. It's only a couple of days later that I realised that I could have cached the data without the AJAx. (Trust me, it's easier to spot in hindsight.) But that's not the issue I'm asking about.
But I was told not to do any caching at all. I was told that my AJAX page "exposed the API". That a malicious user could hit the hidden page again and again to do a Denial of Service attack. I thought my AJAX was a temporary solution anyway, but that caching was needed. But mostly: wasn't the DoS argument true of ANY page on the site? Did the fact that my hidden page did not appear in the menus and returned no content make it worse?
As I said, there was a lot of competition between programmers, so the people around me, who were unanimous, might have been right, or they may have tried to stop me from doing something that was bad because they were not the ones doing it. (It happened a lot.) But I'm still curious. I was fully aware that my AJAX thing was a hack. I wanted to change that system as soon as I found something better, but I thought that no caching at all was even worse. Which was true? Doesn't, by that logic, ALL AJAX expose the API? If we look past the fact that my AJAX was an ugly hack, was it really that dangerous?
I'll admit again and again that it was an ugly, temporary fix, but my question is about having a "hidden" page that returns no content that makes the server do something. How horrible is that?
both sides are right. Yes, it does "expose" the api, but ajax requests can only access publicly accessible documents/scripts in the first place, so yes, all ajax requests "expose" their target script in the same way. DoS attacks are not script specific, they are server specific, so one can perform a DoS using anything pointing to the server, not just this script your ajax calls. I would tell your buddies their argument is weak and grasping at straws, and don't be jealous :P
If I read your post correctly, it seems as if the AJAX requested version of the page would know to invalidate the cache each time?
If that's the case, then I suppose your co-worker might have been saying that the hidden page would be susceptible to a DDOS attack in a way that the full pageload wasn't. I.E. The full pageload would get a cached version on each pageload after the first, where as the AJAX version would get fresh content each time. If that's the case, then s/he's right.
By "expose the API", your co-worker was saying that you were exposing the URL of a page that was doing work that should be done in the background. The outside world should not know about a URL whose sole purpose is to do some heavy lifting task. As you even said, you found a backend solution that didn't require the user's browser knowing about your worker process at all.
Yes, having no cache at all when the page relies on heavy content is worse than having an ajax version of the page do the caching, but I think the warning from your coworker was that no page, EVEN if it's AJAX, should have the power to break the cache in a way you didn't expect or intend.
The only way this would be a problem is said "hidden page that returns no content that makes the server do something" had different authentication scheme or permissioning from the rest of the pages, or if what it made the back-end do would be inordinately heavy compared to any other page on the site that posted something.

"How the sausage is made" tour of apache/php/mysql interaction

I am having trouble understanding how apache/php/mysql stack works on a low level (including interaction with the browser). Is there a good description somewhere (a book, a website, etc) that will walk me through the whole path and explaining how starting with a browser reqesting a url, http requests is being sent, how apache talks to php, how php talks to mysql (persistant and non-persistant connections), etc, etc. I want to understand what waits for what in this chain, where timeouts are handled, how long sockets are opened and closed. A book, an article maybe? There is a lot of documentation on each individual component, but I can't find a "walkthrough".
The explanations I se so far are very high-level: look, here's a happy cow, it goes to Bovine University, look - it's all shrink wrapped on the supermarket shelf. What I need is the sausage farm/slaughterhouse/truck/factory tour, starting with cow insemenation :)
[update] To this day I have not found a better way to learn about these things other than reading the source.
PHP and MySQL by example has a pretty basic picture of the process, which I think you probably already understand.
Getting more in-depth than that picture though is a pretty long discussion. Ironically, you can read the book I just linked for a pretty good description. If you have more specific questions, I recommend opening new questions for them. Enjoy!
I have found a site that has, at least in part, contents from the book Advanced PHP Programming by George Schlossnagle.
The site is located at: http://php.find-info.ru/php/016/toc.html. Specifically, the section on The PHP Request Life Cycle contains a lot of the nitty-gritty details, including some source code and diagrams.
DISCLAIMER: IANAL, but considering that the book is still listed on Amazon, its possible the content linked to above breaks all sorts of codes, rules and/or laws. Its not my intention to proliferate or condone illegal or pirated materials, so if that be the case, please remove said links.
You are correct in the fact that there are entire books written on how this all fits togeather here is a link to a "walkthrough" it touches on the main parts.
http://computer.howstuffworks.com/web-server.htm
Hope it helps
The best course of action would be to get a good book about the LAMP stack.
A quick response (ask for more if you feel you need it)
Browser contacts web server though HTTP protocol
Server generates (let's leave how for the moment) an html result and posts it back.
Each browser understands only http protocol (for the sake of this analysis).
Now items such as icons, images, javascript etc, are just read from the apache server and "copied" to the browser. Same in plain html files.
The difference is in php files (I am oversimplifying here). These are passed to the php module and the response (of the module) will be sent back to the browser.
The php module is what understands php.
Are we together here? if yes then:
Php script may (or may not) require data from an MySQL server, it has to connect get them or manipulate them etc.
Summarising: Each of these operation is being done individually in a different process level. That's what makes it "simple".
Ask for more information if you want something more specific.
As far as I understand it apache receives the request, and works out what to do with it based on your .htaccess or config options. It then passes this request to PHP for parsing, if needed. PHP does two scans of the code, the first is the pre-parse, this picks up obvious flaws and runs functions on the top level(ignoring any in if statements, loops, includes, evals or lamda based functions), before parsing the page for real. Anything done with echo, I do believe, is returned as the standard out stream, and is returned to apache. If apache times the page out it sends the kill signal to PHP, which closes objects, prints the error messages if needed, before exiting. Once the page exits apache tends to headers and returns the page.
I would love to know more about this though, so if anyone can explain it better or has a correction/expansion on my answer, I'd love to hear it.

When is it appropriate to use AJAX?

When is it appropriate to use AJAX?
what are the pros and cons of using AJAX?
In response to my last question: some people seemed very adamant that I should only use AJAX if the situation was appropriate:
Should I add AJAX logic to my PHP classes/scripts?
In response to Chad Birch's answer:
Yes, I'm referring to when developing a "standard" site that would employ AJAX for its benefits, and wouldn't be crippled by its application. Using AJAX in a way that would kill search rankings would not be acceptable. So if "keeping the site intact" requires more work, than that would be a "con".
It's a pretty large subject, but you should be using AJAX to enhance the user experience, without making the site totally dependent on it. Remember that search engines and some other visitors won't be able to execute the AJAX, so if you rely on it to load your content, that will not work in your favor.
For example, you might think that it would be nice to have users visit your blog, and then have the page dynamically load the newest article(s) with AJAX once they're already there. However, when Google tries to index your blog, it's just going to get the blank site.
A good search term to find resources related to this subject is "progressive enhancement". There's plenty of good stuff out there, spend some time following the links around. Here's one to start you off:
http://www.alistapart.com/articles/progressiveenhancementwithjavascript/
When you are only updating part of a page or perhaps performing an action that doesn't update the page at all AJAX can be a very good tool. It's much more lightweight than an entire page refresh for something like this. Conversely, if your entire page reloads or you change to a different view, you really should just link (or post) to the new page rather than download it via AJAX and replace the entire contents.
One downside to using AJAX is that it requires javascript to be working OR you to construct your view in such a way that the UI still works without it. This is more complicated than doing it just via normal links/posts.
AJAX is usually used to perform an HTTP request while the page is already loaded (without loading another page).
The most common use is to update part of the view. Note that this does not include refreshing the whole view since you could just navigate to a new page.
Another common use is to submit forms. In all cases, but especially for forms, it is important to have good ways of handling browsers that do not have javascript or where it is disabled.
I think the advantage of using ajax technologies isn't only for creating better user-experiences, the ability to make server calls for only specific data is a huge performance benefit.
Imagine having a huge bandwidth eater site (like stackoverflow), most of the navigation done by users is done through page reloads, and data that is continuously sent over HTTP.
Of course caching and other techniques help this bandwidth over-head problem, but personally I think that sending huge chunks of HTML everytime is really a waste.
Cons are SEO (which doesn't work with highly based ajax sites) and people that have JavaScript disabled.
When your application (or your users) demand a richer user experience than a traditional webpage is able to provide.
Ajax gives you two big things:
Responsiveness - you can update only parts of a web page at a time if need be (saving the time to re-load a page). It also makes it easier to page data that is presented in a table for instance.
User Experience - This goes along with responsiveness. With AJAX you can add animations, cooler popups and special effects to give your web pages a newer, cleaner and cooler look and feel. If no one thinks this is important then look to the iPhone. User Experience draws people into an application and make them want to use it, one of the key steps in ensuring an application's success.
For a good case study, look at this site. AJAX effects like animating your new Answer when posted, popups to tell you you can't do certain things and hints that new answers have been posted since you started your own answer are all part of drawing people into this site and making it successful.
Javascript should always just be an addition to the functionality of your website. You should be able to use and navigate the site without any Javascript involved. You can use Javascript as an addition to existing functionality, for example to avoid full-page reloads. This is an important factor for accessibility. Javascript should never be used as the only possibility to reach or complete a request on your site.
As AJAX makes use of Javascript, the same applies here.
Ajax is primarily used when you want to reload part of a page without reposting all the information to the server.
Cons:
More complicated than doing a normal post (working with different browsers, writing server side code to hadle partial postbacks)
Introduces potential security vulnerabilities (
You are introducing additional code that interacts with the server. This can be a problem on both the client and server.
On the client, you need ways of sending and receiving responses. It's another way of interacting with the browser which means there is another point of entry that has to be guarded. Executing arbritary code, posting data to a non-intended source etc. There are several exploits for Ajax apps that have been plugged over time, but there will always be more.
)
Pros:
It looks flashier to end users
Allows a lot of information to be displayed on the page without having to load all at the same time
Page is more interactive.

Adding some custom session variables to a JavaScript object

I currently have a custom session handler class which simply builds on php's session functionality (and ties in some mySQL tables).
I have a wide variety of session variables that best suits my application (primarily kept on the server side). Although I am also using jQuery to improve the usability of the front-end, and I was wondering if feeding some of the session variables (some basics and some browse preference id's) to a JS object would be a bad way to go.
Currently if I need to access any of this information at the front-end I do a ajax request to a php page specifically written to provide the appropriate response, although I am unsure if this is the best practice (actually I'm pretty sure this just creates a excess number of Ajax requests).
Has anyone got any comments on this? Would this be the best way to have this sort of information available to the client side?
I really guess it depends on many factors. I'm always having "premature optimization ..." in the back of my head.
In earlier years I rushed every little idea that came to my mind into the app. That often lead to "i made it cool but I didn't took time to fully grasp the problem I'm trying to solve; was there a problem anyway?"
Nowadays I use the obvious approach (like yours) which is fast (without scarifying performance completely on the first try) and then analyze if I'm getting into problems or not.
In other words:
How often do you need to access this information from different kind of loaded pages (because if you load the information once without the user reloading there's probably not much point in re-fetching it anyway) multiplied by number of concurrent clients?
If you write the information into a client side cookie for fast JS access, can harm be done to your application if abused (modified without application consent)? Replace "JS" and "cookie" without any kind of offline storage like WHATWG proposes it, if #1 applies.
The "fast" approach suits me, because often there's not the big investment into prior-development research. If you've done that carefully ... but then you would probably know that answer already ;)
As 3. you could always push the HTML to your client already including the data you need in JS, maybe that can work in your case. Will be interesting to see what other suggestions will come!
As I side note: I've had PHP sessions stored in DB too, until I moved them over to memcached (alert: it's a cache and not a persistent store so may be not a good idea for you case, I can live with it, I just make sure it's always running) to realize a average drop of 20% of database queries and and through this a 90% drop of write queries. And I wasn't even using any fancy Ajax yet, just the number of concurrent users.
I would say that's definately an overkill of AJAX, are these sessions private or important not to show to a visitor? Just to throw it out there; a cookie is the easiest when it comes to both, to have the data in a javascript object makes it just as easily readable to a visitor, and when it comes down to cookies being enabled or not, without cookies you wouldn't have sessions anyway.
http://www.quirksmode.org/js/cookies.html is a good source about cookie handling in JS and includes two functions for reading and writing cookies.

Categories