I had a thought...
Dunno if it's a good one or a bad one.
I am working on an image-less/responsive theme, for a SMF Fork. I was thinking since it's written in PHP, would it be valid to add php include a "style.php" in the header, containing the all the styles for the pages.
I was thinking this would give me two major benefits. One, would be that I could add variables start to the css file. Two, it would be one less HTTP request. I know that pagespeed and yslow would bitch about the css being included inside the page in between tags, but it is none the wiser, correct?
As far as I can tell, I see alot of benefits in doing it this way regardless of what pagespeed/yslow thinks. I could even do this with javascript, maybe...
I wonder if the IE maximum 4096 CSS rules would still apply?
I am a PHP Ultra Noob, but have a good amount of experience in web design. I can't seem to fine a reason "not" to do it. Any experts willing to share their thought on this idea?
I don't think it's a good idea. If you want to use variables in your CSS, look at SASS or LESS. Regarding the additional request, CSS is static, so if you do your job on the server side, the browser will retrieve the CSS only once, and subsequent requests will use the cached copy.
I don't think this can be harmful, however that's quite a diverge from standard development, so it's not a good idea just for this. Also, since nobody does it, is must not be such a smart invention.
A css file is generally more valid for better speed, because it is requested but once, and then cached for a long time. It is 1 extra request for the whole browsing session if they haven't got in in their cache already compared to the same css over and over and over again in the head tags, making your actual pages load slower. All in all, after a few requests a separate (cachable) file usually already wins out, provided you set it to be cachable for a long time (don't worry about people not seeing css changes, if you change your css, just add some query parameter like /styles.css?rev=1. You don't use that parameter, you just increase it whenever your css changes thus making the client request a fresh copy.
That doesn't mean you can't use PHP (or nodejs/less for that matter) to create or serve your CSS file, variables are indeed nice to have. If going the less route, DO convert it to css once on your own server instead of bothering clients with heavy javascript to convert it again and again.
You can actually include anything as a CSS file if it's valid CSS (and actually even if it's not, I suppose):
<link rel="stylesheet" type="text/css" href="/style.php">
//style.php
<?php header('Content-type: text/css');
$style = 'bold';
?>
strong {
font-weight: <?php echo $style ?>;
}
Related
My main goal is to allow for the loading of several pages to be as fast as possible. For this I want to take advantage of both, the cache and one "special technique" that, as a fallback, relies on the standard cache.
Structure
On the backend I have the following structure. There's a main page in the public_html and several subpages, each with specific css rules different from each other. The creation of all the minimized files is done by a script, so no extra complexity there. For simplicity, let's assume that this is the structure, although it's more complex:
/public_html
/index.php
/style.css ~50kb
/min.css ~100kb
/subjects
/index.php
/style.css ~20kb
/min.css ~10kb
/books
/index.php
/style.css ~20kb
/min.css ~10kb
...
First request
So when the user enters first time on a subpage, they will receive this html code:
<!DOCTYPE html>
<html>
<head>
<link href="/subjects/min.css" rel="stylesheet" type="text/css">
</head>
<body>
All the body here
<link href="/min.css" rel="stylesheet" type="text/css">
</body>
As you can see, the user loads all the css code needed for that page in the header, in a small file. Note that /subjects/min.css is MUCH smaller than /min.css which would make this first request to load faster. Then, after the full html and css has correctly loaded, the /min.css will start loading. This file contains all of the subpages style.
Note that it's appropriate to put the <link> within the <body> tag, and even if it didn't work, there's no problem since the page-specific style is already loaded. Why am I loading this here? Keep reading:
Following requests
For the second and subsequent requests on that session, the user will receive this html code:
<!DOCTYPE html>
<html>
<head>
<link href="/min.css" rel="stylesheet" type="text/css">
</head>
<body>
All the body here
</body>
The /min.css should be already cached from the first request. However, if for any reason it's not, it will load now the full minimized style, as in any normal website. This would be the fallback case.
Is this a valid scheme? Why haven't I seen anything like this before? Does it contain any logic error?
These are the main problems I can see, not strong enough in comparison to the benefits:
It adds some extra complexity to the code.
An extra request, after everything is already loaded, needs to be made. This would add a slight overhead on the server, however it's a static file.
Notes about the comments:
The browser will make less requests. This is true, in this way the browser does one extra request. However, it's after loading the html and css, so this will not affect in a great manner the html.
Cache. Yes, I'm doing my best to catch the file. A point could be made against cache of the <link> if it's inside the <body>, though, because I don't know if it behaves differently about the cache, I only assumed yes in the question.
UPDATE:
Please mind that the answer which the questioner marked as accepted cannot be recommended -
don't ever do this!
Any kind of "pre-loading" of CSS files doesn't make any sense, as you should never split up your CSS into several files!
My original answer:
So what is your real question in the end?
In my humble opinion your doing it all wrong - sorry!
Usually an author intends to
give a site a consistent look/ appearance
keep the maintainability as easy as possible
avoid FOUC (flash of unstyled content)
minimize number of (HTTP) requests
support cache mechanism to reduce bandwidth/ data volume
just to mention some of the most important aspects.
All of them are disregarded by your approach.
As you are using the link element within the body element, I assume you are using HTML5. Because in other HTML versions this would be invalid.
But also in HTML5 I would not rely on this. Have a look at the 2 versions:
http://www.w3.org/html/wg/drafts/html/master/document-metadata.html#the-link-element
http://www.w3.org/html/wg/drafts/html/CR/document-metadata.html#the-link-element
Compare the section (at the top) "Contexts in which this element can be used:".
As the information from the CSS is most needed by the browser to render a page, it should be one of the first things loaded.
Have a look at the article:"How Browsers Work: Behind the scenes of modern web browsers" and especially at the section:"Rendering engines".
So loading another style sheet will force the browser to redo all the work, beside the additional HTTP request, which in particular on GSM connections may cause "trouble" because of the greater latency.
And if each page of your site really has such an amount of individual style rules then I would say it is a "design flaw".
One of the "design principles" is: As much as necessary - as little as possible!
Another (big) advantage of using just one style sheet is that it is cached by the browser after the first load. And as the CSS of a site normally doesn't change too often this is a great advantage which by far outweighs the disadvantage of some more KB to load on first page visit (btw independent of the entry/ landing page)!
Conclusion:
I really cannot recommend to use your approach!
Put all your styles (normalize, basic, media queries, print) in one single file which you load via <link> in the <head> of your document.
That's the best you can do.
Yes, what you are doing is perfectly valid and common
CSS is perhaps a bad example, but the same principle ( load the last one in via ajax btw )
Like say, images.
We are on page 1 of our website and we know 99.999% of the time our visitors are going to click to page 2, and we know that on page 2 we have some large images to serve, yes, then we may load them silently AFTER page 1 has loaded - getting ready, then the site 'feels' fast as they navigate. A common trick in mobile web applications/sites/
So yes:
It is the same principle for ANY type of file that you may want to 'pre cache' for subsequent requests.
Load the page
while the visitor is 'reading' the loaded page, pre fetch files/data that
you expect they may request next. ( images, page 2 of result data, javascript, and css ). These are loaded via ajax as to not hold up the page 'onload' event firing - a key difference from your example
However, To answer your goal - allow for the loading of the pages to be as fast as possible
Doing this, or any kind of 'pre emptive loading' technique, is minimal to 'speed of delivery' if we are not serving static files from a static server, a cookieless domain , and ultimately a Content Delivery Network.
Achieving the goal of allowing for the loading of the pages to be as fast as possible, is the serving of static files differently from your dynamic content ( php rendered et all )
1) Create a subdomain for these resources ( css, js, images/media ) - static.yourdomain.com
2) Turn off cookies, headers and tune cache headers specifically for this sub domain.
3) Look into using a service like http://cdnify.com/ or www.akamai.com.
These are the performance and speed steps for serving static content. ( hope no suck eggs, just directly related the question and if anyone is unfamiliar with this )
The 'pre emptive loading' techniques are still great,
but they are now more related to pre loading data for usability than they are for speed.
Edit/Update:
To clarify 'speed' and 'usability speed'.
Speed is judged by software often as when the page 'onload' event fires ( that is why it is important to load these 'pre emptive resources' via ajax.
Perceived speed ( usability ) is the how quickly a user can see and interact with the content ( even though the page load event may not have fired ).
Edit/update
In a few areas of the post and in the comments was mentioned the loading of these additional 'pre emptive' resources via javascript/ajax.
The reason is to not delay the page 'onload' event firing.
Many website test speed tools ( yslow, google .. ) use this 'onload' event to judge page speed.
Here we delay the page 'onload' event.
<body>
... page content
<link rel="stylesheet" href="/nextpage.css" />
</body>
Here we Load via javascript /some cases Ajax ( page data ) and do not preventing the page load event
<body>
.. page content
<script>
window.onload = function () {
var style = document.createElement( 'link' );
style.rel = 'stylesheet';
style.type = 'text/css';
style.href = '/nextpage.css';
document.getElementsByTagName( 'head' )[0].appendChild( style );
};
</script>
( this, as a bonus, also gets around the compatibility problems with having a <link> tag within the <body> as discussed in your other threads )
Since min.css contains all styles properly minimized, just use that
Why ?
1.The browser will make less requests
2.The file will be cached after fetching for some 2 or three times by the browser. Tremendous decrease in page load time !
3.The browser doesn't have to go through the specific page's css, which in turns decreases the time needed for a page to render
4.Easy maintainability of code. If you want to update css, just prefix some query variable, so that browser fetches the updated css
I think , the above reasons are enough for you to use just the min.css
Also, don't forget to set a reallyyyyy long cache expiry date, if you would do as I've recomended
Edit:
As OP didn't understand point 2, I'm gonna make myself and the point clear.
The browser will not cache the css file in it's first encounter, because it thinks : 'Hey, Let's not cache this immediately. What if it changes ? I will see to that, the same css is being reloaded atleast 2 times, so as to reap the benefit of caching'
There's no point in caching the css, when it is first loaded. Because if the browser does that, then there will be huge amount of cahce on the user's system. So browsers are clever enough to cache the files that are frquently loaded and unchanged.
What you're describing is a pre-fetch/lazy-load pattern with resources loaded in anticipation of becoming relevant in the future - for instance, a basic login page with minimal styling that starts loading site css in the background.
This has been done before, among other things, in PageSpeed Module. In fact, it's more aggressive yet requires less development effort! A vanilla landing page (like a login screen) utilizing only a small subset of styles could take advantage of prioritize_critical_css that inlines relevant rules into the html and loads css at the bottom of the page! Unlike your original scenario where two consecutive requests have to be performed, the render-blocking effects of not having stylesheet in the head are being offset. This improvement is well-perceived by first-time visitors using mobile devices, who are subject to higher network latency and smaller number of simultaneous http requests allowed.
A natural progression of this would be to lazy-load sprites, webfonts and other static cacheable content. However, I'm inclined to speculate that the benefits of having well-structured separate css are probably superficial, and you would generally do well with minifying styles into one file. The difference in loading time between a 5 and a 50 kilobyte file is not tenfold, it's negligible, since website performance does not depend on bandwidth anymore. As a side note, you'll never have to worry about dependency management (i.e. remembering to include rules relevant to specific elements on your page), which is not easily automated for html+css and gets quite hairy for big projects.
If you focus on the cardinal rule of static resources - aggressive caching - and remember to fingerprint your assets so that deployments don't get messy, you're doing great! And if you address perceived performance with a well-placed throbber here and there...
Why don't people make .php files for their CSS and JavaScript files?
Adding <?php header("Content-type: text/javascript; charset: UTF-8"); ?> to the file makes it readable by browsers, and you can do the same thing to css files by setting the Content-type property to text/css.
It lets you use all the variables of PHP and methods into the other languages. Letting you, as an example, change the theme main colors depending on user preferences in css, or preloading data that your javascript can use on document load.
Are there bad sides of using this technique?
People do it more often than you think. You just don't get to see it, because usually this technique is used in combination with URL rewriting, which means the browser can't tell the difference between a statically-served .css file and a dynamic stylesheet generated by a PHP script.
However, there are a few strong reasons not to do it:
In a default configuration, Apache treats PHP script output as 'subject to change at any given time', and sets appropriate headers to prevent caching (otherwise, dynamic content wouldn't really work). This, however, means that the browser won't cache your CSS and javascript, which is bad - they'll be reloaded over the network for every single page load. If you have a few hundred page loads per second, this stuff absolutely matters, and even if you don't, the page's responsivity suffers considerably.
CSS and Javascript, once deployed, rarely changes, and reasons to make it dynamic are really rare.
Running a PHP script (even if it's just to start up the interpreter) is more expensive than just serving a static file, so you should avoid it unless absolutely necessary.
It's pretty damn hard to make sure the Javascript you output is correct and secure; escaping dynamic values for Javascript isn't as trivial as you'd think, and if those values are user-supplied, you are asking for trouble.
And there are a few alternatives that are easier to set up:
Write a few stylesheets and select the right one dynamically.
Make stylesheet rules based on class names, and set those dynamically in your HTML.
For javascript, define the dynamic parts inside the parent document before including the static script. The most typical scenario is setting a few global variables inside the document and referencing them in the static script.
Compile dynamic scripts into static files as part of the build / deployment process. This way, you get the comfort of PHP inside your CSS, but you still get to serve static files.
If you want to use PHP to generate CSS dynamically after all:
Override the caching headers to allow browsers and proxies to cache them. You can even set the cache expiration to 'never', and add a bogus query string parameter (e.g. <link rel="stylesheet" type="text/css" href="http://example.com/stylesheet.css?dummy=121748283923">) and change it whenever the script changes: browsers will interpret this as a different URL and skip the cached version.
Set up URL rewriting so that the script's URL has a .css extension: some browsers (IE) are notorious for getting the MIME type wrong under some circumstances when the extension doesn't match, despite correct Content-Type headers.
Some do, the better thing to do is generate your JS/CSS scripts in PHP and cache them to a file.
If you serve all of your CSS/JS files using PHP, then you have to invoke PHP more which incurs more overhead (cpu and memory) which is unnecessary when serving static files. Better to just let the web server (Apache/nginx/lighttpd/iis etc) do their job and serve those files for you without the need for PHP.
Running the PHP engine does not have a zero cost, in either time or CPU. And since CSS and JavaScript files usually rarely change, having them run through the engine to do absolutely nothing is pointless; better to let the browser cache them when appropriate instead.
Here’s one method I’ve used: The HTML page contains a reference to /path/12345.stylesheet.css. That file does not exist. So .htaccess routes the request to /path/index.php. That file (a) does a database request, (b) creates the CSS, (c) saves the file for next time, (d) serves the CSS to the browser. That means that the very next time there’s a request for /path/12345.stylesheet.css, there actually is a physical static file there to be served by Apache as normal.
Oh, and whenever the styles rules are edited (a) the static file is deleted, and (b) the reference ID is changed, so that the HTML page will in future contain a reference to /path/10995.stylesheet.css, or whatever. (Actually, I use a UNIX timestamp.)
I use a similar method to create image thumbnails: create the file on first request, and save a static file in the same place for future requests. I’ve never had occasion to do the same for javascript, but there’s no fundamental reason why not.
This also means that I don’t need to worry about caching headers in PHP: only the first invocation of each CSS file (or image thumbnail) goes through PHP, and if that is served with anti-caching headers, that’s no great problem.
Sometimes you might have to dynamically create javascript or styles.
the issue is webservers are optimized to serve static content. Dynamically generating content with php can be a huge perforamce hit because it needs to be generated on each request.
It's not a bad idea, or all that uncommon, but there are disadvantages. Caching is an important consideration - you need to let browsers cache when the content is the same, but refresh when it will vary (e.g. when someone else logs in). Any query string will immediately stop some browsers caching, so you'll need some rewrite rules as well as HTTP headers.
Any processing that takes noticeable time, or requires a lock on something (e.g. session_start) will hold up the browser while it waits for the asset.
Finally, and quite importantly, mixing languages can make editing code harder - syntax highlighting and structure browsers may not cope, and overlapping syntax can lead to ugly things like multiple backslash escapes.
In javascript, it can be useful to convert some PHP data into (JSON) variables, and then proceed with static JS code. There is also a performance benefit to concatening multiple JS files ago the browser downloads them all in one go.
For CSS, there are specific languages such as Less which are more suited to the purpose. Using LessPHP (http://leafo.net/lessphp/) you can easily initialize a Less template with variables and callbacks from your PHP script.
PHP is often used as a processor to generate dynamic content. It takes time to process a page and then send it. For the sake of efficiency (both for the server and time spent in programming) dynamic JS or CSS files are only created if there isn't a possible way for the static file to successfully accomplish its intended goal.
I recommend only doing this if absolutely you require the assistance of a dynamic, database driven processor.
The bad sides: plenty, but to name just a few:
It'll be dead slow: constructing custom stylesheets for each request puts a huge load on the server, not something you want.
Designers create CSS files, programmers shouldn't (in some cases shouldn't be allowed to). It's not their job/their speciality.
Mixing JS and PHP is, IMHO, one of the greatest mistakes on can make. With jQuery being a very popular lib, using the $ sign, it might be a huge source for bugs and syntax errors. Besides that: JS is a completely different language than virtually any other programming language. Very few people know how to get the most out of it, and letting PHP developers write vast JS scripts often ends in tears. JavaScript is a functional OO (prototypal) language. People who don't full understand these crucial differences write bad code as a result. I know, because I've written tons of terrible JS code.
Why would you want to do this, actually? PHP allows you to change all element's classes while generating the page, just make sure the classes have corresponding style rules in your css files and the colours will change as you want them, without having to send various files, messing with headers and all the headaches that comes with this practice
If you want more reasons why you shouldn't do this, I can think of at least another few dozens. That said: I can only think of 1 reason why you would think of doing this: it makes issues caused by client-side cached scripts less of an issue. Not that it should be an issue in the first place, but hey...
I've just been messing around with file_get_contents() at school and have noticed, it allows me to open websites in school that are blacklisted.
Only a few issues:
No images load
Clicking a link on the website just takes me back to the original blocked page.
I think i know a way of fixing the linking issue, but haven't really thought it through..
I could do a str_replace on the content from file_get_contents to replace any link, with another file_gets_contents() function, on that link...right?
Would it make things easier if i used cURL instead?
Is what I'm trying to do, even possible, or am i just wasting my valuable time?
I know this isn't a good way to go about something like this, but, it is just a thought, thats made me curious.
This is not a trivial task. It is possible, but you would need to parse the returned document(s) and replace everything that refers to external content so that they are also relayed through your proxy, and that is the hard part.
Keep in mind that you would need to be able to deal with (for a start, this is not a complete list):
Relative and absolute paths that may or may not fetch external content
Anchors, forms, images and any number of other HTML elements that can refer to external content, and may or may not explicitly specify the content they refer to.
CSS and JS code that refers to external content, including JS that modifies the DOM to create elements with click events that act as links, to name but one challenge.
This is a fairly mammoth task. Personally I would suggest that you don't bother - you probably are wasting your valuable time.
Especially since some nice people have already done the bulk of the work for you:
http://sourceforge.net/projects/php-proxy/
http://sourceforge.net/projects/knproxy/
;-)
Your "problem" comes from the fact that HTTP is a stateless protocol and different resources like css, js, images, etc have their own URL, so you need a request for each. If you want to do it yourself, and not use php-proxy or similar, it's "quite trivial": you have to clean up the html and normalize it with tidy to xml (xhtml), then process it with DOMDocument and XPath.
You could learn a lot of things from this - it's not overly complicated, but it involves a few interesting "technologies".
What you'll end up with what is called a crawler or screen scraper.
Is there an easy way to do this without parsing the entire resource pointed to by the URL and finding out the different content types (images, javascript files, etc.) linked to inside that URL?
Just some quick thoughts for you.
You should be aware that caching, and the differences in the way in which browsers, obey and disobey caching directives can lead to different resource requests generated for the same page, by different browsers at different times, might be worth considering.
If the purpose of your project is simply to measure this metric and you have control over the website in question you can pass every resource through a php proxy which can count the requests. i.e you can follow this pattern for ssi, scripts, styles, fonts, anything.
If point 2 is not possible due to the nature of your website but you have access, then how about parsing the HTTP log? I would imagine this will be simple compared with trying to parse a html/php file, but could be very slow.
If you don't have access to the website source / http logs, then I doubt you could do this with any real accuracy, huge amount of work involved, but you could use curl to fetch the initial HTML and then parse as per the instructions by DaveRandom.
I hope something in this is helpful for you.
EDIT
This is easily possible using PhantomJS, which is a lot closer to the right tool for the job than PHP.
Original Answer (slightly modified)
To do this effectively would take so much work I doubt it's worth the bother.
The way I see it, you would have to use something like DOMDocument::loadHTML() to parse an HTML document, and look for all the src= and href= attributes and parse them. Sounds relatively simple, I know, but there are several thousand potential tripping points. Here are a few off the top of my head:
Firstly, you will have to check that the initial requested resource actually is an HTML document. This might be as simple as looking at the Content-Type: header of the response, but if the server doesn't behave correctly in this respect, you could get the wrong answer.
You would have to check for duplicated resources (like repeated images etc) that may not be specified in the same manner - e.g. if the document you are reading from example.com is at /dir1/dir2/doc.html and it uses an image /dir1/dir3/img.gif, some places in the document this might be refered to as /dir1/dir3/img.gif, some places it might be http://www.example.com/dir1/dir3/img.gif and some places it might be ../dir3/img.gif - you would have to recognise that this is one resource and would only result in one request.
You would have to watch out for browser specific stuff (like <!--[if IE]) and decide whether you wanted to include resources included in these blocks in the total count. This would also present a new problem with using the XML parser, since <!--[if IE] blocks are technically valid SGML comments and would be ignored.
You would have to parse any CSS docs and look for resources that are included with CSS declarations (like background-image:, for example). These resources would also have to be checked against the src/hrefs in the initial document for duplication.
Here is the really difficult one - you would have to look for resources dynamically added to the document on load via Javascript. For example, one of the ways you can use Google AdWords is with a neat little bit of JS that dynamically adds a new <script> element to the document, in order to get the actual script from Google. In order to do this, you would have to effectively evaluate and execute the Javascript on the page to see if it generates any new requests.
So you see, this would not be easy. I suspect it may actually be easier to go get the source of a browser and modify it. If you want to try and come up with a PHP based solution that comes up with an accurate answer be my guest (you might even be able to sell something as complicated as that) but honestly, ask yourself this - do I really have that much time on my hands?
I have created css page called style.php and included this the top:
<?php header("Content-type: text/css"); ?>
Does this make you cringe. Is it a terrible idea? I am doing this because I have created a CMS that allows the admin to control colors on pages (so the style.php script queries the database and grabs the hex values).
Any thoughts?
It's not a bad idea (subject to the notes about caching + content-type), but think about the cost of firing up a PHP instance (mod_php) or passing the script to an already running php (fastcgi style). Do you really want that overhead?
You might be better off writing a "cached" version of your CSS page to a static file, and serving that (or if you need per-page flexibility, selecting which style sheet to include; I assume your main page is PHP already)
This is a fine solution, just make sure that you are serving up the appropriate headers. See my blogpost about a related topic (search for "The important headers are" to get to the right section).
One more thing:
With the caching you might get into the situation where the user changes the color she wants to see, but (because it is cached at the client), the page doesn't update. To invalidate the cache, append a ?=id at the end of the URL, where ID is a number that is stored for the user (for example in the session) and is incremented every time she changes the color scheme.
Example:
At first the user has a stylesheet of http://example.com/style.php?id=0
When she changes the colors, she will get the url of http://example.com/style.php?id=1 and so on.
Assuming you use appropriate caching, as I imagine the CMS-driven values will probably not change very often, there's no specific reason to avoid creating a CSS include on the fly.
This is not a bad idea. This is a creative idea with numerous benefits:
your users can define values w/o you needing to worry about security (parsing css is hard)
you can enforce a more visually consistent set of skins (some flexibility is better than total flexibility)
simple to code