I'm creating a website from scratch and I was really into this in the late 90's but the web has changed alot since then! And I'm more of a designer so when I started putting this site together, I basically did a system of php includes to make the site more "dynamic"
When you first visit the site, you'll be presented to a logon screen, if you're not already logged on (cookies). If you're not logged on, a page called access.php is introdused.
I thought I'd preload the most heavy images at this point. So that when the user is done logging on, the images are already cached. And this is working as I want. But I still notice that the biggest image still isn't rendered immediatly anyway. So it's seems kinda pointless.
All of this has made me rethink how the site is structured and how scripts and css files are loaded. Using FireBug and YSlow with Firefox I see a few pointers like expires headers and reducing the size of each script. But is this really the culprit?
For example, would this be really really stupid in the main index.php? The entire site is basically structured like this
<?php
require("dbconnect.php");
?>
<?php
include ("head.php");
?>
And below this is basically just the body and the content of the site.
Head.php however consists of the doctype, head portions, linking of two css style sheets, jQuery library, jQuery validation engine, Cufon and Cufon font file, and then the small Cufon.Replace snippet.
The rest of the body comes with the index.php file, but at the bottom of this again is an include of a file called "footer.php" which basically consists of loading of a couple of jsLoader scripts and a slidepanel and then a js function.
All of this makes the end page source look like a typical complete webpage, but I'm wondering if any of you can see immediatly that "this is really really stupid" and "don't do that, do this instead" etc. :) Are includes a bad way to go?
This site is also pretty image intensive and I can probably do a little more optimization.
But I don't think that's its the primary culprit. YSlow gives me a report of what takes up the most space:
doc(1) - 5.8K
js(5) - 198.7K
css(2) - 5.6K
cssimage(8) - 634.7K
image(6) - 110.8K
I know it looks like it's cssimage(8) that weighs the most, but I've already preloaded these images from before and it doesn't really affect the rendering.
To speed a little, you could assemble all your images on the same image sprite, so that you have only 1 request to download all the images. But that requires you to fine tune your css to let display just the small subset of your image.
To have a better explanation, check out : http://css-tricks.com/css-sprites/
Another answer that could seem a little stupid but I like to think of this when I make a website : Just Keep It Simple. I mean do all your JS add real value, do all this images are fine, could you display less, make a lighter design ? I'm not criticizing your work at all, just suggest you...
I used the following approach on an extranet project:
Using jQuery and a array of file names, I ajax in all the images, .js and .css files so that they are preloaded in the cache. As I iterate through the array, I update a progress bar on the screen that indicates that the site is loading - much like a flash loader.
It worked well.
What I will do is show by default the loading page with pure CSS and HTML then wait for the jQuery to load and preload the images with ImageLoader. Once you are done redirect to the normal website since the images will be already in the cache they won't be loaded again.
Another optimization you can do is minify all JS files and combine all except the jquery.js. Put the jquery.js first into your HTML so it loads first. Also put your SCRIPT tags at the bottom of the HTML.
It sounds like you have pretty much nailed preloading, if you have loaded it once, and the expiry header is set correctly, you have preloaded it, no matter what kind of content it is.
File combination can be key to a quick website, each extra file will add load time, in the worst cases of network and server lag you might add up to a second extra for each separate file. More commonly it will be around 100 - 200 milliseconds per file.
If not already minified, minify the scripts, and put them in the same file, just remember to keep the order. I have no idea why Ivo Sabev wouldn't include jQuery.
Same thing with the CSS files.
How much have you done about testing image compression? There can really be a gain from trying out different compression settings and comparing size vs. quality. For PNG images IrfanView with PNGOUT can often make files 25% smaller than other programs, on top of that, a very big gain in size reduction can be achieved by reducing the image to 8 bit colour, with a lot of graphic elements you simply can't tell the difference. Right here on Stack Overflow there is a great example of well compressed and stacked images in the editor control buttons: http://sstatic.net/so/Img/wmd-buttons.png
Related
I am looking for a proper way to implement lazy loading of images without harming printability and accessibility, and without introducing layout shift (content jump), preferrably using native loading=lazy and a fallback for older browsers. Answers to the question How lazy loading images using JavaScript works?
included various solutions none of which completely satisfy all of these requirements.
An elegant solution should be based on valid and complete html markup, i.e. using <img src, srcset, sizes, width, height, and loading attributes instead of putting the data into data- attributes, like the popular javascript libraries lazysizes and vanilla-lazyload do. There should be no need to use <noscript> elements either.
Due to a bug in chrome, the first browser to support native lazyloading, images that have not yet been loaded will be missing in the printed page.
Both javascript libraries mentioned above, require either invalid markup without any src attribute at all, or an empty or low quality placeholder (LQIP), while the src data is put into data-src instead, and srcset data put into data-srcset, all of which only works with javascript. Is this considered an acceptable or even best practice in 2020, and does this neither harm the site accessibility, cross-device compatibility, nor search engine optimization?
Update:
I tried a workaround for the printing bug using only HTML and CSS #media print background images in this codepen . Even if this worked as intended, there would be a necessary css directive for each and every image, which is neither elegant nor generic. Unfortunately there is no way to use media queries inside the <picture> element either.
There is another workaround by Houssein Djirdeh at at lazy-load-with-print-ctl1l4wu1.now.sh using javascript to change loading=lazy to loading=eager when a "print" button is clicked. The same function could also be used onbeforeprint.
I made a codepen using lazysizes.
I made another codepen using vanilla-lazyload .
I thought about forking a javascript solution to make it work using src and srcset, but this must probably have been tried before, the tradeoff would be that once the lazyloading script starts to act on the image elements, the browser might have already started downloading the source files.
Just show me your hideous code, I don't want to read!
If you don't want to read my ramblings the final section "Demo" contains a fiddle you can investigate (commented reasonably well in the code) with instructions.
Or there is a link to the demo on a domain I control here that is easier to test against if you want to use that.
There is also a version that nearly works in IE here, for some reason the "preparing for print" screen doesn't disappear before printing but all other functionality works (surprisingly)!
Things to try:
Try it at different browser sizes to see the dynamic image requesting
try it on a slower connection and check the network tab to see the lazy loading in action and the dynamic change in how lazy loading works depending on connection speed.
try pressing CTRL + P when the network connection is slow (without scrolling the page) to see how we load in images not yet in the DOM before printing
try loading the page with a slow network connection and then using FILE > PRINT to see how we handle images that have not yet loaded in that scenario.
Version 0.1, proof of concept
So there is still a long way to go, but I thought I would share my solution so far.
It is complex (and flawed) but it is about 90% of what you asked for and potentially a better solution than current image lazy loading.
Also as I am awful at writing clean JS when prototyping an idea. I can only apologise to any of you brave enough to try and understand my code at this stage!
only tested in chrome - so as you can imagine it might not work in other browsers, especially as grabbing the content of a <noscript> tag is notoriously inconsistent. However eventually I hope this will be a production ready solution.
Finally it was too much work to build an API at this stage, so for the image resizing I utilised https://placehold.it - so there are a few lines of redundant code to be removed there.
Key features / Benefits
No wasted image bytes
This solution calculates the actual size of the image to be requested. So instead of adding breakpoints in something like a <picture> element we actually say we want an image that is 427px wide (for example).
This obviously requires a server-side image resizing solution (which is beyond the scope of a stack overflow answer) but the benefits are massive.
First of all if you change all of your breakpoints on the site it doesn't matter, so no updating picture elements everywhere.
Secondly the difference between a 320px and 400px wide image in terms of kb is over 40% so picking a "similarly sized" image is not ideal (which is basically what the <picture> element does).
Thirdly if people (like me) have massive 4K monitors and a decent connection speed then you can actually serve them a 4K image (although connection speed detection is an improvement I need to make in version 0.2).
Fourthly, what if an image is 50% width of it's parent container at one screen size, 25% width of it's parent container at another, but the container is 60% screen width at one screen size and 80% screen width at another.
Trying to get this right in a <picture> element can be frustrating at best. It is even worse if you then decide to change the layout as you have to recalculate all of the width percentages etc.
Finally this saves time when crafting pages / would work well with a CMS as you don't need to teach someone how to set breakpoints on an image (as I have yet to see a CMS handle this better than just setting the breakpoints as if every image is full width on the screen).
Minimal Markup (and semantically correct markup)
Although you wanted to not use <noscript> and avoid data attributes I needed to use both.
However the markup you write / generate is literally an <img> element written how you normally would wrapped in a <noscript> tag.
Once an image has fully loaded all clutter is removed so your DOM is left with just an <img> element.
If you ever want to replace the solution (if browser technology improves etc.) then a simple replace on the <noscripts> would get you to a standard HTML markup ready for improving.
WebP
Of course this solution requests WebP images if supported (its all about performance!). On the server side you would need to process these accordingly (for example if an image is a PNG with transparency you send that back even if a WebP image is requested).
Printing
Oh this was a fun one!
There is nothing we can do if we send a document to print and an image has not loaded yet, I tried all sorts of hacks (such as setting background images) but it just isn't possible (or I am not clever enough to work it out....more likely!)
So what I have done is think of real world scenarios and cover them as gracefully as possible.
If the user is on a fast connection we lazy load the images, but we don't wait for scroll to do this. This could mean a bit more load on our servers but I am acting like printing is highly important (second only to speed).
If the user is on a slow connection then we use traditional lazy loading.
If they press CTRL + P we intercept the print command and display a message while the images are loading. This concept is taken from the example OP gave by Houssein Djirdeh but using our lazy loading mechanism.
If a user prints using FILE > PRINT then we instead display a placeholder for images that have not yet loaded explaining that they need to scroll the page to display the image. (the placeholders are approximately the same size as the image will be).
This is the best compromise I could think of for now.
No layout shifts (assuming content to be lazy loaded is off-screen on page load).
Not a 100% perfect solution for this but as "above the fold" content shouldn't be lazy loaded and 95% of page visits start at the top of the page it is a reasonable compromise.
We use a blank SVG (created at the correct proportions "on the fly") using a data URI as a placeholder for the image and then swap the src when we need to load an image. This avoids network requests and ensures that when the image loads there is no Layout Shift.
This also means the page is semantically correct at all times, no empty hrefs etc.
The layout shifts occur if a user has already scrolled the page and then reloads. This is because the <img> elements are created via JavaScript (unless JavaScript is disabled in which case the image displays from the <noscript> version of the image). So they don't exist in the DOM as it is parsed.
This is avoidable but requires compromises elsewhere so I have taken this as an acceptable hit for now.
Works without JavaScript and clean markup
The original markup is simply an image inside a <noscript> tag. No custom markup or data-attributes etc.
The markup I have gone with is:
<noscript class="lazy">
<img src="https://placehold.it/1500x500" alt="an image" width="1500px" height="500px"/>
</noscript>
It doesn't get much more standard and clean as that, it doesn't even need the class="lazy" if you don't use <noscript> tags elsewhere, it is purely for collisions.
You could even omit the width and height attributes if you didn't care about Layout Shift but as Cumulative Layout Shift (CLS) is a Core Web Vital I wouldn't recommend it.
Accessibility
The images are just standard images and alt attributes are carried over.
I even added an additional check that if alt attributes are empty / missing a big red border is added to the image via a CSS class.
Issues / compromises
Layout Shift if page already scrolled
As mentioned previously if a page is already scrolled then there will be massive layout shifts similar to if a standard image was added to a page without width and height attributes.
Accessibility
Although the image solution itself is accessible the screen that appears when pressing CTRL + P is not. This is pure laziness on my part and easy to resolve once a more final solution exists.
The lack of Internet Explorer support (see below) however is a big accessibility issue.
IE
UPDATE
There is a version that nearly works in IE11 here. I am investigating if I can get this to work all the way back to IE9.
Also tested in Firefox, Edge and Safari (mobile), seems to work there.
ORIGINAL
Although this isn't tested in Firefox, Safari etc. it is easy enough to get to work there if there are issues.
However accessing the content of <noscript> tags is notoriously difficult (and impossible in some versions) in IE and other older browsers and as such this solution will probably never work in IE.
This is important when it comes to accessibility as a lot of screen reader users rely on IE as it works well with JAWS.
The solution I have in mind is to use User Agent sniffing on the server and serve different markup and JavaScript, but that is complex and very niche so I am not going to do that within this answer.
Checking Latency
I am using a rather crude way of checking latency (to try and guess if someone is on a 3G / 4G connection) of downloading a tiny image twice and measuring the load time.
2 unneeded network requests is not ideal when trying to go for maximum performance (not due to the 100bytes I download, but due to the delay on high latency connections before initialising things).
This needs a complete rethink but it will do for now while I work on other bits.
Demo
Couldn't use an inline fiddle due to character count limitation of 30,000 characters!
So here is the current JS Fiddle - https://jsfiddle.net/9d5qs6ba/.
Alternatively as mentioned previously the demo can be viewed and tested more easily on a domain I control at https://inhu.co/so/image-concept.php.
I know it isn't the "done thing" linking to your own domains but it is difficult to test printing on a jsfiddle etc.
The proper solution for printable lazy loading in 2022 is using the native loading attribute.
<img loading=lazy>
The recommendation to use a custom print button has been obsoleted as chromium issue 875403 got fixed.
Prior recommendations included adding a custom print button (which did not fix the problem when using the native browser print functionality) or using JavaScript to load images onBeforePrint the latter not being considered a good solution, as loading=lazy, as a "DOM-only" solution, must not rely on JavaScript.
Beware that, even after the bug fix, some of your users might still visit your site with a buggy browser version.
#Ingo Steinke Before one dwells into answers for the concerns that you have raised, one has to go back and think about why lazy loading came about and for what detriment it solved on initiation as framework of thought. Keyword framework of thought... it is not a solution and I would go on a leaf to say it has never been a solution but framework of thought.
Why we wanted it:
Minimise unnecessary file fetching from server - this is bandwidth critical if one is running a large user base. So it was the internet version of just in time as in industrial production.
Legacy browser versions and before async and defer were popularised in JS/HTML, interactivity with the browser window remained hampered until all content was loaded.
Now broad band as we know it has only been around since the last 6-7 years in real sense of manner and penetration. We wanted it because we didn't want to encounter no.2 on low bandwidth. To be honest, there was and still is a growing concern and ideology of minifying and zipping JS and CSS files - all because that round trip to server and back should be minimised so that next item in the list could be fetched. Do keep in mind browsers tend to limit simultaneous downloading connections to around 6 at a time per window or active window. There is reasons why Google popularised the 3 second rule. If above were to let run on as it than 3 second rule will fall on its head as if it did not have legs.
So came along thought frameworks.
Image as CSS background: This came as it did not mess up the visual aspect of the page. Everything remained as it is in its place and then suddenly became colourful. It was time when web pages seemed to have elastic fit i.e. it was that bag which once filled with air suddenly poped-transformed into jumping castle. This was increasingly become bad idea as front end developer. So fixing height and with of the container then run images as background helped and HTML5 background alignment properties upgraded them self accordingly. There was even variant and still used as in use multiple backgrounds one being loading spiril or low end blured image version on top of which actual intended image was fetched. Since level down bacground would be fetched and populated everywhere in single instance of downloading it created a more pleasing visual and user knew what to expect. worked in printing as well even if intended image did not download.
Then came JS version of it by hijacking DOM either through data-src, invalid image tags removing src, and what not. only trigger the change when content is scrolled to. Obviously there would be lag but that was either countered through CSS approach implemented in JS or calculating scroll points and triggering event couple of pixel ahead. They all still work on the same premise.
There is one question that begs to be asked and you have touched it in your pretext .... none of it controls or alters browser native functionality. Browser might as will go fetch the item even before your script had anything to do with any thing.
This is the main issue here. BOM does not care and even want to care about what your script is asking to do all it knows if there is a src property fetch the content. None of the solutions have changed that. If we could change that functionality then thought framework would become solution.
I still believe browsers should not change that just for the sake of it and thus never gained tracking in debates. What browsers have done is pre-fetching known as speculative or look-ahead pre-parser, It is the single biggest improvement in browsers that deserves it credit. Just as we type url in address bar on every chnage of string browser is pre-fetching the content even though I had not typed the whole url. I had specially developed a programme where I grabbed anything that was received at server from these look-ahead pre-parsers. It takes less than second to get response at most times and browsers begin to process it all including images and JS. This was counter the jerky delayed elastic prone display as discussed in No.1 and No.2. It did not reduce the server hit however. The reason why we are doing lazy loading any ways. But some JS workaround gained traction as there was no src property so pre-parser did not fetch the image and was only done so when user actually sent to the page and events were triggered. Some browser have toyed with the idea of lazy loading them self but let go if it as it did not assume universal consistency in standard.
Universal Standard is simple if there is src property browser will fetch the item no if and buts. Imagine if that was not the case OMG hell would break loose on poor front-end developer.
So deep down what you are raising in debate is the question regarding BOM functionality as I have discussed above. There is no work around for it. In your case both for screen and print version of display. How to make sure images are loaded when print command is sent. Answer is simple for BOM print is after the fact. Fact ebing screen display and before the fact being everything before that at BOM/DOM propagation level. Again you cannot change that.
So you have to make trade off. Trade off would come in the form of another thought framework. rather than assuming everything is print ready make it print ready on user command. There is div that pops up and shows printed version of document and then print from there on. UI could be anything it would only take second or so as majority of the content would be loaded any ways and rest will take short amount of time. CSS rules for print could mighty handy in this respect. You can almost see it in action in may places on the internet.
conclusion as we stand today where we are with BOM functionality bundling the screen display and print display with lazy load is not what lazy lading was intended for thus does not provide any better solution then mere hacks. So you have to create your UI based on your context separating the two, to make it work properly.
I'm using CakePHP to build my site (if that matters). I have a TON of elements/modules each having their own file and fairly complicated CSS (in some cases).
Currently the CSS is in a massive single CSS file, but for sanity sake (and the below mentioned details), I would like to be able to keep the CSS in it's own respective file - ie css/modules/rotator.css. But with normal CSS, that would call a TON of CSS files.
So, I started looking into SASS or LESS per recommendation. But - it seems these are supposed to be compiled then uploaded. But in my case, each page is editable via the CMS, so a page might have 10 modules one minute, then after a CMS change it could have 20 or 5...etc. And I don't want to have to compile the CSS for every module if it's not going to use it.
Is there a way I can have a ton of CSS files that all compile on the fly?
Side note: I'd also like to allow the user to edit their own CSS for a page and/or module, which would then load after the default CSSs. Is this possible with SASS and/or LESS?
I don't need a complete walkthrough (though that would be awesome), but so far my searches have returned either things that are over my head related to Ruby on Rails (never used) or generic tutorials on each respective CSS language.
Any other recommendations welcome. I'm a complete SASS/LESS noob.
Clarified question:
How do I dynamically (server-side) combine multiple CSS files using LESS? (even a link to a resource that would get me on the right track is plenty!)
If you want to reduce the number of CSS files & you have one huge css file that has all the component css, just link to it on all pages & make sure you set cache headers properly.
They will load the file once and use it everywhere. The one pitfall is initial pageload time; if that's not an issue go with this solution. If it is an issue consider breaking down your compiled CSS files to a few main chunks (default.css, authoring.css, components.css eg.).
Don't bother trying to make a custom css for each collection of components, you will actually be shooting yourself in the foot by forcing users to re-download the same CSS reorganized in different ways.
Check out lessphp (http://leafo.net/lessphp/). It's a php implementation of less and can recompile changed files by comparing the timestamp.
Assuming that 'on the fly' means 'on pageload', that would likely be even slower than sending multiple files. What I would recommend is recompiling the stylesheets whenever a module is saved.
The issue of requiring only necessary modules should be solved by means of CMS. It has nothing to do with SASS or LESS.
If your CMS is aware of which modules current page has, do not run a SASS/LESS compilation (it will be painfully slow unless you implement caching which is not a trivial task). Instead, adjust your CMS's logic so that it includes each module's CSS file.
Advanced CMSs like Drupal not only automatically fetch only necessary CSS files, but also assemble them into a single file and compress it.
And if your CSS is not aware of which modules current page has (e. g. "modules" are simply HTML code that is saved into post body), then you can't really do anything.
UPD: As sequoia mcdowell says in his answer, making users download one large CSS file once is better than making them download a number of lesser CSS files that contain duplicate code. The cumulative size of all those smaller CSS files will turn out to be larger than the size of a full CSS file.
So, I'm in to cache everything on my website called http://apolloinvest.hu.
I sending gzipped, optimized images, js, css, and everything also the whole site is gzipped, the JS files are loads deferred, with LAB, and everything must be fantastic, I also made a browser cache. But my site is still loads for 1 sec to load any page, and not instantly do it.
Could you help me please, why?
My redbot andswer is: http://redbot.org/?uri=http%3A%2F%2Fapolloinvest.hu%2F
Google PageSpeed rank is 99/100 (Because I don't want to remove the comments from the jquery UI)
The answer for CSS files: http://redbot.org/?uri=http%3A%2F%2Fapolloinvest.hu%2Fda232d78aa810382f2dcdceae308ff8e.css
For JS files: http://redbot.org/?uri=http%3A%2F%2Fapolloinvest.hu%2F5ec01c6d8ca5258bf9dcef1fc6bfb38c.js
So to tell the true I dont know what is the matter, with my caching or my JSes. Thanks for the help guys.
Répás
The site is pretty fast as it is, but here are a few possible improvements:
Directly render the HTML page instead of using JavaScript to do so. Put all the <script> elements at the bottom of the HTML document (just before </body>) so that the browser can render the page even before the JavaScript code is downloaded.
You can concatenate all the JavaScript files into one. Currently, http://apolloinvest.hu/475a641fc1d70f7c92efa3488e27568f.js is just empty.
If possible, serve static content such as JavaScript files and styles with Cache-Control and Expires headers far in the future.
A couple of unrelated notes:
The site is not valid HTML. The additional overhead caused by the browser transforming it to valid HTML does not matter, but the readability (and compatibility) does.
Your stylesheet is restricted to screen. When printed out (or viewed on another non-screen device), it looks ugly.
The site breaks for users without JavaScript. It's just showing a loading bar, forever.
I sending gzipped, optimized images, js, css, and everything also the whole site is gzipped, the JS files are loads deferred, with LAB
THAT IS exactly your problem.
Instead of doing all that fancy stuff, you had to profile your application first, determine a certain bottleneck and then optimize the exact part that is causing the slowness.
Let me suggest you to start from the "Net" tab of Firebug where you can watch actual response times of the requests. It is very likely that your code runs FAST but some JS-based web-counter prevents the page from displaying immediately.
if it's 1 second that takes for the PHP code to execute - time to profile it. Xdebug or simple microtime(1)-based manual profiling can tell you where is the problem. Once you find it, you'll be able to ask more certain question here.
If JavaScript and CSS files were included inside of pages it would cut down the number of http requests and therefore make the page load faster. I feel like I am missing something because it seems like any organization interested in lightning-quick pages would do this. However, I don't recall any sites having tons of CSS and JavaScript into their pages as I look at the source code.
Questions:
What errors are in my statements above?
What are the drawbacks of this approach (shown in the title via psuedocde)?
If the data is in an external file it can be cached and reused on other pages (or the same page, revisited) without having to fetch it over the network again.
You get a minor performance penalty on the first page in exchange for a major performance enhancement on subsequent pages.
Modularity is a major concern:
I can pick and choose which javascript and css files I want per page: otherwise I'd have a ton of css and javascript files that have all the different configurations (which is just messy).
I can also cache a file and hand it to someone else faster
Where you will find an example of this happening is when sites chuck their images together into one png file and then use css to slice up the bits they want for buttons etc.
Another aspect not only for inline css and jscript. When I write code I hate to repeat. It leads to errors is difficult to maintain (update/edit) and a waste of time and space. Printing CSS or jscript once in a file that gets downloaded once is less error prone, easy to maintain and less waste of time and space.
I need to write a text file viewer (not the directory tree, but the actual file contents) for use in a browser. It will be used to view large files. I want to give the user the ability to actually ummm, browse the file, ie prev page & next page buttons, while each page will show only a portion of the file.
Two question:
Is there anyway to pass the file descriptor through POST (or something) so that on each page I can keep reading from an already open file, and not starting all over again (again - huge files)
Is there a way to read the file backwards? Will be very useful for browsing back in a file.
Any other implementation ideas are very welcome. Thanks
Keeping the file open between requests is not a good idea - you don't have to "start all over again" - just maintain an offset and use fseek() to jump to that offset. That way, you can also implement the "backwards jumping".
Cut your huge files into smaller files once, and then serve the small files to the user.
You should consider pagination. If you're concerned about the user being frustrated by needing to click "next" too often, you could make each chunk reasonably large (so a normal reader pages every 20min).
Another option is the Chunked-Endoding transfer type: Wikipedia Entry. This would allow your server to respond quickly and give the user something to read while it streams the rest of the file over the network (rather than the server needing to read in the file and send it all at once). This could dramatically improve the perceived performance compared to serving the files normally, but still consumes a lot of bandwidth for your server.
You might be able to simulate a large document with Javascript and AJAX, but only send pieces at a time for better performance.
Consider sending a few pages worth of your document and attaching listeners to the scroll event of your browser. Over time or as the user scrolls down you AJAX more chunks. This creates a few annoying UX edge cases, like:
Scroll bar indicates a much smaller document than there actually is
You might be able to avoid this by filling in the bottom of your document with many page breaks, but it'll be difficult to make the length perfect.
Scrolling past the point of currently-available content will show a blank page.
You could detect this using JavaScript and display a "loading" icon to let the user know what's going on.
Built-in "find" feature doesn't work
Hard to avoid this without the user downloading the entire document, but you could provide your own search feature for them to use instead (not as good but perhaps adequate).
Really though, you're probably best off with pagination with medium-sized pages. It's a very well understood design pattern that's a relatively easy (compared to other options at least) to implement and make fast.
Hope that helps!