I'm more or less building a new design into some software and to retain the functionality of some of the page features I need to keep some of the javascript files the system uses; however it appears the software uses a global header to include all the .js files and to cut down on http requests I was only wanting to include them when the page actually needed them.
However without actually pining through the code of each page, is there a quicker method you can use to test if the page actually needs to have a certain .js file included or not!?
There is no reliable way to test that. A better solution would be to pack all javascript files in a single file, so it needs only a single request.
During that packing you may minimize them as well (remove comments and unneeded whitespace). You can cache the packed file, so it doesn't have to be generated on each request. If you pass the proper headers, the browser will cache it as well, saving you bandwidth and speeding up the page.
TL/DR
No. Due to the dynamic nature of JavaScript it's impossible to tell if a library is used until it's actually run.
Long Answer
If running the application is an option (i.e. it has a defined number of things to test to verify all functionality) then it's possible to write a proxy for the libraries you're considering removing that would log their actual use. It's a bit of an overkill solution and it would probably be easier to just try removing the libraries one by one and testing the pages to see if they work. It's grunt work but the direct method would probably be the fastest.
Related
You probably could speed things up by combining and minifying the includes. If they're all in one place that would probably be easy.
Related
if I have a categories section of my website.
I notice some websites are built so categories are under categories.php,
and some websites are built so categories are under a folder.
Example >
www.example.com/categories.php
vs
www.example.com/categories/
Is there a benefit or con to doing either of those 2 ways, is having each page of a site in a separate folder just used for keeping things organized better?
What are the pros and cons of both.
Your first example, you are specifically asking for a file. The second example you are requesting a route, which may or may not be a folder. It could also be a file as well. It depends on how the server will handle your request. In my opinion, it's better to do it with the second example to:
Obfuscate your tech stack
If your resource changes, you won't need to edit your link and it will save your time, you just need to make a back end change instead of both front and back end
Since you will most likely be dealing with dynamic content you will find that under the hood most apps are actually configured to use a file such as categories.php under the hood.
In other words - requests to categories is routed to /categories.php behind the scenes.
I am deliberately leaving some stuff out here to not over-complicate matters but hope that makes sense so far.
The most common reasons for this are that by emulating a folder structure you...
1 - ...hide details about how you built the site.
So in your case you obscure that you are using php. This is good for security purposes because any potential attacker now no longer knows what technology you are using, making it harder to successfully mount an attack.
This is also good for your normal users who won't know, care or understand what this .php extension is (or what an extension is for that matter). So by hiding it you also make your URL more readable, which leads me to the next point.
2 - ...make your URLs easier to read.
Consider this example:
/categories.php?action=edit&id=43
vs
/categories/shoes/edit
As you can see the version that has things in "folders" is much easier to read and even my Gran could make an educated guess as to what that page might do.
So doing this is great for your users but also for search engines.
By formatting urls in such a way services like google better understand what you are doing and thus end up indexing your resources more precisely.
Hope this helps.
Your clients would not want the see web page extension like .html, .php and so on. Moreover, think one user writes the full URL to go to the page. It is also a bother for the client.
And think about it, for now, you are serving your site in Php. But what about when you decide to serve your site with any other language. Then if you change your links, older links would be broken if you don't change then this means you continue to use the ".php" extension. Which is not something when you do not use Php.
With this question, I aim to understand the inner workings of PHP a little better.
Assume that you got a 50K library. The library is loaded with a bunch of handy functions that you use here and there. Also assume that these functions are needed/used by say 10% of the pages of your site. But your home page definitely needs it.
Now, the question is... should you use a global include that points to this library - across the board - so that ALL the pages ( including the 90% that do not need the library ) will get it, or should you selectively add that include reference only on the pages you need?
Before answering this question, let me point "why" I ask this question...
When you include that reference, PHP may be caching it. So the performance hit I worry may be one time deal, as opposed to every time. Once that one time out of the way, subsequent loads may not be as bad as one might think. That's all because of the smart caching mechanisms that PHP deploys - which I do not have a deep knowledge of, hence the question...
Since the front page needs that library anyway, the argument could be why not keep that library warm and fresh in the memory and get it served across the board?
When answering this question, please approach the matter strictly from a caching/performance point of view, not from a convenience point of view just to avoid that the discussion shifts to a programming style and the do's and don'ts.
Thank you
Measure it, then you know.
The caching benefit it likely marginal after the first hit, since the OS will cache it as well, but that save only the I/O hit (granted, this is not nothing). However, you will still incur the processing hit. If you include your 50K of code in to a "Hello World" page, you will still pay the CPU and memory penalty to load and parse that 50K of source, even if you do not execute any of it. That part of the processing will most likely not be cached in any way.
In general, CPU is extremely cheap today, so it may not be "worth saving". But that's why you need to actually measure it, so you can decide yourself.
The caching I think you are referring to would be opcode caching from something like APC? All that does is prevent PHP from needing to interpret the source each time. You still take some hit for each include or require you are using. One paradigm is to scrap the procedural functions and use classes loaded via __autoload(). That makes for a simple use-on-demand strategy with large apps. Also agree with Will that you should measure this if you are concerned. Premature optimization never helps.
I very much appreciate your concerns about performance.
The short answer is that, for best performance, I'd probably conditionally include the file on only the pages that need it.
PHP's opcode caches will maintain both include files in a cached form, so you don't have to worry about keeping the cache "warm" as you might when using other types of caches. The cache will remain until there are memory limitations (not an issue with your 50K script), the source file is updated, you manually clear the cache, or the server is restarted.
That said, opcode (PHP bytecode) caching is only one part of the PHP parsing process. Every time a script is run, the bytecode is then processed to build up the functions, classes, objects, and other instance variables that are defined and optionally used within the script. This all adds up.
In this case, a simple change can lead to significant improvement in performance. Be green, every cycle counts :)
I had an idea today (that millions of others have probably already had) of putting all the sites script into a single file, instead of having multiple, seperate ones. When submitting a form, there would also be a hidden field called something like 'action' which would represent which function in the file would handle it.
I know that things like Code Igniter and CakePHP exist which help seperate/organise the code.
Is this a good or bad idea in terms of security, speed and maintenance?
Do things like this already exist that i am not aware of?
What's the point? It's just going to make maintenance more difficult. If you're having a hard time managing multiple files, you should invest the time into finding a better text editor / IDE and stop using Notepad or whatever is making it so difficult in the first place!
Many PHP frameworks rely on the Front Controller design: a single small PHP script serves as the landing point for all requests. Based on request arguments, the front controller invokes code in other PHP scripts.
But storing all code for your site in a single file is not practical, as other people have commented.
There are many forums that do this. Personally, I don't like it, mainly because if you make an error in the file, the entire site is broken until you fix it.
I like separation of each part, but I guess it has its plusses.
It's likely bad for maintenance, as you can't easily disable a section of your site for an update.
Speed: I'm not sure to be honest.
Security: You could accomplish the exact same security settings but just adding a security check to a file and then including that file in all your pages.
If you're not caching your scripts, everything in a single file means less disk I/O, and since generally, disk I/O is an expensive operation, this probably can be a significant benefit.
The thing is, by the time you're getting enough traffic for this to matter, you're probably better off going with caching anyway. I suppose it might make some limited sense, though, in special cases where you're stuck on a shared hosting environment where bandwidth isn't an issue.
Maintenance and security: composing software out of small integral pieces of code a programmer can fit inside their head (and a computer can manage neatly in memory) is almost always a better idea than a huge ol' file. Though if you wanted to make it hell for other devs to tinker with your code, the huge ol' file might serve well enough as part of an obfuscation scheme. ;)
If for some reason you were using the single-file approach to try and squeeze out extra disk I/O, then what you'd want to do is create a build process, where you did your actual development work in a series of broken-out discrete files, and issued make or ant like command to generate your single file.
I currently have a custom session handler class which simply builds on php's session functionality (and ties in some mySQL tables).
I have a wide variety of session variables that best suits my application (primarily kept on the server side). Although I am also using jQuery to improve the usability of the front-end, and I was wondering if feeding some of the session variables (some basics and some browse preference id's) to a JS object would be a bad way to go.
Currently if I need to access any of this information at the front-end I do a ajax request to a php page specifically written to provide the appropriate response, although I am unsure if this is the best practice (actually I'm pretty sure this just creates a excess number of Ajax requests).
Has anyone got any comments on this? Would this be the best way to have this sort of information available to the client side?
I really guess it depends on many factors. I'm always having "premature optimization ..." in the back of my head.
In earlier years I rushed every little idea that came to my mind into the app. That often lead to "i made it cool but I didn't took time to fully grasp the problem I'm trying to solve; was there a problem anyway?"
Nowadays I use the obvious approach (like yours) which is fast (without scarifying performance completely on the first try) and then analyze if I'm getting into problems or not.
In other words:
How often do you need to access this information from different kind of loaded pages (because if you load the information once without the user reloading there's probably not much point in re-fetching it anyway) multiplied by number of concurrent clients?
If you write the information into a client side cookie for fast JS access, can harm be done to your application if abused (modified without application consent)? Replace "JS" and "cookie" without any kind of offline storage like WHATWG proposes it, if #1 applies.
The "fast" approach suits me, because often there's not the big investment into prior-development research. If you've done that carefully ... but then you would probably know that answer already ;)
As 3. you could always push the HTML to your client already including the data you need in JS, maybe that can work in your case. Will be interesting to see what other suggestions will come!
As I side note: I've had PHP sessions stored in DB too, until I moved them over to memcached (alert: it's a cache and not a persistent store so may be not a good idea for you case, I can live with it, I just make sure it's always running) to realize a average drop of 20% of database queries and and through this a 90% drop of write queries. And I wasn't even using any fancy Ajax yet, just the number of concurrent users.
I would say that's definately an overkill of AJAX, are these sessions private or important not to show to a visitor? Just to throw it out there; a cookie is the easiest when it comes to both, to have the data in a javascript object makes it just as easily readable to a visitor, and when it comes down to cookies being enabled or not, without cookies you wouldn't have sessions anyway.
http://www.quirksmode.org/js/cookies.html is a good source about cookie handling in JS and includes two functions for reading and writing cookies.
I want to implement a two-pass cache system:
The first pass generates a PHP file, with all of the common stuff (e.g. news items), hardcoded. The database then has a cache table to link these with the pages (eg "index.php page=1 style=default"), the database also stores an uptodate field, which if false causes the first pass to rerun the next time the page is viewed.
The second pass fills in the minor details, such as how long ago something(?) was, and mutable items like "You are logged in as...".
However I'm not sure on a efficient implementation, that supports both cached and non-cached (e.g., search) pages, without a lot of code and several queries.
Right now each time the page is loaded the PHP script is run regenerating the page. For pages like search this is fine, because most searches are different, but for other pages such as the index this is virtually the same for each hit, yet generates a large number of queries and is quite a long script.
The problem is some parts of the page do change on a per-user basis, such as the "You are logged in as..." section, so simply saving the generated pages would still result in 10,000's of nearly identical pages.
The main concern is with reducing the load on the server, since I'm on shared hosting and at this point can't afford to upgrade, but the site is using a sizeable portion of the servers CPU + putting a fair load on the MySQL server.
So basically minimising how much has to be done for each page request, and not regenerating stuff like the news items on the index all the time seems a good start, compared to say search which is a far less static page.
I actually considered hard coding the news items as plain HTML, but then that means maintaining them in several places (since they may be used for searches and the comments are on a page dedicated to that news item (i.e. news.php), etc).
I second Ken's rec of PEAR's Cache_Lite library, you can use it to easily cache either parts of pages or entire pages.
If you're running your own server(s), I'd strongly recommend memcached instead. It's much faster since it runs entirely in memory and is used extensively by a lot of high-volume sites. It's a very easy, stable, trouble-free daemon to run. In terms of your PHP code, you'd use it much the same way as Cache_Lite, to cache various page sections or full pages (or other arbitrary blobs of data), and it's very easy to use since PHP has a memcache interface built in.
For super high-traffic full-page caching, take a look at doing Varnish or Squid as a caching reverse proxy server. (Pages that get served by Varnish are going to come out easily 100x faster than anything that hits the PHP interpreter.)
Keep in mind with caching, you really only need to cache things that are being frequently accessed. Sometimes it can be a trap to develop a really sophisticated caching strategy when you don't really need it. For a page like your home page that's getting hit several times a second, you definitely want to optimize it for speed; for a page that gets maybe a few hits an hour, like a month-old blog post, it's a bad idea to cache it, you only waste your time and make things more complicated and bug-prone.
I recommend to don't reinvent the wheel... there are some template engines that support caching, like Smarty
For server side caching use something like Cache_Lite (and let someone else worry about file locking, expiry dates, file corruption)
You want to save the results to a file and use logic like this to pull them back out:
if filename exists
include filename
else
generate results
render to html (as string)
write to file
output string or include file
endif
To be clear, you don't need two passes because you can save parts of the page and leave the rest dynamic.
As always with this type of question, my response is:
Why do you need the caching?
Is your application consuming too much IO on your database?
What metrics have you run?
Your are talking about adding an extra level of complexity to your app so you need to be very sure that you actually need it.
You might actually benefit from using the built-in MySQL query cache, if the database is the contention point in your system. The other option is too use Memcache.
I would recommend using existing caching mechanism. Depending on what you really need, You might be looking for APC, memcached, various template caching libs... It easier/faster to tune written/tested code to please your need than to write everything from scratch. (usually, although there might be situations when you don't have a choisce)