create html page from php - php

I am the webmaster of a dynamic website, and because of plenty of complicated queries that I have to use on the front page and some other pages, the server suffers sometimes from overload, when the number of visitors of our website is elevated.
So, I got the Idea to generate periodically (every 2 minutes) an html static snapshot of these pages. This would charge the server just once per 2 minutes by just one user.
My question is: Is this a good Idea? because I plan to generalize it over many other pages, and I don't want to be surprised and have to go back again.
If it isn't, is there any good ideas to avoid this charge?
Thank you in advance
PS: I would maybe publish the method I use to do this, to see if there is a better way.

I don't think it's a bad idea, but you should use an existing caching solution rather than implementing your own. Why not to use memcached? I think that's what you are looking for, just use it for those parts of your code that are taking long time.

Caching is a good idea to protect your server from overloading. Many CMS (Content Management System) use this technique.

Sure, it's called caching :)
However, most sites are caching just parts of their content. You can't cache a whole page if you are using user specific content, for example the name of a logged-in user. But you can cache the heavy parts of your site and combine it with a dynamic page.

Your idea is really good and many big websites are using this concept. You can also using Caching techniques, if you want avoid database hit then you can use caching technique it will be better. you can use Memcached http://memcached.org/.

Related

PHP - detecting changes in external database-driven site

For a homework project, I'm creating a PHP driven website which main function is aggregating news about various university courses.
The main problem is this: (almost) each course has it's own website. These are usually just plain HTML or built using some simple free CMS system.
As a student, participating in 6-7 courses, almost every day you go through 6-7 websites checking if there are any news. The idea behind the project is that you don't have to do that, instead, you just check the aggregation site.
My idea is the following: each time a student logs in, go through his course list. For every course, get it's website (recursively, like with wget), and create a hash value of it. If the hash is different then one stored in database, we know that site has changed, and we notify the student.
So, what do you think, is this reasonable way to achieve the functionality?
And if yes, what is (technically) the best way to go about this? I was checking php_curl, put I don't know if it can get a website recursively.
Furthermore, there's a slight problem I have somewhat limited resources, only a few MB of quota on public (university) server. However, if that's a big problem, I could use a seperate hosting solution.
Thanks :)
Just use file_get_contents, or cURL if you absolutely have to (in case you need COOKIES).
You can use your hashing trick to check for modifications but it's not very elegant. What you want to know is when was it last changed. I doubt this information is on the website, but maybe they offer an RSS feed or some webservice or API you can use for this purpose.
Don't worry about doing recursive requests. Just make a new request each time.
"When all else fails, build a scraper"

How to track my visitors ? [best perfomance]

I've been asked to create a custom 'tracker' in PHP, to know where users are coming from and where they are going on the site.
I'm thinking of writing a simple script, which connects to a database, writes the ip, browser, and time of the visit, then closes the db link.
Is this the right way to do it ?
I've found a few similar questions on stackoverflow, but none mentioned performance.
Is there a reason you can't use a solution such as Google Analytics - its free and has some nice features such as heat maps which show traffic flow
The main disadvantage is that it requires you to embed some javascript on all the pages - which means that its client side
I suppose it's another question of the kind "I want superior performance, however I have no certain reason for that".
in fact, any solution will be fast enough as writing logs is not too heavy operation.
the only thing one have to keep in mind is not to use any indexes in case SQL database used.
that's all.
So, lets put aside that performance stuff.
The only complete solution would be analyzing web-server logs.
Any other method will not give you complete picture. Say, if there is some image hotlinked on other sites and makes heavy load because of that, you'd never notice that if you log only requests to php scripts.
So, you can run crontab-based script running every night parsing access logs and getting comprehensive information of all users and bots activity.
Check Piwik or New Relic, if you need more customization, you should take a look at Webalyzer and Visitors
N.B: You can customize Piwik by creating plugins http://geekmonkey.org/articles/34-how-to-write-a-piwik-plugin
Perhaps you need some special software like Webalyzer? (it's free and quite powerful)
Performance is easy to say but much harder to define. It depends on zillion circumstances and while i'm say: this is the best performance i can get - you might say: hey, what's this?
Personally i recommend Google Analytics. It does almost everything if you need (almost things you didn't need). Maybe you can get a small 'performance' boost if you storing it's source locally but there's a chance it's cached in users' browser yet.
Or, if you prefer open source solutions, give a shot for Piwik.
Piwik does just that, and it does it very well. There is also a Tracking API that you can use to track a lot of things about your visitors, using PHP or any other language (REST API). See more information on http://piwik.org/docs/tracking-api/
Also it is very modular & fast, don't reinvent the wheel :)

Simultaneous visits to website

Will there be any problem if my website is being visited by more then one person simultaneously...? If the answer is yes, can you say how can i overcome that?
Do I have to incorporate the session? Will that alone work? Please explain with a small example.
Going off the lack of information:
Your website will not have any problems if multiple people visit at the same time. More than likely, the software you are using is built specifically for this purpose.
There will generally be no problem. A normal PHP/Apache server setup is designed to handle multiple requests at once. Sessions are always on a per-client basis.
For more specific info, you would have to provide more information about your setup, but if you're just starting out building a web site, it is safe to say that you don't need to worry about this for the time being.
Session are not necessarly needed BUT you should really keep in mind that you data can be read and modified by multiple users in concurrency.
Look at ACID definition for the engineering part of it :)

First site going live real soon. Last minute questions

I am really close to finishing up on a project that I've been working on. I have done websites before, but never on my own and never a site that involved user generated data.
I have been reading up on things that should be considered before you go live and I have some questions.
1) Staging... (Deploying updates without affecting users). I'm not really sure what this would entail, since I'm sure that any type of update would affect users in some way. Does this mean some type of temporary downtime for every update? can somebody please explain this and a solution to this as well.
2) Limits... I'm using the Kohana framework and I'm using the Auth module for logging users in. I was wondering if this already has some type of limit (on login attempts) built in, and if not, what would be the best way to implement this. (save attempts in database, cookie, etc.). If this is not whats meant by limits, can somebody elaborate.
Edit: I think a good way to do this would be to freeze logging in for a period of time (say 15 minutes), or displaying a captcha after a handful (10 or so) of unseccesful login attempts
3) Caching... Like I said, this is my first site built around user content. Considering that, should I cache it?
4) Back Ups... How often should I backup my (MySQL) database, and how should I back it up (MySQL export?).
The site is currently up, yet not finished, if anybody wants to look at it and see if something pops out to you that should be looked at/fixed. Clashing Thoughts.
If there is anything else I overlooked, thats not already in the list linked to above, please let me know.
Edit: If anybody has any advice as to getting the word out (marketing), i'd appreciate that too.
Thanks.
EDIT: I've made the changes, and the site is now live.
1) Most sites who incorporate frequent updates or when their is a massive update that will take some time use a beta domain such as beta.example.com that is restricted to staff until it is released to the main site for the public.
2) If you use cookies then they can just disable cookies and have infinite login attempts, so your efforts will go to waste. So yeah, use the database instead. How you want it to keep track is up to you.
3) Depends on what type of content it is and how much there is. If you have a lot of different variables, you should only keep the key variables that recognize the data in the database and keep all the additional data in a cache so that database queries will run faster. You will be able to quickly find the results you want and then just open the cache file associated with them.
4) It's up to you, it really depends on traffic. If you're only getting 2 or 3 new pieces of data per day, you probably don't want to waste the time and space backing it up every day. P.S. MySQL exports work just fine, I find them much easier to import and work with.
1) You will want to keep taking your site down for updates to a minimum. I tend to let jobs build up, and then do a big update at the end of the month.
2) In terms of limiting login attempts; Cookies will be simple to implement but is not fool-proof, it will prevent the majority of your users but it can be easily circumvented so it would be best to choose another way. Using a database would be better but a bit more complicated to implement and could add more strain to a database.
3) Cacheing depends greatly on how often content is updated or changes. If content is changing a lot it may not be worth caching data but if a lot of more static then maybe using something like memcache or APC will be of use.
4) You should always make regular backups. I do one daily via a cron job to my home server although a weekly one would suffice.
Side notes: YSlow indicates that:
you are not serving up expires headers on your CSS or images (causes pages to load slower, and costs you more bandwidth)
you have CSS files that are not served up with gzip compression (same issues)
also consider moving your static content (CSS,Images,etc.) to a separate domain (CDN) for faster load times

Improve performance of website

I have designed a new web site. I have hosted it online. I want it to be of the best performance and load pages faster.
This website is designed in php 5.0+ using codeigniter. This is using mysql as DB. I have images on it. I am using Nitobi grid for displaying set of records on page. The rest is everything normal page controls.
As i am not so very experienced with website performance factors i would like to get suggestions and details on factors that can improve performance of website. Please let me know how i can improve my performance.
Also please let me know if there are any ways to measure the performance of website and also any websites or tools to help test the performance.
To start, get Firefox and Firebug and then install YSlow. YSlow gives great information about the client-side performance of the website in question. Here's an User Guide.
For the server-side performance, have a look at Apache JMeter.
Have you looked into opcode caching, APC, memcache etc? As another has said, you need to time the loading of your pages and try to find potential SQL bottlenecks and/or scripts that can be refactored. You may also want to look at getting something like webgrind installed so you can see what happens on a page load and how long each process takes.
You can see loading times of the main page and the components it contains with the Net tab in the already mentioned Firebug addon for Firefox. There you can see if a page is slow due to having a lot of external content (like user added images or so) or because of itself.
In the first case not much you can do except removing the content that takes most time, in the second case you will need to take a look at your PHP code considering the fact that most of the times performance issues in PHP applications depend on imperfect database interaction (badly written queries, repeated queries when one would suffice, etc.).
Profiling is the key word in the world of performance optimization.
To profile your site you have to measure 2 different areas: php scripts running time and whole page load time (including pictures, javascripts, style sheets etc). To measure PHP scripts is quite easy. The easiest way is to place this line at the top of your page
$TIMER['start']=microtime(1);
and this line at the bottom:
echo "Time taken: ".round(microtime(1) - $TIMER['start'],3);
if it stays below 0,1 sec, it's ok. Now to the whole page loading. Dunno if there are some http sniffer with response time recording.
Edit: Looks like Firebug's Net tab mentioned above is the right tool for this
Like what Kevin said, I suggest trying opcode caching with PHP. I'm not sure which is currently best, but when I looked it up a year ago, I decided to go with [eAccelerator][1] and it works great. I've also used APC on another server but I prefer eAccelerator.
You should probable go with Col. Shrpnel's advice and do some profiling as well.
[1]: http://en.wikipedia.org/wiki/EAccelerator eAccelerator
from the server-perspective:
as others wrote; use a php accelerator (I use APC, which is
supposed to become standard in php)
take care of database; number of queries, complexity of queries, data in resultset, ... can have a big impact
cache dynamic pages
and from a browser-perspective:
minimize number of JS and CSS-files
(one of each is ideal), put css in head, js below
avoid calling 3rd party javascript (analytics, widgets, ...)
check size of images (I use smush.it)
impact of these can be huge, cfr. tests I ran on my (wordpress-based) site.
If you have time to play try HipHop developed and used by Facebook
Page generated in 0.0074 secs.
DB runtime 0.0006 secs (7.87 %) using 1 DB queries, 7 DB cache fetches, 3 RSS cache fetches and 61.88 K memory.
http://i42.tinypic.com/2m31frp.jpg
ouch !!
dont bump - this is his benchmark ;)
This site will measure integrated performance mark for your site, as well as give you some relevant advice. All you have to do is to type in the URL.
I would suggest give Clicktale a try. I’ve been using it for 2 months and it is neat to watch what your users do, I learned a lot.

Categories