I don't know where to begin and need some guidance...
Looking for a simple page hits counter for a directory website. Each page of the directory (100+ pages) will have its own publicly-viewed page counter. Thanks in advance!
The question is a bit diffuse as it is now. It is not clear from your question whether you are searching for an existing hit counter solution (and possibly help on how to implement that on your site) or whether you want to code your own solution.
In either case you should try to get some things sorted out before you start:
What are the requirements for the hit counter? Or to put it differently: What features should it offer?
Should it just count the overall hits of every page or should it provide a finer resolution, e.g. showing the hits per day?
Should it provide an option to not show or not count hits on some special pages?
etc.
Hit counters usually use some kind of database to store the hit counts for the various pages. What kinds of databases are available on your server that could be used for that purpose?
If these questions are answered, you could either look for an existing solution that meets your requirements or start working on your own implementation. It is usually easier to work towards achieving some end, if you have a clear goal in mind. (Maybe you already have that, but then the question does not show it.)
Related
Ok. I didn't know how to put this question in Title so here is a quick description.
Let's say I have site with some promotional stuff to give for free (or not:).
When I have something to give I announce this on facebook and twitter etc. and people can come to website and fill quick form, couple quick questions and of course name and address.
But the problem is I have for example 20 pieces of this thing to give for free.
When you submit the form this goes automatically to database table.
I know how to display current status for this offer with some PHP (like: there's only 12 items left.hurry up!), there is also no problem with refreshing this every couple seconds with AJAX. But problem I see in here is when let's say this will become more popular and I will have many offers during short time.
I don't want database to be overloaded with queries from hundreds of people every two seconds.
Is there any way to send just one query every two seconds (somewhere on the sever?) and just somehow update value from this query in any browser currently visiting the website?
I'm not sure if this is clear question but what I'm asking is what would be the best practice for this kind of situation.
Is my concern about overloading the database even reasonable?
And extra problem...
In this particular situation - with the limit for amount of people that can participate - is there any threat that I can have strange behavior when two people will send form in exactly the same time when there is only one item left?
I would love to see any directions in this subject. Even general one will do :)
PS: No, english is not my first language :)
Thx
Can't you have one script loading from the database and save it somewhere that isn't the database (a file, preferably) and then it can be extracted from there? This will include a cronjob for that script to be run every 5 second.
I'm developing a website that is sensitive to page visits. For instance it has sections that will show the users which parts of the website (which items) have been visited the most. To implement this features, two strategies come to my mind:
Create a page hit counter, sort the pages by the number of visits and pick the highest ones.
Create a Google Analytics account and use its info.
If the first strategy has been chosen, I would need a very fast and accurate hit counter with the ability to distinguish the unique IPs (or users). I believe that using MySQL wouldn't be a good choice, since a lot of page visits, means a lot of DB locks and performance problems. I think a fast logging class would be a good one.
The second option seems very interesting when all the problems of the first one emerge but I don't know if there is a way (like an API) for Google Analytics to make me able to access the information I want. And if there is, is it fast enough?
Which approach (or even an alternative approach) you suggest I should take? Which one is faster? The performance is my top priority. Thanks.
UPDATE:
Thank you. It's interesting to see different answers. These answers reminded me an important factor. My website updates the "most visited" items, every 8 minutes so I don't need the data in real time but I need it to be accurate enoughe every 8 minutes or so. What I had in mind was this:
Log every page visit to a simple text log file
Send a cookie to the user to separate unique users
Every 8 minutes, load the log file, collect the info and update the MySQL tables.
That said, I wouldn't want to reinvent the wheel. If a 3rd party service can meet my requirements, I would be happy to use it.
Given you are planning to use the page hit data to determine what data to display on your site, I'd suggest logging the page hit info yourself. You don't want to be reliant upon some 3rd party service that you'd have to interrogate in order to create your page. This is especially true if you are loading that data real time as you'd have to interrogate that service for every incoming request to your site.
I'd be inclined to save the data yourself in a database. If you're really concerned about the performance of the inserts, then you could investigate intercepting requests (I'm not sure how you go about this in PHP, but I'm assuming it's possible.) and then passing the request data of to a separate thread to store the request info. By having a separate thread handle the logging, then you won't interrupt your response to the end user.
Also, given you are planning using the data collected to "... show the users which parts of the website (which items) have been visited the most", then you'll need to think about accessing this data to build your dynamic page. Maybe it'd be good to store a consolidated count for each resource. For example, rather than having 30000 rows showing that index.php was requested, maybe have one row showing index.php was requested 30000 times. This would certainly be quicker to reference than having to perform queries on what could become quite a large table.
Google Analytics has a latency to it and it samples some of the data returned to the API so that's out.
You could try the API from Clicky. Bear in mind that:
Free accounts are limited to the last 30 days of history, and 100 results per request.
There are many examples of hit counters out there, but it sounds like you didn't find one that met your needs.
I'm assuming you don't need real-time data. If that's the case, I'd probably just read the data out of the web server log files.
Your web server can distinguish IP addresses. There's no fully reliable way to distinguish users. I live in a university town; half the dormitory students have the same university IP address. I think Google Analytics relies on cookies to identify users, but shared computers makes that somewhat less than 100% reliable. (But that might not be a big deal.)
"Visited the most" is also a little fuzzy. The easy way out is to count every hit on a particular page as a visit. But a "visit" of 300 milliseconds is of questionable worth. (Probably realized they clicked the wrong link, and hit the "back" button before the page rendered.)
Unless there are requirements I don't know about, I'd probably start by using awk to extract timestamp, ip address, and page name into a CSV file, then load the CSV file into a database.
I have a Wordpress Plug-in with users requesting a feature that is the view counter.
I know only a few approaches on making a view counter, and the problem is that I want to optimize for performance and memory issues.
I have done a small amount of research and it seems, "mod_log_mysql", may be a great approach, but I do not have any prior knowledge on how this mod works, nor could have any ideas on how to connect it with the Wordpress Plug-in.
Or I could use a database. When the page is viewed, an update or a insert (which is said to be faster than a update) event occurs.
Thus my following option are:
Update/Insert on server side when
the page is called for.
Research more on mod_log_mysql and
find a way to connect it with the
plug-in.
Find a Premade view counter.
If there is a better approach, I would like to hear them in hoping this will solve my problem.
It really depends on what you want to achieve and how much time you want to spend on it.
If you need more than a simple view counter per page/event, go for a premade one.
If you need something simple, I would go for option #1.
If you're worried about performance, use a memory table for staging the 'counts', and then have a php script move that into a regular table periodically (i.e. using a cronjob). I wouldn't expect updating a view counter in a memory table to have any significant performance impact.
Option #2 could easily fall into premature optimization category.
Option 1 by far seems the easiest and probably most efficient. There is little overhead associated with making a single call to a database whose connection is already open from other operations done on the page previously.
I'm struggling with a conceptual question. When you have a forum with thousands of posts and/or threads, how do you retrieve all those posts to be displayed on your site? Do you connect to your database every time someone visits your page then capture every post in an array and display it? Surely this seems like it would be very taxing on your server and would cause a whole bunch of unnecessary database reads. Can anyone shine some light on this topic?
Thanks.
You never retrieve all those posts at once. In most case, forums show a page of X threads/posts, and you just get those X threads/posts from the database each time a page is served. RDBMS are pretty good at this. A forum is (should be) quite dynamic so indeed it generates a pretty good load on the database, but this is what database are made for, storing and retrieving data.
One new(ish) way of doing this is to use a Document Oriented Database like CouchDB where everything about an individual post is stored in the same document and that document gets loaded on request.
It seems in this case a Document Oriented Database would work very well for a forum or blog type site.
As far as Relational Databases go, I'm pretty sure the database gets hit every time the page loads unless there is some sort of caching implemented (then you'd have to worry about data getting stale though, which brings up a whole new mess of problems.)
Don't worry a lot about stale data. Facebook doesn't... their database is only "eventually consistent". The idea is like this: making sure that the comments are 100% always, always up-to-date is very expensive. That does put a large load on your DB. Although as Serty says, that's what the DB is made for, but whether or not your physical box is sufficient for the load is another matter.
Facebook and Digg to name a few took a different approach... Is it really all that important that every load of every page be 100% accurate? How many page loads actually result in every single comment being read by the end user anyways? It's a lot cheaper to get the comments right 'most' of the time and by 'most' I mean something you get to decide. Is a 10% chance of a page with missing comments ok? is a 1% chance? How many nodes need to have the right data NOW. When I write a new comment, how many nodes have to say they got the update for it to be successful.
I like the idea behind Cassandra which is in summary, "how much are we willing to spend to get Aunt Martha's comment about her nephew's baptism picture 100% correct?"
But that's a fine question for a free website, but this wouldn't work so good for a business application.
I was searching the web for this but found no satisfying answer.
I am not talking about the time it takes the browser to render and display.
Only the part where the HTML is generated in the server itself.
<?php
$script_start = microtime_float();
#CODE
echo (microtime_float()-$script_start)
?>
What is the accepted/normal time in web pages.
Lets say the page has a calendar, poll, content, menus(with submenus), some other modules.
Is it okay if it is less than 0.05seconds?
What do you think, what is the highest normal/accepted time it should take?
I've got this bit of string, how long should it be?
Your page will take as long as it needs to, based on what you're trying to do, how you're trying to do it, what platform you're running on, whether you're marshalling data from third-parties and a thousand and one other unknowable variables.
There will be an upper limit on what your users find acceptable, and if you find yourself frequently breaching that bound, then you could try some workarounds, e.g. caching data, lowsrc, asynchronous elements, etc.
But as it stands, there's no specific answer to this general question.
You should read this story about Google's measurements on this very topic.
I think there is no such thing like as a highest accepted time. As #Johannes points out it depends on how many users you have. Execution speed matters for Facebook - they even wrote a Compiler for it :) There are some nice benchmarks on http://www.phpbench.com/ and some optimizing tips on http://phplens.com/lens/php-book/optimizing-debugging-php.php
There's no correct answer to this, satisfying or otherwise. You should obviously be aiming to render the html in as little time as possible, but you can't put an arbitrary figure on how long that should be.
Having said that, if your pages are rendering in less than 0.05 seconds I don't think you've got anything to worry about!
That's one of the so-called "non-functional requirements". Too often they're forgotten. Others are "how often should my page crash", and "what's the desired uptime", and "should the page look different when printed?"...
You should take a look at how your php should be used: is it going to be called from other web-pages, or is it a stand-alone app? Is the user going to be bothered if the html-generation becomes the bigger part of the latency?...
Its usually more productive to observe the following:
How long database queries take
How long it takes to get data from off site requests
... which individually add up to the single page load time. There's no sense in measuring how long a page takes to load if you can't narrow down the bottlenecks.
More than a second or two, someone is likely going to start fiddling with their back or refresh buttons, or just close the browser tab. Again, that's subjective and based on my idea of how a 'typical someone' expects things to work.