I have a table article with many articles. I want to track the number of viewers for each article. Here's my idea, of how I plan to do it.
I've a table viewers with rows: id, ip, date, article_id.
article_id is a FOREIGN FIELD referring to the id of the article. So, when a user open up an article, his/her IP address is stored in the table.
Is this a decent approach? It is for a personal site and not a very big website.
EDIT: I want to print the number of view on each article page.
It depends on how frequently you need to display number of viewer. Your general query will be:
select count(*) from viewers
where article_id='10'
With time, your viewers table will grow. Say it have million records after 1 year or two. Now if you are showing number of viewers on each article page or displaying articles with most viewers, it will start impacting on performance even though foreign key is indexed. However that will happen after you added hundreds of articles with each having thousands of viewers.
A better optimized solution may be to keep number of viewers in article table and use that to display results. Existing Viewers table is also necessary to ensure there is no duplicate entry (Same user reading an article ten times must be marked as single entry not ten).
Use a Tool like Google Analytics. This will do the job much more elaborated and you're up and running in minutes, there's more about unique visitors than IP addresses!
If you want to have an on premise solution, look at PIWIK, which is PHP framework for exactly this puprose.
In this design,There is a one problem if the same user open it again and again then either you have to put check before insert the entry or you insert the same ip address multiple time but different time stamp.
Most of the popular sites consider one ip address as one view even if that client or user open that article several times.
I can think of solution.
your approach with single check. if the same client has opened it again don't insert it.
or group by Id when you retrieve the counter.
It depends on what you want to store in your database. If you want to know exactly how many unique users visited this particular article (including date and ip) this is reasonable way do to this. But if you need only a number to show you can alter article table and include new column with visit_counter and store cookie to prevent incrementing counter on same user.
try something like this
// insert
$query = mysqli_query("REPLACE INTO viewers (ip) VALUES ('" . ip2long($_SERVER['REMOTE_ADDR']) . "')");
// retrieve
list($pageviews) = mysqli_fetch_row(mysqli_query("SELECT COUNT(ip) FROM viewers"));
echo $pageviews;
Read : REPLACE INTO
Yes, this is good aproach if you create some kind of cache for displaying how many views each article had. It's not optimal to count views each time user open website.
You can do it in SQL Server. There is something like materialized view. ( https://code.google.com/p/flexviews/ )
select article_id, count(*) as views from viewers group by article_id
Or you can cache it in files and refresh every X time.
To store users who viewed article I suggest using AJAX. When user open website, another 'thread' will call website to add his as viewer. Even if your db is slow, it will not slow down website loading, also web spiders will not be counted.
Related
I have a large database of three million articles in a specific category.I'm going with this database, few sites launch.but my budget is low.So the best thing is for me to use a shared host but the problem is that the shared host hardware power is weak given to the user because it shared so I have to get a new post to a site that has already been posted i'm in trouble. I used the following method to get the new contents of the database but now with the increasing number and growing database records more than the power of a shared host to display information at the right time.
My previous method :
I have a table for content
And a table to know what entry was posted statistics that for every site.
My query is included below:
SELECT * FROM postlink WHERE `source`='$mysource' AND NOT EXISTS (SELECT sign FROM `state` WHERE postlink.sign = state.sign AND `cite`='$mycite') ORDER BY `postlink`.`id` ASC LIMIT 5
i use mysql
I've tested with different queries but did not get a good result and we had to show a few post more very time-consuming.
Now I want you to help me and offer me a solution thats I can with the number of posts and with normally shared host show in the shortest possible time some new content to the site requesting new posts.
The problem will happen when the sending post stats table is too large and if I want to empty this table we'll be in problems with sending duplicate content so I have no other choice to table statistics.
Statistics table now has a record 500 thousand entries for 10 sites.
thanks all in advance
Are you seriously calling 3 million articles a large database? PostgreSQL will not even start making Toasts at this point.
Consider migrating to a more serious database where you can use partial indexes, table partitioning, materialized views, etc.
I have to create a like system (the name won't be "like", Facebook owns it).
So I imagined two ways to store these likes in my database and I want to know, which way is the better for a very high-traffic site.
Create table comment_likes with "id", "comment_id", "user_id" cells. In comments table store the "like_count", so I don't need to count it when I need to write it out. But likes are easy to do thing, so people will create a lots of them and if I need to list a specified comment's likes, I need to read the whole comment_likes table and found all the user_ids. This could be millions of rows in the future. If 1000 user will do it in the same time, my system will die.
My second thought was, to store likes in comments table. create a cell named "likes" with a list of user_ids like this: 1#34#21#56#....
So when somebody like/unlike a comment just CONCAT or REPLACE his/her id in this cell with a #. When I need to list specified comment just explode this list at #-s.
I think 2nd could be faster and smarter, but what do you think about this?
The first option is much better, because you have the benefits of a relational setup. For example: What if you want to get the comments from the database userId x has liked? With the first setup this is a fast and simple query. In the second case you would have to use a LIKE, which is much slower and inaccurate. (Imagine the userId is 1, and the likes field in the comments table contains #10 - it would return the comment if you would use LIKE '%1%').
And even for a high traffic site; just using an index on commentId would make this a fast operation.
So go for the first option.
If you really doubt the speed of the first option, you could create a "cache" field in the comments table in which you count the amount of likes, so you don't have to perform a subquery to select the like count.
I have a page on my site that if the column 'active' in my table is not null then show my user, if not dont bother.
Before I go further though, say I have 100 users, having this query on my page will that slow speed down a lot? Basically I have my query on my index page, so when users visit I want it to run, and pull all of my data from my table...
while($r = $q->fetch(PDO::FETCH_LAZY)){
echo '<div class="user">'.$r["user"].'</div>';
}
The short answer is no. You should not see a significant delay for a reasonable amount of users. However, if you are anticipating thousands of users, you might be better off using any number of practices to optimize performance, like:
Paging: Using a LIMIT clause with your queries and an assortment of links to provide access to all user records. This way, no matter how much your table may grow, performance will stay constantly high.
A simple text search form for the user name. Consider having to browse through a couple hundred users just to reach a particular one, versus having the server do that for you. If you go that way, don't forget to index the fields the search would apply to to ensure optimal performance.
Index the fields you 're using as criteria in your query, which is covered by ndefontenay and I 'm just mentioning it for completeness.
EDIT: Regarding your question, if what you want is to get all users where the field active is not null, you can do it like this:
SELECT * FROM User WHERE active IS NOT NULL
A small note: In terms of performance, it is best to specify the fields you want to retrieve instead of collecting all fields. If, for example, you wanted to retrieve the user name and email for each user, you could do it like:
SELECT username, email FROM User WHERE active IS NOT NULL
It looks more like a SQL question rather than a PHP question.
Your query looks like this:
select username, fullname, whatever_else from usertable where active = 1;
This should be a really fast query if you create an index on your active column:
create index idx_active
on usertable
(active desc);
I made the index descending so that the value 1 is lined up first (in my example 1 is active 0 inactive and the active column is an integer).
I assume that on the long run, you will have a lot less active people than inactive people as well. You will always query a small substract of your table usertable.
Hope this helps.
I see on sites that they sometimes have a statistic showing how many views an article or downloads a file had over the last week. I think download.com does something like this. I was wondering how they go about doing this. Are they actually keeping track of every days downloads or am I missing something really basic?
Are they doing something like having three rows called total_downloads, last_week_downloads, this_week_downloads. Then every week, copying the value of this_week_downloads to last_week_downloads and then resetting this_week_downloads to 0?
There are a couple of ways to do it, depending on what your trying to get out of the stats.
One way is to include a visits column on your table, then just increase that number by 1 each time that article's page is loaded.
This however isn't very good for giving the past weeks number of views. You can do this in 2 ways:
1) another column in your table doing the same as visits, but run a cron job to put it back to 0 every week.
2) create another table which holds article_id, ip_address and timestamp, you would insert a record each time someone visits the article, storing their IP address (allowing you to roughly get page views and unique page views), and of course the timestamp allows you to query for only a sub-set of those records. Note: using this method you could store more information for stats, but it does require a lot more server resources.
The most basic way you can do this is associate a MySQL field alongside your article on the database and just increment it.
Assuming you we're retreiving article 123 from your database you would have something like this on your code:
<?php
// this would increment the number of views
$sql = "UPDATE table SET count_field=count_field+1 WHERE id=123";
...
?>
Platform: PHP5, MySQL
I have a web site that displays articles that are sorted into categories. I would like to track the number of views for each article. Then, in the side bar, display the top 5 articles viewed articles for the current day, within the past meek, and within the last month. What do you think would be the best way to do that? One row in database for each view (article_id, timestamp)? What would be the least amount of work for the server?
Thanks, joe
This can become a tricky problem. If you just store raw hits, your table will grow rapidly and crunching the numbers becomes more time consuming. So, one way to deal with this is to create aggregate tables and crunch the numbers using a cron job.
For example, you could have the following tables
hit_count: article_id, timestamp
hit_count_daily: day, year, article_id, hit_count
hit_count_weekly: week, year, article_id, hit_count
hit_count_monthly: month, year, article_id, hit_count
hit_count_yearly: year, article_id, hit_count
You then process the data in the hit_count table, add it to the aggregate tables, and then remove the data from the hit_count table.
You also need to think about what happens if someone refreshes the page or if Google crawls the article. Do you want to count those as hits?
To keep crawlers from triggering hits, you could use some Javascript on the page to communicate back to your server and register the hit. This way, a normal browser will trigger the hit but a crawler will not.
You could also offload this task to another service, like Chartbeat or Clicky
How about:
Add this to php.ini
auto_append_file = /server_root/footer.php
footer.php contains a silent routine to write the SQL data.