I need to count page views (from any url on my site including search pages) and show them on my site but I can't manage to make it work. I wanted to show the numer of times a page is loaded daily but at this point I don't really care whether I get pageviews, single visitors, or any kind of visits, as long as I do have some kind of counter.
Is there an easy way to do it?
Thanks
Yeah. The easiest way is to use Google Analytics.
I would suggest one of the free web statistics programs out there to just analyze your web logs. They'll be more fully featured than just counting visits, and there will be no overhead of database transactions just because someone is visiting a page.
http://awstats.sourceforge.net/
First, I'd have to say it: displaying number of views is SO 2000.
Well, now to the actual question, you'll have to identify each page and find out how flexible that can be:
/?p=1
/?p=1&q=2
/?p=1&s=1
Those might be the paths and might be referring to the same object, so you'll have to grab it and parse it if necessary. Now, just save it to a table in your database and increase the counter each time a new view is there.
Back on Visonary Software Solutions' track. I would use a Google Analytics-based solution too, perhaps you will use it on your site anyway. I did a quick search and found a tutorial on how to create counters like you wanted, displaying Analytics data. It doesn't look so complicated.
http://www.webresourcesdepot.com/feedcount-like-google-analytics-counter/
As far as I can tell, there are quite a lot of extensions for this purpose for the popular CMSs:
For Drupal: http://drupal.org/project/google_analytics_counter
For WordPress: http://analytics.blogspot.com/2009/05/share-your-google-analytics-data-with.html
Related
I am making a site which is designed to contain a large amount of links to other sites, much like a search engine. I have seen two different approached with regards to linking to external sites.
Simply to link directly to the external content directly from the links on my own site
To redirect to the content via an internal link, such as www.site.com/r/myref123 -> www.internet.com/hello.php
Would anyone be able to tell me what the advantage is with each approach? I am stuck at a crossroads here and can't find much information on which approach I should be using.
This is a bit opinion based, so I wouldn't be surprised if the question gets closed off, but I think that the second option is the better by FAR.
The reason being is that you are then able to track who clicks through to what. You are then also able to perform some fancy code that the user will never see - such as internally ranking sites that generate a lot of click-through traffic when presented in a list of choices.
Of course, lastly, and most importantly, if you are going to possibly throw in some links that generate some sort of income, you need to be able to track those clicks. If you simply present them and do nothing more, you will have no way to bill your advertisers.
You may want to track when, who, from where, etc are links clicked. If you put one of your pages in between the original page and the linked one, you will be able to do so.
In case you use the 1st version, whenever a click is done it will directly go to the referred page and you won't have any way to track it.
If you use the 2nd version, you will be able to track the clicks done by visitors through a script located in the www.site.com/r/myref....
İ want to extract data from a php forum based on keywords I entered.
İs there something ready that can do this?
Just to give example
Kadinlarkulubu.com/forum.php
Keywords ios, android
Thanks to this info I want to get date, time, message, URL of message, keyword in the message, nick of member who wrote this message.
I need to work in different forums, so I need one or more tools that will work on key big platforms like vBulletin.
You need to create your own web crawler. If you want it to work on various different platforms, you will have to create variations on that crawler.
To start, picks your favourite forum, and give it a seed page (the page where to start crawling). Tread carefully, since you may need to be logged in to be able to see posts, and if that's the case, it may not be easy to do (making a crawler that logs you in, and breaks the captcha, for example). You can also make use of the search functionality (since many forums have search URLs similar to ?q=your_tag&p=1, this could make things a lot easier.
Just check that you're on the same domain, and that you don't go into an infinite loop, other than that, you should be fine.
Expect this to be a long term project :)
The alternative would be using API, if the forum provides one, but I doubt you will be so lucky.
2 ways
The easy way is only possible if the owner of the forum provides you access to the forum API (if it has one) or the database
The extreme hard way is make a grabber that reads a forum page by page and parses the information you like to something you can use.
For a homework project, I'm creating a PHP driven website which main function is aggregating news about various university courses.
The main problem is this: (almost) each course has it's own website. These are usually just plain HTML or built using some simple free CMS system.
As a student, participating in 6-7 courses, almost every day you go through 6-7 websites checking if there are any news. The idea behind the project is that you don't have to do that, instead, you just check the aggregation site.
My idea is the following: each time a student logs in, go through his course list. For every course, get it's website (recursively, like with wget), and create a hash value of it. If the hash is different then one stored in database, we know that site has changed, and we notify the student.
So, what do you think, is this reasonable way to achieve the functionality?
And if yes, what is (technically) the best way to go about this? I was checking php_curl, put I don't know if it can get a website recursively.
Furthermore, there's a slight problem I have somewhat limited resources, only a few MB of quota on public (university) server. However, if that's a big problem, I could use a seperate hosting solution.
Thanks :)
Just use file_get_contents, or cURL if you absolutely have to (in case you need COOKIES).
You can use your hashing trick to check for modifications but it's not very elegant. What you want to know is when was it last changed. I doubt this information is on the website, but maybe they offer an RSS feed or some webservice or API you can use for this purpose.
Don't worry about doing recursive requests. Just make a new request each time.
"When all else fails, build a scraper"
I've been asked to create a custom 'tracker' in PHP, to know where users are coming from and where they are going on the site.
I'm thinking of writing a simple script, which connects to a database, writes the ip, browser, and time of the visit, then closes the db link.
Is this the right way to do it ?
I've found a few similar questions on stackoverflow, but none mentioned performance.
Is there a reason you can't use a solution such as Google Analytics - its free and has some nice features such as heat maps which show traffic flow
The main disadvantage is that it requires you to embed some javascript on all the pages - which means that its client side
I suppose it's another question of the kind "I want superior performance, however I have no certain reason for that".
in fact, any solution will be fast enough as writing logs is not too heavy operation.
the only thing one have to keep in mind is not to use any indexes in case SQL database used.
that's all.
So, lets put aside that performance stuff.
The only complete solution would be analyzing web-server logs.
Any other method will not give you complete picture. Say, if there is some image hotlinked on other sites and makes heavy load because of that, you'd never notice that if you log only requests to php scripts.
So, you can run crontab-based script running every night parsing access logs and getting comprehensive information of all users and bots activity.
Check Piwik or New Relic, if you need more customization, you should take a look at Webalyzer and Visitors
N.B: You can customize Piwik by creating plugins http://geekmonkey.org/articles/34-how-to-write-a-piwik-plugin
Perhaps you need some special software like Webalyzer? (it's free and quite powerful)
Performance is easy to say but much harder to define. It depends on zillion circumstances and while i'm say: this is the best performance i can get - you might say: hey, what's this?
Personally i recommend Google Analytics. It does almost everything if you need (almost things you didn't need). Maybe you can get a small 'performance' boost if you storing it's source locally but there's a chance it's cached in users' browser yet.
Or, if you prefer open source solutions, give a shot for Piwik.
Piwik does just that, and it does it very well. There is also a Tracking API that you can use to track a lot of things about your visitors, using PHP or any other language (REST API). See more information on http://piwik.org/docs/tracking-api/
Also it is very modular & fast, don't reinvent the wheel :)
I have a community site which has around 10,000 listings at the moment. I am adopting a new url strategy something like
example.com/products/category/some-product-name
As part of strategy, I am implementing a site map. Google already has a good index of my site, but the URLs will change. I use a php framework which accesses the DB for each product listing.
I am concerned about the perfomance effects of supplying 10,000 new URLs to google, should I be?
A possible solution I'm looking at is rendering my php-outputted pages to static HTML pages. I already have this functionality elsewhere on the site. That way, google would index 10,000 html pages. The beauty of this system is that if a user arrives via google to that HTML page, as soon as they start navigating around the site, they jump straight back into the PHP version.
My problem with this method is that I would have to append .html onto my nice clean URLs...
example.com/products/category/some-product-name.html
Am I going about this the wrong way?
Edit 1:
I want to cut down on PHP and MySQL overhead. Creating the HTML pages is just a method of caching in preparation of a load spike as the search engines crawl those pages. Are there better ways?
Unless I'm missing something, I think you don't need to worry about it. I'm assuming that your list of product names doesn't change all that often -- on a scale of a day or so, not every second. The Google site-map should be read in a second or less, and the crawler isn't going to crawl you instantly after you update. I'd try it without any complications and measure the effect before you break your neck optimizing.
You shouldnt be worried about 10000 new links, but you might want to analyze your current google traffic, to see how fast google would crawl them. Caching is always a good idea (See: Memcache, or even generate static files?).
For example, i have currently about 5 requests / second from googlebot, which would mean google would crawl those 10,000 pages in a good half hour, but, consider this:
Redirect all existing links to new locations
By doing this, you assure that links already indexed by google and other search engines are almost immediatelly rewritten. Current google rank is migrated to the new link (additional links start with score 0).
Google Analytics
We have noticed that google uses Analytics data to crawl pages, that it usually wouldn't find with normal crawling (javascript redirects, logged in user content links). Chances are, google would pick up on your url change very quickly, but see 1).
Sitemap
The rule of thumb for the sitemap files in our case is only to keep them updated with the latest content. Keeping 10,000 links, or even all of your links in there is pretty pointless. How will you update this file?
It's a love & hate relationship with me and Google crawler theese days, since most used links by users are pretty well cached, but the thing google crawler crawls usually are not. This is the reason google causes 6x the load in 1/6th the requests.
Not an answer to your main question.
You dont have to append .html. You can leave the URLs as they are. If you cant find a better way to redirect to the html file (which does not have ot have an .html suffix), you can output it via PHP with readfile.
I am concerned about the perfomance effects of supplying 10,000 new URLs to google, should I be?
Performance effects on Google's servers? I wouldn't worry about it.
Performance effects on your own servers? I also wouldn't worry about it. I doubt you'll get much more traffic than you used to, you'll just get it sent to different URLs.