How to track my visitors ? [best perfomance] - php

I've been asked to create a custom 'tracker' in PHP, to know where users are coming from and where they are going on the site.
I'm thinking of writing a simple script, which connects to a database, writes the ip, browser, and time of the visit, then closes the db link.
Is this the right way to do it ?
I've found a few similar questions on stackoverflow, but none mentioned performance.

Is there a reason you can't use a solution such as Google Analytics - its free and has some nice features such as heat maps which show traffic flow
The main disadvantage is that it requires you to embed some javascript on all the pages - which means that its client side

I suppose it's another question of the kind "I want superior performance, however I have no certain reason for that".
in fact, any solution will be fast enough as writing logs is not too heavy operation.
the only thing one have to keep in mind is not to use any indexes in case SQL database used.
that's all.
So, lets put aside that performance stuff.
The only complete solution would be analyzing web-server logs.
Any other method will not give you complete picture. Say, if there is some image hotlinked on other sites and makes heavy load because of that, you'd never notice that if you log only requests to php scripts.
So, you can run crontab-based script running every night parsing access logs and getting comprehensive information of all users and bots activity.

Check Piwik or New Relic, if you need more customization, you should take a look at Webalyzer and Visitors
N.B: You can customize Piwik by creating plugins http://geekmonkey.org/articles/34-how-to-write-a-piwik-plugin

Perhaps you need some special software like Webalyzer? (it's free and quite powerful)

Performance is easy to say but much harder to define. It depends on zillion circumstances and while i'm say: this is the best performance i can get - you might say: hey, what's this?
Personally i recommend Google Analytics. It does almost everything if you need (almost things you didn't need). Maybe you can get a small 'performance' boost if you storing it's source locally but there's a chance it's cached in users' browser yet.
Or, if you prefer open source solutions, give a shot for Piwik.

Piwik does just that, and it does it very well. There is also a Tracking API that you can use to track a lot of things about your visitors, using PHP or any other language (REST API). See more information on http://piwik.org/docs/tracking-api/
Also it is very modular & fast, don't reinvent the wheel :)

Related

create html page from php

I am the webmaster of a dynamic website, and because of plenty of complicated queries that I have to use on the front page and some other pages, the server suffers sometimes from overload, when the number of visitors of our website is elevated.
So, I got the Idea to generate periodically (every 2 minutes) an html static snapshot of these pages. This would charge the server just once per 2 minutes by just one user.
My question is: Is this a good Idea? because I plan to generalize it over many other pages, and I don't want to be surprised and have to go back again.
If it isn't, is there any good ideas to avoid this charge?
Thank you in advance
PS: I would maybe publish the method I use to do this, to see if there is a better way.
I don't think it's a bad idea, but you should use an existing caching solution rather than implementing your own. Why not to use memcached? I think that's what you are looking for, just use it for those parts of your code that are taking long time.
Caching is a good idea to protect your server from overloading. Many CMS (Content Management System) use this technique.
Sure, it's called caching :)
However, most sites are caching just parts of their content. You can't cache a whole page if you are using user specific content, for example the name of a logged-in user. But you can cache the heavy parts of your site and combine it with a dynamic page.
Your idea is really good and many big websites are using this concept. You can also using Caching techniques, if you want avoid database hit then you can use caching technique it will be better. you can use Memcached http://memcached.org/.

Analtytics, statistics or logging information for a PHP Script

I have a WordPress plugin, which checks for an updated version of itself every hour with my website. On my website, I have a script running which listens for such update requests and responds with data.
What I want to implement is some basic analytics for this script, which can give me information like no of requests per day, no of unique requests per day/week/month etc.
What is the best way to go about this?
Use some existing analytics script which can do the job for me
Log this information in a file on the server and process that file on my computer to get the information out
Log this information in a database on the server and use queries to fetch the information
Also there will be about 4000 to 5000 requests every hour, so whatever approach I take should not be too heavy on the server.
I know this is a very open ended question, but I couldn't find anything useful that can get me started in a particular direction.
Wow. I'm surprised this doesn't have any answers yet. Anyways, here goes:
1. Using an existing script / framework
Obviously, Google analytics won't work for you since it is javascript based. I'm sure there exists PHP analytical frameworks out there. Whether you use them or not is really a matter of your personal choice. Do these existing frameworks record everything you need? If not, do they lend themselves to be easily modified? You could use a good existing framework and choose not to reinvent the wheel. Personally, I would write my own just for the learning experience.
I don't know any such frameworks off the top of my head because I've never needed one. I could do a Google search and paste the first few results here, but then so could you.
2. Log in a file or MySQL
There is absolutely NO GOOD REASON to log to a file. You'd first log it to a file. Then write a script to parse this file.Tomorrow you decide you want to capture some additional information. You now need to modify your parsing script. This will get messy. What I'm getting at is - you do not need to use a file as an intermediate store before the database. 4-5k write requests an hour (I don't think there will be a lot of read requests apart from when you query the DB) is a breeze for MySQL. Furthermore, since this DB won't be used to serve up data to users, you don't care if it is slightly un-optimized. As I see it, you're the only one who'll be querying the database.
EDIT:
When you talked about using a file, I assumed you meant to use it as a temporary store only until you process the file and transfer the contents to a DB. If you did not mean that, and instead meant to store the information permanently in files - that would be a nightmare. Imagine trying to query for certain information that is scattered across files. Not only would you have to write a script that can parse the files, you'd have to right a non-trivial script that can query them without loading all the contents into memory. That would get nasty very, very fast and tremendously impair your abilities to spot trends in data etc.
Once again - 4-5K might seem like a lot of requests, but a well optimized DB can handle it. Querying a reasonably optimized DB will be magnitudes upon magnitudes of orders faster than parsing and querying numerous files.
I would recommend to use an existing script or framework. It is always a good idea to use a specialized tool in which people invested a lot of time and ideas. Since you are using a php Piwik seems to be one way to go. From the webpage:
Piwik is a downloadable, Free/Libre (GPLv3 licensed) real time web analytics software program. It provides you with detailed reports on your website visitors: the search engines and keywords they used, the language they speak, your popular pages…
Piwik provides a Tracking API and you can track custom Variables. The DB schema seems highly optimized, have a look on their testimonials page.

Tricky website idea (not your average idea, and not a "will you program this for me?" request)

Again, to re-iterate: This is not a request to program anything for me. I am looking for more experienced web developers to tell me if my idea is really doable, as it involves some pretty tough issues (at least, I think so). Please, if this post is to be closed, could I at least get a little advice on where I should be posting instead first?
Imagine: You visit a website (say malonssite.com). You sign in, you get a double-paned window. Left side is chat list(think FB buddy list). Right side is a "browser".
The chat list is populated by other people who have signed into malonssite.com AND are visiting the same page as as you using the "embedded" browser.
Each user has the ability to "allow followers", at which point whatever site they visit, all their followers "follow".
Image sketch:
My abilities:
PHP
MySQL
Javascript (node.js included, but that's more serverish I guess)
I've done long polling and ajax, but this gets complicated. I am thinking something like this might be best done in flash? Or maybe an oldschool Java applet? I am just not sure.
I am pretty confident I can make this thing on my own, I am just not sure what technology to use. I usually hit stumbling blocks in each area, normally along the lines of the same origin policy. I know that JSONP can get around the SOP, however is it powerful enough to do what I want? I am not familiar enough with it.
Sockets in general (websockets, flash sockets, etc) and node.js are pretty new to me, and I think they somehow hold the answer, I am just looking for some verification.
Thanks!
As I see it, you'll just need an iframe with a JScript asking its src and sending it to the server. So basically the user will stay on your own domain, browsing other web sites in the iframe and you will have no cross-origin-request issues.
You could use ape engine for the server side, which is exactly meant for this sort of things.
It is very possible.
Simple? no. But possible.
HTML/CSS/JS will easily take care of the front end layout,that should be elementary.
Node.js is a good option, and would be best suited if you know that traffic will be heavy.
If traffic won't be heavy, i guess php is OK.
And you will also need a backend database...again, depends on how many users you think you'll have. nosql ones would suit well, although oracle just claimed they 'exponentially' improved mySQL performance.
But think about this idea carefully. The concept of allowing users to communicate if they're on the same page is neat - but they'd have to browse a site within your site....furthermore, you have to account when the user presses next/back button in the browser.
perhaps you could make a fork of firefox and implement this as a software
did you mean something like talkita
or any other solution on google search "chat with others on same page"?
some of them also allow followors (subscribers) etc..
have a look, maybe youll get an idea.
please forget about flash and java applets...
i think this is a great idea and i hope you can get it to work.
I would really use NodeJS + (Socket.IO | SockJS) for the server-side and realtime communication, all your SOP problems will be gone.
As for the client side, just take care of cross browsing the javascript and css
For data persistence, some kind of nosql implementation: mongoDB or couchDB for example

What might be the best way to benchmark a users PC, PHP or JS?

PHP - Apache with Codeigniter
JS - typical with jQuery and in house lib
The Problem: Determining (without forcing a download) a user's PC ability &/or virus issue
The Why: We put out a software that is mostly used in clinics, but can be used from home, however, we need to know, before they go to our mainsite, if their pc can handle the enormities of our web-based, browser-served software.
Progress: So far, we've come up with a decent way to test dl speed, but that's about it.
What we've done: In php we create about a 2.5Gb array of data to send to the user in a view, from there the view calculates the time it took to get the data and then subtracts the php benchmark from this time in order to get a point of reference of upload/download time. This is not enough.
Some of our (local) users have been found to have "crappy" pc's or are virus infected and this can lead to 2 problems. (1)They crash in the middle of preforming task in our program, or (2) their virus' could be trying to inject into our js thus creating a bad experience that may make us look bad to the average (uneducated on how this stuff works) user, thus hurting "our" integrity.
I've done some googling around, but most plug-ins or advice forums/blogs i've found simply give ways to benchmark the speed of your JS and that is simply not enough. I need a simple bit of code (no visual interface included, another problem i found with one nice piece of js lib that did this, but would take days to remove all of the authors personal visual code) that will allow me to test the following 3 things:
The user's data transfer rate (i think we have this covered, but if better method presented i won't rule it out)
The user's processing speed, how fast is the computer in general
possible test for infection via malware, adware, whatever maybe harmful to the user's experience
What we are not looking to do: repair their pc! We don't care if they have problems, we just don't want to lead them into our site if they have too many problems. If they can't do it from home, then they will be recommended to go to their nearest local office to use this software "in house" so to speak.
Further Explanation
We know your can't test the user-side stuff with PHP, we're not that stupid, PHP is mentioned because it can still be useful in either determining connection speed or in delivering a script that may do what we want. Also, this is not a software for just anyone on the net to go sign up and use, if you find it online, unless you are affiliated with a specific clinic and have a login name and what not, your not ment to use the sight, and if you get in otherwise, it's illegal. I can't really reveal a whole lot of information yet as the sight is not live yet. What I can say, is it mostly used by clinics/offices for customers to preform a certain task. If they don't have time/transport/or otherwise and need to do it from home, then the option is available. However, if their home PC is not "up to snuff" it will be nothing but a problem for them and make the 2 hours task they are meant to preform become a 4-6hour nightmare. Thus the reason, i'm at one of my fav quest sights asking if anyone may have had experience with this before and may know a good way to test the user's PC so they can have the best possible resolution, either do it from home (as their PC is suitable) or be told they need to go to their local office. Hopefully this clears things up enough we can refrain from the "sillier" answers. I need a REAL viable solution and/or suggestions, please.
PHP has (virtually) no access to information about the client's computer. Data transfer can just as easily be limited by network speed as computer speed. Though if you don't care which is the limiter, it might work.
JavaScript can reliably check how quickly a set of operations are run, and send them back to the server... but that's about it. It has no access to the file system, for security reasons.
EDIT: Okay, with that revision, I think I can offer a real suggestion - basically, compromise. You are not going to be able to gather enough information to absolutely guarantee one way or another that the user's computer and connection are adequate, but you can get a general idea.
As someone suggested, use a 10MB-20MB file and several smaller ones to test actual transfer rate; this will give you a reasonable estimate. Then, use JavaScript to test their system speed. But don't just stick with one test, because that can be heavily dependent on browser. Do the research on what tests will best give an accurate representation of capability across browsers; things like looping over arrays, manipulating (invisible) elements, and complex math. If there is a significant discrepancy between browsers, then use different thresholds; PHP does know what browser they're using, so you can give the system different "good enough" ratings depending on that. Limiting by version (like, completely rejecting IE6) may help in that.
Finally... inform the user. Gently. First let them know, "Hey, this is going to run a test to see if your network connection and computer are fast enough to use our system." And if it fails, tell them which part, and give them a warning. "Hey, this really isn't as fast as we recommend. You really ought to go down to the local clinic to perform this task; if you choose to proceed, it may take a lot longer than intended." Hopefully, at that point, the user will realize that any issues are on them, not on you.
What you've heard is correct, there's no way to effectively benchmark a machine based on Javascript - especially because the javascript engine mostly depends on the actual browser the user is using, amongst numerous other variables - no file system permissions etc. A computer is hardly going to let a browsers sub-process stress itself anyway, the browser would simply crash first. PHP is obviously out as it's server-side.
Sites like System Requirements Lab have the user download a java applet to run in it's own scope.

PHP - detecting changes in external database-driven site

For a homework project, I'm creating a PHP driven website which main function is aggregating news about various university courses.
The main problem is this: (almost) each course has it's own website. These are usually just plain HTML or built using some simple free CMS system.
As a student, participating in 6-7 courses, almost every day you go through 6-7 websites checking if there are any news. The idea behind the project is that you don't have to do that, instead, you just check the aggregation site.
My idea is the following: each time a student logs in, go through his course list. For every course, get it's website (recursively, like with wget), and create a hash value of it. If the hash is different then one stored in database, we know that site has changed, and we notify the student.
So, what do you think, is this reasonable way to achieve the functionality?
And if yes, what is (technically) the best way to go about this? I was checking php_curl, put I don't know if it can get a website recursively.
Furthermore, there's a slight problem I have somewhat limited resources, only a few MB of quota on public (university) server. However, if that's a big problem, I could use a seperate hosting solution.
Thanks :)
Just use file_get_contents, or cURL if you absolutely have to (in case you need COOKIES).
You can use your hashing trick to check for modifications but it's not very elegant. What you want to know is when was it last changed. I doubt this information is on the website, but maybe they offer an RSS feed or some webservice or API you can use for this purpose.
Don't worry about doing recursive requests. Just make a new request each time.
"When all else fails, build a scraper"

Categories