I want to record number of hits to the pages and I'm thinking either plain file-based or SQLite storage.
File-based option
A file includes only an integer incremented with each page visit and every page has a unique file name. If I open a file using a mode, I can write but is it possible to not to close it in order to save from opening and closing the file each time?
SQLite option
A simple table with columns PageName and Hits. Same question again is it possible to not to close it in order to save from opening and closing the db each time?
Google Analytics. Unless you need to do it in-house.
Rolling your own solution in PHP can be as simple or complicated as you like. You can create a table to store ip-address (not always reliable), the page location, and a date. This will allow you to track unique hits for each day. You may want to schedule a task to reduce the amount of records to a simple row of date, numOfUnique.
Another method is parsing your log files. You could do this every 24 hours or so as well.
If you really have to do this in-house, you should go with the sqlite method. While there's a little overhead (due to opening the database file), there can also be notable benefits in storing structured data.
You could for example add a date field, and then get daily/hourly/monthly/whatever data for each page.
You could also add the IP address for each visitor, and then extract visits data. That way you could easily extract data about your site users' behaviours.
You could also store your visitors' user-agent and OS, so you know what browsers you should target or not.
All in all, inserting that kind of data into a database is trivial. You can learn a lot of things from these data, if you take some time to study these. For that reason, databases are usually the way to go, since they're easy to manipulate.
It's not possible in any of your cases. PHP applications are run when user requests something from them, they generate the result and then shuts down. So even if you don't close db connection or file they will be closed automatically. But I don't know why opening db connection or a file to write would be a problem?
It's difficult to give particularly useful answers in the absence of how much traffic you're expecting (other than Jonathan Sampson's comment that you might be better off using Google Analytics).
File-based option:
I don't think it's possible to keep the file open. Also, you'll probably bump into concurrent write problems unless you employ some kind of locking mechanism.
SQLite option:
I think this is probably the way to go, if you've not already got a database open. I doubt that opening/closing the db each time will be a bottleneck - try it and profile.
Related
I am creating a web-based app for android and I came to the point of the account system. Previously I stored all data for a person inside a text file, located users/<name>.txt. Now thinking about doing it in a database (like you probably should), wouldn't that take longer to load since it has to look for the row where the name is equal to the input?
So, my question is, is it faster to read data from a text file, easy to open because it knows its location, or would it be faster to get the information from a database, although it would have to first scan line by line untill it reaches the one with the correct name?
I don't care about the safety, I know the first option is not save at all. It doesn't really matter in this case.
Thanks,
Merijn
In any question about performance, the first answer is usually: Try it out and see.
In your case, you are reading a file line-by-line to find a particular name. If you have only a few names, then the file is probably faster. With more lines, you could be reading for a while.
A database can optimize this using an index. Do note that the index will not have much effect until you have a fair amount of data (tens of thousands of bytes). The reason is that the database reads the records in units called data pages. So, it doesn't read one record at a time, it reads a page's worth of records. If you have hundreds of thousands of names, a database will be faster.
Perhaps the main performance advantage of a database is that after the first time you read the data, it will reside in the page cache. Subsequent access will use the cache and just read it from memory -- automatically, I might add, with no effort on your part.
The real advantage to a database is that it then gives you the flexibility to easily add more data, to log interactions, and to store other types of data the might be relevant to your application. On the narrow question of just searching for a particular name, if you have at most a few dozen, the file is probably fast enough. The database is more useful for a large volume of data and because it gives you additional capabilities.
Abit of googling came up with this question: https://dba.stackexchange.com/questions/23124/whats-better-faster-mysql-or-filesystem
I think the answer suits this one as well.
The file system is useful if you are looking for a particular file, as
operating systems maintain a sort of index. However, the contents of a
txt file won't be indexed, which is one of the main advantages of a
database. Another is understanding the relational model, so that data
doesn't need to be repeated over and over. Another is understanding
types. If you have a txt file, you'll need to parse numbers, dates,
etc.
So - the file system might work for you in some cases, but certainly
not all.
That's where database indexes come in.
You may wish to take a look at How does database indexing work? :)
It is quite a simple solution - use database.
Not because its faster or slower, but because it has mechanisms to prevent data loss or corruption.
A failed write to the text file can happen and you will lose a user profile info.
With database engine - its much more difficult to lose data like that.
EDIT:
Also, a big question - is this about server side or app side??
Because, for app side, realistically you wont have more than 100 users per smartphone... More likely you will have 1-5 users, who share the phone and thus need their own profiles, and for the majority - you will have a single user.
I need to create a visitor counter for my websites and I'm wondering if it is better to store and read the information from a txt file located somewhere in my host or directly from the database.
Using a database would mean that a DB entry will be created for every single visitor that will access the site and honestly I don't think that would be OK.
File counter - when just count.
DB counter - when visit tracking, depenences, analysis, aggregation.
Read file is really faster, when file is small. Still, there may be a race condition effect, when site is heavy loaded. There is hard to show linked data, if needed. For this needs there is a great solution: Database Management Systems.
Database (with good design) allows to avoid race condition. Also it's a better solution for large amount of linked data structures. It's better, when you need to log visits, referers, etc...
DB Suggestions: you might store counter in one row of global_settings table and update it within each page visit, or you might get it by registrating each visit in visit table (with additional data, like IP, DateTime, UserID, etc...) with SELECT COUNT(*) from visit;.
There is another related topic here.
Loading anything from text files is pretty bad practice. Using a database is the better solution. Databases are meant to store large amounts of data, so it is perfectly acceptable.
I've got the following user data for my site:
1) Hit IP, location
2) Login name, attempt info
3) Download attempt info
Is it better to keep this information in individual files, per day, or in a database?
Well there is a trade-off.
With files, they are easy to move around, easy to export to other systems and simple to parse.
With a database you have easy searching, reporting, and security.
It is up to you to balance those priorities.
Files are the optimal store for logging data. You don't need to access them daily, nor do you want to run queries on them usually (that would be the deciding factor). Usual log data is textual and informative, not structured. And the file API is best suited to append log data. (It's faster, but you shouldn't base the decision on performance aspects however.)
Keep them in the database, it's easier and more efficient, especially if you're reading the data for statistics/other stuff. A database also prevents file write locks. There is a limited set of things that should be stored in files: mostly large amounts of binary data, e.x. images.
A side note, make sure login name points at the/a primary key of the users table (be that an auto-increment integer column, or the username; whatever you set it to). This will increase performance.
Files much more preffered: on top what previous users said usually you log at the beginnig of script, if your database think to long it will affect performance of the script, logging to file will be faster in many casses.
And if fast and convinient reporting is needed you can always load data from file to database as when it needed or as cron job
as the title of the question suggests, my question is simple, which one is better in terms of performance knowing that i'm on a linux shared hosting, siteground.. i'm capable of coding both, i actually coded a one that updates the DB, but from reading around some people suggested to insert and not to update.. any feed back is much appreciated..
thank you.
Use a database! Since you will have multiple people accessing your site, writing to one file will either mean blocking or having the count overwritten.
By using a database and inserting, you don't have to wait for other clients and you are safely allowing concurrent access. you just get the count by doing a select count(*) from countTbl
What are you storing in the database? If it´s just that one number (the page counter), I would not use a database but if you are storing data for each visitor, a database is the way to go.
First of all, the website I run is hosted and I don't have access to be able to install anything interesting like memcached.
I have several web pages displaying HTML tables. The data for these HTML tables are generated using expensive and complex MySQL queries. I've optimized the queries as far as I can, and put indexes in place to improve performance. The problem is if I have high traffic to my site the MySQL server gets hammered, and struggles.
Interestingly - the data within the MySQL tables doesn't change very often. In fact it changes only after a certain 'event' that takes place every few weeks.
So what I have done now is this:
Save the HTML table once generated to a file
When the URL is accessed check the saved file if it exists
If the file is older than 1hr, run the query and save a new file, if not output the file
This ensures that for the vast majority of requests the page loads very fast, and the data can at most be 1hr old. For my purpose this isn't too bad.
What I would really like is to guarantee that if any data changes in the database, the cache file is deleted. This could be done by finding all scripts that do any change queries on the table and adding code to remove the cache file, but it's flimsy as all future changes need to also take care of this mechanism.
Is there an elegant way to do this?
I don't have anything but vanilla PHP and MySQL (recent versions) - I'd like to play with memcached, but I can't.
Ok - serious answer.
If you have any sort of database abstraction layer (hopefully you will), you could maintain a field in the database for the last time anything was updated, and manage that from a single point in your abstraction layer.
e.g. (pseudocode): On any update set last_updated.value = Time.now()
Then compare this to the time of the cached file at runtime to see if you need to re-query.
If you don't have an abstraction layer, create a wrapper function to any SQL update call that does this, and always use the wrapper function for any future functionality.
There are only two hard things in
Computer Science: cache invalidation
and naming things.
—Phil Karlton
Sorry, doesn't help much, but it is sooooo true.
You have most of the ends covered, but a last_modified field and cron job might help.
There's no way of deleting files from MySQL, Postgres would give you that facility, but MySQL can't.
You can cache your output to a string using PHP's output buffering functions. Google it and you'll find a nice collection of websites explaining how this is done.
I'm wondering however, how do you know that the data expires after an hour? Or are you assuming the data wont change that dramatically in 60 minutes to warrant constant page generation?