Database vs XML File for minimal content - php

I am building a website for a client. The landing page has 4 areas of customizable content. The content is minimal, it's mainly just a reference to an image, an associated link, and an order...so 3 fields.
I am already using a lot of MySQL Tables for the other CMS related aspects, but for this one use, I am wondering if a database table really is the best option. The table would have only 4 records and there would be 3 columns. It's not going to be written too very often, just read from as the landing page loads.
Would I be better off sticking to a MySQL table for storing this minimal amount of information since it will fit into the [programming] workflow easy enough? Or would using an XML file that stores the information be a better way to go?
UPDATE: The end user (who knows nothing about databases) will be going through a web interface I create to choose one of the 4 items they want to update, then uploading an image from their computer, then selecting the link from a list of pages on the site (or offsite). The XML file or Database table will store the location of the image on the server and the link to wrap it in.

A database is the correct solution for storing dynamic data. However I agree it sounds like MySQL is overkill for this situation. It also means an entire other thing to administer and manage. But using a flat-file like XML is a bad idea too. Luckily SQLite is just the thing.
Let these snippets from the SQLite site encourage you:
Rather than using fopen() to write XML or some proprietary format into disk files used by your application, use an SQLite database instead. You'll avoid having to write and troubleshoot a parser, your data will be more easily accessible and cross-platform, and your updates will be transactional.
Because it requires no configuration and stores information in ordinary disk files, SQLite is a popular choice as the database to back small to medium-sized websites.
Read more here.
As for PHP code to interface with the database, use either PDO or the specific SQLite extension.
PDO is more flexible so I'd go with that and it should work out of the box - "PDO and the PDO_SQLITE driver is enabled by default as of PHP 5.1.0." (reference) Finding references and tutorials is super easy and just a search away.

If you use PHP the SimpleXML library could give you what you need and using XML in this fashion for something like you describe might be the simplest way to go. You don't have to worry about any MySQL configurations or worrying about the client messing something up. Restoring and maintaining an XML file might be easier too. The worry might be future scalability if you plan to grow the site much. That's my 2 cents anyway.

Related

Storing post bodies in database or files?

I'm learning web-centric programming by writing myself a blog, using PHP with a MySQL database backend. This should replace my current (Drupal based) blog.
I've decided that a post should contain some data: id, userID, title, content, time-posted. That makes a nice schema for a database table. I'm having issues deciding how I want to organize the storage of content, though.
I could either:
Use a file-based system. The database table content would then be a URL to a locally-located file, which I'd then read, format, and display.
Store the entire contents of the post in content, ie put it into the database.
If I went with (1), searching the contents of posts would be slightly problematic - I'd be limited to metadata searching, or I'd have to read the contents of each file when searching (although I don't know how much of a problem that'd be - grep -ir "string" . isn't too slow...). However, images (if any) would be referenced by a URL, so referencing content would at least be an internally consistant methodology, and I'd be easily able to reuse the content, as text files are ridiculously easy to work with, compared to an SQL database file.
Going with (2), though, I could use a longtext. The content would then need to be sanitised before I tried to put it into the tuple, and I'm limited by size (although, it's unlikely that I'd write a 4GB blog post ;). Searching would be easy.
I don't (currently) see which way would be (a) easier to implement, (b) easier to live with.
Which way should I go / how is this normally done? Any further pros / cons for either (1) or (2) would be appreciated.
For the 'current generation', implementing a database is pretty much your safest bet. As you mentioned, it's pretty standard, and you outlined all of the fun stuff. Most SQL instances have a fairly powerful FULLTEXT (or equivalent) search.
You'll probably have just as much architecture to write between the two you outlined, especially if you want one to have the feature-parity of the other.
The up-and-coming technology is a key/value store, commonly referred to as NoSQL. With this, you can store your content and metadata into separate individual documents, but in a structured way that makes searching and retrieval quite fast. Some common NoSQL engines are mongo, CouchDB, and redis (among others).
Ultimately this comes down to personal preference, along with a few use-case considerations. You didn't really outline what is important to you as far as conveniences and your application. Any one of these would be just fine for a personal or development blog. Building an entire platform with multiple contributors is a different conversation.
13 years ago I tried your option 1 (having external files for text content) - not with a blog, but with a CMS. And I ended in shoveling it all back into the database for easier handling. It's much easier to have global replaces on the database than on the text file level. With large numbers of post you run into trouble with directory sizes and access speed, or you have to manage subdirectory schemes etc. etc. Stick to the database only approach-
There are some tools to make your life easier with text files than the built-in mysql functions, but with a command line client like mysql and mysqldump you can easily extract any texts to the file system level, work on them with standard tools and re-load them into the database. What mysql really lacks is built-in support for regex search/replace, but even for that you'll find a patch if you're willing to recompile mysql.

Why do forums store posts in a database?

From looking at the way some forum softwares are storing data in a database (eg. phpBB uses MySQL databases for storing just about everything) I started to wonder why they do it that way? Couldn't it be just as fast and efficient to use.. maybe xsl with xslt to store forum topics and posts? Or to at least store the posts in a topic?
There are loads of reasons why they use databases and not flat files. Here are a few off the top of my head.
Referential integrity
Indexes and efficient searching
SQL Joins
Here are a couple more posts you can look at for more information :
If i can store my data in text files and easily can deal with these files, why should i use database like Mysql, oracle etc
Why use MySQL over flatfiles?
Why use SQL database?
But this is exactly what databases have been designed and optimized for, storage and retrieval of data. Using a database allows the forum designer to focus on their problem and not worry about implementing storage as well. It wouldn't make sense to ignore all the work that has been done in the database world and instead implement your own solution. It would take more time, be more buggy, and not run as quickly.
Database engines handle all the problems of concurrency. Imagine that, two users try to write in your forum at the same time. If you store the post in files, the first attempt will lock the file so the second has to wait for the first to finish.
Otherwise if you want to search, it's much faster to do it in database than scanning all the files.
So basically, it's not a good idea to store data wich can be modified by useres simultaneously, and searching is much more efficient in database.
Simply, easy access to data. It's a lot easier to find posts between a date, created by a user, or with certain keywords. You could do all of the above with flat file storage, but this would be IO intensive and slow. If you had the idea of storing each post in its own file, you'd then have the problem of running out of disk space, not because of lack of capacity, but because you'd have consumed all the available inodes.
Software such as this usually has a static caching feature - pages that don't change are written out to static HTML files, and those are served instead of hitting the database.
Mixing static caching with relational DB storage provides the best of both worlds.

If I have the choice, should webpage contents be saved on the file system or in MySQL?

I am in the planning stages of writing a CMS for my company. I find myself having to make the choice between saving page contents in a database or in folders on a file system. I have learned that PHP performs admirably well reading and writing to file systems, way better in fact than running SQL queries. But when it comes to saving pages and their data on a file system, there'll be a lot more involved than just reading and writing. Since pages will be drawn using a PHP class, the data for each page will be just data, no HTML. Therefore a parser for the files would have to be written. Also I doubt that all the data from a page will be saved in just one file, it would rather be saved in one directory, with content boxes and data in separated files.
All this would be done so much easier with MySQL, so what I want to ask you experts:
Will all the extra dilly dally with file system saving outweigh it's speed and resource advantage over MySQL?
Thanks for your time.
Go for MySQL. I'd say the only time you should think about using the file system is when you are storing files (BLOBS) of several megabytes, databases (at least the ones you typically use with a php website) are generally less performant when storing that kind of data. For the rest I'd say: always use a relational database. (Assuming you are dealing with data dat has relations of course, if it is random data there is not much benefit in using a relational database ;-)
Addition: If you define your own file-structure, and even your own way of cross referencing files you've already started building a 'database' yourself, that is not bad in itself -- it might be loads of fun! -- but you probably will not get the performance benefits you're looking for unless your situation is radically different than the other 80% of 'standard' websites on the web (a couple of pages with text and images on them). (If you are building google/youtube/flickr/facebook ... you've got a different situation and developing your own unique storage solution starts making sense)
things to consider
race-condition in file write if two user editing same piece of content
distribute file across multiple servers if CMS growth, latency on replication will cause data integrity problem
search performance, grep on files on multiple directory will be very slow
too many files in same directory will cause server performance especially in windows
Assuming you have a low-traffic, single-server environment hereā€¦
If you expect to ever have to manage those entries outside of the CMS, my opinion is that it's much, much easier to do so with existing tools than with database access tools.
For example, there's huge value in being able to use awk, grep, sed, sort, uniq, etc. on textual data. Proxying that through a database makes this hard but not impossible.
Of course, this is just opinion based on experience.
S
Storing Data on the filesystem may be faster for large blobs that are always accessed as one piece of information. When implementing a CMS, you typically don't only have to deal with such blobs but also with structured information that has internal references (like content fields belonging to a certain page that has links to other pages...). SQL-Databases provide an easy way to access structured information, files on your filesystem do not (except of course simple hierarchical structures that can be represented with folders).
So if you wanted to store the structured data of your cms in files, you'd have to use a file format that allows you to save the internal references of your data, e.g. XML. But that means that you would have to parse those files, which is not only a lot of work but also makes the process of accessing the data slow again.
In short, use MySQL
Use a database and you have lots of important properties from the beginning "for free" without inventing them in some suboptimal ways if you go the filesystem way. If you don't want to be constrained to MySQL only you can make use of e.g. the database abstraction layer of the doctrine project.
Additionally you have tools like phpMyAdmin for easy lookup or manipulation of your data versus the texteditor.
Keep in mind that the result of your database queries can almost always be cached in memory or even in the filesystem so you have the benefit of easier management with well known tools and similar performance.
When it comes to minor modifications of website contents (eg. fixing a typo or updating external links), I find it much easier to connect to the server using SSH and use various tools (text editors, grep etc.) on files, rather than I having to use CMS interface to update each file manually (our CMS has such interface).
Yet there are several questions to analyze and answer, mentioned above - do you plan for scalability, concurrent modification of data etc.
No, it will not be worth it.
And there is no advantage to using the filesystem over a database unless you are the only user on the system (in which the advantage would be lost anyway). As soon as the transactions start rolling in and updates cascades to multiple pages and multiple files you will regret that you didn't used the database from the beginning :)
If you are set on using caching, experiment with some of the existing frameworks first. You will learn a lot from it. Maybe you can steal an idea or two for your CMS?

Is PHP serialization a good choice for storing data of a small website modified by a single person

I'm planning a PHP website architecture. It will be a small website with few visitors and small set of data. The data is modified exclusively by a single user (administrator).
To make things easier, I don't want to bother with a real database or XML data. I think about storing all data through PHP serialization into several files. So for example if there are several categories, I will store an array containing Category class instances for each category.
Are there any pitfalls using PHP serialization in those circumstances?
Use databases -- it is not that difficult and any extra time spent will be well learnt with database use.
The pitfalls I see are as Yehonatan mentioned:
1. Maintenance and adding functionality.
2. No easy way to query or look at data.
3. Very insecure -- take a look at "hackthissite.org". A lot of the beginning examples have to do with hacking where someone put the data hard coded in files.
4. Serialization will work for one array, meaning one table. If you have to do anything like have parent categories that have to match up to other data, not going to work so well.
The pitfalls come when with maintenance and adding functionality.
it is a very good way to learn but you will appreciate databases more after the lessons.
I tried to implement PHP serialization to store website data. For those who want to do the same thing, here's a feedback from the project started a few months ago and heavily modified since:
Pros:
It was very easy to load and save data. I don't have to write SQL queries, optimize them, etc. The code is shorter (with parametrized SQL queries, it may grow a lot).
The deployment does not require additional effort. We don't care about what is supported on the web server: if there is just PHP with no additional extensions, database servers, etc., the website will still work. Sqlite is a good thing, but it is not possible to install it on some servers, and it also requires a PHP extension.
We don't have to care about updating a database server, nor about the database server to use (thus avoiding the scenario where the customer wants to migrate from Microsoft SQL Server to Oracle, etc.).
We can add more properties to the objects without having to break everything (just like we can add other columns to the database).
Cons:
Like Kerry said in his answer, there is "no easy way to query or look at data". It means that any business intelligence/statistics cases are impossible or require a huge amount of work. By the way, some basic scenarios become extremely complicated. Let's say we store products and we want to know how much products there are. Instead of just writing select count(1) from Products, in my case it requires to create a PHP file just for that, load all data then count the number of items, sometimes by adding stuff manually.
Some changes required to implement data migration, which was painful and required more work than just executing an SQL query.
To conclude, I would recommend using PHP serialization for storing data of a small website modified by a single person only if all the following conditions are true:
The deployment context is unknown and there are chances to have a server which supports only basic PHP with no extensions,
Nobody cares about business intelligence or similar usages of the information,
There will be no changes to the requirements with large impact on the data structure.
I would say use a small database like sqlite if you don't want to go through setting up a full db server. However I will also say that serializing an array and storing that in a text file is pretty dang fast. I've had to serialize an array with a few thousand records (a dump from a database) and used that as a temp database when our DB server was being rebuilt for a few days.

flat-file database php application

I'm creating and app that will rely on a database, and I have all intention on using a flat file db, is there any serious reasons to stay away from this?
I'm using mimesis (http://mimesis.110mb.com)
it's simpler than using mySQL, which I have to admit I have little experience with.
I'm wondering about the security of the db. but the files are stored as php and it seems to be a solid database solution.
I really like the ease of backing up and transporting the databases, which I have found harder with mySQL. I see that everyone seems to prefer the mySQL way - and it likely is faster when it comes to queries but other than that is there any reason to stay away from flat-file dbs and (finally) properly learn mysql ?
edit
Just to let people know,
I ended up going with mySQL, and am using the CodeIgniter framework. Still like the flat file db, but have now realized that it's way more complex for this project than necessary.
Use SQLite, you get a database with many SQL features and yet it's only a single file.
Greetings, I'm the creator of Mimesis. Relational databases and SQL are important in situations where you have massive amounts of data that needs to be handled. Are flat files superior to relation databases? Well, you could ask Google, as their entire archiving system works with flat files, and its the most popular search engine on Earth. Does Mimesis compare to their system? Likely not.
Mimesis was created to solve a particular niche problem. I only use free websites for my online endeavors. Plenty of free sites offer the ability to use PHP. However, they don't provide free SQL database access. Therefore, I needed to create a database that would store data, implement locking, and work around file permissions. These were the primary design parameters of Mimesis, and it succeeds on all of those.
If you need an idea of Mimesis's speed, if you navigate to the first page it will tell you what country you're viewing the site from. This free database is taken from the site ip2nation.com and ported into a Mimesis ffdb. It has hundreds if not thousands of entries.
Furthermore, the hit counter on the main page has already tracked over 7000 visitors. These are UNIQUE visits, which means that the script has to search the database to see if the IP address that's visiting already exists, and also performs a count of the total IPs.
If you've noticed the main page loads up pretty quickly and it has two fairly intensive Mimesis database scripts running on the backend. The way Mimesis stores data is done to speed up read and write procedures and also translation procedures. Most ffdb example scripts or other ffdb scripts out there use a simple CVS file or other some such structure for storing data. Mimesis actually interprets binary data at some levels to augment its functionality. Mimesis is somewhat of a hybrid between a flat file database and a relational database.
Most other ffdb scripts involve rewriting the COMPLETE file every time an update is made. Mimesis does not do this, it rewrites only the structural file and updates the actual row contents. So that even if an error does occur you only lose new data that's added, not any of the older data. Mimesis also maintains its history. Unless the table is refreshed the data that rows had previously is still contained within.
I could keep going on about all the features, but this isn't intended as a "Mimesis is the greatest database ever" rant. Moreso, its intended to open people's eyes to the fact that SQL isn't the ONLY technology available, and that flat files, when given proper development paradigms are superior to a relational database, taking into account they are more specialized.
Long live flat files and the coders who brave the headaches that follow.
The answer is "Fine" if you only NEED a flat-file structure. One test: Would a single simple spreadsheet handle all needs? If not, you need a relational structure, not a flat file.
If you're not sure, perhaps you can start flat-file. SQLite is a great app for getting started.
It's not good to learn you made the wrong choice, if you figure it out too far along in the process. But if you understand the importance of a relational structure, and upsize early on if needed, then you are fine.
I really like the ease of backing up
and transporting the databases, which
I have found harder with mySQL.
Use SQLite as mentioned in another answer. There is only one file to backup, or set up periodic dumps of the MySQL databases to SQL files. This is a relatively simple thing to do.
I see that everyone seems to prefer
the mySQL way - and it likely is
faster when it comes to queries
Speed is definitely a consideration. Databases tend to be a lot faster, because the data is organized better.
other than that is there any reason to
stay away from flat-file dbs and
(finally) properly learn mysql ?
There sure are plenty of reasons to use a database solution, but there are arguments to be made for flat files. It is always good to learn things other than what you "usually" use.
Most decisions depend on the application. How many concurrent users are you going to have? Do you need transaction support?
Wanted to inform that Mimesis has moved from the original URL to http://mimesis.site11.com/
Furthermore, I am shifting the focus of Mimesis from an ffdb to a key-value store. It's more sensible Given the types of information I'm storing and the methods I use to retrieve it. There was also a grave error present in the coding of Mimesis (which I've since fixed). However, I'm still in the testing phase of the new key-value store type. I've also been side-tracked by other things. Locking has also been changed from the use of file creation to directory creation as the mutex mechanism.
Interoperability. MySQL can be interfaced by basically any language that counts. Mimesis is unlikely to be usable outside PHP.
This becomes significant the moment you try to use profilers, or modify data from the outside.
You might also look at http://lukeplant.me.uk/resources/flatfile/ for the PHP Flatfile Package.
The issue with going flatfile is that in order to adjust the situation for further development you have to alter a significant amount of code in order to improve the foundation of the system. Whereas if it was a pure SQL system it would require little to no modification to proceed in the future.

Categories