Related
I have currently 100 stores each with separate database. I want to develop a web portal which will display all 100 stores and if someone wants to search he will get the products from 100 stores on this portal rather going to different store's website. i was using xml for this purpose but it taking too long to parse xml file and filter the records of each store according to search keyword. i was generating xml when any store add new or edit product record. And on the portal website i was just parsing these generated xml (using PHP) files.
Please guide me if there is any better solution other than xml parsing. Let me clear one thing that all these stores and portal are hosted on same server and using subdomain for each store.
Thanks in advance.
Use a single database. Distinguish between stores with a store table that you reference with a foreign key column in any table where it is relevant (which is likely only going to be the stock table).
My first piece of advice to you is this: HIRE SOMEONE! If you have 100 stores, you certainly have the cashflow to devote to some sort of development project that is managed and planned by a professional.
Secondly, you are looking for someone with extensive database skill and experience. Please take the time to hire the right person, then get out of their way and let them do the job. If there is one thing I have learned in my time in the business world, it's that the single greatest hindrance to a job well done by IT, is a "boss" that doesn't recognize when he needs to hand over the reins to someone else who knows better.
The best way to think of it is this... if you know nothing about databases, and you were to start TODAY to learn everything you'd need to learn to do it RIGHT, you could expend the equivalent of a few working years in doing that. Is it worth the loss of productivity for your business? Probably not. So pay a guy $60,000 to $80,000 depending on his capabilities, and have him do it for you. You get a final product that's far more certain to be done right and work well, and you get it sooner, so you can get a faster ROI.
As far as what technology to use? I'm not even going to try answering that... it's not really for you to know or decide. Hire the right person, and let them tell you what you need.
I think sphinxsearch fits to your requirement.
Sphinx is an open source full text search server which accepts input from various sources like mysql & xml.
IN your case you can use xml/mysql as input source for indexes.
Key point with sphinx is once your indexer is ready you search response will be very quick. You can update your indexes in real time (for new product added in system).
Hope this help.
~K
I've been working on a new site of mine for a couple of days now which will be retrieving almost all of its most used content from a MySql database. Seeming as the Database and website is still under development the tables are really small at the moment and speed is of no concern yet.
But you know what they say, a little bit of hard work now saves you a headache later on.
Now I'm only 17, the only database I've ever been taught was through Microsoft Access, and we were practically given the database completed - we learned up to 3NF, but that was about it.
I remember reading once when I was looking to pull data (randomly) out of a database how large databases were taking several seconds/minutes to complete a single query, so this just got me thinking. In a fraction of a second I can submit a search to google, google processes the query and returns the result, and then my browser renders it - all done in the blink of an eye. And google has billions of records to search through. And they're also doing this for millions of users simultaneously.
I'm thinking, how do they do it? I know that they have huge data centers, but still.
I realize that it probably comes down to the design of the database, how it's been optimized, and obviously the configuration. And I guess that's my question really. Could someone please tell me how to design high performance databases for millions/billions of rows (yes, I'm being optimistic), and possibly point me towards some good reading material to help me learn further?
Also, all my queries are done via PHP, if that's at all relevant to any answers.
The blog http://highscalability.com/ has some good articles and pointers to how companies handle large problems.
Specifically related to MySQL, you can Google for craigslist.org's use of MySQL.
http://www.slideshare.net/jzawodn/mysql-and-search-at-craigslist
First the good news... MySQL scales well (depending on the hardware) to at least hundreds of millions of rows.
Once you get to a certain point, a single database server will have trouble managing the load. That's when you get into the realm of partitioning or sharding... spreading the load across multiple database servers using any one of a number of different schemes (e.g. putting unrelated tables on different servers, spreading a single table across multiple servers e.g. by using the ID or date range as a partitioning key).
SQL does shard, but is not fundamentally designed to shard well. There's a whole category of storage alternatives collectively referred to as NoSQL that are designed to solve that very problem (MongoDB, Cassandra, HBase are a few).
When you use SQL at very large scale, you run into any number of issues such as making data model changes across a DB server farm, trouble keeping up with data backups, etc. That's a very complex topic, and people that solve it well are rare. For a glimpse at the issues, have a look at http://gigaom.com/cloud/facebook-trapped-in-mysql-fate-worse-than-death/
When selecting a database platform for a specific project, benchmark the solution early and often to understand whether or not it will meet the performance requirements that you envision. Having a framework to do that will help you learn about scalability, and will help you decide whether to invest effort in improving the data storage part of your solution, and will help you know where best to invest your time.
No one can tell how to design databases. It comes after much reading and many hour working on them. A good design is product of many many years doing them though. As you've only seen Access you got no knowledge of databases. Search through Amazon.com and you'll get tons of titles. For someone that's starting, anyone will do it.
I mean no disrespect. I've been there and I'm also tutor of some people learning programming/database design. I do know that there's no silver bullet or shortcuts for the work you have ahead.
If you intend to work with high performance database, you should have something in mind. The design of them in per application. A good design depends on learning more and more how the app's users interact with the system, the usage patterns, etc. The things you'll learn from books will give you options, using them will depend heavily on the scenario.
Good luck!
It doesn't all come down to the design of the database, though that is indeed a big part of it. The guys who made Google are geniouses, and if I'm not completely wrong about Google you won't be able to find out exactly how they do what they do. Also, I know that years back they had more than 10,000 computers processing queries, and today they probably have many more. I also suspect them for caching most of the recent/popular keywords. And all the websites have been indexed and analyzed using an unknown algorithm which will make sure the computers don't have to look through all the words on every page.
In fact, Google crawls the entire internet around every 14 days, so when you do a search you do not search the entire internet. Your search gets broken down into keywords and then these keywords are used to narrow the number of relevant pages - and I'm pretty sure all pages have already been analyzed for important and/or relevant keywords before you even thought of visiting google.com.
Have a look at this question.
Have a look into Sphinx server.
http://sphinxsearch.com/
Craigslist uses that for their search engine. Basically, you give it a source and it indexes whatever you want (mysql database/table, text files, etc.). If it works for craigslist, it should work for you.
I have some site metadata I'd like to be changeable... for example, in my application, if the sysadmin didn't want to use the "Inventory" portion of the site, he/she could turn it off, and it would disappear from the main site.
So I was thinking, maybe I could make a table in my database called "meta", and insert values (or tuples) there! Then, if a module got turned off, the script would update the row, and set "module x" to 0, and I'd be done with it, right?
Except it seems like an awful lot of overhead (creating an entire table, and maintaining it, etc) just for a set of values... Basically, my solution sounds like shoving a square peg into a circular slot.
A cursory glance over the drupal database yielded nothing, and I'm guessing they use a configuration file on the server itself? If that's the case, I don't know exactly how saved values in a .cfg file (for example) could be read by the web app, nor do I know how such an app could save information to the file. I'd appreciate your insight, if you've tackled this problem before.
I use primarily PHP, by the way.
Thanks in advance!
I've often seen this accomplished using a config array:
$config["admin_email"] = "admin#mydomain.com";
$config["site_name"] = "Bob's Trinket Store";
$config["excluded_modules"] = array("inventory", "user_chat");
Then later you can check:
if (!in_array("inventory", $config["excluded_modules"])) {
// include inventory logic
}
Granted, this is a bit backwards. In reality, it would be smarter to explicitly declare included modules, rather than the negative. You would then reference this config.php in your project to load-up and act in response to different configurations.
You could implement this as a database table too, making at least two fields:
Option
Value
Where option may be "excluded_modules" and its corresponding value would be "inventory,user_cat". In all honesty though, this method is a bit sloppy, and may cause you some frustration in the future.
I know your question is "how do I read/write to a separate file on the server from a web app", but I figured I'd address one of the assumptions you made. There's nothing (too) wrong with storing your config in the DB.
I've seen projects (with lots of traffic, and good uptime - and a ton of IT keeping it that way =P) that stored configuration in the database, more or less as you described. If it's a single table, and you don't have a whole crazy fail-over/partitioning scheme on it, then it's not really THAT much overhead.
The DB has lots of features, besides storing data, and a lot of infrastructure around it. If you use the DB for your config, you get to use whatever mechanism you have for DB deployment/backup with little extra cost. You also can take advantage of the built in permissions mechanism, and any undo features that are available.
Edit:
However, if you access that config on every page display, though, you might bottleneck :) All about your design. One solution is that if you have a persistent web service, you can have it re-scan the config every X seconds.
You have two choices basically - either put it in a DB table, or in a flat config file (probably PHP, perhaps XML). With the latter, to make it editable from a page, you will have to (1) deal with messy OS-specific file access issues, (1) apply proper file permissions each time you set up a site, and (3) parse and generate PHP/XML code. With a database, all you need is a simple query, so I'd definitely go with that.
As for large projects using this approach, I know phpBB does store most of its config in a database (except for passwords, last time I checked).
I prefer to work with ini files as configuration that sit before public_html folder.I think that gives me a lot of flexibility and grouping var and create if necessary separate ini for modules etc.
I'm creating and app that will rely on a database, and I have all intention on using a flat file db, is there any serious reasons to stay away from this?
I'm using mimesis (http://mimesis.110mb.com)
it's simpler than using mySQL, which I have to admit I have little experience with.
I'm wondering about the security of the db. but the files are stored as php and it seems to be a solid database solution.
I really like the ease of backing up and transporting the databases, which I have found harder with mySQL. I see that everyone seems to prefer the mySQL way - and it likely is faster when it comes to queries but other than that is there any reason to stay away from flat-file dbs and (finally) properly learn mysql ?
edit
Just to let people know,
I ended up going with mySQL, and am using the CodeIgniter framework. Still like the flat file db, but have now realized that it's way more complex for this project than necessary.
Use SQLite, you get a database with many SQL features and yet it's only a single file.
Greetings, I'm the creator of Mimesis. Relational databases and SQL are important in situations where you have massive amounts of data that needs to be handled. Are flat files superior to relation databases? Well, you could ask Google, as their entire archiving system works with flat files, and its the most popular search engine on Earth. Does Mimesis compare to their system? Likely not.
Mimesis was created to solve a particular niche problem. I only use free websites for my online endeavors. Plenty of free sites offer the ability to use PHP. However, they don't provide free SQL database access. Therefore, I needed to create a database that would store data, implement locking, and work around file permissions. These were the primary design parameters of Mimesis, and it succeeds on all of those.
If you need an idea of Mimesis's speed, if you navigate to the first page it will tell you what country you're viewing the site from. This free database is taken from the site ip2nation.com and ported into a Mimesis ffdb. It has hundreds if not thousands of entries.
Furthermore, the hit counter on the main page has already tracked over 7000 visitors. These are UNIQUE visits, which means that the script has to search the database to see if the IP address that's visiting already exists, and also performs a count of the total IPs.
If you've noticed the main page loads up pretty quickly and it has two fairly intensive Mimesis database scripts running on the backend. The way Mimesis stores data is done to speed up read and write procedures and also translation procedures. Most ffdb example scripts or other ffdb scripts out there use a simple CVS file or other some such structure for storing data. Mimesis actually interprets binary data at some levels to augment its functionality. Mimesis is somewhat of a hybrid between a flat file database and a relational database.
Most other ffdb scripts involve rewriting the COMPLETE file every time an update is made. Mimesis does not do this, it rewrites only the structural file and updates the actual row contents. So that even if an error does occur you only lose new data that's added, not any of the older data. Mimesis also maintains its history. Unless the table is refreshed the data that rows had previously is still contained within.
I could keep going on about all the features, but this isn't intended as a "Mimesis is the greatest database ever" rant. Moreso, its intended to open people's eyes to the fact that SQL isn't the ONLY technology available, and that flat files, when given proper development paradigms are superior to a relational database, taking into account they are more specialized.
Long live flat files and the coders who brave the headaches that follow.
The answer is "Fine" if you only NEED a flat-file structure. One test: Would a single simple spreadsheet handle all needs? If not, you need a relational structure, not a flat file.
If you're not sure, perhaps you can start flat-file. SQLite is a great app for getting started.
It's not good to learn you made the wrong choice, if you figure it out too far along in the process. But if you understand the importance of a relational structure, and upsize early on if needed, then you are fine.
I really like the ease of backing up
and transporting the databases, which
I have found harder with mySQL.
Use SQLite as mentioned in another answer. There is only one file to backup, or set up periodic dumps of the MySQL databases to SQL files. This is a relatively simple thing to do.
I see that everyone seems to prefer
the mySQL way - and it likely is
faster when it comes to queries
Speed is definitely a consideration. Databases tend to be a lot faster, because the data is organized better.
other than that is there any reason to
stay away from flat-file dbs and
(finally) properly learn mysql ?
There sure are plenty of reasons to use a database solution, but there are arguments to be made for flat files. It is always good to learn things other than what you "usually" use.
Most decisions depend on the application. How many concurrent users are you going to have? Do you need transaction support?
Wanted to inform that Mimesis has moved from the original URL to http://mimesis.site11.com/
Furthermore, I am shifting the focus of Mimesis from an ffdb to a key-value store. It's more sensible Given the types of information I'm storing and the methods I use to retrieve it. There was also a grave error present in the coding of Mimesis (which I've since fixed). However, I'm still in the testing phase of the new key-value store type. I've also been side-tracked by other things. Locking has also been changed from the use of file creation to directory creation as the mutex mechanism.
Interoperability. MySQL can be interfaced by basically any language that counts. Mimesis is unlikely to be usable outside PHP.
This becomes significant the moment you try to use profilers, or modify data from the outside.
You might also look at http://lukeplant.me.uk/resources/flatfile/ for the PHP Flatfile Package.
The issue with going flatfile is that in order to adjust the situation for further development you have to alter a significant amount of code in order to improve the foundation of the system. Whereas if it was a pure SQL system it would require little to no modification to proceed in the future.
So I'm going to be working on a home made blog system in PHP and I was wondering which way of storing data is the fastest. I could go in the MySQL direction, or I could go with my own little way of doing it which is storing all of the information (encoded in JSON) in files.
Which way would be the fastest, MySQL or JSON files?
For a small, single user 'database', a file system would likely be quicker - as the size and complexity grows, a database server like MySQL or SQL Server is hard to beat.
I would definately choose a DB option (as you need to be able to search and index stuff). But that does not mean you need a fully realized separate DB service.
MySQL is definitely the more scalable solution.
But the downside is you need to set up and maintain a separate service.
On the other hand there are DBs that are file based and still give you access with standard SQL (SQLite SQLite.org) jumps to mind. You get the advantages of SQL but you do not need to maintain a separate service. The disadvantage is that they are not as scalable.
I would choose a MySQL database - simply because it's easier to manage.
JSON is not really a format for storage, it's for sending data to JavaScripts. If you want to store data in files look into XML or Serialized PHP (which I suspect is what you are after, rather than JSON).
Forgive me if this doesn't answer your question very directly, but since it is a homecooked blog system is it really worth spending time thinking about what storage backend right now is faster?
You're not going to be looking at 10,000 concurrent users from day 1, it doesn't sound like it will need to scale to any maningful degree in the foreseeable future.
Why not just stick with MySQL as a sensible choice rather than a fast one? If you really want some sense that you designed for speed maybe bolt sqlite on instead.
Since you are thinking you may not have the need for a complex relational structure, this might be a fun opportunity to try something more down the middle.
Check out CouchDB, it is a document-based, schema free database (yet still indexable). The database is made of documents that contain named fields (think key-value pairs).
Have fun....
Though I don't know for certain, it seems to me that a MySQL database would be a lot faster, especially as the amount of data gets larger and larger.
Also, using MySQL with PHP is super easy, especially if you use an abstraction class like ezSQL. ezSQL makes working with a database really simple and I think you'd be creating more unnecessary work for yourself by going the home-brewed JSON direction.
I've done both. I like files for very simple problems and databases for complicated problems.
For file solutions, note these problems as the number of files increases:
1) Much more disk space is used than you might expect, because even tiny files use up a whole block. Blocks are fairly large on filesystems which support large drives.
2) Most filesystems get very slow when the number of files in a directory gets very large. My solution to this (assuming the names of the files are reasonably spread out across the alphabet) is to create a directory consisting of the first two letters of the filename. Thus, the file, "animal.txt" would be found at an/animal.txt. This works surprisingly well. If your filenames are not reasonable well-distributed across the alphabet, use some sort of hashing function to create the directories. Sounds a little crazy, but this can work very, very well, and I've used it for very fast solutions with tens of thousands of files.
But the file solutions really only fit sometimes. Unless you have a great reason to go with files, use a database.
This is really cool. It's a PHP class that controls a flat-file database with queries http://www.fsql.org/index.php
For blogs, I recommend caching the pages because blogs usually only have static content. This way, the queries only get run once while caching. You can update the cached pages when a new blog post is added.