I am trying to make a database in NoSQL for learning purpose
Its a simple Notice management (Add/ edit/ Delete notice from notice borad) application in PHP.
I have Memcached (Membase actually) where I can store data as key value pair.
For adding a notice, I am generating a unique id {using uniqueid()function} and storing notice detail in it. But the problem is,
1. How to list all the notices?
I also want to add serial key to Notices. To do that, I need to know the serial key of last inserted data. 2. How do I find out the last inserted Notice?
If find this question inappropriate, cuz this is somewhat relational datamodel (or you may say, it should be implemented in relational database), please let me know any use case scenario where I can use NoSQL to learn more about it.
Natural connections between entities are relational - so every data model is well designed using relational schema. Almost every nosql schema could be represented in relational data model.
You use NoSQL where using standard relational model is not comfortable (for example when foreign components need to add their own data and you don't know it in advance) or you need better performance and scaling - then you denormalize your data in NoSQL schema.
MongoDB (http://www.mongodb.org/) is a good start point in NoSQL data because it allows you to mix denormalized schema with (almost) relational design.
Nice use case is to implement data model for custom form data storing - where number of fields and type of fields isn't known upfront
And about your questions:
I don't know membase well but if it's a simple key-value store the only solution is to create another key at which you store list of all id's - but concurrent updates are a big concern here
You can also store last insert id somewhere else (at other key) - here concurrent updates are easier to master
The first thing to learn about NoSQL is that there are a lot of NoSQL solutions out there, with different capabilities. You need to pick the one that is most appropriate. In this case Redis will make your life a lot easier with the design that you've chosen.
The heart of the issue is the CAP theorem. Many NoSQL solutions deliberately choose to not guarantee consistency. Once you have thrown that away, you can't guarantee that the same ID is not handed out twice. Therefore it makes sense to either use timestamps, or use something else (like Redis) to generate the unique ids which you can store wherever you want.
Related
I need to create an application with an editable database structure. Where you can add/delete/modify tables and fields, views and structure of the database. All in production and real time.
The purpose is that the application can be adapted to the needs of the company. Allowing you to store the information that is needed, where it is needed.
I use laravel 5 and MySQL, but my question is not about my software. My questions are:
Is there a methodology, or a set of steps to follow, to achieve this functionality?
And if it exists, is there any package to apply it to laravel?
Entity-attribute-value model allows to have a DB with something like a "dynamic schema" and be able to run indexed queries on its tables (though "tables" become different from what you would have if you used the normal approach). With it you can add and remove fields and have the values indexed (unlike in a document-oriented NoSQL DB). Downsides: a lot of joins, performance might suffer; however, I've seen pretty large systems get away with it. Don't know if and how it can be applied in Laravel context, but googling gives at least some results.
You need to use Business Intelligence and reporting tools, which fulfil all your needs. They run with any db, no matter which fields you add or remove they will adjust themselves. One of the best example is:-
https://github.com/getredash/redash
I came across an interesting comment in php.net about serialize data in order to save it into the DB.
It says the following:
Please! please! please! DO NOT serialize data and place it into your
database. Serialize can be used that way, but that's missing the point
of a relational database and the datatypes inherent in your database
engine. Doing this makes data in your database non-portable, difficult
to read, and can complicate queries. If you want your application to
be portable to other languages, like let's say you find that you want
to use Java for some portion of your app that it makes sense to use
Java in, serialization will become a pain in the buttocks. You should
always be able to query and modify data in the database without using
a third party intermediary tool to manipulate data to be inserted.
I've encountered this too many times in my career, it makes for
difficult to maintain code, code with portability issues, and data
that is it more difficult to migrate to other RDMS systems, new
schema, etc. It also has the added disadvantage of making it messy to
search your database based on one of the fields that you've
serialized.
That's not to say serialize() is useless. It's not... A good place to
use it may be a cache file that contains the result of a data
intensive operation, for instance. There are tons of others... Just
don't abuse serialize because the next guy who comes along will have a
maintenance or migration nightmare.
I would like to know if this is a standard view about using serializing data for DB purposes. Meaning if it's a good practice to use it sometimes, or if it should be avoided.
For example, I was instructed to use serialize myself recently.
In this case the data we had to save into a MySQL table was the following:
Car brand.
Car model.
Car version.
Car info.
Car info was an array representing all the properties of a version, so it was a large variable amount of properties (under 100 properties). This array was the one to be serialized.
The main reason I was given in order to use serialize was the following:
Being a large number of fields, it is better to serialize the data in
order to improve performance instead of creating a field for each property
or multiple tables.
Personally I agree more with the commentary in php.net than with this last asseveration, but I would like to here more qualified opinions than mine about this.
Being a large number of fields, it is better to serialize the data in
order to improve performance instead of creating a field for each
property or multiple tables.
I would consider this highly dependent on the use case. What if there is a class Customer that wants to have infos about all cars that are running Diesel or any other specific data for the car (using fuel seems easiest). You would need to get all the cars from the database, unserialize it, check for the propery and keep the list with all cars relevant for the customer.
Example: We had to move some person-related data from an old customer CMS to a new one. Instead of having each attribute nicely mapped on the database, the whole information was a single string in the old database. So instead of using a proper database structure, we had to do lots of regex-foo to turn the data into a proper structure again. Of course, this was an expensive (both monetary and work-load) task. In this case, the problem was not that huge since the amount of data was managable. But imagine the same scenario with millions of rows and more than just a single string....
The comment you posted is only talking about data structures IMO. And I agree, storing these is not very good nor efficient. It will be much easier to have a typo somewhere or add a new property that other parts of the language are not aware of. This WILL leed to problems sooner or later.
On the other hand, storing some configs that are more easily ported might be an OK case for serializing data. You could argue that there external setting files are more ideal for such a case, but this will be highly dependent on the case/philosophy/customer/...
TL;DR
In most cases, using a proper schema will sooner or later benefit the whole development, speed wise and complexity wise (since I preferr reading many table descriptions instead of a huge, cryptic string). There might be some use-cases where serializing data is acceptable so giving a finite answer if this is good or bad practice is not that easy and highly dependent.
I'm staring to build a system for working with native languages, tags and such data in Yii Framework.
I already choose MongoDB for storing my data as I think it feets nicelly and will get better performance with less costs (the database will have huge amounts of data).
My question regards user authentication, payments, etc... This are sensitive bits of information and areas where I think the data is relational.
So:
1. Would you use two different db systems? Should I need them or I'm I complicating this?
2. If you recommend the two db approach how would I achieve that in Yii?
Thanks for your time!
PS: I do not intend this question to be another endless discussion between the relational vs non-relational folks. Having said that I think that my data feets mongo but if you have something to say about that go ahead ;)
You might be interested in this presentation on OpenSky's infrastructure, where MongoDB is used alongside MySQL. Mongo was utilized mainly for CMS-type data where a flexible schema was useful, and they relied upon MySQL for transactions (e.g. customer orders, payments). If you end up using the Doctrine library, you'll find that the ORM (for SQL databases) and MongoDB ODM share a similar API, which should make the experimentation process easier.
I wouldn't shy away from using MongoDB to store user data, though, as that's often a record that can benefit from embedded document storage (e.g. storing multiple billing/shipping addresses within a single user document). If anything, Mongo should be flexible enough to enable you to develop your application without worrying about schema changes due to evolving product requirements. As those requirements become more clear, you'll be able to make a decision based on the app's performance needs and types of database queries you end up needing.
There is no harm in using multiple databases (if you really need), many big websites are using multiple databases so go a head and start your project.
I have problem, I am creating quite complex form. Some parts of form are created dynamically. Lets say if you select certain option from a drop-down, extra fields gets injected to the form.
What approach would be best to store that data? I would like to try and get-away without using multiple tables. Because I makes the whole application so much more complex.
I was thinking of initializing all possible values as "0" in my model. And then overwrite them with post data, and just store the whole array in the table. Anyone see any problems with this approach?
The necessity of using multiple tables in your model doesn't depend on how much data (how many fields) you have to store - it depends on the logic of your model. So if there is a logical reason to use relationships in your model (f.e. 1:n, n:m) JUST DO IT!!!
If you will not follow the basic rules in creating your model and will try f.e. to store all the data in one table, although it should be divided into many tables, you will very soon regret it. Any change in your code in the future will cost you much more work and at some point you will not understand your own code and will have to write it again, this time following the rules ;)
And don't worry if the devoloping the right model costs a lot of work (lately I invested over two weeks in developing my model) - it really makes sense, because afterwards you can work much faster and more effectively with a well developed and planned model.
On the other hand there are situations, when storing over 100 and more fields in one table makes sense - it depends on the logic. So if you will provide some example, maybe one can say if you should work with one or more tables.
A lot depends on what you want to do with the form data later, and how often.
Serialized Single Field
In the simplest use cases you could base64_encode(serialize($data)) all the data and put that into a single column in the database.
Simple
Fast to insert
Easy to add/change input fields
Difficult AND Slow to search for values (particularly at scale)
Difficult to programmatically update should you need to make systematic changes to the data
Perfect if you always pull all of the data out of the db and never narrow your sql queries by data in the serialized string.
Metadata Table
Adding a second metadata table could offer a little more flexibility. The 2nd table would have a foreign key reference to the main form submissions, a metadata name, and the value. This allows a very flexible many to one relationship that you can easily store, search, and manipulate. You can see examples of this in wordpress.
2 tables, but still simple
Easy to add/change input fields
Much better searching via sql
Much easier to systematically update
Perfect if you don't always get all the data or have to narrow searches by the form data
And a different direction - You may also consider looking at Document based databases like MongoDB or CouchDB if you find yourself dealing with a lot of this type of data.
I'm working on an old web application my company uses to create surveys. I looked at the database schema through the mysql command prompt and thought the tables looked pretty solid. Though I'm not a DB guru I'm well versed in the theory behind it (having taken a few database design courses in my software engineering program).
That being said, I dumped the create statements into an SQL file and imported them in MySQL Workbench and saw that they make no use of any "actual" foreign keys. They'll store another table's primary key like you would with a FK but they don't declare it as one.
So seeing how their DB is designed the way I would through what I know (minus the FK issue) I'm left wondering that maybe there's a reason behind it. Is this a case of lazy programming or could you get some performance gains by doing all the error check programmatically?
In case you'd like an example they basically have Surveys and a survey has a series of Questions. A question is part of a survey so it holds it's PK in a column. That's pretty much it but they use it everywhere.
I'd appreciate any insight :) (I understand that this question might not have a right/wrong answer but I'm looking more for some information on why they would do this as this system has been pretty solid ever since we started using it so I'm led to believe that these guys knew what they were doing)
The original developers might have opted to use MyISAM or any other storage engine that does not support foreign key constraints.
MySQL only supports the defining of actual foreign key relationships on InnoDB tables, maybe yours are MyISAM, or something else?
More important is that the proper columns have indices defined on them (so the ones holding the PK of another table should be indexed). This is also possible in MyISAM.
As general points; keys speed up reads (if they are applicable to the read taking place they help the optimizer) and slow down writes (because they add overhead to the tables).
In the vast majority of cases the improvement of speed for reading and maintenance of referential integrity outweighs the minor overhead they add to writes.
This distinction has been blurred by cacheing, mirroring etc as so many reads on the very big sites don't actually hit the 'live' database - but this is not very relevant unless you are working for Amazon, Twitter or the like.
On uber large databases (the type that Teradata support) you find that they don't use Foreign keys. The reason is performance. Every time you write out to the database, which is often enough in a data warehouse you have the added overhead of having to check all the fk's on a table. If you already know it to be true, what's the point.
Good design on a small db would just mean you put them in, but there are performance gains to be had by leaving them out.
You don't really have to use foreign keys.
If you don't have them, data might became inconsistent and you won't be able to use cascade deletes and updates.
If you have them you might loose some of the users data due to the bug in your SQL statements that happens because of schema changes.
Some prefer to have them, some prefer life without them. There's no real advantages in either case.
Here is a real life instance where I'm not using a foreign key.
I needed a way to store a parent child relationship where the child may not exist, and the child is an abstract class. Since the child could be of a few types, I use one field to name the type of the child and one field to list the id of the child. The application handles most of the logic.
I'm not sure if this was the best design decision, but it was the best I could come up with under the deadline. It's been working well so far!