How to search children nodes in SQL - php

Lets say I have a table called Tags with an id columnm name column, and a parent_id column. Many tags are nested using the parent_id column. How would I check if Tag A has Tag B as a non-direct child efficiently.
Previously I have selected all tags that have a parent_id of the current tag and then got the result and repeated for any child elements.
How would I do this more efficiently to get all tags that match a search and is a direct or non-direct child.
Thanks for the help,
Jason

Rather than discuss in the comments... here is what I would recommend:
If you want to stick with MySQL but can play with the structure of your database then absolutely this "Closure Table" pattern suggested by Bill Karwin is the way to go. It allows you to keep your data in a flat table design while abstracting the multi-level tree structure into a separate table for easy data extraction.
If you want to try a different Relational Database System then you might try SQL Server Express which is free from Microsoft. In full disclosure, I don't use this so I don't know what functionality is excluded (and I'm sure something is otherwise you wouldn't get it for free). So please do some research to make sure Recursive Common Table Expressions (CTEs) are available. If they are then you can use Pinal Dave's blog post for recursive SQL technique using CTEs.
Otherwise if you only think you will only ever have a handful of levels to work with, you can use the original suggestion and hardcode the number of levels.

Related

Nested Set Database Design vs Search Tags

Currently I categorise my items by giving them a tag that is attached to the item itself. For example, Men,Accessories,Watch.
My query is mainly based on search, if scale, plan to look more at Elastic Search.
I'm considering trying Nested Set as Categorising my item.
If I use Nested set, does it mean I need three tables? Item Table, Link Table (for each item in Nested Set's ID), Nested Set table
In terms of scalability, is it stupid to go my current way of "tag search", would there be a big difference between tag search vs a proper Nested Set?
I tried searching the web but can't seem to find how people are using Nested Set, especially the middle table joining them up.
I need some advice here. Which way should I go about and the reason behind them? I personally prefer tag mainly because it's already working and i got no idea to go about nested set on Laravel's "packages".
I have this nested set table.
I have another ITEM table
how do i "connect them" up.
id
parentid
left
right
depth
No. You need just left and right fields. 2. You can try closures and materialized path. I don"t know nothing about all of them but adjacent list model ain"t so bad. Maybe there is bottleneck in the hardware?

What is proper SQL(MySQL) data structure for holding hierarchical tag-cloud / categories with shared sub-categories?

According to this article Managing hierarchical data in mysql the most suitable solution for holding regular hierarchical data is a Nested Set Model, which i totaly like, but unfortunately my task is slightly more difficult. I need to manage a hierarchical model where some of the sub-categories may have multiple parents, something like crossing sets. Similiar problems described here.
To be exact i have some structure of categories in which each item can belong to multiple categories (and also on my way i'll need to provide some mean of category inheritance, if item belongs to TVs then it also belongs to Home_electronics, so the regular tag-cloud won't do here).
tl;dr: need a simple way / approach (maybe complex in realization, but simple in managment, like delete, add and find path) to manage model of categories with M:M relations.
Sadly i'm limited to MySQL only, but if this task can't be solved with SQL only, i'll move on to implementing this functionality in PHP (and thus i'll be glad to hear any out-of-the-box solutions of this problem - libraries or just sources, but thats for the worst case scenario).
Looks like the thing i'm looking for is named Directed Acyclic Graph (which was pretty obvious but i was probably too dumb to think about it :)).
It'll be good to see some implementation of it which has good managability.
And by the way, regular ID, ParentID, Data thing is not an option because MySQL doesn't have recursion and thus can't retrieve data by one query (well it can if you make PHP create a query with 1000 JOINs and pass it to MySQL, but thats retarded).
PS: Using only MySQL is not my decision, it's simply given, i know that any NoSQL DBMS would be more suitable.
Shouldn't a M to N type relationship tables in mysql work?
tbl_Category
cat_id, name
tbl_Cat_parent
cat_id, parent_cat_id
where parent_cat_id refers to Category->cat_id

Right mysql table design/relations in this scenario

I have this situation where i need suggestions on database tables design.
BACKGROUND
I am developing an application in PHP ( cakephp to be precise ). where we upload an xml file, it parses the file and save data in databases. These XML could be files or url feeds and these are purchased from various suppliers for data. It is intended to collect various venues data from source urls , venues can be anything like hotels , cinemas , schools , restaurants etc.
Problem
Initial table structure for these venues is as below . table is deigned to store generic information initially.
id
Address
Postcode
Lat
Long
SourceURL
Source
Type
Phone
Email
Website
With the more data coming from different sources , I realized that there are many attributes for different types of venues.
For example
a hotel can have some attributes like
price_for_one_day, types_of_accommodation, Number_of_rooms etc
where as schools will not have them but have different set of attributes.Restaurant will have some other attributes.
My first idea is to create two tables called vanue_attribute_names , Venue_attributes
##table venue_attribute_names
_____________________________
id
name
##table venue_attributes
________________________
id
venue_id
venue_attribute_name_id
value
So if I detect any new attribute I want to create one and the its value in attributes table with a relation. But I doubt this is not the correct approach. I believe there could be any other approach for this?. Besides if table grows huge there could be performance issues because of increase in joins and also sql queries
Is creating widest possible table with all possible attributes as columns is right approach? Please let me know. If there any links where I could refer I can follow it . Thanks
This is a surprisingly common problem.
The design you describe is commonly known as "Entity/Attribute/Value" or EAV. It has the benefit of allowing you to store all kinds of data without knowing in advance what the schema for that data is. It has the drawback of being hard to query - imagine finding all hotels in a given location, where the daily roomrate is between $100 and $150, whose name starts with "Waldorf". Writing queries against all the attributes and applying boolean logic quickly becomes harder than you'd want it to be. You also can't easily apply database-level consistency checks like "hotel_name must not be null", or "daily_room_rate must be a number".
If neither of those concerns worry you, maybe your design works.
The second option is to store the "common" fields in a traditional relational structure, but to store the variant data in some kind of document - MySQL supports XML, for instance. That allows you to define an XML schema, and query using XPath etc.
This approach gives you better data integrity than EAV, because you can apply schema constraints. It does mean that you have to create a schema for each type of data you're dealing with. That might be okay for you - I'm guessing that the business doesn't add dozens of new venue types every week.
Performance with XML querying can be tricky, and general tooling and the development approach will make it harder to build than "just SQL".
The final option if you want to stick with a relational database is to simply bite the bullet and use "pure" SQL. You can create a "master" table with the common attributes, and a "restaurant" table with the restaurant-specific attributes, a "hotel" table with the hotel attributes. This works as long as you have a manageable number of venue types, and they don't crop up unpredictably.
Finally, you could look at NoSQL options.
If you are sticking with a relational data base, that's it. The options you listed are pretty much what they can give you.
For your situation MongoDB (or an other document oriented NoSql system) could be a good option. This db systems are very good if your have a lot of records with different atributes.

How to query for multiple types of an object (using a key-value table) and grouping the results together as a complete object

I hope I asked the question properly. I have a table of objects grouped by object_id. They are stored as a key / value. I thought this would be simple but I cannot find a solution anywhere. I'm trying to get the most efficient method of querying against this table to return a full object based on multiple meta_name values. Here's the table structure:
Here's the code I have so far, which works great to query one value:
SELECT data2.object_id,data2.object, data2.meta_name, data2.value_string, data2.value_text FROM meta_data AS data1
LEFT JOIN meta_data AS data2 ON(data1.object_id = data2.object_id)
data1.object="domain"
AND data1.meta_name = "category"
AND data1.value_string = "programmer"
This gives me the following results. This is great for a single taxonomy (domain in category programmer).
The problem comes when I want to query for all domains with category programmer AND color red AND possibly other meta_name = value_strings. I can find no solution for this outside of making multiple queries from PHP (which I want to avoid for obvious performance reasons).
I need to point out that objects will be created on the fly, and without a specific schema (which is the point of having this structure to begin with) so I cannot hard code and assume anything about an object (Objects may have more meta properties defined to them from the admin panel at any given time).
Again, I hope I am asking this question right, since I have been completely unlucky in finding a solution by searching online for the last 3 days.
Thank you so much ahead of time to the MySQL pro that can help me with this!
In situations like this solutions typically query all records to avoid multiple queries and then stitch data objects together to provide the desired format. Then you can develop simple find() methods on those objects to further filter the results (e.g. using array functions)
If you're interested in exact implementation, I encourage you to look at WordPress - you noted taxonomies. As an open source project you can review their code for an example of how this is done. Take a look at the Taxonomies API as well as Meta API.

PHP/MySQL database design for various/variable content - modular system

I'm trying to build (right now just thinking/planning/drawing relations :] ) little modular system to build basic websites (mostly to simplify common tasks we as webdesigners do routinely).
I got little stuck with database design / whole idea of storing content.
1., What is mostly painful on most of websites (from my experience), are pages with quasi same layout/skelet, with different information - e.g. Title, picture, and set of information - but, making special templates / special modules in cms happens to cost more energy than edit it as a text - however, here we lose some operational potential - we can't get "only titles", because, CMS/system understands whole content as one textfield
So, I would like to this two tables - one to hold information what structure the content has (e.g. just variable amount of photos <1;500) :], title & text & photo (large) & gallery) - HOW - and another table with all contents, modules and parts of "collections" (my working name for various structured information) - WHAT
table module_descriptors (HOW)
id int
structure - *???*
table modules (WHAT)
id int
module_type - #link to module_descriptors id
content - *???*
2., What I like about this is - I don't need many tables - I don't like databases with 6810 tables, one for each module, for it's description, for misc. number to text relations, ... and I also don't like tables with 60 columns, like content_us, content_it, category_id, parent_id.
I'm thinking I could hold the structure description and content itself (noted the ??? ?) as either XML or CSV, but maybe I'm trying to reinvent the wheel and answer to this is hidden in some design pattern I haven't looked into.
Hope I make any sense at all and would get some replies - give me your opinion, pros, cons... or send me to hell. Thank you
EDIT: My question is also this: Does this approach make sense? Is it edit-friendly? Isn't there something better? Is it moral? Don't do kittens die when I do this? Isn't it too much for server, If I want to read&compare 30 XMLs pulled from DB (e.g. I want to compare something)? The technical part - how to do it - is just one part of question:)
The design pattern you're hinting at is called Serialized LOB. You can store some data in the conventional way (as columns) for attributes that are the same for every entry. For attributes that are variable, format them as XML or MarkDown or whatever you want, and store it in a TEXT BLOB.
Of course you lose the ability to use SQL expressions to query individual elements within the BLOB. Anything you need to use in searching or sorting should be in conventional columns.
Re comment: If your text blob is in XML format, you could search it with XML functions supported by MySQL 5.1 and later. But this cannot benefit from an index, so it's going to result in very slow searches.
The same is true if you try to use LIKE or RLIKE with wildcards. Without using an index, searches will result in full table-scans.
You could also try to use a MySQL FULLTEXT index, but this isn't a good solution for searching XML data, because it won't be able to tell the difference between text content and XML tag names and XML attributes.
So just use conventional columns for any fields you want to search or sort by. You'll be happier that way.
Re question: If your documents really require variable structure, you have few choices. When used properly, SQL assumes that every row has the same structure (that is, columns). Your alternatives are:
Single Table Inheritance or Concrete Table Inheritance or Class Table Inheritance
Serialized LOB
Non-relational databases
Some people resort to an antipattern called Entity-Attribute-Value (EAV) to store variable attributes, but honestly, don't go there. For a story about how bad this can go wrong, read this article: Bad CaRMa.

Categories