I'm thoroughly having an issue coming up with a solution to create a recursive hierarchy in mysql while summing the results as I go. Here's the quick structure to keep it simple.
----------------------
id | name | parent_id
----------------------
1 | A | 0
2 | B | 1
3 | C | 1
4 | D | 2
5 | E | 2
6 | F | 3
7 | G | 3
I can recursively create this menu successfully as a php loop or in mysql:
A
-B
--D
--E
-C
--F
--G
However, I these IDs reference another table (contacts) and these are types of contacts. The issue is that only the leafs are assigned to the contacts, but I need to rollup the totals to each level. So I can get to:
A=0
-B=0
--D=100
--E=100
-C=0
--F=200
--G=200
But what I need is to roll up each subsection and sum that to the parent (without a lot of queries) In reality, this tree is several hundred elements in length. This is just a simplified version, but I can't figure out how to walk back up and end up with:
A=600
-B=200
--D=100
--E=100
-C=400
--F=200
--G=200
I'd be happy with a MySQL or PHP implementation. Really just anything to get me headed in the right direction would be much appreciated.
If the number of elements in tree is small enough, you can use ranged IDs.
For example top most parent you can say id will be between 100000 - 199999, first child of this node can be between 100000-109999, second child between 110000-119999 etc. So you know for each node its children ids will be in certain range.
When you want count for a particular node, you just check if id is in that range. I hope this helps.
Related
I'm building a REST API so the answer can't include google maps or javascript stuff.
In our app, we have a table containing posts that looks like that :
ID | latitude | longitude | other_sutff
1 | 50.4371243 | 5.9681102 | ...
2 | 50.3305477 | 6.9420498 | ...
3 | -33.4510148 | 149.5519662 | ...
We have a view with a map that shows all the posts around the world.
Hopefully, we will have a lot of posts and it will be ridiculous to show thousands and thousands of markers in the map. So we want to group them by proximity so we can have something like 2-3 markers by continent.
To be clear, we need this :
Image from https://github.com/googlemaps/js-marker-clusterer
I've done some research and found that k-means seems to be part of the solution.
As I am really really bad at Math, I tried a couple of php libraries like this one : https://github.com/bdelespierre/php-kmeans that seems to do a decent job.
However, there is a drawback : I have to parse all the table each time the map is loaded. Performance-wise, it's awful.
So I would like to know if someone already got through this problematic or if there is a better solution.
I kept searching and I've found an alternative to KMeans : GEOHASH
Wikipedia will explain better than me what it is : Wiki geohash
But to summarize, The world map is divided in a grid of 32 cells and to each one is given an alpha-numeric character.
Each cell is also divided into 32 cells and so on for 12 levels.
So if I do a GROUP BY on the first letter of hash I will get my clusters for the lowest zoom level, if I want more precision, I just need to group by the first N letters of my hash.
So, what I've done is only added one field to my table and generate the hash corresponding to my coordinates:
ID | latitude | longitude | geohash | other_sutff
1 | 50.4371243 | 5.9681102 | csyqm73ymkh2 | ...
2 | 50.3305477 | 6.9420498 | p24k1mmh98eu | ...
3 | -33.4510148 | 149.5519662 | 8x2s9674nd57 | ...
Now, if I want to get my clusters, I just have to do a simple query :
SELECT count(*) as nb_markers FROM mtable GROUP BY SUBSTRING(geohash,1,2);
In the substring, 2 is level of precision and must be between 1 and 12
PS : Lib I used to generate my hash
I need to store and retrieve items of a course plan in sequence. I also need to be able to add or remove items at any point.
The data looks like this:
-- chapter 1
--- section 1
----- lesson a
----- lesson b
----- drill b
...
I need to be able to identify the sequence so that when the student completes lesson a, I know that he needs to move to lesson b. I also need to be able to insert items in the sequence, like say drill a, and of course now the student goes from lesson a to drill a instead of going to lesson b.
I understand relational databases are not intended for sequences. Originally, I thought about using a simple autoincrement column and use that to handle the sequence, but the insert requirement makes it unworkable.
I have seen this question and the first answer is interesting:
items table
item_id | item
1 | section 1
2 | lesson a
3 | lesson b
4 | drill a
sequence table
item_id | sequence
1 | 1
2 | 2
3 | 4
4 | 3
That way, I would keep adding items in the items table with whatever id and work out the sequence in the sequence table. The only problem with that system is that I need to change the sequence numbers for all items in the sequence table after an insertion. For instance, if I want to insert quiz a before drill a I need to update the sequence numbers.
Not a huge deal but the solutions seems a little overcomplicated. Is there an easier, smarter way to handle this?
Just relate records to the parent and use a sequence flag. You will still need to update all the records when you insert in the middle but I can't really think of a simple way around that without leaving yourself space to begin with.
items table:
id | name | parent_id | sequence
--------------------------------------
1 | chapter 1 | null | 1
2 | section 1 | 1 | 2
3 | lesson a | 2 | 3
4 | lesson b | 2 | 5
5 | drill a | 2 | 4
When you need to insert a record in the middle a query like this will work:
UPDATE items SET sequence=sequence+1 WHERE sequence > 3;
insert into items (name, parent_id, sequence) values('quiz a', 2, 4);
To select the data in order your query will look like:
select * from items order by sequence;
I have this problem. I have a table (below) of groups. It's a recursive sort of table because each new group can have a parent group in the same table. So effectively we have a group > subgroup > subgroup > subgroup kinda model.
**id | label | parent_id**
1 | Ceiling| 0
2 | Window | 0
3 | Wall | 0
4 | Small | 2
5 | Large | 2
6 | Large| 1
7 | Paint | 4
So this would give something that looks like this:
Window > Small window > Paint
I've created the forms and table for creating the groups but it's the database query and loops that I'm having trouble with actually getting data into the above format. Bit too much for my brain to handle :(
I'm doing it in this format because I want there to be complete control over the groups and the depth of the subgroups.
I don't really have code to give an example because it's more the problem solving I'm after.
** UPDATE **
A bit more specific: I want to list each parent group (so a group that has a 0 set in parent_id) and it's immediate subgroup, then that groups immediate subgroup (if it has any) etc, etc.
If you want to do it like this, you will allways have to fetch whole table to PHP and then perform the search with php.
However, there is one similar method how you can managet such structure, it is very vell described here:
http://www.sitepoint.com/hierarchical-data-database-2/
What issues are associated with maintaining multiple tress within a single table?
The motivation for having multiple trees is to avoid excessive updates to all nodes when inserting a node at the start. Each of the trees are completely separate entities.
Example Table:
tree_id | id | lft | rgt | parent_id | various fields . . .
---------------------------------------------------------------------
1 | 1 | 1 | 4 | NULL | ...
1 | 2 | 2 | 3 | 1 | ...
2 | 3 | 1 | 4 | NULL | ...
2 | 4 | 2 | 3 | 3 | ...
It's very common to store multiple trees in one table, just have to make sure the values that comprise a tree are stored correctly, otherwise it'll lead to data integrity issues like nonsensical tree constructions.
Suppose we have a binary tree, (like the one in your example). If a tree was 5 depth. ((2^n)-1) = (2^5 - 1) nodes would exist or 31 rows in the database which is trivial. Even at 10 depth it's still a small amount of rows, but would be a rather ginormous tree. And so having multiple trees, X, in there would be X((2^n)-1) = rows... in the database which isn't bad. So potentially a hundred trees could exist in one table and would only be 100k rows which is relatively small.
Additionally, suppose every new tree constructed was stored in its own table, then very quickly, the database would be filled with quite a bit of tables over time to match the number of trees that exist. And it just seems like not a good idea to make extra tables that are unneeded, adds unneeded complexity in the code side to have to access these multiple tables.
Looking at your table in detail, it doesn't quite look right in terms of columns, but I'm sure that table example is just something thrown up quickly to show us what you mean.
tree_id, node_id, left_node_id, right_node_id, various_fields...
Um, be sure to index those _id fields.
I'm storing categories using a hierarchical model like so:
CATEGORIES
id | parent_id | name
---------------------
1 | 0 | Cars
2 | 0 | Planes
3 | 1 | Hatchbacks
4 | 1 | Convertibles
5 | 2 | Jets
6 | 3 | Peugeot
7 | 3 | BMW
8 | 6 | 206
9 | 6 | 306
I then store actual data with one of these category ids like so:
CARS
vehicle_id | category_id | name
-------------------------------
1 | 8 | Really fast silver Peugeot 206
2 | 9 | Really fast silver Peugeot 306
3 | 5 | Really fast Boeing 747
4 | 3 | Another Peugeot but only in Hatchbacks category
When searching for any of this data, I would like to find all child / grandchild / great grandchild etc. etc. nodes. So if someone wants to see all "Cars", they see everything with a parent_id of "Hatchbacks", and so everything with a parent_id of "Peugeot", and so on, to an arbitrary level.
So if I list a "really fast Peugeot 206" with a category_id of either 1, 3, 6, or 8, my query should be able to "travel up" the tree and find any higher categories which are parents/grandparents of that child category. E.g. a user searching for Peugeots in category "8" should find any Peugeots listed with categories 6, 3, or 1 - all of which category 8's descendants.
E.g. using the above data, searching for "Peugeot" in category 3 should actually find vehicles 1, 2 and 4, because vehicles 1 and 2 have a category ancestor trail which leads back up to category 3. See?
Sorry if I haven't explained this well. It's difficult! Thank you, though.
Note: I have read the MySQL dev article on hierarchies.
Normalized models are great, but not when you actually have to query them.
Just store the "path" to your category in category table. Like this: path = /1/3/4 and when query you database like "select .... where path like '/1/3/%'" It will be much more simple and fast than multiple hierarchical queries...
This article can help you http://www.phpro.org/tutorials/Managing-Hierarchical-Data-with-PHP-and-MySQL.html
I like the explanation provided by SitePoint. It gives you code and explains the theory behind it.
http://blogs.sitepoint.com/hierarchical-data-database/
Note: this method is better for reads than for writes. If you're constantly writing to the tree, I'd use a different algorithm. This method is optimized for reads (lookups).
You've represented your data as an Adjacency List model, whose querying in MySQL is best done using session variables. Now, this is not the only way you can represent a hierarchy in a relational database. For your particular problem, I would probably use a materialized path approach instead, where you do away with the actual categories table and instead have a column on your cars table that looks like Cars/Hatchbacks/Peugeot on a per record basis and use LIKE queries. Unfortunately that would be slow as the number of records grew. Now, if you know the maximum depth of your hierarchy (e.g. four levels) you could break that out into separate columns instead, which you allow you to take advantage of indexing.