Why?
I am trying to dynamically find where foreign keys points. For this I search in information_schema.KEY_COLUMN_USAGE. It works fine for tables, but not for views.
Views are referenced in information_schema.VIEWS, the view_definition field exposes the query.
I think that this is the only place I will find information about where view fields comes from, right?
Then, I would search for my field name between the SELECT and the FROM. If it is an alias, get the table field name and the table name (and resolve the table if it is an alias).
Last complication, the view can refer to another view, then the code will have to be recursive.
Let's take an example (view name is vw_mandates_articles):
select ma.*, a.id_articles_unit, a.id_articles_category from mandates_articles ma
left join articles a on ma.id_article = a.id
The way it is stored in the VIEWS table is:
select `ma`.`id` AS `id`,
`ma`.`id_mandate` AS `id_mandate`,
`ma`.`id_article` AS `id_article`,
`ma`.`unit_price` AS `unit_price`,
`ma`.`description` AS `description`,
`a`.`id_articles_unit` AS `id_articles_unit`,
`a`.`id_articles_category` AS `id_articles_category`
from (`ste`.`mandates_articles` `ma`
left join `ste`.`articles` `a` on((`ma`.`id_article` = `a`.`id`)))
my inputs are:
the view name (vw_mandates_articles)
the field name (id_articles_category)
the expected output:
the field table (ste.articles)
the field name (id_articles_category) //could be same as input but not necessarily
I am not asking someone to write it for me, I just want to validate the approach before digging.
Any thoughts? Good/bad approach, alternatives?
Thanks in advance for your lights
Yes. Views only have fields stored in the query in the information_schema.VIEWS table.
No there's no better way than exploding etc. in the query...
I wouldn't recommend to make recursive views. What's sure is that it'll be slow (mysql will have to store the temporary result(s) on the hard disk what's really not improving performance).
Even if it isn't best practice, I'd tend to increase redundancy and get the data by using one single query (with maximal 1 subselect).
Related
I'm working on an existing application that uses some JOIN statements to create "immutable" objects (i.e. the results are always JOINed to create a processable object - results from only one table will be meaningless).
For example:
SELECT r.*,u.user_username,u.user_pic FROM articles r INNER JOIN users u ON u.user_id=r.article_author WHERE ...
will yield a result of type, let's say, ArticleWithUser that is necessary to display an article with the author details (like a blog post).
Now, I need to make a table featured_items which contains the columnsitem_type (article, file, comment, etc.) and item_id (the article's, file's or comment's id), and query it to get a list of the featured items of some type.
Assuming tables other than articles contain whole objects that do not need JOINing with other tables, I can simply pull them with a dynamicially generated query like
SELECT some_table.* FROM featured_items RIGHT JOIN some_table ON some_table.id = featured_items.item_id WHERE featured_items.type = X
But what if I need to get a featured item from the aforementioned type ArticleWithUser? I cannot use the dynamically generated query because the syntax will not suit two JOINs.
So, my question is: is there a better practice to retrieve results that are always combined together? Maybe do the second JOIN on the application end?
Or do I have to write special code for each of those combined results types?
Thank you!
a view can be thot of as like a table for the faint of heart.
https://dev.mysql.com/doc/refman/5.0/en/create-view.html
views can incorporate joins. and other views. keep in mind that upon creation, they take a snapshot of the columns in existence at that time on underlying tables, so Alter Table stmts adding columns to those tables are not picked up in select *.
An old article which I consider required reading on the subject of MySQL Views:
By Peter Zaitsev
To answer your question as to whether they are widely used, they are a major part of the database developer's toolkit, and in some situations offer significant benefits, which have more to do with indexing than with the nature of views, per se.
I am trying to create a Class-Inheritance design for products.
There is the base table that contains all the common fields. Then for each product type there is a separate table containing the fields that are for that product type only
So in order to get all the data for a product I need to JOIN the base table with whatever table that correlates to the product_type listed in the base table. Is there a way to make this query join on the table dynamically?
Here is a query to try to illustrate what I am trying to do:
SELECT * FROM product_base b
INNER JOIN <value of b.product_type> t
ON b.product_base_id = t.product_base_id
WHERE b.product_base_id = :base_id
Is there a way to do this?
No, there's no way to do this. The table name must be known at the time of parsing the query, so the parser can tell if the table exists, and that it contains the columns you reference. Also the optimizer needs to know the table and its indexes, so it can come up with a plan of what indexes to use.
What you're asking for is for the table to be determined during execution, based on data found row-by-row. There's no way for the RDBMS to know at parse-time that all the data values correspond to real tables.
There's no reason you would do this to implement Class Table Inheritance. CTI supports true references between tables.
You're instead describing the antipattern of Polymorphic Associations.
Make 2 queries:
First select < value of b.product_type > and then use it in the second one (the one that you have, but replace < value of b.product_type > with the result from the first one).
No. There would be little point even if it were possible, as the query optimiser would not be able to make a plan without knowing anything about the right- hand side of the join.
You need to construct the query using concatenation or similar, but make sure that you only use a valid table name to avoid injection attacks.
You can create a procedure that takes the table name as an argument and constructs a dynamic-SQL query. But it's probably easier to do this in your server-side code (PHP). But rather than make it a variable (and as suggested vulnerable to injection attacks), create separate classes for the different join combinations. Use another class (like a dispatcher) to determine the correct class to instantiate.
I know i am writing query's wrong and when we get a lot of traffic, our database gets hit HARD and the page slows to a grind...
I think I need to write queries based on CREATE VIEW from the last 30 days from the CURDATE ?? But not sure where to begin or if this will be MORE efficient query for the database?
Anyways, here is a sample query I have written..
$query_Recordset6 = "SELECT `date`, title, category, url, comments
FROM cute_news
WHERE category LIKE '%45%'
ORDER BY `date` DESC";
Any help or suggestions would be great! I have about 11 queries like this, but I am confident if I could get help on one of these, then I can implement them to the rest!!
Putting a wildcard on the left side of a value comparison:
LIKE '%xyz'
...means that an index can not be used, even if one exists. Might want to consider using Full Text Searching (FTS), which means adding full text indexing.
Normalizing the data would be another step to consider - categories should likely be in a separate table.
SELECT `date`, title, category, url, comments
FROM cute_news
WHERE category LIKE '%45%'
ORDER BY `date` DESC
The LIKE '%45%' means a full table scan will need to be performed. Are you perhaps storing a list of categories in the column? If so creating a new table storing category and news_article_id will allow an index to be used to retrieve the matching records much more efficiently.
OK, time for psychic debugging.
In my mind's eye, I see that query performance would be improved considerably through database normalization, specifically by splitting the category multi-valued column into a a separate table that has two columns: the primary key for cute_news and the category ID.
This would also allow you to directly link said table to the categories table without having to parse it first.
Or, as Chris Date said: "Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else)."
Anything with LIKE '%XXX%' is going to be slow. Its a slow operation.
For something like categories, you might want to separate categories out into another table and use a foreign key in the cute_news table. That way you can have category_id, and use that in the query which will be MUCH faster.
Also, I'm not quite sure why you're talking about using CREATE VIEW. Views will not really help you for speed. Not unless its a materialized view, which MySQL doesn't suppose natively.
If your database is getting hit hard, the solution isn't to make a view (the view is still basically the same amount of work for the database to do), the solution is to cache the results.
This is especially applicable since, from what it sounds like, your data only needs to be refreshed once every 30 days.
I'd guess that your category column is a list of category values like "12,34,45,78" ?
This is not good relational database design. One reason it's not good is as you've discovered: it's incredibly slow to search for a substring that might appear in the middle of that list.
Some people have suggested using fulltext search instead of the LIKE predicate with wildcards, but in this case it's simpler to create another table so you can list one category value per row, with a reference back to your cute_news table:
CREATE TABLE cute_news_category (
news_id INT NOT NULL,
category INT NOT NULL,
PRIMARY KEY (news_id, category),
FOREIGN KEY (news_id) REFERENCES cute_news(news_id)
) ENGINE=InnoDB;
Then you can query and it'll go a lot faster:
SELECT n.`date`, n.title, c.category, n.url, n.comments
FROM cute_news n
JOIN cute_news_category c ON (n.news_id = c.news_id)
WHERE c.category = 45
ORDER BY n.`date` DESC
Any answer is a guess, show:
- the relevant SHOW CREATE TABLE outputs
- the EXPLAIN output from your common queries.
And Bill Karwin's comment certainly applies.
After all this & optimizing, sampling the data into a table with only the last 30 days could still be desired, in which case you're better of running a daily cronjob to do just that.
Maybe it's a little dumb, but i'm just not sure what is better.
If i have to check more than 10k rows in db for existanse, what i'd do?
#1 - one query
select id from table1 where name in (smth1,smth2...{till 30k})
#2 - many queries
select id from table1 where name=smth1
Though, perfomance is not the goal, i don't want to go down with mysql either ;)
Maybe, any other solutions will be more suitable...
Thanks.
upd: The task is to fetch domains list, save new (that are not in db yet) and delete those that dissappeared from list. Hope, it'll help a little...
What you should do is create a temp table, insert all of the names, and (using one query) join against this table for your select.
select id
from table1 t1
inner join temptable tt on t1.name = tt.name
The single query will most likely perform better as the second will give a lot of round-trip delays. But if you have a lot of names like in your example the first method might cause you to hit an internal limit.
In this case it might be better to store the list of names in a temporary table and join with it.
Depending on your future needs to do similar things, you might want to add a function in the database 'strlist_to_table'. Let the function take a text where your input is delimited by a delimiter character (possibly also passed to function), split it on the delimiter to create a on-the-fly table. Then you can use
where in strlist_to_table('smth1|smth2', '|')
and also get protection from sql injection (maybe little Bobby Tables appears in the input).
Just my 2 cents...
I'm not sure how flexible your application design is, but it might be worth looking into removing the delimited list altogether and simply making a permanent third table to represent the many-to-many relationship, then joining the tables on each query.
A friend told me that I should include the table name in the field name of the same table, and I'm wondering why? And should it be like this?
Example:
(Table) Users
(Fields) user_id, username, password, last_login_time
I see that the prefix 'user_' is meaningless since I know it's already for a user. But I'd like to hear from you too.
note: I'm programming in php, mysql.
I agree with you. The only place I am tempted to put the table name or a shortened form of it is on primary and foreign keys or if the "natural" name is a keyword.
Users: id or user_id, username, password, last_login_time
Post: id or post_id, user_id, post_date, content
I generally use 'id' as the primary key field name but in this case I think user_id and post_id are perfectly OK too. Note that the post date was called 'post_date" because 'date' is a keyword.
At least that's my convention. Your mileage may vary.
I see no reason to include the table name, it's superfluous. In the queries you can refer to the fields as <table name>.<field name> anyway (eg. "user.id").
With generic fields like 'id' and 'name', it's good to put the table name in.
The reason is it can be confusing when writing joins across multiple tables.
It's personal preference, really, but that is the reasoning behind it (and I always do it this way).
Whatever method you choose, make sure it is consistent within the project.
Personally I don't add table names for field names in the main table but when using it as a foreign field in another table, I will prefix it with the name of the source table. e.g. The id field on the users table will be called id, but on the comments table it, where comments are linked to the user who posted them, it will be user_id.
This I picked up from CakePHP's naming scheme and I think it's pretty neat.
Prefixing the column name with the table name is a way of guaranteeing unique column names, which makes joining easier.
But it is a tiresome practice, especially if when we have long table names. It's generally easier to just use aliases when appropriate. Besides, it doesn't help when we are self-joining.
As a data modeller I do find it hard to be consistent all the time. With ID columns I theoretically prefer to have just ID but I usually find I have tables with columns called USER_ID, ORDER_ID, etc.
There are scenarios where it can be positively beneficial to use a common column name across multiple tables. For instance, when a logical super-type/sub-type relationship has been rendered as just the child tables it is useful to retain the super-type's column on all the sub-type tables (e.g. ITEM_STATUS) instead of renaming it for each sub-type (ORDER_ITEM_STATUS, INVOICE_ITEM_STATUS, etc). This is particularly true when they are enums with a common set of values.
For example, your database has tables which store information about Sales and Human resource departments, you could name all your tables related to Sales department as shown below:
SL_NewLeads
SL_Territories
SL_TerritoriesManagers
You could name all your tables related to Human resources department as shown below:
HR_Candidates
HR_PremierInstitutes
HR_InterviewSchedules
This kind of naming convention makes sure, all the related tables are grouped together when you list all your tables in alphabetical order. However, if your database deals with only one logical group of tables, you need not use this naming convention.
Note that, sometimes you end up vertically partitioning tables into two or more tables, though these partitions effectively represent the same entity. In this case, append a word that best identifies the partition, to the entity name
Actually, there is a reason for that kind of naming, especially when it comes to fields, you're likely to join on. In MySQL at least, you can use the USING keyword instead of ON, then users u JOIN posts p ON p.user_id = u.id becomes users u JOIN posts p USING(user_id) which is cleaner IMO.
Regarding other types of fields, you may benefit when selecting *, because you wouldn't have to specify the list of the fields you need and stay sure of which field comes from which table. But generally the usage SELECT * is discouraged on performance and mainenance grounds, so I consider prefixing such fields with table name a bad practice, although it may differ from application to application.
Sounds like the conclusion is:
If the field name is unique across tables - prefix with table name. If the field name has the potential to be duplicated in other tables, name it unique.
I found field names such as "img, address, phone, year" since different tables may include different images, addresses, phone numbers, and years.
We should define primary keys with prefix of tablename.
We should use use_id instead if id and post_id instead of just id.
Benefits:-
1) Easily Readable
2) Easily differentiate in join queries. We can minimize the use of alias in query.
user table : user_id(PK)
post table : post_id(PK) user_id(FK) here user table PK and post table FK are same
As per documentation,
3) This way we can get benefit of NATURAL JOIN and JOIN with USING
Natural joins and joins with USING, including outer join variants, are
processed according to the SQL:2003 standard. The goal was to align
the syntax and semantics of MySQL with respect to NATURAL JOIN and
JOIN ... USING according to SQL:2003. However, these changes in join
processing can result in different output columns for some joins.
Also, some queries that appeared to work correctly in older versions
(prior to 5.0.12) must be rewritten to comply with the standard.
These changes have five main aspects:
1) The way that MySQL determines the result columns of NATURAL or USING join operations (and thus the result of the entire FROM clause).
2) Expansion of SELECT * and SELECT tbl_name.* into a list of selected columns.
3) Resolution of column names in NATURAL or USING joins.
4) Transformation of NATURAL or USING joins into JOIN ... ON.
5) Resolution of column names in the ON condition of a JOIN ... ON.
Examples:-
SELECT * FROM user NATURAL LEFT JOIN post;
SELECT * FROM user NATURAL JOIN post;
SELECT * FROM user JOIN post USING (user_id);