I have a table that looks like this
+----+----------+------------
| id | restaurant| ... |
+----+----------+-----------+
| 1 | one | ... |
| 2 | tow | ... |
And now for the restaurants I want to add in hours of operation for each. Now from what I have read it would be bad to have a new column and call it hours and then have a varchar and have something like
"9:00-22:00,10:00-20:00,10:00-21:00"
And then when I pull this data into my app later split it at the commas to make an array. Not 100% sure why this is bad, but I know that I am not supposed to do that right?
So I was thinking making a new table called "Restaurant_Hours" and have it look like this
+----+----------+------------+------------+
| id | restaurant| mon |tue | ect...
+----+----------+------------+------------+
| 1 | one | 9:00-22:00|10:00-22:00 |
| 2 | tow | ect. | ect. |
Is this strategy of making the new table and having it like the way I showed best? And is this also not the correct way of doing things. And then restaurant would be my unique each in each so I could get the hours that way?
The base of what I'm thinking is something like this:
CREATE TABLE `restaurant_hours` (
`restaurant_hour_id` INT NOT NULL AUTO_INCREMENT,
`restaurant_id` INT NOT NULL,
`day_of_week` TINYINT NULL,
`opens_at` TIME NOT NULL,
`closes_at` TIME NOT NULL,
`hours_desc` CHAR(16) NOT NULL DEFAULT ''
);
Of course, restaurant_id should be a FOREIGN KEY to restaurants.id, and you might want a UNIQUE constraint on (restaurant_id, day_of_week, hours_desc). If they have special hours for holidays, you might want to use day_of_week == 0 as a "flag".
... or if you're feeling really ambitious, have it also reference some sort of "day_descriptions" table, where 1-7 correspond to Sunday-Saturday, and >=8 can be used to signal things that may need calculated by year (specific holidays).
Edit: hours_desc is intended as things like "Breakfast", "Lunch", "Dinner", etc...
Even without that, a query to find out "what's open when" would go something like this:
SELECT r.restaurant
FROM restaurant_hours AS rh
INNER JOIN restaurants AS r rh.restaurant_id = r.id
WHERE rh.day_of_week = DAYOFWEEK(#theWhen)
AND rh.opens_at < TIME(#theWhen)
AND rh.ends_at > TIME(#theWhen)
;
Related
I have some queries that are taking over 30mins to execute, I am not a database expert so I really dont know what to so here, I need someone to suggest a better query for:
select count(*),substring(tdate,1,7)
from bills
where amt='30'
group by substring(tdate,1,7)
order by substring(tdate,1,7) desc
SELECT count(*)
FROM `bills`
where amt='30'
and date(tdate)=date('$date')
and stat='RENEW'
and x1 in (select `id` from sub);
here I pass the value of $date in the following format 'Y-m-d 00:00:00'
select count(*),substring(tdate,1,7)
from bills
where amt='30'
group by substring(tdate,1,7)
order by substring(tdate,1,7) desc
Table structures:
MariaDB [talksport]> desc bills;
+-------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+----------------+
| bid | int(11) | NO | PRI | NULL | auto_increment |
| num | varchar(500) | NO | | NULL | |
| stat | varchar(500) | NO | | NULL | |
| tdate | varchar(500) | NO | | NULL | |
| x1 | varchar(500) | NO | | NULL | |
| amt | varchar(500) | NO | | 30 | |
+-------+--------------+------+-----+---------+----------------+
Any and all help is welcome.
Michael
Your three queries are really two (the first and third are the same). These are your three queries (reformatted so they are readable):
select count(*), left(tdate, 7)
from bills
where amt = '30'
group by left(tdate, 7)
order by left(tdate, 7) desc;
select count(*)
from `bills`
where amt = '30' and date(tdate) = date('$date') and stat = 'RENEW' and
x1 in (select `id` from sub);
First, you want an index on bills(amt, tdate) for the first query. THe second is more problematic. In some versions of MySQL, in can be an issue. Also, date arithmetic is problematic. So, if you are storing tdate as YYYY-MM-DD, then pass in $date in the same format (better yet, use parameters, better better yet use the right types). So, I would write this as:
select count(*)
from `bills` b
where amt = '30' and tdate = '$date' and stat = 'RENEW' and
exists (select 1 from sub s where b.x1 = s.id);
Then you want an index on bills(amt, stat, tdate, id).
The right indexes should speed your queries.
In addition to the answer above, one other optimisation that can possibly be done is, replacing COUNT(*) by COUNT(id).
When you're counting all the rows and each row has a unique identifier (id being the PRIMARY KEY, which is already indexed), you would as well get the same COUNT if you only counted the ids. The query will have to look for lesser number of columns as well as the one it will sift through will already be indexed to make the searches and aggregations much faster.
It's always good to try and use specific column names instead of * in SELECT queries. Likewise, always review the columns used in SELECT as well as the columns participating in WHERE and GROUP BY clauses to identify the potential candidates for indexing.
Please note:
Creating several indexes shouldn't be assumed to be the only method to optimise, as it can possibly slow down bulk INSERTs / UPDATEs, while attempting to speed up SELECT significantly. Besides, you could end up creating indexes that may turn out to be superfluous or redundant. Therefore, a holistic view on application's purpose has to be taken into account to attain an optimal balance - depending on whether more user operations are concentrated on INSERT / UPDATE or SELECT.
This is a general question, one that I've been scratching my head on for a while now. My company's database handles about 2k rows a day. 99.9% of the time, we have no problem with the values that are returned in the different SELECT statements that are set up. However, on a very rare occasion, our database will "glitch" and return the value for a completely different row than what was requested.
This is a very basic example:
+---------+-------------------------+
| row_id | columnvalue |
+---------+-------------------------+
| 1 | 10 |
| 2 | 20 |
| 3 | 30 |
| 4 | 40 |
+---------+-------------------------+
SELECT columnvalue FROM table_name WHERE row_id = 1 LIMIT 1
Returns: 10
But on the very rare occasion, it may return: 20, or 30, etc.
I am completely baffled as to why it does this sometimes and would appreciate some insight on what appears to be a programming phenomena.
More specific information:
SELECT
USERID, CONCAT( LAST, ', ', FIRST ) AS NAME, COMPANYID
FROM users, companies
WHERE users.COMPANYCODE = companies.COMPANYCODE
AND USERID = 9739 LIMIT 1
mysql> DESCRIBE users;
+------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+----------------+
| USERID | int(10) | NO | PRI | NULL | auto_increment |
| COMPANYCODE| varchar(255)| NO | MUL | | |
| FIRST | varchar(255)| NO | MUL | | |
| LAST | varchar(255)| NO | MUL | | |
+------------+-------------+------+-----+---------+----------------+
mysql> DESCRIBE companies;
+------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+----------------+
| COMPANYID | int(10) | NO | PRI | NULL | auto_increment |
| COMPANYCODE| varchar(255)| NO | MUL | | |
| COMPANYNAME| varchar(255)| NO | | | |
+------------+-------------+------+-----+---------+----------------+
What the results were suppose to be: 9739, "L----, E----", 2197
What the results were instead: 9739, "L----, E----", 3288
Basically, it returned the wrong company id based off the join with companycode. Given the nature of our company, I can't share any more information than that.
I have run this query 5k times and have made very modification to the code imaginable in order to generate the second set of results and I have no been able to duplicate it. I'm not quick to blame MySQL -- this has been happening (though rarely) for over 8 years, and have exhausted all other possible causes. I have suspected the results were manually changed after the query was ran, but the timestamps states otherwise.
I'm just scratching my head as to why this can run perfectly 499k out of 500k times.
Now that we have a more realistic query, I notice right away that you are joining the tables, not on the primary key, but on the company code. Are we certain that the company code is being enforced as a unique index on companies? The Limit 1 would hide a second row if such a row was found.
From a design perspective, I would make the join on the primary key to avoid even the possibility of duplicate keys and put company code in as a unique indexed field for display and lookup only.
This behavior is either due to an incredibly unlikely SERIOUS bug in MySQL, -or- MySQL is returning a result that is valid at the time the statement is run, and there is some other software that is garfing up the displayed result.
One possibility to consider is that the row had been modified (by some other statement) at the time your SQL statement executed, and then the row was changed again later. (That's the most likely explanation we'd have for MySQL returning an unexpected result.)
The use of the LIMIT 1 clause is curious, because if the predicate uniquely identifies a row, there should be no need for the LIMIT 1, since the query is guaranteed to return no more than one row.
This leads me to suspect that row_id is not unique, and that the query actually returns more than one row. With the LIMIT clause, there is no guarantee as to which of the rows will get returned (absent an ORDER BY clause.)
Otherwise, the most likely culprit is out dated cache contents, or other problems in the code.
UPDATE
The previous answer was based on the example query given; I purposefully omitted the possibility that EMP was a view that was doing a JOIN, since the question originally said it was a table, and the example query showed just the one table.
Based on the new information in the question, I suggest that you OMIT the LIMIT 1 clause from the query. That will identify that the query is returning more than one row.
From the table definitions, we see that the database isn't enforcing a UNIQUE constraint on the COMPANYCODE column in the COMPANY table.
We also know there isn't a foreign key defined, due to the mismatch between the datatypes.
Normally, the foreign key would be defined referencing the PRIMARY KEY of the target table.
What we'd expect the users table to have a company_id column, which references the id (primary key) column in the companies table.
(We note the datatype of the companycode column (int) matches the datatype of the primary key column in the companies table, and we note that the join condition is matching on the companycode column, even though the datatypes do not match, which is very odd.)
There are several reasons this could happen. I suggest you look at the assumptions you're making. For example:
If you're using GROUP BY and one of the columns isn't an aggregate or the grouping expression, you're going to get an unpredictable value in that column. Make sure you use an appropriate aggregation (such as MAX or MIN) to get a predictable result on each column.
If you're assuming a row order without making it explicit, and using LIMIT to get only the first row, the actual returned order of rows differs depending on that result's execution plan, which is going to differ in large resultsets based on the statistics available to the optimiser. Make sure you use ORDER BY in such situations.
I'm attempting to build a database that stores messages for multiple users. Each user will be able to send/receive 5 different message "types" (strictly a label, actual data types will be the same). My initial thought was to create multiple tables for each user, representing the 5 different message types. I quickly learned this is not such a good idea. My next thought was to create 1 table per message type with a users column, but I'm not sure that's the best method either from a performance perspective. What happens if user 1 sends 100 message type 1's, while user 3 only sends 10? The remaining fields would be null values, and I'm really not sure if that makes a difference or not. Thoughts? Suggestions and/or suggested reading? Thank you in advance!
No, that (the idea given in the subject of this question) will be tremendously inefficient. You'll need to introduce a new table each time a new user is created, and querying all them at once would be a nightmare.
It's far easier to be done with a single table for storing information about message. Each row in this table will correspond to one - and only - message.
Besides, this table should probably have three 'referential' columns: two for linking a specific message to its sender and receiver, and one for storing its type, that can be assigned only a limited set of values.
For example:
MSG_ID | SENDER_ID | RECEIVER_ID | MSG_TYPE | MSG_TEXT
------------------------------------------------------
1 | 1 | 2 | 1 | .......
2 | 2 | 1 | 1 | #######
3 | 1 | 3 | 2 | $$$$$$$
4 | 3 | 1 | 2 | %%%%%%%
...
It'll be quite easy to get both all the messages sent by someone (with WHERE sender_id = %someone_id% clause), sent to someone (WHERE receiver_id = %someone_id%), of some specific type (WHERE msg_type = %some_type%). But what's best of it, one can easily combine these clauses to set up more sophisticated filters.
What you initially thought of, it seems, looks like this:
IS_MSG_TYPE1 | IS_MSG_TYPE2 | IS_MSG_TYPE3 | IS_MSG_TYPE4
---------------------------------------------------------
1 | 0 | 0 | 0
0 | 1 | 0 | 0
0 | 0 | 1 | 0
It can be NULLs instead of 0, the core is still the same. And it's broken. Yes, you can still get all the messages of a single type with WHERE is_msg_type_1 = 1 clause. But even such an easy task as getting a type of specific message becomes, well, not so easy: you'll have to check each of these 5 columns until you find the one that has truthy value.
The similar difficulties expect the one who tries to count the number of messages of each types (which is almost trivial with the structure given above: COUNT(msg_id)... GROUP BY msg_type.
So please, don't do this. ) Unless you have a very strong reason not to, try to structure your tables so that with the time passing by they will grow in height - not in width.
The remaining fields would be null values
Except if you're designing your database vertically, there will be no remaining fields.
user int
msgid int
msg text
create table `tv_ge_main`.`Users`(
`USER_ID` bigint NOT NULL AUTO_INCREMENT ,
`USER_NAME` varchar(128),
PRIMARY KEY (`ID`)
)
create table `tv_ge_main`.`Message_Types`(
`MESSAGE_TYPE_ID` bigint NOT NULL AUTO_INCREMENT ,
`MESSAGE_TYPE` varchar(128),
PRIMARY KEY (`ID`)
)
create table `tv_ge_main`.`Messages`(
`MESSAGE_ID` bigint NOT NULL AUTO_INCREMENT ,
`USER_ID` bigint ,
`MESSAGE_TYPE_ID` bigint ,
`MESSAGE_TEXT` varchar(255) ,
PRIMARY KEY (`ID`)
)
I'm developing a QA web-app which will have some points to evaluated assigned to one of the following Categories.
Call management
Technical skills
Ticket management
As this aren't likely to change it's not worth making them dynamic but the worst point is that points are like to.
First I had a table of 'quality' which had a column for each point but then requisites changed and I'm kinda blocked.
I have to store "evaluations" that have all points with their values but maybe, in the future, those points will change.
I thought that in the quality table I could make some kind of string that have something like that
1=1|2=1|3=2
Where you have sets of ID of point and punctuation of that given value.
Can someone point me to a better method to do that?
As mentioned many times here on SO, NEVER PUT MORE THAN ONE VALUE INTO A DB FIELD, IF YOU WANT TO ACCESS THEM SEPERATELY.
So I suggest to have 2 additional tables:
CREATE TABLE categories (id int AUTO_INCREMENT PRIMARY KEY, name VARCHAR(50) NOT NULL);
INSERT INTO categories VALUES (1,"Call management"),(2,"Technical skills"),(3,"Ticket management");
and
CREATE TABLE qualities (id int AUTO_INCREMENT PRIMARY KEY, category int NOT NULL, punctuation int NOT nULL)
then store and query your data accordingly
This table is not normalized. It violates 1st Normal Form (1NF):
Evaluation
----------------------------------------
EvaluationId | List Of point=punctuation
1 | 1=1|2=1|3=2
2 | 1=5|2=6|3=7
You can read more about Database Normalization basics.
The table could be normalized as:
Evaluation
-------------
EvaluationId
1
2
Quality
---------------------------------------
EvaluationId | Point | Punctuation
1 | 1 | 1
1 | 2 | 1
1 | 3 | 2
2 | 1 | 5
2 | 2 | 6
2 | 3 | 7
I'm trying to create a table like this:
lives_with_owner_no from until under_the_name
1 1998 2002 1
3 2002 NULL 1
2 1997 NULL 2
3 1850 NULL 3
3 1999 NULL 4
2 2002 2002 4
3 2002 NULL 5
It's the Nermalization example, which I guess is pretty popular.
Anyway, I think I am just supposed to set up a dependency within MySQL for the from pending a change to the lives_with table or the cat_name table, and then set up a dependency between the until and from column. I figure the owner might want to come and update the cat's info, though, and override the 'from' column, so I have to use PHP? Is there any special way I should do the time stamp on the override (for example, $date = date("Y-m-d H:i:s");)? How do I set up the dependency within MySQL?
I also have a column that can be generated by adding other columns together. I guess using the cat example, it would look like:
combined_family_age family_name
75 Alley
230 Koneko
132 Furrdenand
1,004 Whiskers
Should I add via PHP and then input the values with a query, or should I use MySQL to manage the addition? Should I use a special engine for this, like MemoryAll?
I disagree with the nermalization example on two counts.
There is no cat entity in the end. Instead, there is a relation (cat_name_no, cat_name), which in your example has the immediate consequence that you can't tell how many cats named Lara exist. This is an anomaly that can easily be avoided.
The table crams two relations, lives_with_owner and under_the_name into one table. That's not a good idea, especially if the data is temporal, as it creates all kinds of nasty anomalies. Instead, you should use a table for each.
I would design this database as follows:
create table owner (id integer not null primary key, name varchar(255));
create table cat (id integer not null primary key, current_name varchar(255));
create table cat_lives_with (
cat_id integer references cat(id),
owner_id integer references owner(id),
valid_from date,
valid_to date);
create table cat_has_name (
cat_id integer references cat(id),
name varchar(255),
valid_from date,
valid_to date);
So you would have data like:
id | name
1 | Andrea
2 | Sarah
3 | Louise
id | current_name
1 | Ada
2 | Shelley
cat_id | owner_id | valid_from | valid_to
1 | 1 | 1998-02-15 | 2002-08-11
1 | 3 | 2002-08-12 | 9999-12-31
2 | 2 | 2002-01-08 | 2001-10-23
2 | 3 | 2002-10-24 | 9999-12-31
cat_id | name | valid_from | valid_to
1 | Ada | 1998-02-15 | 9999-12-31
2 | Shelley | 2002-01-08 | 2001-10-23
2 | Callisto | 2002-10-24 | 9999-12-31
I would use a finer grained date type than just year (in the nermalization example having 2002-2002 as a range can really lead to messy query syntax), so that you can ask queries like select cat_id from owner where '2000-06-02' between valid_from and valid_to.
As for the question of how to deal with temporal data in the general case: there's an excellent book on the subject, "Developing Time-Oriented Database Applications in SQL" by Richard Snodgrass (free full-text PDF distributed by Richard Snodgrass), which i believe can even be legally downloaded as pdf, Google will help you with that.
Your other question: you can handle the combined_family_age either in sql externally, or, if that column is needed often, with a view. You shouldn't manage the content manually though, let the database calculate that for you.