I have a MySQL server that keeps events in DB, the DB looks like this:
id | epoch_time | type | event_text | ....
---|------------|------|-------------|-----
01 | 1487671205 | 0 | user-login | ....
02 | 1487671284 | 0 | user-logout | ....
03 | 1487671356 | 1 | sys_error | ....
04 | 1487671379 | 0 | user-logout | ....
05 | 1487671389 | 2 | user_error | ....
06 | 1487671397 | 1 | sys_error | ....
On the web UI, there is a summery of the last 24 hours events by type, since the DB is keeping 1 year back log of data, there are over 1M records at the moment which makes the site loads very slow (from the obvious reasons).
The SQL query is simple,
SELECT COUNT(id) as total FROM `eventLog` WHERE `epoch_time` >= (UNIX_TIMESTAMP() - 86400)
My question is - Is there a way to to "tell" MySQL that the epoch_time column is sorted so that once it hits a raw that has:
epoch_time < (UNIX_TIMESTAMP() - 86400)
The query will end.
Thanks
[UPDATE]
Thank you all for your help, I tried to add the index but the performance is still bad (~ 7 - 12 sec' to load the page)
Does it make sense to just keep statistical information just for that ?
You can set epoch_time as index using:
ALTER TABLE `eventLog` ADD INDEX epoch_time (`epoch_time`)
That will make your query runs much faster,
Related
Hello I am facing hard time trying to realized this task. The problem is that I am not sure in which way this have to be proceeded and couldn't find tutorials or information about realizing this type of task.
The question is I have 2 tables and one connecting table between the two of them. With regular query usually what is displayed is the table header which is known value and them then data. In My case I have to display the table horizontally and vertically since the header value is unknown value.
Here is example of the DB
Clients:
+--------+------ +
| ID | client|
+--------+------ +
| 1 | Sony |
| 2 | Dell |
+--------+------ +
Users:
+--------+---------+------------+
| ID | name | department |
+--------+--------+-------------+
| 1 | John | 1|
| 2 | Dave | 2|
| 3 | Michael| 1|
| 4 | Rich | 3|
+--------+--------+-------------+
Time:
+--------+------+---------------------+------------+
| ID | user | clientid | time | date |
+--------+------+---------------------+------------+
| 1 | 1 | 1 | 01:00:00 | 2017-01-02 |
| 2 | 2 | 2 | 02:00:00 | 2017-01-02 |
| 3 | 1 | 2 | 04:00:00 | 2017-02-02 | -> Result Not Selected since date is different
| 4 | 4 | 1 | 02:00:00 | 2017-01-02 |
| 5 | 1 | 1 | 02:00:00 | 2017-01-02 |
+--------+------+---------------------+------------+
Result Table
+------------+--------+-----------+---------+----------+
| Client | John | Michael | Rich | Dave |
+------------+--------+-----------+---------+----------+
| Sony |3:00:00 | 0 | 2:00:00 | 0 |
+------------+--------+-----------+---------+----------+
| Dell | 0 | 0 | 0 | 2:00:00 |
+------------+--------+-----------+---------+----------+
First table Clients Contains information about clients.
Second table Users Contains information about users
Third Table Time contains rows of time for each users dedicated to different clients from the clients table.
So my goal is to make a SQL Query which will show the Result table. In other words it will select sum of hours which every user have completed for certain client. The number of clients and users is unknown. So first thing that have to be done is Select all users, no matter if they have hours completed or not. After that have to select each client and the sum of hours for each client which was realized for individual user.
The problem is I don't know how to approach this situation. Do I have first to make one query slecting all users then foreach them in the table header and then realize second query selecting the hours and foreaching the body conent, or this can be made with single query which will render the whole table.
The filters for select command are:
WHERE MONTH(`date`) = '$month'
AND YEAR(`date`) ='$year'
AND u.department = '$department'
Selecting single row for tume SUM is:
(SELECT SUM( TIME_TO_SEC( `time` ) ) FROM Time tm
WHERE tm.clientid = c.id AND MONTH(`date`) = '$month' AND YEAR(`date`) ='$year'
This is the query to select the times for a user , here by my logic this might be transformed with GROUP BY c.id (client id), and the problem is that it have to contains another WHERE clause which will specify the USER which is unknown. If the users was known value was for example 5, there is no problem to make 5 subsequent for each user WHERE u.id = 1, 2, 3 etc.
So here are the 2 major problems how to display in same query The users header and them select the sum of hours for each client corresponding the user.
Check out the result table hope to make the things clear.
Any suggestion or answer which can come to resolve this situation will be very helpful.
Thank you!
I am trying to select data, when inserting the data it has an auto insert of the date when submitting. So when data is inserted it inserts the current date.
However, in my table I have week beginnings, so I am trying to select the data inside of that week:
mysql> select * from week;
+---------+------+------------+
| week_id | week | date |
+---------+------+------------+
| 1 | 1 | 2014-12-29 |
| 2 | 2 | 2015-01-05 |
| 3 | 3 | 2015-01-12 |
| 4 | 4 | 2015-01-19 |
| 5 | 5 | 2015-01-26 |
| 6 | 6 | 2015-02-02 |
| 7 | 7 | 2015-02-09 |
| 8 | 8 | 2015-02-16 |
| 9 | 9 | 2015-02-23 |
| 10 | 10 | 2015-03-02 |
| 11 | 11 | 2015-03-09 |
| 12 | 12 | 2015-03-16 |
| 13 | 13 | 2015-03-23 |
| 14 | 14 | 2015-03-30 |
| 15 | 15 | 2015-04-06 |
| 16 | 16 | 2015-04-13 |
| 17 | 17 | 2015-04-20 |
e.g.
select * from table where date='2015-04-06';
However the data will not be selected and presented because the inserted date was 2015-04-10. The only way to retrieve that data is by doing this:
select * from table where date='2015-04-10'; < when the data was inserted
So my question is, is it possible to select that data from that week beginning?
So if I select data from 2015-04-06 it should show data from the range of 2015-04-06 to 2015-04-12, is that possible?
Hopefully I have explained correctly, been a bit tricky to explain let alone try to implement. I can add any more info if needed.
NOTE: I am trying to use this inside of PHP so where the date is I would just use a variable, just thought I would say.
As the week will always end 6 days from the beginning you can use the between operator and the date_add function like this:
(for your specific example):
select *
from table
where date between '2015-04-06' and date_add('2015-04-06', interval 6 day)
And using a php variable:
select *
from table
where date between '$name_of_dt_var' and date_add('$name_of_dt_var', interval 6 day)
You could also compare the week of the date the data was entered with the weeks in the week table using WEEK() function.
Assuming that week is the same value as week(), the:
select t.*
from table t
where week = week('2015-04-10');
Even if the numbers do not match, then presumably you have some base date (such as 2015-01-01 and simple arithmetic would accomplish something very similar).
I have found that the most robust way to do this sort of week processing is to truncate each date in the table (in your example 2015-04-10) to the preceding Monday at midnight. That way you can compute the week of each item by assigning it to the first day of that week.
This little formula returns the preceding Monday given any DATE or DATETIME value.
FROM_DAYS(TO_DAYS(datestamp) -MOD(TO_DAYS(datestamp) -2, 7))
For example,
set #datestamp := '2015-04-10'
SELECT FROM_DAYS(TO_DAYS(#datestamp) -MOD(TO_DAYS(#datestamp) -2, 7))
yields the value 2015-04-06.
So, if you have a table called sale you can add up sales by week like this:
SELECT SUM(amount) weekly_amount,
FROM_DAYS(TO_DAYS(datestamp) -MOD(TO_DAYS(datestamp) -2, 7)) week_beginning
FROM sale
GROUP BY FROM_DAYS(TO_DAYS(datestamp) -MOD(TO_DAYS(datestamp) -2, 7))
This is a very convenient way to handle things, because it's robust over end-of-year transitions. The WEEK() function doesn't work quite as well.
If your business rules say that your weeks begin on Sunday rather than Monday, use -1 rather than -2, as follows.
FROM_DAYS(TO_DAYS(datestamp) -MOD(TO_DAYS(datestamp) -1, 7))
I'm displaying a record set using Datatables pulling records from two tables.
Table A
sno | item_id | start_date | end_date | created_on |
===========================================================
10523563 | 2 | 2013-10-24 | 2013-10-27 | 2013-01-22 |
10535677 | 25 | 2013-11-18 | 2013-11-29 | 2013-01-22 |
10587723 | 11 | 2013-05-04 | 2013-05-24 | 2013-01-22 |
10598734 | 5 | 2013-06-14 | 2013-06-22 | 2013-01-22 |
Table B
id | item_name |
=====================================
2 | Timesheet testing |
25 | Vigour |
11 | Fabwash |
5 | Cruise |
Now since the number of records returned is going to turn into a big number in near future, I want the processing to be done serverside. I've successfully managed to achieve that but it came at a cost. I'm running into a problem while dealing with filters.
From the figure above, (1) is the column whose value will be in int (item_id), but using some small modifications inside the while loop of the mysql resource, I'm displaying the corresponding string using Table B.
Now if I use the filter (2), it is working fine since those values come from Table A
The Problem
When I try to filter from the field (3), if I enter a string value such as fab it says no record found. But if I enter an int such as 11 I get a single row which contains Fabwash as the item name.
So while filtering I'm required to use the direct value used in Table A and not its corresponding string value stored in Table B. I hope the point that I'm putting across is understandable because it is hard to explain it in words.
I'm clueless on how to solve the issue.
The system is as such. Tutors provide their availability (Monday - Sunday) and the time frame they are available on that day (0700 - 1400) (ie: 7am - 2pm).
I am trying to figure out the best way to store and search through this information to find available tutors. Searching only needs to be done on a daily system (ie: day of the week - mon, tues, wed, etc).
My planned infrastructure:
//Tutor Availability
---------------------------------------------------------------------------
tutorID | monday | tuesday | wednesday | thursday | friday |
---------------------------------------------------------------------------
27 | 0700-1200 | NULL | 1400-1800 | NULL | NULL |
---------------------------------------------------------------------------
35 | NULL | 1400-1600 | NULL | NULL | 1100-1900 |
//Scheduled tutor sessions
------------------------------------
tutorID | day | time |
------------------------------------
27 | monday | 0700-0900 |
------------------------------------
35 | friday | 1300-1500 |
Query: SELECT tutorid FROM tutoravailability WHERE 'monday'=... is available between 0900-1100 and is not in scheduled tutor session.
I have been searching forever about how I can search through (and store) these time intervals in MySQL. Is this the best way to store the time intervals of a 24 hours day? Will it even be possible to search between these intervals? Am I approaching this from the wrong way? Any insight is appreciated.
Updated Infrastructure
//Tutor Availability
-----------------------------------------------------
tutorID | day | start_time | end_time | PK |
-----------------------------------------------------
27 | mon | 0700 | 1200 | 1 |
-----------------------------------------------------
27 | fri | 1400 | 1800 | 2 |
-----------------------------------------------------
35 | tue | 1100 | 1600 | 3 |
//Scheduled tutor sessions
--------------------------------------------------------
tutorID | day | start_time | end_time | PK |
--------------------------------------------------------
27 | mon | 0800 | 1000 | 1 |
--------------------------------------------------------
27 | fri | 1600 | 1800 | 2 |
So with this system it will be much simpler to search for available times. However I am still at a loss as to how to compare the availability against the scheduled lessons to ensure no overlap.
SELECT tutorID
FROM tutoravailability WHERE day = 'fri'
AND start_time <= '1400'
AND end_time >= '1530'
Now I don't understand how I would compare this query against the Scheduled tutor sessions table to avoid duplicate bookings.
Final Update
To ensure their are no overlapping of the Scheduled Tutors sessions I will use the MySQL BETWEEN clause to search for the start and end time.
If you store the time interval using two columns it will be much easier for you to perform a search using sql query.
i.e. tutorID, day, startTime, endTime
You can use a bit flag to indicate the availability (24 bit) and scheduled time (24 bit). Then you can use 24 bit to represent the available hours and scheduled hours for each day.
In the Tutor Availability table, let's say '1' stands for Available in and '0' stands for unavailable. In the Scheduled table, '0' stands for Scheduled, '1' stands for Unscheduled.
So the available interval 0900-1100 can be stored as POW(2,9) | POW(2,10) | POW(2,11); the scheduled 1000-1200 can be stored as ^(POW(2,10) | POW(2,12))
Then the following query can give your the availability of on tutor - available on Monday between 09 am to 11 am:
SELECT ta.tutorid FROM tutoravailability ta, tutorscheduled ts
WHERE ta.tutorid = ts.tutorid AND ts.day = 'monday'
AND (ta.monday & ts.time & (POW(2,9) | POW(2,10) | POW(2,11))) = (POW(2,9) | POW(2,10) | POW(2,11))
For a table with 100% reading (no writing), which structure is better and why?
[My table has many columns, but I've made an example here with 4 columns for simplicity]
Option 1: One table with multiple columns
ID | Length | Width | Height
-----------------------------------------
1 | 10 | 20 | 30
2 | 100 | 200 | 300
Option 2: Two tables; one storing column headers, and other storing values
Table 1:
ID | Object_ID | Attribute_ID | Attribute_Value
------------------------------------------
1 | 1 | 1 | 10
2 | 1 | 2 | 20
3 | 1 | 3 | 30
4 | 2 | 1 | 100
5 | 2 | 2 | 200
6 | 2 | 3 | 300
Table 2:
ID | Name
-------------------
1 | Length
2 | Width
3 | Height
Your second option is an under-optimized implementation of the EAV anti-pattern:
Entity-Attribute-Value Model
Why it's bad has already been argued to death on this site and elsewhere.
You'll get much better results from the first.
I will preface this by saying that I'm a relative novice to SQL and database tables; that, however, doesn't mean that I don't know my basics.
Unless your example is heavily oversimplified, you really should use the first example. Not only will it be faster and easier to query, but it simply makes more sense.
In this example, you don't need to split your tables at all; your 'Attribute IDs' are adequately represented by the table headers. Further, these values have no real meaning by themselves, so they really don't need to be in another table.
You would generally break out a new table and reference it as you have if you had another object, existing separately, relating to your object with a one-to-many relationship.
Here's an example (actually from my database on an O'Reilly server) using blog entries and comments on blog entries:
mysql> select * from blog_entries;
+----+--------------+-------------+---------------------+
| id | poster | post | timestamp |
+----+--------------+-------------+---------------------+
| 1 | lunchmeat317 | blah blah | 0000-00-00 00:00:00 |
| 2 | Yongho Shin | yadda yadda | 0000-00-00 00:00:00 |
+----+--------------+-------------+---------------------+
2 rows in set (0.00 sec)
mysql> select id, blog_id, poster, post, timestamp from blog_comments;
+----+---------+--------------+----------------+---------------------+
| id | blog_id | poster | post | timestamp |
+----+---------+--------------+----------------+---------------------+
| 1 | 1 | lunchmeat317 | humina humina | 0000-00-00 00:00:00 |
| 2 | 1 | Joe Blow | huh? | 0000-00-00 00:00:00 |
| 3 | 2 | lunchmeat317 | yakk yakk yakk | 0000-00-00 00:00:00 |
| 4 | 2 | Yongho Shin | lol | 0000-00-00 00:00:00 |
+----+---------+--------------+----------------+---------------------+
4 rows in set (0.00 sec)
mysql>
Think about it from a logical perspective; there's no reason to artificially inject complexity into this design when it doesn't need to be there. In your example, length, width, and height aren't really separate objects, and they're all related to the dimensions of the object you're describing in the table row. Further, length width and height only have one value at a given time.
I hope that made some sense - if I was a bit pedantic in my pedagogy, I apologize. However, if someone else stumbles on this question, hopefully this example will help them.
Good luck.
Edit: I just realized that your question was specifically about performance. That's a little more in-depth, perhaps based on the db engine that you use? Generally, though, I would imagine that querying a table without doing any joins would be slightly faster, considering that denormalization is a commonly-cited method of improving performance.