I am working on a PHP / MySQL stat logging program and am trying to find the best MySQL DB Structure for it.
There is a part where visitors will be able to see up to the date stats (i.e the latest 20 entries) but also will be able to see today's overall, yesterday's overall, last 7 days overall and last 30 days overall stats.
From the data I'm pulling the real-time stats will be updated every 60 seconds with at least 10 new entries per update.
Is my logic correct to setup two tables ... one to act as "today's" stats and another to act as the overall archive ... like:
todays_stats
id
from_url
entry_date
overall_stats
id
from_url
entry_date
Then double insert for each new entry but truncate the todays_stats at midnight every night via a cron job?
Or is there a more efficient way of doing this?
It depends on your daily stat row count, whether to delete historical data, and how much indexes you has. We need to delete historical data and has 7~8 indexes with large amount of stat data, so we separate data into daily tables and write stored procedures to fetch data(last day, last 7 day, last 30 day etc). Dropping table is much more faster than DELETE FROM table WHERE index=6-month-old-data
I think best way is to be keep one table that will hold the current data set, then you will have separate table for overall stats and at midnight you will insert all data from current to overall table with
INSERT INTO `overall` SELECT * FROM `current`
query. Then you will truncate the current table after successful data copying.
Related
I'm really struggling for days with (what i think) is a pretty advanced operation that i plan to schedule to run in my database every week.
this is the structure of my table (unit_uptime_daily):
What i need to do is run a script every week that, for every unit_id that exists in that table, gets all the rows of that unit_id thats timestamp is that present day < 6 days (so all the unit_ids with a timestamp of the previous week) add up the total_uptime column of the 7 rows and insert the new row into a weekly table.
Effectively, i am grabbing the 7 latest rows for each unit_id, adding up the total_uptime and then inserting unit_id, result of added total_uptime and timestamp into a new table.
Thanks in advance, if this is even possible to do!
Use cron jobs. Most of hosting providers provides this facility. Google cron jobs to find more about it and here's an answer that could help you.
Run a PHP file in a cron job using CPanel
I have solved this by using PHP, I got the list of unique ids, and did all the maths in a loop, inserting each result into the new table. I had already done this but would have liked it to be possible in SQL.
I want to make a table where the entries expire 24 hours after they have been inserted in PHP and MySQL.
Ideally I want to run a "deleting process" every time a user interacts with my server, that deletes old entries. Since this is more frequent you should it will not have large amounts of data to delete so it should only take a few milliseconds.
I have given each entry a date/time added value.
How would I do this?
You could use MySQL's event scheduler either:
to automatically delete such records when they expire:
CREATE EVENT delete_expired_101
ON SCHEDULE AT CURRENT_TIMESTAMP + INTERVAL 24 HOUR DO
DELETE FROM my_table WHERE id = 101;
to run an automatic purge of all expired records on a regular basis:
CREATE EVENT delete_all_expired
ON SCHEDULE EVERY HOUR DO
DELETE FROM my_table WHERE expiry < NOW();
you shouldn't do a delete process when a user interacts. it slows down things, you should use a cronjob (every minute / hour)
you'll want to index the added timestamp value and then run DELETE FROM table WHERE added < FROM_UNIXTIME(UNIX_TIMESTAMP()-24*60*60)
maybe you'll want to checkout Partitions, which divide the table into different tables, but it behaves as one table. The advantage is that you don't need to delete the entries and you'll have seperate tables for each day.
i think that YOU think that much data slows down tables. Maybe you should use EXPLAIN (MySQL Manual) and optimize your SELECT queries using indexes (MySQL Manual)
UPDATE Check out eggyal's answer - This is another approach worth taking a look.
You can look into using Cron Job, http://en.wikipedia.org/wiki/Cron Make it run once every 24 hours when it matches your requirement.
This will help
Delete MySQL row after time passes
So, I've previously developed an employee scheduling system in php. It was VERY inefficient. When I created a new schedule, I generated a row in a table called 'schedules' and, for every employee affected by that schedule, I generated a row in a table called 'schedule_days' that gave there start and stop time for that specific date. Also, editing the schedules was a wreck too. On the editing page, I pulled every user from the database from the specific schedule and printed it out on the page. It was very logical, but it was very slow.
You can imagine how long it takes to load around 15 employees for a week long schedule. That would be 1 query for the schedule, 1 query for each user, and 7 queries for each day for every user.. If I have 15 users thats too many queries. So I'm simply asking, whats someone else's view on the best way to do this?
For rotation based schedules, you want to use an exclusion based system. If you know that employee x works in rotation y within date range z, then you can calculate the individual days for that employee on the fly. If they're off sick/on course/etc., add an exclusion to the employee for that day. This will make the database a lot smaller than tracking each day for each employee.
table employee {EmployeeID}
table employeeRotations {EmployeeRotationID, EmployeeID, RotationID, StartDate, EndDate}
table rotation {RotationID, NumberOfDays, StartDate}
table rotationDay {RotationDayID, RotationID, ScheduledDay, StartTime, EndTime}
table employeeExceptions {EmployeeExceptionID, ExceptionDate, ExceptionTypeID (or whatever you want here)}
From there, you can write a function that returns On/Off/Exception for any given date or any given week.
Sounds like you need to learn how to do a JOIN rather than doing many round trips to the server for each item.
I have a mysql database, or more specific, a mysql table which I store IP adresses in.
This is because I limit the nr of messages being sent from my website.
I simply check if the IP is in the table, and if it is, I tell the user to "slow down".
Is there any way to make this MySql table only store a row (a record) for x minutes?
Other solutions are also appreciated...
No, but you can use a TIMESTAMP field to store when the row was inserted / modified and occasionally delete rows that are older than x minutes.
DELETE FROM your_table
WHERE your_timestamp < NOW() - interval 5 minute
To solve your actual problem though, I'd suggest having a table with a row for each user and the last time they sent a message. Assuming it is indexed correctly and your queries are efficient you probably won't ever need to delete any rows from this table, except perhaps if you use a foreign key to the user table and delete the corresponding user. When a user sends a message insert a row if it already exists, otherwise update the existing row (you can use for example the MySQL extension REPLACE for this if you wish).
I would recommend that you add a WHERE clause concerning time to the SELECT "simply check if the IP is in the table"
SELECT * FROM table WHERE ip = <whatever> and timestamp > NOW() - 3*60
Then maybe empty out that table once every night.
I'd make a column that has the timestamp of the last sent message and another that has the total number of posts. Before updating the table check if at least X minutes has passed since the last post. If so, change the total number of posts to 1, otherwise increment the value by 1.
One approach that doesn't involve deleting the IP addresses after a certain interval is to store the addresses as "temporal" data, i.e. records that are only valid for a certain period.
Simplest way to do that would be to add a timestamp column to the table and, when entering an IP, capture either the time it was entered into the table, or the time after which it is no longer being "limited".
The code that grabs IPs to be limited then checks the timestamp to see if it's either:
older than a certain threshold (e.g. if you recorded the IP more than an hour ago, ignore it) or
the current time is greater than the expiry date stored there (e.g. if an IP's timestamp says 2010-11-23 23:59:59 and that is in the past, ignore it)
depending on what you choose the timestamp to represent.
The other solutions here using a timestamp and a cron job are probably your best option, but if you insist on mysql handling this itself, you could use Events. They're like cron jobs, except mysql handles the scheduling itself. It requires 5.1+ though.
I'm creating a calendar that displays a timetable of events for a month. Each day has several parameters that determine if more events can be scheduled for this day (how many staff are available, how many times are available etc).
My database is set up using three tables:
Regular Schedule - this is used to create an array for each day of the week that outlines how many staff are available, what hours they are available etc
Schedule Variations - If there are variations for a date, this overrides the information from the regular schedule array.
Events - Existing events, referenced by the date.
At this stage, the code loops through the days in the month and checks two to three things for each day.
Are there any variations in the schedule (public holiday, shorter hours etc)?
What hours/number of staff are available for this day?
(If staff are available) How many events have already been scheduled for this day?
Step 1 and step 3 require a database query - assuming 30 days a month, that's 60 queries per page view.
I'm worried about how this could scale, for a few users I don't imagine that it would be much of a problem, but if 20 people try and load the page at the same time, then it jumps to 1200 queries...
Any ideas or suggestions on how to do this more efficiently would be greatly appreciated!
Thanks!
I can't think of a good reason you'd need to limit each query to one day. Surely you can just select all the values between a pair of dates.
Similarly, you could use a join to get the number of events scheduled events for a given day.
Then do the loop (for each day) on the array returned by the database query.
Create a table:
t_month (day INT)
INSERT
INTO t_month
VALUES
(1),
(2),
...
(31)
Then query:
SELECT *
FROM t_month, t_schedule
WHERE schedule_date = '2009-03-01' + INTERVAL t_month.day DAY
AND schedule_date < '2009-03-01' + INTERVAL 1 MONTH
AND ...
Instead of 30 queries you get just one with a JOIN.
Other RDBMS's allow you to generate rowsets on the fly, but MySQL doesn't.
You, though, can replace t_month with ugly
SELECT 1 AS month_day
UNION ALL
SELECT 2
UNION ALL
...
SELECT 31
I faced the same sort of issue with http://rosterus.com and we just load most of the data into arrays at the top of the page, and then query the array for the relevant data. Pages loaded 10x faster after that.
So run one or two wide queries that gather all the data you need, choose appropriate keys and store each result into an array. Then access the array instead of the database. PHP is very flexible with array indexing, you can using all sorts of things as keys... or several indexes.