The Matrix Part 4 and MySql dilemma

The Matrix Part 4 and MySql dilemma - php

I want to do the following:
Basically I have the following design for an events table:
event:
id
code
date
When a new event is created, I want to do the following:
Check if there are any codes already available. A code is available if the date has already passed.
$code1 = select code from event where date_add(date, INTERVAL 7 day) < NOW() AND code NOT IN (select code from event where date_start > NOW()) limit 1
If a code is available, get that code and use that for the new event.
insert into event (code, date) VALUES($code1, NOW())
If a code is not available, then generate a new code.
The problem is I am afraid that when 2 events are created at the same time, they both get the same code. How can I prevent that?
The goal is to assign a code from 1-100 for each event. So because 1-100 is only 100 numbers, I need to recycle codes so that is why I check for old codes to assign to new events. I want to be able to assign codes from 1 to 100 to events by recycling old codes. I don't want to assign the same code to 2 different events.

You ought to lock the table while you work:
LOCK TABLE event WRITE;
SELECT MIN(code) FROM event WHERE date_add(date_end, INTERVAL 7 day) < NOW()
AND code NOT IN (SELECT code FROM event WHERE date_start > NOW());
...
INSERT INTO event ...
UNLOCK TABLES;
You might also keep a table with all active codes (in this case all numbers from 1 to 100). In this case you can do the INSERT with a single statement:
INSERT INTO event ( code, <other fields> )
SELECT MIN(codes.code) AS code, <other values> FROM codes
LEFT JOIN event ON ( codes.code = event code
AND ( event.date_end > DATE_ADD(now(), INTERVAL 7 day)
OR event.date_start >= NOW()) )
WHERE event.code IS NULL;
This selects all codes that are not used in "active" events, and inserts the smallest of them into event (add other fields as needed).
You could also employ a subSELECT ( SELECT DISTINCT code FROM event ) in place of the codes table, but in that case you would only select codes that you already used at least once; any "new" codes would be ignored.
A side effect of the above logic is that larger codes gets reused less often, i.e., if you have twenty active events, chances are that they're using codes from 1 to 20. If you instead want to recycle codes evenly, you can for example SELECT ... ORDER BY RAND() LIMIT 1.

Ok, this is a long shot, and I'm not a database person, so take my suggestion with a grain of salt and do your own research ...
I think what you want is serializable transactions. Basically, you first ask MySQL to make transaction isolation serializable using SET TRANSACTION ISOLATION LEVEL SERIALIZABLE. Then, before your three steps, you begin a transaction by running START TRANSACTION, and after your three steps you run COMMIT.
Some further reading:
http://en.wikipedia.org/wiki/Database_transaction
http://en.wikipedia.org/wiki/Serializability
http://dev.mysql.com/doc/refman/5.0/en/sql-syntax-transactions.html
I am not sure about the overhead the serializability incurs. Hopefully someone more familiar with databases can chip in.
If you can work around your problem, I'd personally rather do that than using transactions.

Related

How to insert only if entries of the same date is not more than 4?

I am creating a module wherein customers (all customers, not individual) should only be able to book a schedule of the same date to a maximum of 5 times in 2 tables, namely acceptance_bookings, and turnover_bookings.
For example, 5 customers booked a schedule of the same date (07-17-2019), 3 of which are acceptance_booking, and 2 turnover_booking. If other customers have attempted to book on the same date, they should be rejected.
Currently my implementation would look like the following:
** START OF TRANSACTION **
** CODE FOR INSERTING ACCEPTANCE OR TURNOVER BOOKING **
$acceptance_bookings = AcceptanceBooking::selectRaw("DATE(date) as date")
->havingRaw("DATE(date) = '$dateString'")
->get()
->count();
$turnover_bookings = TurnoverBooking::selectRaw("DATE(date) as date")
->havingRaw("DATE(date) = '$dateString'")
->get()
->count();
$total_count = $acceptance_bookings + $turnover_bookings;
if($total_count > 5)
** ROLLBACK THE DB TRANSACTION
** END OF TRANSACTION
I am worried that in race conditions, there is a chance that both inserts will get past the validation check at the end of the code which checks if there are more than 5 entries to that date.
In my implementation (code above), I insert the booking before I check if it actually is already past limit, and roll back the transaction if it is past 5 entries. Placing the condition before the insert would probably yield the same result in a race condition.
What is the best way to handle this scenario to avoid race conditions? The strict requirements I want to maintain is that there should never be more than 5 entries of the same booking schedules across the 2 tables.
Note: I cannot change the tables, like trying to create a booking table and have a reference ID to determine if it is an acceptance or turnover booking as I am integrating an already working system.

Your code has the right idea; what is the problem? BEGIN at the start and COMMIT at the end (if not failure).
One thing more: The SELECT needs to be SELECT .. FOR UPDATE. This "locks" the rows that you are counting --> avoids race problem.

Selecting random rows from a table automatically

I'm working on a project that requires back-end service. I am using MySQL and php scripts to achieve communication with server side. I would like to add a new feature on the back-end and that is the ability to generate automatically a table with 3 'lucky' members from a table_members every day. In other words, I would like MySQL to pick 3 random rows from a table and add these rows to another table (if is possible). I understand that, I can achieve this if manually call RAND() function on that table but ... will be painful!
There is any way to achieve the above?
UPDATE:
Here is my solution on this after comments/suggestions from other users
CREATE EVENT `draw` ON SCHEDULE EVERY 1 DAY STARTS '2013-02-13 10:00:00' ON COMPLETION NOT PRESERVE ENABLE DO
INSERT INTO tbl_lucky(`field_1`)
SELECT u_name
FROM tbl_members
ORDER BY RAND()
LIMIT 3
I hope this is helpful and to others.

You can use the INSERT ... SELECT and select 3 rows ORDER BY RAND() with LIMIT 3
For more information about the INSERT ... SELECT statement - see
It's also possible to automate this every day job with MySQL Events(available since 5.1.6)

Trending SQL Query

So what I am trying to do is make a trending algorithm, i need help with the SQL code as i cant get it to go.
There are three aspects to the algorithm: (I am completely open to ideas on a better trend algorithm)
1.Plays during 24h / Total plays of the song
2.Plays during 7d / Total plays of the song
3.Plays during 24h / The value of plays of the most played item over 24h (whatever item leads the play count over 24h)
Each aspect is to be worth 0.33, for a maximum value of 1.0 being possible.
The third aspect is necessary as newly uploaded items would automatically be at top place unless their was a way to drop them down.
The table is called aud_plays and the columns are:
PlayID: Just an auto-incrementing ID for the table
AID: The id of the song
IP: ip address of the user listening
time: UNIX time code
I have tried a few sql codes but im pretty stuck being unable to get this to work.

In your ?aud_songs? (the one the AID points to) table add the following columns
Last24hrPlays INT -- use BIGINT if you plan on getting billion+
Last7dPlays INT
TotalPlays INT
In your aud_plays table create an AFTER INSERT trigger that will increment aud_song.TotalPlays.
UPDATE aud_song SET TotalPlays = TotalPlays + 1 WHERE id = INSERTED.aid
Calculating your trending in real time for every request would be taxing on your server, so it's best to just run a job to update the data every ~5 minutes. So create a SQL Agent Job to run every X minutes that updates Last7dPlays and Last24hrPlays.
UPDATE aud_songs SET Last7dPlays = (SELECT COUNT(*) FROM aud_plays WHERE aud_plays.aid = aud_songs.id AND aud_plays.time BETWEEN GetDate()-7 AND GetDate()),
Last24hrPlays = (SELECT COUNT(*) FROM aud_plays WHERE aud_plays.aid = aud_songs.id AND aud_plays.time BETWEEN GetDate()-1 AND GetDate())
I would also recommend removing old records from aud_plays (possibly older than 7days since you will have the TotalPlays trigger.
It should be easy to figure out how to calculate your 1 and 2 (from the question). Here's the SQL for 3.
SELECT cast(Last24hrPlays as float) / (SELECT MAX(Last24hrPlays) FROM aud_songs) FROM aud_songs WHERE aud_songs.id = #ID
NOTE I made the T-SQL pretty generic and unoptimized to illustrate how the process works.

Daily/Weekly/Monthly Highscores

I have an online highscores made with php + mysql but it currently shows the All Time highscores, I want to add Daily/Weekly/Monthly to that and I was wondering what would be the best way todo that?
My current thought is to add 3 new tables and then have the data inserted into each of them, and then having a cron which would run at the appropriate times to delete the data from each of the tables.
Is there any better way I could do this?
Another thing, I want to have it so the page would be highscores.php?t=all t=daily, etc. How would I make it so that the page changed the query depending on that value?
Thanks.

Use one table and add a column with the date of the highscore. Then have three different queries for each timespan, e.g.
SELECT ... FROM highscores WHERE date>"05-12-2011";
If you want to have a generic version without the need to have a fixed date, use this one:
SELECT ...
FROM highscores
WHERE date >= curdate() - INTERVAL DAYOFWEEK(curdate())+6 DAY;

Is there a better way to get old data?

Say you've got a database like this:
books
-----
id
name
And you wanted to get the total number of books in the database, easiest possible sql:
"select count(id) from books"
But now you want to get the total number of books last month...
Edit: but some of the books have been
deleted from the table since last month
Well obviously you cant total for a month thats already past - the "books" table is always current and some of the records have already been deleted
My approach was to run a cron job (or scheduled task) at the end of the month and store the total in another table, called report_data, but this seems clunky. Any better ideas?

Add a default column that has the value GETDATE(), call it "DateAdded". Then you can query between any two dates to find out how many books there were during that date period or you can just specify one date to find out how many books there were before a certain date (all the way into history).
Per comment: You should not delete, you should soft delete.

I agree with JP, do a soft delete/logical delete. For the one extra AND statement per query it makes everything a lot easier. Plus, you never lose data.
Granted, if extreme size becomes an issue, then yeah, you'll potentially have to start physically moving/removing rows.

My approach was to run a cron job (or scheduled task) at the end of the month and store the total in another table, called report_data, but this seems clunky.
I have used this method to collect and store historical data. It was simpler than a soft-delete solution because:
The "report_data" table is very easy to generate reports/graphs from
You don't have to implement special soft-delete code for anything that needs to delete a book
You don't have to add "and active = 1" to the end of every query that selects from the books table
Because the code to do the historical reporting is isolated from everything else that uses books, this was actually the less clunky solution.

If you needed data from the previous month then you should not have deleted the old data. Instead you can have a "logical delete."
I would add a status field and some dates to the table.
books
_____
id
bookname
date_added
date_deleted
status (active/deleted)
From there you would be able to query:
SELECT count(id) FROM books WHERE date_added <= '06/30/2009' AND status = 'active'
NOTE: It my not be the best schema, but you get the idea... ;)

If changing the schema of the tables is too much work I would add triggers that would track the changes. With this approach you can track all kinds of things like date added, date deleted etc.

Looking at your problem and the reluctance in changing the schema and the code, I would suggest you to go with your idea of counting the books at the end of each month and storing the count for the month in another table. You can use database scheduler to invoke a SP to do this.

You have just taken a baby step down the road of history databases or data warehousing.
A data warehouse typically stores data about the way things were in a format such that later data will be added to current data instead of superceding current data. There is a lot to learn about data warehousing. If you are headed down that road in a serious way, I suggest a book by Ralph Kimball or Bill Inmon. I prefer Kimball.
Here's the websites: http://www.ralphkimball.com/
http://www.inmoncif.com/home/
If, on the other hand, your first step into this territory is the only step you plan to take, your proposed solution is good enough.

The only way to do what you want is to add a column to the books table "date_added". Then you could run a query like
select count(id) from books where date_added <= '06/30/2009';

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

The Matrix Part 4 and MySql dilemma - php

Related

How to insert only if entries of the same date is not more than 4?

Selecting random rows from a table automatically

Trending SQL Query

Daily/Weekly/Monthly Highscores

Is there a better way to get old data?

Categories

Resources