How should I store weekly poll data? - php

I'm writing a top 10 polling system. Pollsters vote weekly on their top 10. How should I store their poll for each week? That is, how do I control what week the poll is in storage (mySQL) or in my PHP (5.x+) calculations?
(System #1) I've previously done this by having a file "week.txt" on the server that I set at 0 and then ran a cron job weekly to update +1. When I'm storing the poll data in the database, I'd just load the file and know what week it was. I'm looking for something more elegant.
The system must:
Be able to start at any time of the year.
Be able to skip weeks.
Not require shuffling of week numbers during calculations.
Be maintenance free by a human, other than a 1-off event (like saying "this is the start date", "this is the end date", once in a blue moon).
Use PHP, mySQL, file or other "standard" server items (except other programming languages or databases).
Not require other software (e.g. "Install Software X, it does this!").
Pollsters are probably non-technical people, so asking them anything other than "Enter your top 10" or "edit your top 10" is not allowed.
Be able to go over the end of the calendar year smoothly (e.g. Start in November and end in March).
Other Information:
Pollsters will only be allowed to vote on a single day.
I'll be running multiple polls at once that have no bearing on each other and thus may have different skip weeks.
My system I've used before won't work because in order to skip weeks, it would need interaction and violate #4 and otherwise can't skip weeks and thus violate #2.
I've thought of 2 systems but they have failures of parts of the above:
(System #2) Use PHP's date("W") when the pollster votes. Thus, the first week they all get week #48 (for example), second week #49, so it would be easily to tell which week is what. The problem is that some polls will go over the calendar year, thus I would end up with 48, 49, 50, 51, 52, 1, 2, 3, 4 and violate #3 above. Also, if we skipped weeks, we could end up with 48, 49, 50, 1, 2, 3 which violates #2 and #8 above.
(System #3) Then, I had the idea to just store the date they enter the poll. I would set a date to calculate from the week prior to the first poll, thus, it would just need to calculate the difference between weeks and I'd know the week number. But there's no easy way to skip weeks violating #2 unless we shuffle days which violates #3.
(System #4) I then had the idea that when a pollster first votes, we just record it as their week 1 vote. When they next vote, it's week 2, and so forth. If they wanted to edit their poll (the same day), they'd just use the edit button and we wouldn't record a new poll, because they'd have signaled it's an edit. The only problem is if a pollster forgets a week, meaning I'd have to go in and correct the data (add a blank week or change the week number they voted but violate #4). This handles the skip weeks just fine. Maybe a cron job would solve this? If someone forgot, a cron job that runs after the poll closes would enter in a blank week. Could be programmed to see the max week number entered, if any userid didn't have that week number, just enter in blank data.
If you can adapt any system above to meet all the criteria, that would be fine as well. I'm looking for a simple and elegant and hands-free solution.
Please ask for any other clarifying information.

When working with week numbers, you should keep in mind that 01.01.2012 is in week 52 (not 1). The question is if you want your polls to be fixed on calendar weeks, or 7-day-offsets from the poll-start-date. Consider your poll started on a friday and ended exactly 7 days after. You'd be crossing the calendar week barrier and thus have 2 "weeks" your users may vote.
I'd probably prefer the offset-approach, as strict calendar binding is usually not helpful anyways. Do you want to answer the question "what are the votes in calendar week 34" or "what are the votes in the third week of polling"?
Calculating the offset is quite simple:
// 0-based
$week_offset = floor(time() - strtotime("2011-11-02") / 7);
I don't know your polling algorithm. I'll just demonstrate with a weighted poll (1-3 stars, 3 being best):
| poll_id | user_id | week_offset | vote |
| 7 | 3 | 0 | 1 |
| 7 | 4 | 0 | 3 |
| 7 | 5 | 0 | 2 |
| 7 | 3 | 1 | 2 |
| 7 | 4 | 1 | 2 |
| 7 | 5 | 2 | 3 |
| 7 | 5 | 5 | 1 |
Running a query like
SELECT
poll_id,
week_offset,
SUM(vote) as `value`,
COUNT(user_id) as `count`,
AVG(vote) as `average`
FROM votes_table
WHERE poll_id = 7
GROUP BY poll_id, week_offset
ORDER BY poll_id, week_offset;
would give you something like
| poll_id | week_offset | value | count | average |
| 7 | 0 | 6 | 3 | 2 |
| 7 | 1 | 4 | 2 | 2 |
| 7 | 2 | 3 | 1 | 3 |
| 7 | 5 | 1 | 1 | 1 |
By now you'll probably have noticed the gap 0, 1, 2, [3], [4], 5.
When grabbing that data from MySQL you have to iterate the results anyways. So where's the problem extending that loop for a gap-filler?
<?php
// your database accessor of heart (mine is PDO)
$query = $pdo->query($above_statement);
$results = array();
$previous_offset = 0;
foreach ($query as $row) {
// calculate offset distance
$diff = $row['week_offset'] - $previous_offset;
// make sure we start at 0 offset
if ($previous_offset === 0 && $row['week_offset'] > 0) {
$diff++;
}
// if distance is greater than a single step, fill the gaps
for (; $diff > 1; $i--) {
$results[] = array(
'value' => 0,
'count' => 0,
'average' => 0,
);
}
// add data from db
$results[] = array(
'value' => $row['value'],
'count' => $row['count'],
'average' => $row['average'],
);
// remember where we were
$previous_offset = $row['week_offset'];
}
// 0 based list of voting weeks, enjoy
var_dump($results);
You might also be able to do the above right in MySQL using a function.

Related

Mysql GROUP BY and Union based on date condition

Here is my query - it mostly works, but I can see it failing on one condition - explained after the query:
$firstDay = '2020-03-01' ;
$lastDay = '2020-03-31' ;
SELECT * FROM clubEventsCal
WHERE ceFreq!=1
AND (ceDate>='$firstDay' AND ceDate<='$lastDay')
UNION SELECT * FROM clubEventsCal
WHERE ceFreq=1
AND (ceDate>='$firstDay' AND ceDate<='$lastDay')
GROUP BY ceStopDate ORDER BY ceID,ceDate ;
The first select gives me all Event records between the two dates. The second select gives me grouped/summarized Event records between the two dates. The problem though is if the value ceDate spans days across two months: IE: 2020-03-30 thru 2020-04-02. When I pull the records for March, all is good - the above query pulls the 2020-03-30 record (grouped) as the first instance of the 4 days/records - allowing us to charge for a single 4 day event. But when I pull the records for April its also going to pull 2020-04-01 as a new grouped Event record for the last two days of the 4 day event and try to charge the customer for a new Event - when in fact those two days were already a part of March's bill.
How can I write the query so that when ceDate starts in Month X but ends in Month Y that when records are pulled for Month Y its not trying to pull records that actually belong to an Event that started in Month X?
Examples of an Event record would look like this:
rid | ceID | ceActive | ceFreq | ceDate | ceStopDate
------------------------------------------------
1 1108 1 3 2020-03-09 | 2020-03-09
2 1111 1 2 2020-03-15 | 2020-03-15
3 1112 1 2 2020-03-17 | 2020-03-17
4 1117 1 1 2020-03-30 | 2020-04-02
5 1117 1 1 2020-03-31 | 2020-04-02
6 1106 1 3 2020-03-21 | 2020-03-21
7 1110 1 2 2020-03-05 | 2020-03-05
8 1113 1 2 2020-03-24 | 2020-03-24
9 1117 1 1 2020-04-01 | 2020-04-02
10 1117 1 1 2020-04-02 | 2020-04-02
The above query pulls all records where ceFreq != 1, and it pulls a single record for the ceFreq = 1 records (rids: 4 & 5). For March, we don't necessarily care that ceID 1117 spills into April. But when we pull records for April - we need to exclude rid 9 & 10, because the Event (ceID=1117), was already accounted for in March.
SELECT * FROM clubEventsCal
...
GROUP BY ceStopDate
This is gibberish.
MySQL (depending on configuration) allows it without choking - but it's semantically wrong and stands out as an anti-pattern.
There are some edge cases where the values returned might contain significant data, but they very unusual. Trying to explain a problem with code which does not work is perhaps not a good strategy.
Looking at your code, its possible that you don't need a union - but there's not enough information in your example records to say if this would actually give the result you expect (it will be significantly faster depending on your indexes):
SELECT IF(cefreq=1, rid, null) AS consolidator
, ceid
, cefreq
, MIN(cedate), MAX(cedate)
, ceStopDate
FROM clubEventsCal
WHERE cID=1001
AND ceActive!=2
AND (ceDate>='$firstDay' AND ceDate<='$lastDay')
GROUP BY IF(cefreq=1, rid, null)
, ceid
, cefreq
, ceStopDate
;
I would have added the ORDER BY - but I don't know where clId came from. Also This will give different resuts to what I think you were trying to achieve for any record where cefreq is null (if you really do want to exclude them, add a predicate in the WHERE clause).

Different select query based on the day of week? - MySQL

Is it possible to execute a different select query for each day of the week. I currently have the following columns: id, station_name, week_type and service.
The week_type is an enom value with the following options: 'Mon-Thur', 'Fri', 'Sat', 'Sun', 'Special'.
The service column only has a varchar value of the time of day. It needs to apply as the service operates the same on a weekly schedule depending on the week_type.
+-----------------------------------+------------+-----------+-----------+
| id |station_name| week_type | service |
+-----------------------------------+------------+-----------+-----------+
| 1 | Station1 | Mon-Thur | 08:15:00 |
| | | | |
| 2 | Station2 | Sat | 10:15:00 |
+-----------------------------------+------------+-----------+-----------+
As seen in the table above, when it is Saturday in my timezone and is equal to the week_type, then it should only show Saturday rows. And etc. for the other columns.
Any help would be much appreciated, as I am new to SQL.
I think you really need to work out on the table. Why don't you normalize your table.
station_services
id|station_name
station_working_days
id|station_id|weekday_id|working_hours
If you dont want week days as seperate table then you can hardcode from 1 as sunday to saturday as 7
station_working_days
id|station_id|weekday|working_hours
By normalising you will get all the flexibility in future too.
In case if the stations all the time have the same working hours then use the following table normalisation so that it may help you.
station_services
id|station_name|working_hours
station_working_days
id|station_id|weekday_id

Efficient SELECT from table of boolean changes

I'm trying to figure out an efficient query for a project I'm working on.
We're recording a switch state into a table, each time it changes, a row is added with the new value (0 or 1).
Here's a simplified structure of the table:
day | hour | state
-----+------+-------
10 | 1 | 1 # day 10
10 | 6 | 0
10 | 21 | 1
11 | 3 | 0 # day 11
11 | 6 | 1
13 | 13 | 0 # day 13
....
Now we need to make a daily overview, something like this:
Day 11 : Switch was on during 0-3, 6-24
SELECT * FROM log WHERE day = 11 will give us only [3,0] and [6,1]. From those we can guess that it started ON and ended ON, but how about day 12?
SELECT * FROM log WHERE day = 12 gives nothing, obviously - there's no clue to guess from.
What is an efficient and reliable way to get the starting and ending state for a given day? Something like "Select one entry before day 12 and one after day 12"?
SELECT
day,
hour,
state
FROM
log
WHERE
day*100+hour
BETWEEN
(SELECT max(day*100+hour) FROM log WHERE day < 12)
AND
(SELECT min(day*100+hour) FROM log WHERE day > 12)
Will give you everything between (including) the last entry before day 12 and the first entry after day 12.
The second part might be unnecessary if you don't need to know when the state changed, and it's enough to know the state didn't change until at least midnight of the selected day.

MySQL query to select on-air program from playlist schedule

I am trying to write ONE SQL query, which gives always gives three rows of results. Database is as follows:
uid | program_date | program_time | program_name
------------------------------------------------
1 | 2012-04-16 | 21:00 | Some movie
2 | 2012-04-16 | 23:00 | Program end
3 | 2012-04-17 | 10:00 | Animation
4 | 2012-04-17 | 11:00 | Some other movie
5 | 2012-04-17 | 12:00 | Some show
All I need - always have three rows - what is on air now, next and upcomming. So if today is 2012-04-16 21:00 it should output Some movie, Program end, Animation.
At 2012-04-17 00:00 it should output Program end, Animation, Some other movie.
Problem is that I need to "navigate" back in one day if there is no records WHERE program_date = date("Y-m-d") AND program_time <= date("H:i:s");
There is another problem - database does not have Unix timestamp field, only Uid, program_date (date field) and program_time (time field) and program_name.
Also, there might be, that Uid's are not inserted into table in sequence, as some program entry might be inserted in between into existing program schedule.
I am trying various approaches, but want to do everything in one SQL query, without looping in PHP.
Can anyone help me here?
As TV-people count and show time in rather strange manner, MySQL function may be created to handle their non-human ;-) logic easier:
CREATE FUNCTION TV_DATE(d CHAR(10), t CHAR(5))
RETURNS CHAR(16) DETERMINISTIC
RETURN CONCAT(d, IF (t < "06:00", "N", " "), t);
User-defined functions are declared per-database and this may be done just once. DETERMINISTIC tells that function always return the same result for the same input and internal MySQL optimizer may rely on that. N is just a letter which is larger (in string comparison) than whitespace. Consider it as mnemonics for next or night.
note: Hours should be always formatted with 2 digits!
Then using this function we may select what we need even simpler:
-- what is on air now
(SELECT `program_name`, TV_DATE(`program_date`, `program_time`) AS `tv_time`
FROM `table`
WHERE (`tv_time` <= TV_DATE(date("Y-m-d"), date("H:i"))
ORDER BY `tv_time` DESC
LIMIT 1)
UNION
-- next and upcomming
(SELECT `program_name`, TV_DATE(`program_date`, `program_time`) AS `tv_time`
FROM `table`
WHERE (`tv_time` > TV_DATE(date("Y-m-d"), date("H:i"))
ORDER BY `tv_time` ASC
LIMIT 0, 2)
Keep in mind, that if all records in DB are in future you'll get only 2 of them.
The same for situation, when the next program is the last one in DB.
You may add different constant values into queries in order to distinguish those 2 situations.

Ranking Values in Multidimensional PHP Array

I've set-up a multidimensional PHP array from a SQL query which pulls back parcels despatched per item on a daily basis.
e.g.
Item|Day1|Day2 |Day3|Day4|Day5
1 |100 | 120 | 90 |150 |60
2 |150 | 200 | 80 |90 |100
3 | 1 | 2 | 3 | 4 | 5
I want to be able to assign a ranked value to each day based upon the amount of items sent out each day and put that either into the existing array, or create a new array to which it can reference
e.g. Item 1 Day 1 = 2, Item 2 Day 1 = 1, Item 3 Day 1 = 3, Item 1 Day 2 = 2 etc etc
I'm new to working with arrays, can anyone recommend a way to do this?
Can you not add an ORDER BY in your query in order to do this for you? That said, this does depend on your database design.
Otherwise if you must do it in the PHP side, you could always make a list one for each day, and when you read the information in from the database you could place it where it belongs. This process will be slower than if you use the database.

Categories