MySQL: Calculating totals for user-selectable year ranges - php

Let me start off by stating that I'm a just a self-taught hobbyist at this, so I'm sure I'm doing some things wrong or ineffciently, so any feedback is appreciated. If this question is moot because I've made fundamental errors and need to start from scratch, I guess I need to know so I'll become better.
With that, here's the problem:
I have a database of birth names in MySQL that is intended to let you find the frequency of those names within a given year range. My only table has a lot of columns:
**Name** **Begins** **Popularity** **1800** **1801** **1802**
Aaron A 500 6 7 4
Amy A 100 10 2 12
Ashley A 250 2 5 7
...and so forth until 2013.
Right now I've written a PHP page that can call up a list of names based on the start letter over the entire year range (1800-2013). That works, but what I'd like to do is to let the user specify a custom year range from the dropdowns I put on the home page and use that to calculate the frequency of each name for the custom year range only. I'd also like to be able to sort the resulting list based on those frequency values, not the all-time frequency stored in 'Popularity'.
From what I've looked at, I'm thinking part of the solution might lie in using custom views but I just can't seem to put the pieces all together. Or should I somehow pre-calculate all possible combinations?
Here's is the working query code I'm using right now:
{$query = "SELECT Name
FROM nametable
WHERE Gender = '$genselect'
AND
(BeginsWith = '$begins')
ORDER BY $sortcolumn $sortorder";
goto resultspage;
}
resultspage:
$result = mysqli_query($dbcnx, $query)
or die ("Error in query: $query.".mysqli_error($dbcnx));
$rows = $result->num_rows;
echo "<br>You found $rows names!<br>";
while($row=mysqli_fetch_assoc($result))
{
echo '<br>'.$row['Name'];
}

I think you're going to have to consider structuring your data in a different way to make the most of using an RDBMS.
If it were me, I'd be looking at normalising data into different tables in the first instance and disposing of unnecessary fields such as "Begins" and "Popularity". That kind of information can easily be reproduced or sought out in PHP or within a query itself. The advantage here is that you also reduce the number of columns that actually need to be maintained.
I haven't worked out a silver bullet schema but, roughly, I'd start with something along these lines and expand/modify where appropriate:
Names
- id
- name
- genderID
Genders
- id
- code
Years
- id
Frequencies
- id
- nameID
- yearID
- number
So, for example, a segment of your data may take the following shape:
Names (1, Aaron, 1)
Genders (1, Male)
Years (1987)
Frequencies (1, 1, 1987, 6), (1, 1, 1988, 19)
The beauty of having your data separated out like this is that it becomes much easier to query it. So, if you wanted the frequency of occurrences of the name Aaron between 1987 and 1988 you could do something like the following:
SELECT SUM(frequencies.number) FROM frequencies WHERE frequencies.yearID
BETWEEN 1987 AND 1988
AND frequencies.nameID = 1
Furthermore, doing away with the "Begins" column would mean you can structure a query to use "LIKE"
SELECT * FROM names WHERE name LIKE "A%"
My examples are perhaps a bit contrived but hopefully they illustrate what I'm getting at.
One thing I haven't touched upon is how you might go about physically entering the data. What happens when a new name is added? Does a corresponding entry get made in the frequencies table automatically? Is a check performed in the frequencies table first and, if an entry exists, does it automatically increment the number?
These are important problems to consider but probably best left until after a schema is settled upon.

Related

MySQL sorting date then by part number

Essentially I want these parts (below) grouped then the groups place in order of time, starting from the latest time being at the top of the list.
ID Parts Time
1 SMH_2010 08:59:18
2 JJK_0101 08:59:26
3 FTD_0002 08:59:24
4 JJK_0102 08:59:27
5 FTD_0001 08:59:22
6 SMH_2010 08:59:20
7 FTD_0003 08:59:25
So, the results would look like:
ID Parts Time
1 JJK_0101 08:59:26
2 JJK_0102 08:59:27
3 FTD_0001 08:59:22
4 FTD_0002 08:59:24
5 FTD_0003 08:59:25
6 SMH_2010 08:59:20
7 SMH_2010 08:59:18
Please, I would be grateful for any help.
What you are asking is not sorting in the traditional meaning. Your first attempt orders the result by time, and then by part if multiple timestamps occur at the same time.
What you want neither sorts the result in alphabetically by Parts name, nor ascending/descending on timestamp. What you are asking for can't be accomplished by the sort operation in SQL. Having the parts in sequence is not ordering.
I finally found a solution to this. Not my ideal solution but, never the less it works.
I added another field called max_date which by default is ‘now()’ as every new part is inserted.
I create a prefix from the current part being inserted, something like “SMH_” as a variable called $prefix = “SMH_”;
I have another query that directly follows the insert, which updates the max_date again, by ‘now()’ where the prefix is like $prefix.
UPDATE parts SET max_date = now() WHERE prefix LIKE '%$prefix%'
To display the results I use something along the line of :
SELECT * FROM parts ORDER BY parts.max_date DESC, parts.part ASC

Advanced statistics in PHP and MySQL

I have a slight problem. I have a dataset, which contains values measured by a weather station, which I want to analyze further using MySQL database and PHP.
Basically, the first column of the db contains the date and the other columns temperature, humidity, pressure etc.
Now, the problem is, that for the calculation of the mean, st.dev., max, min etc. it is quite simple. However there are no build-in commands for other parameters which I need, such as kurtosis etc.
What I need is for example to calculate the skewness, mean, stdev etc. for the individual months, then days etc.
For the build-in functions it is easy, for example finding some of the parameters for the individual months would be:
SELECT AVG(Temp), STD(Temp), MAX(Temp)
FROM database
GROUP BY YEAR(Date), MONTH(Date)
Obviously I cannot use this for the more advanced parameters. I thought about ways of achieving this and I could only think of one solution. I manually wrote a function, which processes the values and calculates the things such as kurtosis using the particular formulae. But, what that means is that I would need to create arrays of data for each month, day, etc. depending on what I am currently calculating. So for example, i would first need to take the data and split it into arrays lets say Jan11, Feb11, Mar11...... and each array would contain the data for that month. Then I would apply the function on those arrays and create new variables with the result (lets say kurtosis_jan11, kurtosis_feb11 etc.)
Now to my question. I need help with the splitting of data. The problem is that I dont know in advance which month the data starts and which it ends, so I cannot set fixed variables for this. The program first has to check the first month and then create new array for each month, day etc. until it reaches the last record. And for each it would create the array.
That of course would be maybe one solution but if anyone has any other ideas about how to go around this problem I would very much appreciate your help.
You can do more complex queries to achieve this. Here are some examples http://users.drew.edu/skass/sql/ , including Skew
SELECT AVG(Temp), STD(Temp), MAX(Temp)
FROM database
GROUP BY YEAR(Date), MONTH(Date)
having date between date_from and date_to
I think you want a group of data in between a data range.

PHP / Mysql, N number of timestamps + value, cut down to 24 / 1 per 2 weeks

This might be a weird question but let me try to explain best I can.
I have a table in my database and this table contains N number of records the table is simple its laid out as follows:
ID, Time, Data
So the end goal is to out put a Graph for a yearly period based off the values in this table. Now this wouldn't usually be such a big deal but the values in the table are limitless for a year, but there is no pattern to how frequent these will be entered.
In theory the person responsible for updating this table will be doing it once per 2 weeks but this can not be relied upon because I know they wont, so I want to dump all the values from the table then create and array from the results with only 2 values per month one for the 14th and one for the 28th so this will cover all months.
Anyway so I figure,
Select * FROM table
For each
.... take value closest to 14th
.... take value closest to 28th
.... Dump each into new array
But how would you go about doing this in PHP I can't work out how you would get the closest value to each day for that month only and limit it to 2, the hard thing for me is getting my head around if they didn't update it in say 4 weeks what then? use the last value I guess.
Has anyone done this before?

Storing Date Range in MySQL Solution

I am working on script which requires giving the admin the ability to insert dates for when he wants a parking lot available, the admin inserts dates in a range.
I am having a hard time coming to a solution to what would be the best way to store the dates in MySQL.
Should i store the dates using two columns AVAILABLE_FROM_DATE and AVAILABLE_UNTIL_DATE?
PLID AVAILABLE_FROM DATE AVAILABLE_UNTIL_DATE
1 2012-04-01 2012-04-03
1 2012-04-05 2012-04-15
2 2012-04-21 2012-04-30
OR should i just use a single column AVAILABLE_DATE and store the ranges the admin selects in a new row for each date between the range?
[EDIT START]
What i mean above by using a single column is not to join or split the dates into a single column, i actually mean to store a date in a single row with a single column like below:
PLID AVAILABLE_DATE
1 2012-04-01
1 2012-04-02
1 2012-04-03
and so on for all the available dates i want to store.
[EDIT END]
Basically, the admin will want to insert a date range the parking lot is available and allow members to choose that slot if the user is looking for a slot within that range.
OR is there some better and simpler way to do this?
I am currently trying to use the first method using separate columns for the range, but having trouble getting the desired results when looking for parking lots within a range.
[EDIT START]
SELECT * FROM `parking_lot_dates`
WHERE (available_from_date BETWEEN '2012-04-22' AND '2012-04-30'
AND (available_until_date BETWEEN '2012-04-22' AND '2012-04-30'))
I use the following query on the above rows i have, and it returns empty.
I want it to return the last row having the PLID 2.
[EDIT END]
Thank you in advance.
Regarding your EDIT with the query, you have the logic inside out. You need to compare whether each date you are checking is inside the range BETWEEN available_from_date and available_until_date, like this:
SELECT * FROM `parking_lot_dates`
WHERE
(
'2012-04-22' BETWEEN available_from_date AND available_until_date
AND '2012-04-30' BETWEEN available_from_date AND available_until_date
)
Demo: http://www.sqlfiddle.com/#!2/911a3/2
Edit: Although if you'll want to allow partial-range matches, you'll need both types of logic, i.e., the parking lot is available 4-22 to 4-27, and you need it 4-23 to 4-28. You can use it for the dates 4-23 to 4-27, but not 4-28.
Why to complicate so much?
SELECT *
FROM `parking_lot_dates`
WHERE available_from_date <= '2012-04-22'
AND available_until_date >= '2012-04-30';
I personally have found it better to have 2 columns, a start and end time, for searching a specific date, or just looking at it seems easier to me
Using 1 column to store those dates is a bad design from a database point of view (not normalized). It's better to have 2 columns because the results can be retrieved easier and extracting the information from a single column would mean having to do some sort of split. It's just not elegant and it doesn't behave well when requirements change.

Using PHP and MySQL, how can I create a simple room availability check?

What is a simple way to check for room availability?
I have been working on a little project for class, just a simple, small hotel registration page that needs contact information and billing information. Bear in mind that unencrypted information is not an issue here.
Here is part of my process.php page:
$sql="INSERT INTO UMBRELLA (firstName, lastName, email, arrivalDate,
departureDate, adultGuests, childGuests, roomReservation, newsletterBool,
comments, creditCardType, creditCardNumber, expirationMonth, expirationYear,
securityCode, phone)
VALUES
('$_POST[firstName]','$_POST[lastName]','$_POST[email]', '$_POST[arrivalDate]',
'$_POST[departureDate]', '$_POST[adultGuests]', '$_POST[childGuests]',
'$_POST[roomReservation]', '$_POST[newsletterBool]', '$_POST[comments]',
'$_POST[creditCardType]', '$_POST[creditCardNumber]', '$_POST[expirationMonth]',
'$_POST[expirationYear]', '$_POST[securityCode]', '$_POST[phone]')";
I would prefer to keep the solution simple, as it's only a minor requirement, so I don't want to create additional tables for a many-to-many relationship (e.g. ROOMS table, BOOKINGS table). The roomReservation value only has three options: 1. Presidential Suite 2. Master Suite 3. Standard Suite. Basically, I would like to create a variable, $maxRooms, that is equal to 10.
$maxRooms=10;
I would then use a SELECT statement to get all useful data for each specific room type (note this is pseudocode so some formatting might be off):
$standard = "SELECT arrivalDate, departureDate, roomReservation FROM UMBRELLA
WHERE roomReservation = 'Standard Suite';"
$master = "SELECT arrivalDate, departureDate, roomReservation FROM UMBRELLA
WHERE roomReservation = 'Master Suite';"
$presidential = "SELECT arrivalDate, departureDate, roomReservation FROM
UMBRELLA WHERE roomReservation = 'Presidential Suite';"
I would then check how many rooms are reserved in a given date range, and if it was >=maxRooms then the page would output an error. This is the part I'm having trouble on. I would appreciate any help or insight! Thank you!
tl;dr What code do I need to compare the $standard, $master, and $presidential variables to a given date range?
So, they will be telling you which type of room they would like through something like a select input?
ie:
<select>
<option value="presidential">Presidential Suite</option>
....</select>
The max rooms: Does that refer to there being 10 master suites, 10 standard suites, and 10 presidential suites for a total of 30 rooms in all? If not, and it's only 10 rooms in the whole 'hotel' how are the distributed across the three types? Also, I'm assuming you would not like the database to upload the new reservation if the rooms are full.
$maxRooms = 10;
$fetchQuery = "SELECT room_reservation FROM UMBRELLA
WHERE room_reservation = '$_POST[roomReservation]'";
$fetchResult = mysql_query($fetchQuery);
$numberRooms = mysql_num_rows($fetchResult);
if($numberRooms >= $maxRooms)
{
echo "error: All ".$_POST['roomReservation'."s are currently booked.";
}
else
{
$insertQuery = "INSERT into UMBRELLA VALUES('$_POST[firstname]',.....)";
mysql_query($insertQuery) or die(mysql_error());
echo "Registration Complete.";
}
So, when someone clicks "Make Reservation" or whatever your submit button on the form is called, this checks to ensure the room type they have chosen is available. If it's not they are they are all booked. If there are open rooms of the type they chose, they are reserved the room (which adds to the current number of rooms reserved of that type making it one less for the next) etc. You'd have to write something for checking out etc obviously.
Something like this? Honestly, I haven't tried it etc...but is this the idea you are going for? I hope it helps a bit maybe...
Lets take for example the Standard Suit query, and you can use the same idea for the rest two.
To find how many reservations exist for a given date range, this query should do the trick:
$reservations = "SELECT count(*) as count FROM UMBRELLA
WHERE roomReservation = 'Standard Suite' AND arrivalDate BETWEEN '2011-11-20' AND '2011-11-30'";
Not to reiterate, but obviously you do not want to pass the $_POST values directly into your data tier in any case. Also, you should never be saving fields like CreditCardNumber on any server that hasn't undergone a PCI compliance evaluation. I realize this is a theoretical application for class, but these things need to be noted for those who might stumble in.
That said, the object to use for your conditional comparison is the DateTime object in PHP5, which is represented in PHP4 using the date suite of functions.
I realize you mention wanting to avoid joins, but joining across tables storing transaction data and room availability separately might make this even simpler. It may be worth considering.
So that said, here's the kind of query you'll need to be doing, if I have understood your needs correctly.
//We need to find the no more than ten rooms have been reserved in a given date range, let's //say for argument sake, 10 days.
$now = DateTime();
$inc = DateInterval::createFromDateString('-10 days');
$past = $now->add($inc);
This sets up objects storing dates ten days in the past from now. So your queries are then adjusted to find counts instead, and only in a specific date range.
SELECT COUNT(*) AS rooms FROM UMBRELLA WHERE roomReservation='Standard Suite' AND arrivalDate BETWEEN ? AND ?
You'd then use your sql drivers to pass in the two values of the date time objects, using $past->format('Y-m-d H:i:s'); and $now->format('Y-m-d H:i:s'); for your between values. Repeat this process for your other two room types.
You can then conditionally check the values for these three counts against your max rooms and adjust the behavior of the application to suit.

Categories