I'm looking for some general advice on how to go about finding items 'like' the current one.
In my current example I have three tables like so (omitting unrelated data):
games
-game_id
genres
-genre_id
genres_data
-game_id
-genre_id
How can I go about finding games that have genres in common with the current one, from the ones that have all the same genres, descending to ones that only have one in common with it (limited of course to a few rows) with a given row from games?
What's the preferred method of finding like items?
Try this:
SELECT game_id, COUNT(genre_id) AS genres_in_common
FROM genres_data
WHERE
genre_id IN
(
SELECT genre_id
FROM genres_data
WHERE game_id = <game you're searching with>
)
AND
game_id != <game you're searching with>
GROUP BY game_id
ORDER BY genres_in_common DESC
;
The subquery grabs a list of all genre_ids associated with your game, and the main query uses that to search genres_data for any record that matches one of them. In other words, it searches for any game which is associated with any genre your "search game" is associated with.
Because a game can have multiple genres, this query would return the same game_id multiple times, and if you also reported the genre_id on these records they would each show a different genre_id. What we do to find the ones with the most in common is to group the results by each game_id, and we add the COUNT(genre_id) in the main SELECT to show how many different genre_ids there were for each game_id returned in that query.
From there, it's a simple matter of ordering that count of common genres in descending order, so that the games with the most genres in common will be listed first.
I also added a second criterion to the main query to exclude the game you're searching on from the results, otherwise that game would always have the most matches, for obvious reasons.
Hope that helps.
Surely the way to do this for just a single game is to first grab its genres and then loop over them to create a new query:
$query = "SELECT `genre_id` FROM `genres_data` WHERE `game_id` = 'your_game_id_here';"
$genre_id_result = mysql_result($query, $dbconn);
$num = mysql_num_rows($genre_id_result);
if ($num > 0) {
$query = "SELECT `game_id` FROM `games` WHERE ";
for ($i=0;$i<$num;$i++) {
$genre_id = mysql_result($genre_id_result, $i, "genre_id");
if ($WhereSQL == "") {
$WhereSQL = "genre_id = '$genre_id' "
} else {
$WhereSQL .= "AND genre_id = '$genre_id' "
}
}
$GamesInCommonResult = mysql_result($query . $WhereSQL, $dbconn);
}
You could set up a loop to do this for every game in the database and then collate your results. I can't think of how to do this in a single query at the moment.
I'm also a little unsure on your question as either you're looking for the genres that are the most popular (as games with these genres will likely be returned as having the most other games with the same genre in common) or you are looking individually for other game_ids of games in common with another game which might be more useful.
Related
I've read a million complicated questions on sorting arrays. I have something super simple, but I'm just not able to wrap my head around it.
I have a table in my db that has scores for games. The column labels are team1_score and team_2 score. I realize that if all the scores were in one column, I could sort them with my SQL query, but they're not.
I need to know how to fetch the results from those columns, and sort them highest to lowest and ideally assign those to variables such as $first_place and $second_place
I'm sorry I'm a noob and I've done a lot of research before coming on here, so please be gentle.
So, I have something like this...
My program keeps track of scores at a kids camp and there are multiple camps. Each row has id, camp_name, camp_logo and then goes into team1_name, team1_logo, team1_score and so on through 10 teams. Ideally, I'd like to have the query fetch all those scores, and output them in Descending order with something like First Place: xxx points (team name) (team logo)
I can sort the scores with this...
$query = "SELECT team1_score, team2_score FROM camps ";
$scores = mysqli_query($connection,$query);
$scores_array = mysqli_fetch_assoc($scores);
arsort($scores_array);
foreach ($scores_array as $key => $value) {
echo "score - [" . $key . "] = " . $value . "\n";
}
but I don't know how to associate the name and logo with those keys. I hope that makes sense.
This can actually done by SQL. You could use greatest and least to get the top and bottom scores, respectively, and then also sort by them:
SELECT GREATEST(team1_score, team2_score),
LEAST(team1_score, team2_score)
FROM camps
ORDER BY 1 DESC, 2 DESC
I am sorry for a verlo long question, just trying to explain in details. My formatting is not very good, sorry for that as well. I had a PHP/ MySQL App that essentially was not truly relational as I had one large table for all student scores. Among other things, I was able to calculate the average score for each subject, such that the average appeared alongside a student's score. Now I have since split the table up, to have a number of tables which I am successfully querying and creating School Report Cards as before. The hardship is that I can no longer calculate the avaerages for any subject.
Since I had one table with 5 subjects and each of the subjects had 2 tests, I queried for data and calculated the average as follows:
The one table (Columns):
id date name exam_no term term year eng_mid eng_end mat_mid mat_end phy_mid phy_end bio_mid bio_end che_mid che_end
The one query:
$query = "SELECT * FROM pupils_records2
WHERE grade='$grade' && class='$class' && year = '$year' && term ='$term'";
$result = mysqli_query($dbc, $query);
if (mysqli_num_rows($result) > 0) {
$num_rows=mysqli_num_rows($result);
while($row = mysqli_fetch_array($result)){
//English
$eng_pupils1{$row['fname']} = $row['eng_mid'];
$eng_pupils2{$row['fname']} = $row['eng_end'];
$mid=(array_values($eng_pupils1));
$end=(array_values($eng_pupils2));
$add = function($a, $b) { return $a + $b;};
$eng_total = array_map($add, $mid, $end);
foreach ($eng_total as $key => $value){
if ($value==''){
unset ($eng_total[$key]);
}
}
$eng_no=count($eng_total);
$eng_ave=array_sum($eng_total)/$eng_no;
$eng_ave=round($eng_ave,1);
//Mathematics
$mat_pupils1{$row['fname']} = $row['mat_mid'];
$mat_pupils2{$row['fname']} = $row['mat_end'];
$mid=(array_values($mat_pupils1));
$end=(array_values($mat_pupils2));
$add = function($a, $b) { return $a + $b;};
$mat_total = array_map($add, $mid, $end);
foreach ($mat_total as $key => $value){
if ($value==''){
unset ($mat_total[$key]);
}
}
print_r($mat_total);
$mat_no=count($mat_total);
echo '<br />';
print_r($mat_no);
$mat_ave=array_sum($mat_total)/$mat_no;
$mat_ave=round($mat_ave,1);
}
}
//Biology
etc
I split the table into separate tables and have names in a separate table, not needed for calculating avaerages, so I will not show it here. Each subject table tajkes the following form:
id date exam_no term year grade class test*
*Test would be eng_mid or eng_end or mat_mid etc.
Because I had only one query which returned 10 rows (5 subjects each with two tests: e.g. eng_mid (English Mit exam), eng_end (english end of term test), I was able to capture all rows in one call and pack each subject into an array, and then work out the class average, with the help of array_map. It may not be elegant, but it worked very well. Now, I have each test in it's own table.
I was trying to write a joint so as to get a signle resultset but the query fails. The columns as like:
I know that the database design is not anything to be proud off, but coming from a huge single table, this is a massive step (worthy a pat on the shoulder).
What I wish to do is to be able to query all my data and calculate class averages (about 30 students in each class). I tried to use separate queries but I ran into a wall, in that previously I would use the WHILE conditional as shown after the query for it to pull all rows and create an array from which I could get desired results. Now several queries just makes me confused as to how I can archieve the same results since a join is not working. Also I am having a separate $row variable, and that throws me further off balance!
Is it even possible to do averages as I did on my infamous one table (from the dark side) or is my table design so messed up, what I want just isn't humanly possible?
Please any help will be deeply appreciated.
Try using union. It would be something like
select grade, test from math
union all
select grade, test from english
union all
....
Also, in my opinion, better design would be to have table exams something like that (warning, pseudo-DML):
id int primary key,
student_id int foreign key students
subject_id int foreign key subjects
exam_type_id int foreign key exam_types
grade int(????)
exam_types table would be just midterm and final, but you'll be able to easily support more types in future, if required.
subjects table will store all kinds of subjects you have (at this time there will be only five of them: math, eng, phy, etc.
The averaging query would be as simple as (yes, you can actually do aggregation in the query itself)
select student_id, avg(grade)
from exams
group by student_id
EDIT: Added the first SQL query.
A section of my website has two dropdown menus. All the options in both are populated using SQL queries. Dropdown#1 is a list of class sections (like A1 for example). Once the professor selects a section, Dropdown#2 is populated with the student ID's (like 1234567 for example).
Student information is found in table 1. Among this information is the 'professorName' column. In order to associate the student with a class section, I need to match 'professorName' column with an identical column found in table 2, because class sections are only found in table 2.
Till here everything works great, because at the end of my query I put ORDER BY student ID. However, two of the class sections are associated to two different professors. In order to deal with this issue, I used the following code to loop through each professor name.
$from site = $_POST['section'];
$query = ("SELECT professorName FROM Table 2 WHERE classSection='$fromsite'");
$NumberofProfessorNames = $objMSSQL->getAffectedRows();
echo $NumberofProfessorNames;
for ($j=0; $j<$NumberofProfessorNames; $j++)
{
$section= $query[$j][professorName];
$output = $objMSSQL->getTable("SELECT DISTINCT StudentID from table1 WHERE professorName='$section' ORDER BY StudentID");
for ($i=0; $i<$objMSSQL->getAffectedRows(); $i++)
{
echo "<option value='".$output[$i][studentID]."'>".$output[$i][studentID]."</option>";
}
}
The problem is that for the only two sections where this is even necessary (because there are two professorNames), since it is looping like this, it is ending up ordered like this in the dropdown#2:
1234567
2345678
3456789
4567890
1234123
2345765
3456999
4567000
My limited experience in programming is keeping me from understanding how I can fix this seemingly simple issue.
Thank you for your help.
Rather than loop over the professors and query table1 for each, join table1 and table2 in the second query and only query the database once. For example:
$query = [... FROM Table2...];
$NumberofProfessorNames = $objMSSQL->getAffectedRows();
echo $NumberofProfessorNames;
$output = $objMSSQL->getTable("
SELECT DISTINCT StudentID
from table1
join table2
on ...
WHERE [the same clause you used in $query]
ORDER BY StudentID"
);
for ($i=0; $i<$objMSSQL->getAffectedRows(); $i++)
{
echo "<option value='".$output[$i][studentID]."'>".$output[$i][studentID]."</option>";
}
It's more elegant (and almost certainly more efficient) than generating a WHERE IN clause.
Yu can do it this way:
$section = "('";
for ($j=0; $j<$NumberofProfessorNames; $j++)
{
$section.= $query[$j][professorName] . "','";
}
$section = substr($section, 0, -3) . ')'; //$section contains ('prof1','prof2')
$output = $objMSSQL->getTable("SELECT DISTINCT StudentID from table1 WHERE professorName IN $section ORDER BY StudentID");
for ($i=0; $i<$objMSSQL->getAffectedRows(); $i++)
{
echo "<option value='".$output[$i][studentID]."'>".$output[$i][studentID]."</option>";
}
that is querying for all your professors in just one sql with IN() syntax.
UPDATE: I've just noted you use sql server instead of mysql, so I've changed the IN() syntax a bit and change the link to the sql server help docs.
It sounds like your tables aren't normalized. Good form would have a sections table, a students table, and a professors table. Information in each table should be specific to the table's topic.
students
student_id
student_last_name
student_first_name
student_address
etc
sections
section_id
section_name - multiple sections can tend to have the same name but differing content
section_description
section_year - sections can change from year to year
faculty
faculty_id
faculty_name - this is not a key field, more than one person can have the same name.
faculty_address
faculty_type - adjunct, fulltime, etc.
You would then have relational tables so you can associate professors with sections and students with sections.
faculty_2_sections
f2s_id
faculty_id
section_id
student_2_sections
s2s_id
student_id
section_id
This makes it super simple because if a student is logged in, then you already have their student id. If it's a professor, you already have their faculty_id
If you're pulling for students, your sql might look like this:
$sql = "select * from students s,sections sc,faculty f,faculty_2_sections f2s,student_2_sections s2s where student_id='$student_id' and s2s.student_id=s.student_id and s2s.section_id=sc.section_id and f2s.faculty_id=f.faculty_id and f2s.section_id=s2s.section_id";
If you're pulling for faculty you would do this:
$sql = "select * from students s,sections sc,faculty f,faculty_2_sections f2s,student_2_sections s2s where faculty_id='$faculty_id' and f2s.faculty_id=f.faculty_id and f2s.section_id=s2s.section_id and s2s.section_id=sc.section_id and s2s.student_id=s.student_id";
You can then pull a list of sections to populate the section_ids pull-down to only show students or faculty for a specific section.
I think I don't understand how 'sort' works, so please don't judge me. I really searched all day long.
I have a movies table with actors column. A column it's named "actors". The actors are links separated by space " ". The order of the links it's very important.
I explode the links into an array which looks like [0]-> link0, [1]->link1, ...
I have the actors table where every actor also has it's movies links. I don't want to make 20 different sql searches so I made a variable with all the links I want, like this ( WHERE actor_link = link1 OR actor_link = link2 OR .. )
The problem is this. The search will probably find first the link7, and so my sorting it's gone. What can I do to keep that order from the movies table. I want to display the actors by popularity in the movie, not the order of my database.
Can you give me another method to search the actors without making 'x' sql searches and still keeping the order?
$actors[] = explode(" ", $row['actors_link']);
$x=0;
$actors2 = '';
while ($actors[0][$x]) {
$actors2 = $actors2 . "`link_imdb_actor` = " . "'".$actors[0][$x]."' OR ";
$x++;
}
$actors2 = substr($actors2, 0, -3);
$sql = "SELECT * FROM `actors` WHERE $actors2";
$sql_result = mysql_query($sql) or die(" ");
while ($row3 = mysql_fetch_array($sql_result)) {
echo $row3['link_imdb_actor'];
}
So, the movie Hotel Transylvania has Adam Sandler, Andy Samberg and Selena Gomez. My search shows Selena Gomez, Andy Samberg, Adam Sandler because this is the order from my database. How can I sort the sql results by the order of the actors array? Thank you!
To expand on Arjan's comment, if you want to be able to actually use the actor data (e.g. search with it) I would recommend at least two more tables. One called actors with the fields actorID, firstName, and lastName. The other table would be castings with the fields castingID, actorID, movieID, and castingOrder.
Each castingID will then link an actor to a movie - this would make for easy searches of every movie a particular actor has been in or every actor in a particular movie.
The castingOrder field can be used to maintain the order you want.
I need your existing code to really get the gist of what's going on.
I will make one suggestion in your query. Instead of saying WHERE actor_link = a OR actor_link = b OR actor_link = c do this instead:
WHERE actor_link IN (link1, link2, link3)
I've looked around for info on an efficient 'related videos' algorithm but i'm struggling to get well ordered, accurate results
I get given the 'genre' as a pipe-delimited string. eg: |Action|Sci-Fi|Thriller|
$genre = explode("|", $row['genre']);
if (count($genre) == 3) {
$sql = "SELECT title FROM `movie` WHERE genre LIKE '%$genre[1]%' LIMIT 0,8";
} else {
$sql = "SELECT title FROM `movie` WHERE (genre LIKE '%$genre[1]%' AND genre LIKE '%$genre[2]%') UNION SELECT title FROM `movie` WHERE (genre LIKE '%$genre[1]%' OR genre LIKE '%$genre[2]%') LIMIT 0,10";
}
$related = mysql_query($sql);
Then I basically explode it and do a manual, inefficient search for genre matches depending on genre count. The results are poor and returns anything that is semi related.
This code makes me want to gag! It works but I hate it and I know its uber lame. Any tips to improve the SQL and getting richer results?
Move the mappings of genres to movies into a new table movie_genres with columns movie and genre.
This allows you to do this:
$genres = explode('|', trim($row['genre'], '|'));
$sql = "SELECT `movie`, COUNT(*) AS hits
FROM `movie_genres`
WHERE `genre` IN ('" . join("', '", $genres) . "')
GROUP BY `movie`
ORDER BY `hits` DESC
LIMIT 8";
You have to make sure to prevent SQL injection, though.
The extra table is also a good idea, because your database schema is not normalized. Especially Chris Date's fourth condition of the first normal form is violated:
Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else).