I have been tasked with creating a search function that when searched, certain fields will have more weight than others.
Here is an simplified example.
cars (table)
year, make, model, color, type (columns)
Let's say someone searches for the following:
Year: 1968
Make: Ford
Model: Mustang
Color: Red
Type: Sports Car
If the cars in the table have none of the correct fields they should not show up, but if record has some of the correct fields but not all they should still show up. But certain fields should be weighted higher than others.
For instance maybe they are weighted like this:
Column - Weight
Year - 30
Make - 100
Model - 85
Color - 10
Type - 50
So if a record matches the search in the "make" field and the "model" field, that record would be above a record that matched in the "year", "color" and "type" field, because of the weights we placed on each column.
So lets say that the query matches at least one field for two records in the database, they should be ordered by the most relevant based on the weight:
1971, Ford, Fairlane, Blue, Sports Car (weight = 185)
1968, Dodge, Charger, Red, Sports Car (weight = 90)
I have been racking my brain trying to figure out how to make this work. If anyone has done something like this please give me an idea of how to make it work.
I would like to do as much of the work in MySQL as possible via joins, I think this will be bring up the results faster than doing most of the work in PHP. But any solution to this problem would be much appreciated.
Thanks in advance
Bear with me, this is going to be a strange query, but it seems to work on my end.
SELECT SUM(
IF(year = "1968", 30, 0) +
IF(make = "Ford", 100, 0) +
IF(model = "Mustang", 85, 0) +
IF(color = "Red", 10, 0) +
IF(type = "Sports Car", 50, 0)
) AS `weight`, cars.* FROM cars
WHERE year = "1968"
OR make = "Ford"
OR model = "Mustang"
OR color = "Red"
OR type = "Sports Car"
GROUP BY cars.id
ORDER BY `weight` DESC;
Basically, this groups all results by their id (which is necessary for the SUM() function, does some calculations on the different fields and returns the weight as a total value, which is then sorted highest-lowest. Also, this will only return results where one of the columns matches a supplied value.
Since I don't have an exact copy of your database, run some tests with this on your end and let me know if there's anything that needs to be adjusted.
Expected Results:
+============================================================+
| weight | year | make | model | color | type |
|============================================================|
| 130 | 1968 | Ford | Fairlane | Blue | Roadster |
| 100 | 2014 | Ford | Taurus | Silver | Sedan |
| 60 | 2015 | Chevrolet | Corvette | Red | Sports Car |
+============================================================+
So, as you can see, the results would list the closest matches, which in this case are two Ford (+100) vehicles, one from 1968 (+30), and a Red Sports Car (10 + 50) as the closest matches (using your criteria)
One more thing, if you also want to display the rest of the results (ie results with a 0 weight match score) simply remove the WHERE ... OR ..., so it will check against all records. Cheers!
Further to the comments below, checking the weight after a LEFT JOIN on a pivot table:
SELECT SUM(
IF(cars.year = "1968", 30, 0) +
IF(cars.make = "Ford", 100, 0) +
IF(cars.model = "Mustang", 85, 0) +
IF(cars.color = "Red", 10, 0) +
IF(types.name = "Sports Car", 50, 0)
) AS `weight`, cars.*, types.* FROM cars
LEFT JOIN cars_types ON cars_types.car_id = cars.id
LEFT JOIN types ON cars_types.type_id = types.id
WHERE year = "1968"
OR cars.make = "Ford"
OR cars.model = "Mustang"
OR cars.color = "Red"
OR types.name = "Sports Car"
GROUP BY cars.id
ORDER BY `weight` DESC;
Here is a picture of the LEFT JOIN in practice:
As you can see, the Cobalt matches on color (silver) and model (Cobalt) (85 + 10) while the Caliber matches on type (Sports Car) (50). And yes, I know a Dodge Caliber isn't a Sports Car, this was for example's sake. Hope that helped!
If I understand your logic you can just do something like direct comparison in PHP between the value requested and the value returned.
The query will sound like:
SELECT Year,Make,Model,Color,Type
FROM table
WHERE year='$postedyear' OR make='$postedmake'
OR model='$postedmodel' OR color='$postedcolor'
Then in php looping between the results:
foreach($results as $result){
$score = 0;
if($result['year']==$postedyear{$score=$score+30;}
//continue with the other with the same logic.
}
After each foreach iteration $score will be the score of that selected row. If you push the score to the $result array you can also sort it by score before displaying the results.
Variation on #lelio-faieta
In php you can have a result array containing arrays of values for each item matching at least one of the search terms, the associative array of values to match and the associate array of weights, both with the same indexes. You would just get an array of matches for each index. (maybe use array_intersect_assoc()) Then you multiply by the weights and sum, add to the original data. Then you do have to sort the result array at that point.
There is a solution doing this via the mysql query directly, but that would end up with an overgrown resource thirsty query for every single search you perform.
Doing it in PHP is not much difference in resource usage, bounding to several loops in results and processing it.
I've had a very similar project and my best suggestion would be: "use SphinxSearch"
Very easy to install, needs a bit of a learning curve to setup afterwards, but very similar to mysql queries etc. With this you can apply weights to every column match and rank your results afterwards.
Also, it is a multitude of time faster that typical mysql queries.
Hello friends I have 2 Mysql tables with 1:N relationship between category and category_Dates
Category:
ID Category Frequency
1 Cat A Half-yearly
2 Cat B Quarterly
category_Dates:
ID CatID Date
1 1 01-Jan-15
2 1 01-Jul-15
3 2 01-Jan-15
4 2 01-Apr-15
5 2 01-Jul-15
6 2 01-Oct-15
based on the category frequency I am entering number of records automatically in category_date. Eg
When category frequency = quarterly, I am entering 4 records in category_date with ID of that category. And dates will be entered later.
I am little confused if in case on wants to edit the frequency from halfyearly to yearly. How to change number of records. Please help with your valuable suggestions. I am using laravel 4 framework with mysql
best way would be with 3rd table joining Dates and Categories. See little carefully ,you can see its actually Many to Many relationship (N to N) as 1 category can have multiple dates. and one date may be part of multiple categories, like say 01-Jan-15 is part of Category 1 and 2 as well.
So use
category table
id Category Frequency
1 Cat A Half-yearly
2 Cat B Quarterly
date table
id Date
1 01-Jan-15
2 01-Apr-15
3 01-Jul-15
4 01-Oct-15
categories_dates table
ID CatID Date_id
1 1 1
2 1 3
3 2 1
4 2 2
5 2 3
6 2 4
If you change the frequency in Category table, retrieve the update category_id,
delete all from category_dates where CatId=category_id then insert the new entries in category_Dates.
Hope this help.
I assume your models are Category and CategoryDates.
let's update category id 1 from Half-yearlyto to Quarterly
$query = Category::find(1);
$query -> Frequency = 'Quarterly';
$query -> save();
return $query -> id;
in the CategoryDates model you would delete the catID = 1 and insert new data
$catID = 1;
$query = CategoryModel::where('CatId',$catId) -> delete();
$data = ['CatId' => $catID,'date' => 01-Jan-15, ....];
CategoryModel::create($data);
of course assuming that you would return the newly updated category id to your controller and call a funtionn to do the update in your CategoryModel.
Hope this help.
Currently working on the project related to Business listing. I need some help in handing category structure.
Using the table name Bus_CategoryTbl to maintain the categories & used fields are Cat_ID, Cat_Name, Cat_Slug, Cat_Level etc., In this, Cat_PID will have the Cat_ID of the parent category.
Here are the example records,
Cat_ID Cat_name Cat_Slug Cat_PID
1 Web Design web-design 0
2 PHP php 1
3 MYSQL mysql 1
4 Hotels hotels 0
5 5 Stars 5-stars 4
6 3 stars 3-stars 4
7 Le Meridian le-meridian 5
8 St Laurn Suites st-laurn-suites 5
9 Niraali Executive niraali 6
Cat_PID value 0 indicates first level parent category.
Above is the example records, above table have 3 levels. For Ex: Hotels (Cat_ID: 4) -> 5 Stars (Cat_ID: 5) -> St Laurn Suites (Cat_ID: 8)
How the acheive the above result dynamically using PHP/MySql (Levels may increased in future)? It should not utiize more CPU time. current code was written in 3 dimensional array structure using foreach, but its little confused taking more CPU time.
Can you someone help me in achieving this? TIA.
Try following code
function Fname($parentId=0){
$catgArray = array();
$sql = mysql_query("SELECT * FROM Table Where Cat_PID=$parentId");
$mainCatg = mysql_fetch_array($sql);
foreach($mainCatg as $mc){
array_push($catgArray , $mc);
$subCatg = $this->Fname($mc->Cat_ID);
if (count($subCatg ) > 0) {
array_push($catgArray , $subCatg );
}
}
return $catgArray;
}
In the Result we can check the sub category levels by using is_array() function
Im just looking for the best way to go about building this code for search query.
Say I have a search that has to search for 6 columns and an advance search that looks at 14 columns.
I was thinking of compressing it to search 1 column with a large number.
So for example say I have categories 'city, price, date, type, area' then about 8 'yes' or 'no' (or 1/0) columns. I am just wondering which search would work better/faster.
SELECT * FROM table WHERE city=$city AND price BETWEEN ($price) AND other_searchables = [3-9][2-9]__1_1_110110_____1_1
With this code the custom fields are put into one field and given 0=no or 1=yes and the beginning searches to see if the first 2 numbers equal 3+ and 2+ then _ for any characters. So for example other_searchables field would look like '2300111011011100000101'
OR use WHERE for each column?
SELECT * FROM table WHERE city=$city AND price BETWEEN ($price) AND type = $types AND area = $areas AND customfield = 1 AND customfield = 0 AND customfield = 1 AND customfield = 1 AND customfield = 0 AND customfield = 1 AND customfield = 0 AND customfield = 1
I am going to index city as this will be searched everytime... Just looking for any advice on which way to go here as far as which query would be better.
Thanks
This is the books table on db;
book_ID writer_ID
-------- -----------
1 10
2 10
3 10
4 10
5 10
This is the rates table on the db,
book_ID rate
------- --------
1 4
2 3
2 5
2 1
2 4
3 5
4 2
4 5
4 2
4 4
5 3
now, i have the writer_ID at first, and i have to find all book_ID (connected to that writer_ID) and the average rates of each book_ID from the rates table. finally, i have to find the greatest rate average and its book_ID
this is my code
$query="SELECT * FROM books WHERE seller_id ='$id'";
$result = mysql_query($query);
while ($info = mysql_fetch_array($result)) {
//getaveragerate is the function that returns average of the rates from rates table
$arr = array(ID => $info['book_ID'], average => getaveragerate($info['book_ID']));
}
$greatest_average_and_books_id_number = max($arr); // dont know how to get highest average and its ID together from array
that is my question, sorry but english is not my native language, i am trying my best to explain my problem. sometimes i cant and i just stuck.
thanks for understanding.
Or just let the database do it for you:
SELECT max(fieldname) FROM rates WHERE id='34'
If you are limited as to which functions you can perform (ie using some CRUD class):
SELECT * FROM rates WHERE id='34' ORDER BY id DESC LIMIT 1
You haven't told us what fields from the database will be returned by your query. It also looks like you're filtering (WHERE clause) on key column, which should only return one record. Therefore you can strip out everything you have there and only put:
$greatest_record = 34;
No need for a query at all!
With a little more information on what you're doing and what fields you're expecting:
$query = "SELECT id, rate FROM rates";
$result = mysql_query($query);
$myarray = array();
$greatest_number = 0;
while ($row = mysql_fetch_array($result)) {
myarray[] = $row; // Append the row returned into myarray
if ($row['id'] > $greatest_number) $greatest_number= $row['id'];
}
// Print out all the id's and rates
foreach ($myarray as $row_num => $row) {
print "Row: $row_num - ID: {$row['id']}, Rate: {$row['rate']} <br>";
}
print "Highest ID: $greatest_number";
Note that we maintained what was the greatest number at each row returned from the database, so we didn't have to loop through the $myarray again. Minor optimization that could be a huge optimization if you have tens of thousands of rows or more.
This solution is on the basis that you actually need to use the ID and RATE fields from the database later on, but want to know what the largest ID is now. Anyone, feel free to edit my answer if you think there's a better way of getting the greatest_number from the $myarray after it's generated.
Update:
You're going to need several queries to accomplish your task then.
The first will give you the average rate per book:
SELECT
book_id,
avg(rate) as average_rate
FROM Rates
GROUP BY book_id
The second will give you the max average rate:
SELECT
max(averages.average_rate),
averages.book_id
FROM (
SELECT
book_id,
avg(rate) as average_rate
FROM Rates
GROUP BY book_id
)
as averages
WHERE averages.average_rate = max(averages.average_rate)
This will give you a list of books for a given writer:
SELECT book_id
FROM Books
WHERE writer_id = $some_id
Don't try to do everything in one query. Mixing all those requirements into one query will not work how you want it to, unless you don't mind many very near duplicate rows.
I hope you can use this update to answer the question you have. These SQL queries will give you the information you need, but you'll still need to build your data structures in PHP if you need to use this data some how. I'm sure you can figure out how to do that.