How to Implement A Recommendation System? - php

I've Collective Intelligence book, but I'm not sure how it can be apply in practical.
Let say I have a PHP website with mySQL database. User can insert articles with title and content in the database. For the sake of simplicity, we just compare the title.
How to Make Coffee?
15 Things About Coffee.
The Big Question.
How to Sharpen A Pencil?
Guy Getting Hit in Balls
We open 'How to Make Coffee?' article and because there are similarity in words with the second and fourth title, they will be displayed in Related Article section.
How can I implement this using PHP and mySQL? It's ok if I have to use Python. Thanks in advance.

Store a set of keywords alongside each product, which should essentially be everything in the title besides a set of stop words. When a title is displayed, you find any other products which share keywords in common (with those with one or more in common given priority).
You could further enhance this by assigning a score to each keyword based on its scarcity (with more scarce words being given a higher score, as a match on 'PHP', for instance, is going to be more relevant than a match on 'programming'), or by tracking the number of times a user navigates manually between a set of products.
Regardless you'd best start off by making it simple, and then enhance it as you go on. Depending on the size of your database more advanced techniques may not be all that fruitful.

You're best off using a set of tags which are parsed and stored in the db when the title is inserted, and then querying based on that.
If you have to parse the title though, you'd basically be doing a LIKE query:
SELECT * FROM ENTRIES WHERE TITLE LIKE '%<keyword>%';
For a more verbose answer though:
// You need some test to see if the word is valid.
// "is" should not be considered a valid match.
// This is a simple one based on length, a
// "blacklist" would be better, but that's up to you.
function isValidEntry( $word )
{
return strlen( $word ) >= 4;
}
//to hold all relevant search strings:
$terms = array();
$postTitleWords = explode( ' ' , strtolower( 'How to Make Coffee' ) );
for( $postTitleWords as $index => $word )
{
if( isValidEntry( $word ) ) $terms[] = $word;
else
{
$bef = #$postTitleWords[ $index - 1 ];
if( $bef && !isValidEntry( $bef ) ) $terms[] = "$bef $word";
$aft = #$postTitleWords[ $index + 1 ];
if( $aft && !isValidEntry( $aft ) ) $terms[] = "$word $aft";
}
}
$terms = array_unique( $terms );
if( !count( $terms ) )
{
//This is a completely unique title!
}
$search = 'SELECT * FROM ENTRIES WHERE lower( TITLE ) LIKE \'%' . implode( '%\' OR lower( TITLE ) LIKE \'%' $terms ) . '\'%';
// either pump that through your mysql_search or PDO.

This can be simply achieved by using wildcards in SQL queries. If you have larger texts and the wildcard seems to be unable to capture the middle part of text then check if the substring of one matches the other. I hope this helps.
BTW, your question title asks about implementing recommendation system and the question description just asks about matching a field among database records. Recommendation system is a broad topic and comes with many interesting algorithms (e.g, Collaborative filtering, content-based method, matrix factorization, neural networks, etc.). Please feel free to explore these advanced topics if your project is to that scale.

Related

How can I filter WooCommerce shop products (product loop) by their custom product attributes?

I am looking for a simple way to filter all WooCommerce shop products inside the shop loop by their custom product attributes, as you can see below:
The attributes can have a lot of values with special chars (besides normal Latin chars) like ü or ß.
I've done a lot of research inside Stack Overflow, but was unable to find a good solution yet.
When I started my question here, I thought that will be an easy trip – haha. Since I got no answers in time, I've started researching again and found a huge pile of unanswered or non-specific questions on Stack Overflow. A fast and simple solution would have been adding all attributes as global attributes and use the taxonomy query filter to check against each product like pa_xxxx.
I mean, this solution totally works but will enhance the manual effort by a huge number of time depending on your amount of products. So no option for me (> 4000 products).
During my research, I've found an answer which used a WooCommerce filter hook named woocommerce_product_query_meta_query to check if there is any given value inside the serialized custom attributes stored inside the wp_postmeta table under the _product_attributes key for each product:
add_filter( 'woocommerce_product_query_meta_query', 'filter_woocommerce_product_query_meta_query', 10, 2 );
function filter_woocommerce_product_query_meta_query( array $meta_query ): array {
if ( is_shop() || is_product_category() ) {
$meta_query[] = [
'key' => '_product_attributes',
'value' => 'grün',
'compare' => 'LIKE'
];
}
return $meta_query;
}
(I've added a check for shop or categories page)
At the first point, I was really happy because it worked. But minutes later I've realized, that this design is completely useless since it's ignoring all attribute names and just searches for grün and grün can be anything from a color to a nice meadow. So no option for a precise filter.
Then I remembered, that I can use non-greedy RegEx within MySQL. Not simple to me, but possible.
Since the hook gets parsed inside $wpdb later on, I've tried to find out what compare operants are allowed:
array( '=', 'IN', 'BETWEEN', 'LIKE', 'REGEXP', 'RLIKE', '>', '>=', '<', '<=' )
Great, we have REGEXP available. So I've started writing my RegEx which checks inside serialized attributes. Hours later, I'm proud to present you my function:
add_filter( 'woocommerce_product_query_meta_query', 'filter_woocommerce_product_query_meta_query', 10, 2 );
function filter_woocommerce_product_query_meta_query( array $meta_query ): array {
if ( is_shop() || is_product_category() ) {
$quoted_key = preg_quote( 'farbe', '/' );
$quoted_value = preg_quote( 'grün', '/' );
$meta_query[] = [
'key' => '_product_attributes',
'value' => 's:[0-9]+:"' . $quoted_key . '";[a-z]:[0-9]+:\{[a-z]:[0-9]+:"name";[a-z]:[0-9]+:"([a-zA-ZäöüÄÖÜß0-9?!+*-_.:,;=&%$/()#<> ]+)";[a-z]:[0-9]+:"value";[a-z]:[0-9]+:"(' . $quoted_value . '[ ";]|([a-zA-ZäöüÄÖÜß0-9?!+*-_.:,;=&%$/()#<> ]+ \| )+(' . $quoted_value . '[ ";]))',
'compare' => 'REGEXP'
];
}
return $meta_query;
}
To explain my RegEx a bit, I'll show you an example of a serialized attributes string directly out of my DB:
a:2:{s:5:"farbe";a:6:{s:4:"name";s:5:"Farbe";s:5:"value";s:12:"Grün |
Gelb";s:8:"position";i:0;s:10:"is_visible";i:1;s:12:"is_variation";i:1;s:11:"is_taxonomy";i:0;}s:7:"groesse";a:6:{s:4:"name";s:7:"Größe";s:5:"value";s:5:"S
|
M";s:8:"position";i:1;s:10:"is_visible";i:1;s:12:"is_variation";i:1;s:11:"is_taxonomy";i:0;}}
As you can see above, inside my function, we want to search for a key named farbe and check if it contains a value grün. When you check my data example, you can see, that we have a correct dataset matching my search.
Before digging into my RegEx, I want to explain the use of preg_qoute(). Since we have a RegEx, it can happen that we have strings containing special signs which will be recognized as part of the RegEx if we don't escape them. So we use the above function to do so.
So let's dig into my RegEx (since we can not use greedy RegEx, we need to build everything until we find our values):
s:[0-9]+:" This part should find the beginning of an attribute like s:5:" which starts with a s: and any number between 0-9 until over by using a + followed by a "
Now we can insert our key ' . $quoted_key . ' e.g. farbe which will directly come after a " sign
Now we want to continue until we reach the real name field (not WooCommerce generated key) by using ";[a-z]:[0-9]+:\{[a-z]:[0-9]+:". We're again matching all possible serialized structure values here. As you can see, we also need to escape the opening { of the attribute set, including the name and the values
At this point we will not pass a name but expect one by using "name";[a-z]:[0-9]+:"([a-zA-ZäöüÄÖÜß0-9?!+*-_.:,;=&%$/()#<> ]+)"; We allow the name to include all signs defined inside []. We also include a pattern to still match the serialized string structure and close everything up with "; again
Finally we have our value checking part [a-z]:[0-9]+:"value";[a-z]:[0-9]+:"(' . $quoted_value . '[ ";]|([a-zA-ZäöüÄÖÜß0-9?!+*-_.:,;=&%$/()#<> ]+ \| )+(' . $quoted_value . '[ ";])). At the beginning, we try to match the serialized string structure again [a-z]:[0-9]+:" followed by the value identifier. Now it gets tricky. As you can see, I pass the qouted value to 2 loops. One to check if the value is placed at the beginning, the other to check within the values, separated by a | sign and ending with "; again
I don't want to describe every single RegEx part, but I think you get the idea. Since I'm not a RegEx professional, the RegEx can may include optimization effort. Also, I'm not covering all available languages or signs yet since I just don't need them atm.
You can also put a PHP foreach loop around, which loops over a set of values to add the RegEx multiple times with different values to the $meta_query array. Since I want to keep things simple, I intentionally decided to not use value groups like (grün|rot|gelb) since it seems to have an error potential.
You can find the RegEx inside the editor: https://extendsclass.com/regex/6cc8142
I hope I can help you a bit with this tutorial and made your day just a bit better. Feel free to post your improvements to my RegEx. Cheers.
Update: 26.10.2022
My above solution works for AND relations inside the DB. This means, for example: WHERE color = red AND blue AND green. In case you want to extend the amount of products being displayed by adding another attribute check to the query, you should use an OR relation inside the query by nesting the query itself and adding a key for relation:
add_filter( 'woocommerce_product_query_meta_query', 'filter_woocommerce_product_query_meta_query', 10, 2 );
function filter_woocommerce_product_query_meta_query( array $meta_query ): array {
if ( is_shop() || is_product_category() ) {
$quoted_key = preg_quote( 'farbe', '/' );
$quoted_value = preg_quote( 'grün', '/' );
$meta_query[] = [
'relation' => 'OR',
[
'key' => '_product_attributes',
'value' => 's:[0-9]+:"' . $quoted_key . '";[a-z]:[0-9]+:\{[a-z]:[0-9]+:"name";[a-z]:[0-9]+:"([a-zA-ZäöüÄÖÜß0-9?!+*-_.:,;=&%$/()#<> ]+)";[a-z]:[0-9]+:"value";[a-z]:[0-9]+:"(' . $quoted_value . '[ ";]|([a-zA-ZäöüÄÖÜß0-9?!+*-_.:,;=&%$/()#<> ]+ \| )+(' . $quoted_value . '[ ";]))',
'compare' => 'REGEXP'
]
// Add multiple arrays like the one above here to extend the check
];
}
return $meta_query;
}
If you want to add another attribute check, you need to add another child array inside the main array after the first child array.

ACF make top 10 based on field value

I'm using ACF for a while now and I thought this would be easier but cant figure out how to do this properly...
I'm trying to create some kind of trophy cabinet. So every company has a score that is stored inside a ACF called "company_score".
For example we have companies called Microsoft, Facebook and Twitter. They all have a score:
Facebook = 200000
Microsoft = 900000
Twitter = 100000
So the top 3 wil be like
1) Microsoft
2) Facebook
3) Twitter
I know how to display the value of an ACF but how can I compare all the scores that are stored and when a company has the best score it will display a gold medal. When a company has the second best score it will display a silver medal and so on.
I all had it figured it out in my head but a bit stuck here how to do this.
There could be several approaches to handle this. The first one, which comes in my mind is very simple.
Try these steps:
Pull all fields
Store in an array
Sort the array in ascending order
Display the results.
I hope, you've some understanding of the code but here are the coded guidelines...
<?php
$fields = get_fields();
$company_details = array();
if( $fields ):
foreach( $fields as $name => $value ):
$company_details[ $name ] = $value;
endforeach;
endif;
//to sort by company score
arsort( $company_details );
//loop through the array and display results, like
foreach( $company_details as $name => $value):
echo "Company name is: " .$name. ". Company Score: " . $value;
endforeach;
PS: The code is not tested because the purpose was to share logic with you. So, ignore if there's any error.

Laravel Eloquent / SQL - search for keywords in db

I'm currently trying to implement a search for keywords/tags in my db.
In my db, I have lines with keywords like:
auto,cabrio,frischluft or
hose,jeans,blaue hose,kleidung
so always some keywords (that can basically also have a whitespace), seperated by a comma (,).
Now I want to be able to find a product in my db that has some keywords entered.
With LIKE I can find the two entries I mentioned with queries like auto,cabrio or also cabrio,frischluft or hose,jeans,blau or hose,kleidung. But what happens if I enter auto,frischluft or something like hose,blaue hose or jeans,kleidung?
Then LIKE wont work any more. Is there a way to do this?
I hope you know what I mean...
So just to make it clear: The code I currently use is:
$searchQuery = "%".$request->input('productSearch')."%";
and $products = Product::where('name', 'LIKE', $searchQuery)->paginate(15);
But as I said, this won't bring me back the article with the keyowrds auto,cabrio,frischluft if the input productSearch has the keywords auto,frischluft entered...
Any ideas?
Sorry, I know I'm late for the party but this is the first result in Google when I was looking for Eloquent keywords search. I had the same problem and I want to help with my solution.
$q = $request->input('productSearch');
$needles = explode(',', $q);
// In my case, I wanted to split the string when a comma or a whitespace is found:
// $needles = preg_split('/[\s,]+/', $q);
$products = Products::where('name', 'LIKE', "%{$q}%");
foreach ($needles as $needle) {
$products = $products->orWhere('name', 'LIKE', "%{$needle}%");
}
$products = $products->paginate(15);
If the user input has too many commas, the $needles array could be too large (and the query too huge), so you can limit the search, for example, for only the first 5 elements in the array:
$needles = array_slice($needles, 0, 5);
I hope this can help somebody.
On your reply just now:
If you want it simpler, read this MySQL documentation: https://dev.mysql.com/doc/refman/5.7/en/regexp.html.
Basically, in a file you could grep for [,]?blaue hose[,]? to find: an optional comma, the string 'blaue hose', and an optional comma.
The more solid solution would be my initial answer:
You could actually create a keyword table, depending on your products table, where each keyphrase/keyword
is in one column by itself, and even lay an index on the keyphrase/keyword. I explain the principle here:
Optimising LIKE expressions that start with wildcards
And, to take your example as input - here is how I do that in Vertica. Many databases offer a function that returns
the n-th part/token of a string delimited by a character of your choice. In Vertica, it's SPLIT_PART().
MySQL, unfortunately, does not offer any correspondence to that function, and you would have to convert the Common Table Expressions in the WITH clauses below to in-line SELECT-s (SELECT ... FROM (SELECT ... ) AS foo(col1,col2,col3) ..... And then, there is a suggestion here From Daniel Vassallo on how to tackle it:
Split value from one field to two
In Vertica, it would look like this:
WITH
-- input
products(prod_id,keywords) AS (
SELECT 1001,'auto,cabrio,frischluft'
UNION ALL SELECT 1002,'hose,jeans,blaue hose,kleidung'
)
,
-- index to get the n-th part of the comma delimited string
max_keyword_count(idx) AS (
SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
)
SELECT
prod_id
, idx
, TRIM(SPLIT_PART(keywords,',',idx)) AS keywords
FROM products
CROSS JOIN max_keyword_count
WHERE SPLIT_PART(keywords,',',idx) <> ''
ORDER BY
prod_id
, idx
;
prod_id|idx|keywords
1,001| 1|auto
1,001| 2|cabrio
1,001| 3|frischluft
1,002| 1|hose
1,002| 2|jeans
1,002| 3|blaue hose
1,002| 4|kleidung

Universal MySQL query

I would like to implement filter in database driven webpage. Some of the option may not have to be selected some of them may do. Filter as you know can change. The question is, is there any way to build a general query like:
SELECT * FROM database.table WHERE name='$name', author='$author', pages='$pages', font='$font';
When once user might just want to choose all books from the particular author, but doesnt care about any other limitations. Other time user might want to get all books with the same name, etc. The thing is, can I in that case just pass NULL or something like that for pages=$pages if I don't care how many pages it will have, but I want to use the same query for other possible filters set by user.
SELECT is the universal MySQL query you're looking for.
As far as I know, you can't disable a specific WHERE condition by passing a null or something else.
What you can do though, is to remove the condition from your query when you don't need it. Meaning you will have to build your query dynamically depending on the filters you'll want.
You'd solve that in PHP by constructing the query dynamically.
$params = [];
foreach ( [ 'name', 'author', 'pages', 'font' ] as $p )
if ( ! empty( $_REQUEST[$p] ) )
$params[$p] = $_REQUEST[$p];
$sth = $db->prepare( "SELECT * FROM database.table WHERE " .
implode( " AND ", array_map( function($k) { return "$k=?"; }, array_keys($params) ) )
);
$sth->execute( array_values( $params ) );

Laravel Eloquent: how to filter multiple and/or criteria single table

I am making a real estate related app and I've been having a hard time figuring out how to set up the query so that it would return "Only Apartments or Duplexes within selected areas" I'd like to user to be able to find multiple types of property in multiple selected quadrants of the city.
I have a database with a column "type" which is either "Apartment", "House", "Duplex", "Mobile"
In another column I have quadrant_main with values: "NW", "SW", "NE", "SE".
My code works when there is only 1 quadrant selected, but when I select multiple quadrants, I seem to get results which includes ALL the property types from the second or third or 4th quadrant, instead of only "Apartment" and "Duplex" or whatever types the user selects... Any help will be appreciated! thx in advance.
My controller function looks like this:
public function quadrants()
{
$input = \Request::all();
$currentPage = null;
$column = "price";
$order = "desc";
//
// Looks like the input is like 0 => { key: value } ...
// (an Array of key/value pairs)
$q = Listing::where('status','=','Active')->where(function($query) {
$input = \Request::all();
$currentPage = null;
$typeCount = 0;
$quadrantCount = 0;
foreach( $input as $index => $object ) {
$tempObj = json_decode($object);
$key = key((array)$tempObj);
$val = current((array)$tempObj);
if ( $key == "type" ) {
if ( $typeCount > 0 ) {
$query->orWhere('type', '=', $val );
}
else {
$query->where('type', '=', $val );
$typeCount++;
}
}
if ( $key == "quadrant_main" ) {
if ( $quadrantCount > 0 ) {
$query->orWhere('quadrant_main', '=', $val );
}
else {
$query->where('quadrant_main', '=', $val );
$quadrantCount++;
}
}
// else {
// $query->orWhere($key,$val);
// }
}
if( $currentPage ) {
//Force Current Page to Page of Val
Paginator::currentPageResolver(function() use ($currentPage) {
return $currentPage;
});
}
});
$listings = $q->paginate(10);
return $listings;
Looking at your question, its a bit confusing and not much is given to answer definitely. Probable causes of your troubles may be bad data in database, or maybe corrupted input by user.
Disclaimer: Please note that chances are my answer will not work for you at all.
In that case please provide more information and we will work things
out.
There is one thing that I think you have overlooked and thus you are getting awry results. First let me assume a few things.
I think a sample user input should look like this:
array(
0: '{type: Apartment}',
1: '{type: Duplex}',
2: '{quadrant_main: NW}',
3: '{quadrant_main: SW}',
)
What the user meant was give me any apartment or duplex which belongs in NW or SW region.
So after your loop is over, the final SQL statement should be something like this:
Oh and while we are at SQL topic, you can also log the actual
generated SQL query in laravel so you can actually see what was the
final SQL getting generated. If you can post it here, it would help a
lot. Look here.
select * from listings where status = 'Active' and (type = 'Apartment' or type = 'Duplex' and quadrant_main = 'NW' or quadrant_main = 'SW');
What this query will actually produce is this:
Select any listing which is active and:
1. Type is an apartment, or,
2. Type is a duplex, or,
3. Quadrant is SW, and,
4. Quadrant is NW
So assuming you have a database like this:
id|type|quadrant_main
=====================
1|Apartment|NW
2|Apartment|SW
3|Apartment|NE
4|Apartment|SE
5|Duplex|NW
6|Duplex|SW
7|Duplex|NE
8|Duplex|SE
9|House|NW
10|House|SW
11|House|NE
12|House|SE
You will only receive 1, and 5 in the result set. This result set is obviously wrong, plus it is depended on NW because that was the and condition.
The correct SQL query would be:
select * from listings where status = 'Active' and (type = 'Apartment' or type = 'Duplex') and (quadrant_main = 'NW' or quadrant_main = 'SW');
So structure your L5 app such that it produces this kind of SQL query. Instead of trying to cram everything in one loop, have two loops. One loop should only handle type and another loop should only handle quadrant_main. This way you will have the necessary and condition in the right places.
As a side note:
Never directly use user input. Always sanitize it first.
Its not a best practice to put all your logic in the controller. Use repository pattern. See here.
Multiple where clauses are generally applied via Criteria. Check that out in the above linked repository pattern.
You code logic is very complicated and utterly un-necessary. Instead of sending JSON objects, simply send the state of checkboxes. Don't try to generalize the function by going in loop. Instead handle all checkboxes one by one i.e. is "Apartments" selected, if yes, add that to your clause, if not, don't add.

Categories