Help setting up logic for advanced search parameters in PHP - php

I would like to create an advanced search form much like a job site would have one that would include criteria such as keyword, job type, min pay, max pay, category,sub category etc...
My problem is deciding on how best to set this up so if I have to add categories to the parameters I'm not having to modify a whole bunch of queries and functions etc...
My best guess would be to create some sort of associative array out of all of the potential parameters and reuse this array but for some reason I feel like it's a lot more complex than this. I am using CodeIgniter as an MVC framework if that makes any difference.
Does anybody have a suggestion as how best to set this up?
Keep in mind I will need to be generating links such as index.php?keyword=designer&job_type=2&min_pay=20&max_pay=30
I hope my question is not to vague.

I don't know if it's what you need, but I usually create some search class.
<?php
$search = new Search('people');
$search->minPay(1000);
$search->maxPay(4000);
$search->jobType('IT');
$results = $search->execute();
foreach ($results as $result)
{
//whatever you want
}
?>
You can have all this methods, or have some mapping at __set() between method name and database field. The parameter passed to the constructor is the table where to do the main query. On the methods or mapping in the __set(), you have to take care of any needed join and the fields to join on.

There are much more 'enterprise-level' ways of doing this, but for a small site this should be OK. There are lots more ActiveRecord methods you can use as necessary. CI will chain them for you to make an efficient SQL request.
if($this->input->get('min_pay')) {
$this->db->where('min_pay <', $this->input->get('min_pay'));
}
if($this->input->get('keyword')) {
$this->db->like($this->input->get('keyword'));
}
$query = $this->db->get('table_name');
foreach ($query->result() as $row) {
echo $row->title;
}

To use Search criterias in a nice way you should use Classes and Interfaces.
Let's say for example you define a ICriteria interface. Then you have different subtypes (implementations) of Criteria, TimeCriteria, DateCriteria, listCriteria, TextSearch Criteria, IntRange Criteria, etc.
What your Criteria Interface should provide is some getter and setter for each criteria, you'll have to handle 3 usages for each criteria:
how to show them
how to fill the query with the results
how to save them (in session or database) for later usage
When showing a criteria you will need:
a label
a list of available operators (in, not in, =, >, >=, <, <=, contains, does not contains) -- and each subtypes can decide which part of this list is implemented
an entry zone (a list, a text input, a date input, etc)
Your main code will only handle ICriteria elements, ask them to build themselves, show them, give them user inputs, ask them to be saved or loop on them to add SQL criteria on a sql query based on their current values.
Some of the Criteria implementations will inherits others, some will only have to define the list of operators available, some will extends simple behaviors to add rich UI (let's say that some Date elements should provide a list like 'in the last day', 'in the last week', 'in the last year', 'custom range').
It can be a very good idea to handle the SQL query as an object and not only a string, the way Zend_Db_Select works for example. As each Criteria will add his part on the final query, and some of them could be adding leftJoins or complex query parts.

Search queries can be a pain sometimes, but not as big of a pain as pagination. Luckily, CodeIgniter helps you out a bit with this with their pagination library.
I think you're on the right track. The basic gist, I would say, is:
Grab your GET variables from the URL.
Create your database query (sanitize the GET values).
Generate the results set.
Do pagination.
Now, CodeIgniter destroys the GET variable by default, so make sure you enable http query strings in your config file.
Good luck!

I don't know anything about CodeIgniter, but for the search application I used to support, we had drop-down combo-boxes with category options stored in a database table and would rely on application and database cacheing to avoid round-trips each time the page was displayed (an opportunity for learning in itself ;-). When you update the table of job_type, location, etc. the new values will be displayed in your combo-box.
It depends on
how many categories you intend to have drop-down lists
how often you anticipate having to update the list
how dynamic you need it to be.
And the size of your web-site and overall activity are factors you will have to consider.
I hope this helps.
P.S. as you appear to be a new user, if you get an answer that helps you please remember to mark it as accepted, or give it a + (or -) as a useful answer

A pagination class is a good foundation. Begin by collecting query string variables.
<?php
// ...in Pagination class
$acceptableVars = array('page', 'delete', 'edit', 'sessionId', 'next', 'etc.');
foreach($_GET as $key => $value) {
if(in_array($key, $acceptableVar)) {
$queryStringVars[] = $key . '=' . $value;
}
}
$queryString = '?' . implode('&', $queryStringVars);
$this->nextLink = $_SEVER['filename'] . $queryString;
?>

Duplicate the searchable information into another table. Convert sets of data into columns having two values only like : a search for color=white OR red can become a search on 10 columns in a table each containing one color with value 1 or 0. The results can be grouped after so you get counters for each search filter.
Convert texts to full text searches and use MATCH and many indexes on this search table. Eventually combine text columns into one searchable column. The results of a seach will be IDs which you can then convert into the records with IN() condition in SQL

Agile Toolkit allows to add filters in the following way (just to do a side-by-side comparison with CodeIgniter, perhaps you can take some concepts over):
$g=$this->add('Grid');
$g->addColumn('text','name');
$g->addColumn('text','surname');
$g->setSource('user');
$conditions=array_intersect($_GET, array_flip(
array('keyword','job_type','min_pay'));
$g->dq->where($conditions);
$g->dq is a dynamic query, where() escapes values passed from the $_GET, so it's safe to use. The rest, pagination, column display, connectivity with MVC is up to the framework.

function maybeQuote($v){
return is_numeric($v) ?: "'$v'";
}
function makePair($kv){
+-- 7 lines: $a = explode('=', $kv);
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
}
function makeSql($get_string, $table){
+-- 10 lines: $data = explode('&', $get_string);
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
}
$test = 'lloyd=alive&age=40&weather=hot';
$table = 'foo';
print_r(makeSql($test, $table));

Related

Is it worth to save keyword <-> link relation into "hastable" like structure in mysql?

im working on PHP + MySQL application, which will crawl HDD/shared drive and index all files and directories into database, to provide "fulltext" search on it. So far im doing well, but im stuck on question, if i chosed good way how to store data into database.
On picture below, you can see part schema of my database. Thought is, that i'm saving domain (which represents part of disk which i wana to index) then there are some link(s) (which represents files and folder (with content, filepath, etc) then i have table to store sole (uniq) keywords, which i find in file/folder name or content.
And finaly, i have 16 tables linkkeyword to store relations between links and keywords. I have 16 of them because i thought it might be good to make something like hashtable, because im expecting high number of relations between link <-> keyword. (so far for 15k links and 400k keywords i have about 2.5milion of linkkeyword records). So to avoid storing so much data into one table (and later search above them) i thought that this hastable can be faster. It works like i wana to search for word, i compute it md5 and look at first character of md5 and then i know to which linkkeyword table i should use. So there is only about 150~200k records in each linkkeyword table (against 2.5milions)
So there im curious, if this approach can be of any use, or if will be better to store all linkkeyword information to single table and mysql will take care of it (and to how much link<->keyword it can work?)
So far this was great solution to me, but i crushed hard when i tried to implement regular-expression search. So user can use e.g. "tem*" which can result in temp, temporary, temple etc... In normal way when searching for word, i will conpute in md5 hash and then i know to which linkkeyword table i need to look. But for regular expression i need to get all keywords from keywords table (which matches regular expression) and then process them one by one.
Im also attaching part of code for normal keyword search
private function searchKeywords($selectedDomains) {
$searchValues = $this->searchValue;
$this->resultData = array();
foreach (explode(" ", $searchValues) as $keywordName) {
$keywordName = strtolower($keywordName);
$keywordMd5 = md5($keywordName);
$selection = $this->database->table('link');
$results = $selection->where('domain.id', $selectedDomains)->where('domain.searchable = ?', '1')->where(':linkkeyword' . $keywordMd5[0] . '.keyword.keyword LIKE ?', $keywordName)
->select('link.*,:linkkeyword' . $keywordMd5[0] . '.weight,:linkkeyword' . $keywordMd5[0] . '.keyword.keyword');
foreach ($results as $result) {
$keyExists = array_key_exists($result->linkId, $this->resultData);
if ($keyExists) {
$this->resultData[$result->linkId]->updateWeight($result->weight);
$this->resultData[$result->linkId]->addKeyword($result->keyword);
} else {
$domain = $result->ref('domain');
$linkClass = new search\linkClass($result, $domain);
$linkClass->updateWeight($result->weight);
$linkClass->addKeyword($result->keyword);
$this->resultData[$result->linkId] = $linkClass;
}
}
}
}
and regular expression search function
private function searchRegexp($selectedDomains) {
//get stored search value
$searchValues = $this->searchValue;
//replace astering and exclamation mark (counted as characters for regular expression) and replace them by their mysql equivalent
$searchValues = str_replace("*", "%", $searchValues);
$searchValues = str_replace("!", "_", $searchValues);
// empty result array to prevent previous results to interfere
$this->resultData = array();
//searched phrase can be multiple keywords, so split it by space and get results for each keyword
foreach (explode(" ", $searchValues) as $keywordName) {
//set default link result weight to -1 (default value)
$weight = -1;
//select all keywords, which match searched keyword (or its regular expression)
$keywords = $this->database->table('keyword')->where('keyword LIKE ?', $keywordName);
foreach ($keywords as $keyword) {
//count keyword md5 sum to determine which table should be use to match it links
$md5 = md5($keyword->keyword);
//get all link ids from linkkeyword relation table
$keywordJoinLink = $keyword->related('linkkeyword' . $md5[0])->where('link.domain.searchable','1');
//loop found links
foreach ($keywordJoinLink as $link) {
//store link weight, for later result sort
$weight = $link->weight;
//get link ID
$linkId = $link->linkId;
//check if link already exists in results, to prevent duplicity
$keyExists = array_key_exists($linkId, $this->resultData);
//if link already exists in result set, just update its weight and insert matching keyword for later keyword tag specification
if ($keyExists) {
$this->resultData[$linkId]->updateWeight($weight);
$this->resultData[$linkId]->addKeyword($keyword->keyword);
//if link isnt in result yet, insert it
} else {
//get link reference
$linkData = $link->ref('link', 'linkId');
//get information about domain, to which link belongs (location, flagPath,...)
$domainData = $linkData->ref('domain', 'domainId');
//if is domain searchable and was selected before search, add link to result set. Otherwise ignore it
if ($domainData->searchable == 1 && in_array($domainData->id, $selectedDomains)) {
//create new link instance
$linkClass = new search\linkClass($linkData, $domainData);
//insert matching keyword to links keyword set
$linkClass->addKeyword($keyword->keyword);
//set links weight
$linkClass->updateWeight($weight);
//insert link into result set
$this->resultData[$linkId] = $linkClass;
}
}
}
}
}
}
Your question is mostly one of opinion, so you may want to include the criteria that allow us to answer "worth it' more objectively.
It appears you've re-invented the concept of database sharding (though without distributing your data across multiple servers).
I assume you are trying to optimize search time; if that's the case, I'd suggest that 2.5 million records on a modern hardware is not a particularly big performance challenge, as long as your queries can use an index. If you can't use an index (e.g. because you're doing a regular expression search), sharding will probably not help at all.
My general recommendation with database performance tuning is to start with the simplest possible relational solution, keep tuning that until it breaks your performance goals, then add more hardware, and only once you've done that should you go for "exotic" solutions like sharding.
This doesn't mean using prayer as a strategy. For performance-critical application, I typically build a test database, where I can experiment with solutions. In your case, I'd build a database with your schema without the "sharding" tables, and then populate it with test data (either write your own population routines, or use a tool like DBMonster). Typically, I'd go for at least double the size I expect in production. You can then run and tune queries to prove, one way or another, whether your schema is good enough. It sounds like a lot of work, but it's much less work than your sharding solution is likely to bring along.
There are (as #danFromGermany comments) solutions that are optimized for text serach, and you could use MySQL fulltext search features rather than regular expressions.

php/mysql cascading filtering

I would like to create some chaining filters that a user can use to narrow down a list of data from the database.
For instance, I have 5 filters types:
by date
by location
by type
by topic
by keyword
The user is free to use all, some or none of these filters at once.
My first impression is that I would have to store a variable with all the "AND's" and "OR's" of my WHERE clause stacked one after the other but it seems tedious as I may have to programmatically add/delete some filter types later.
Is there a way not to use the whole mysql data that I filter once with one huge request but rather chain the targeted requests that I can add/withdraw if necessary?
I mean, can I filter one after the other the result of each previous filtered result?
For instance, instead of doing this:
SELECT * FROM mytable WHERE (A=3 AND B=4 AND ((C>date1) OR (D<date2)) AND D LIKE '%sample%' )
Is it possible to do something that would achieve this idea:
request 1 = SELECT * FROM mytable WHERE A=3
request 2 = SELECT * FROM "result of request 1" WHERE B=4
request 3 = SELECT * FROM "result of request 2" WHERE C>date1 OR D<date2
request 4 = SELECT * FROM "result of request 3" WHERE E LIKE '%sample%'
The first approach is very hard to manage if the filters are meant to change over time while the second approach is just a matter of adding/withdrawing the desired filter to the chain.
Not to mention that the more filters you have, the more the first approach becomes unreadable and unmanageable.
This could be obtained in pure php if I decide to filter the data after retrieving the whole mysql table but is seems counter productive and more verbose while Mysql has all I need to filter the data directly and it seems useless to get the whole data from Mysql when I only need part of it at the end.
Thank you for your help.
I like the first scenario better, I assume you're submitting a form to decide what to filter. The way I would go about making it scalable and easier to read, is just name the form fields the same as your DB fields. Then you can either keep an array containing the fields you want to use, or make a db call to check if a field is one you want to use in the filter. a simple example.
$filterFields = array('startDate'=>'>',
'endDate'=>'<',
'location'=>'=',
'keyword'=>'LIKE');
function buildQuery($filterFields, $mysqli)
{
$partCount = 0;
$query = "SELECT * FROM mytable WHERE ";
foreach($_POST as $key=>$value)
{
//is the field one we want to use? Does it contain a value?
if(in_array($key, array_keys($filterFields) && !empty($value))
{
if($partCount > 1)
$query .= ' AND ';
if($filterFields[$key] == 'LIKE')
$query .= " `".$key."` LIKE '%".mysqli->real_escape_string($value)."%' ";
else
$query .= " `".$key."` ".$filterFields[$key]." '".mysqli->real_escape_string($value)."' ";
++$partCount;
}
}
return $query;
}
Hopefully that makes sense. If that's not doing what you want or you need clarification, let me know. Also this is not tested, just wrote it here.

mysql | PHP | Join within own table

i dont know if i am doing right or wrong, please dont judge me...
what i am trying to do is that if a record belongs to parent then it will have parent id assosiated with it.. let me show you my table schema below.
i have two columns
ItemCategoryID &
ItemParentCategoryID
Let Suppose a record on ItemCategoryID =4 belongs to ItemCategoryID =2 then the column ItemParentCategoryID on ID 4 will have the ID of ItemCategoryID.
I mean a loop with in its own table..
but problem is how to run the select query :P
I mean show all the parents and childs respective to their parents..
This is often a lazy design choise. Ideally you want a table for these relations or/and a set number of depths. If a parent_id's parent can have it's own parent_id, this means a potential infinite depth.
MySQL isn't a big fan of infinite nesting depths. But php don't mind. Either run multiple queryies in a loop such as Nil'z's1, or consider fetching all rows and sorting them out in arrays in php. Last solution is nice if you pretty much always get all rows, thus making MySQL filtering obsolete.
Lastly, consider if you could have a more ideal approach to this in your database structure. Don't be afraid to use more than one table for this.
This can be a strong performance thief in the future. An uncontrollable amount of mysql queries each time the page loads can easily get out of hands.
Try this:
function all_categories(){
$data = array();
$first = $this->db->select('itemParentCategoryId')->group_by('itemParentCategoryId')->get('table')->result_array();
if( isset( $first ) && is_array( $first ) && count( $first ) > 0 ){
foreach( $first as $key => $each ){
$second = $this->db->select('itemCategoryId, categoryName')->where_in('itemParentCategoryId', $each['itemParentCategoryId'])->get('table')->result_array();
$data[$key]['itemParentCategoryId'] = $each['itemParentCategoryId'];
$data[$key]['subs'] = $second;
}
}
print_r( $data );
}
I don't think you want/can to do this in your query since you can nest a long way.
You should make a getChilds function that calls itself when you retrieve a category. This way you can nest more than 2 levels.
function getCategory()
{
// Retrieve the category
// Get childs
$childs = $this->getCategoryByParent($categoryId);
}
function getCategorysByParent($parentId)
{
// Get category
// Get childs again.
}
MySQL does not support recursive queries. It is possible to emulate recursive queries through recursive calls to a stored procedure, but this is hackish and sub-optimal.
There are other ways to organise your data, these structures allow very efficient querying.
This question comes up so often I can't even be bothered to complain about your inability to use Google or SO search, or to offer a wordy explanation.
Here - use this library I made: http://codebyjeff.com/blog/2012/10/nested-data-with-mahana-hierarchy-library so you don't bring down your database

Mysql - PHP different values searches

I know how to perform mysql searches using for example the WHERE word. But my problem is that i need to search on different values, but these can vary in number. For example:
I can search for 3 variables Name, LastName, Age
BUT
I in other search, i can look for 2 variables Name, Age.
Is there a way to perform a MYSQL search with the same script, no matter the quantity of values i search.??
Ot it is a better practice to "force" the search of a fixed amount of variables.??
Thanks.!
Roberto
IMHO, it is far better to limit the search to a fixed number of variables. That way you are answering a specific question for a specific reason, not trying to fit a general answer to your specific question. Limiting the search criteria makes the statement(s) easier to debug and benchmark for performance.
Hope this helps.
Just use a variable for your search parameters and inject that into your query. Just ensure that in the function/method you put the variable into the proper format (which will depend on how you select the different values.)
SELECT *
FROM db
$variable;
There will be no WHERE clause seen, unless it is passed your values (meaning you can use this same query for a general search of the db) without fear of having an empty/required $variable.
Your $variable when constructed would need to have to have the WHERE clause in it, then each value you add, insert it (in a loop perhaps) in the proper format.
Hope this makes sense, if not let me know and I will try to clarify. This is the same method most people use when paginating (except they put the variable in the LIMIT instead of the WHERE)
EDIT:
Also make sure to properly sanitize your variable before injection.
Simple example of dynamically building a query:
$conditions = array();
if (...) {
$conditions['age'] = $age;
}
if (...) {
$conditions['name'] = $name;
}
...
if (!$conditions) {
die('No conditions supplied');
}
// if you're still using the mysql_ functions and haven't done so before:
$conditions = array_map('mysql_real_escape_string', $conditions);
foreach ($conditions as $field => &$value) {
$value = "`$field` = '$value'";
}
$query = 'SELECT ... WHERE ' . join(' AND ', $conditions);
It's really not hard to dynamically cobble together the exact query you want to create. Just be careful you don't mess up the SQL syntax or open yourself to more injection vulnerabilities. You may want to look at database abstraction layers, which pretty much allow you to pass a $conditions array into a function which will construct the actual query from it, more or less the way it's done above.

how to implement the an effective search algorithm when using php and a mysql database?

I'm new to web design, especially backend design so I have a few questions about implementing a search function in PHP. I already set up a MySQL connection but I don't know how to access specific rows in the MySQL table. Also is the similar text function implemented correctly considering I want to return results that are nearly the same as the search term? Right now, I can only return results that are the exact same or it gives "no result." For example, if I search "tex" it would return results containing "text"? I realize that there are a lot of mistakes in my coding and logic, so please help if possible. Event is the name of the row I am trying to access.
$input = $_POST["searchevent"];
while ($events = mysql_fetch_row($Event)) {
$eventname = $events[1];
$eventid = $events[0];
$diff = similar_text($input, $event, $hold)
if ($hold == '100') {
echo $eventname;
break;
else
echo "no result";
}
Thank you.
I've noticed some of the comments mentioned more efficient ways of performing the search than with the "similar text" function, if I were to use the LIKE function, how would it be implemented?
A couple of different ways of doing this:
The faster one (performance wise) is:
select * FROM Table where keyword LIKE '%value%'
The trick in this one is the placement of the % which is a wildcard, saying either search everything that ends or begins with this value.
A more flexible but (slightly) slower one could be the REGEXP function:
Select * FROM Table WHERE keyword REGEXP 'value'
This is using the power of regular expressions, so you could get as elaborate as you wanted with it. However, leaving as above gives you a "poor man's Google" of sorts, allowing the search to be bits and pieces of overall fields.
The sticky part comes in if you're trying to search names. For example, either would find the name "smith" if you searched SMI. However, neither would find "Jon Smith" if there was a first and last name field separated. So, you'd have to do some concatenation for the search to find either Jon OR Smith OR Jon Smith OR Smith, Jon. It can really snowball from there.
Of course, if you're doing some sort of advanced search, you'll have to condition your query accordingly. So, for instance, if you wanted to search first, last, address, then your query would have to test for each:
SELECT * FROM table WHERE first LIKE '%value%' OR last LIKE '%value%' OR address LIKE '%value'
Look at below example :
$word2compare = "stupid";
$words = array(
'stupid',
'stu and pid',
'hello',
'foobar',
'stpid',
'upid',
'stuuupid',
'sstuuupiiid',
);
while(list($id, $str) = each($words)){
similar_text($str, $word2compare, $percent);
if($percent > 90) // Change percentage value to 80,70,60 and see changes
print "Comparing '$word2compare' with '$str': ";
}
You can check with $percent parameter for how strong match you want to apply.

Categories