Query using three terms in SQL/PHP - php

I'm using WordPress, but this question is more pertaining to the SQL involved. I'll gladly move it if I need to.
I'm working on http://www.libertyguide.com/jobs and I'm trying to alter the filtering mechanics. Currently it's a global OR query.
Anyways, I have three filtering lists, and I'm storing what's selected into three strings (interests, type, experience) in the following way:
"( $wpdb->terms.slug = 'webdevelopment' OR $wpdb->terms.slug = 'journalism' OR ... ) AND"
It's populated by whatever is selected in my filtering lists.
When it comes down to it, I have this as a basic query (I'm leaving out the LEFT JOINS):
Before:
SELECT * FROM $wpdb->posts WHERE ($wpdb->terms.slug = 'fromlist1'
OR $wpdb->terms.slug = 'fromlist2' OR $wpdb->terms.slug = 'fromlist3')
AND $wpdb->term_taxonomy.taxonomy = 'jobtype'...
After:
SELECT * FROM $wpdb->posts WHERE
($wpdb->terms.slug = 'fromlist1' OR $wpdb->terms.slug = 'fromlist1again')
AND ($wpdb->terms.slug = 'fromlist2' OR $wpdb->terms.slug = 'fromlist2again')
AND ($wpdb->terms.slug = 'fromlist3' OR $wpdb->terms.slug = 'fromlist3again')
AND $wpdb->term_taxonomy.taxonomy = 'jobtype'...
So essentially I want to go from an
OR filter
to
an AND filter with OR filtering inbetween.
My new filtering only works when one item overall is selected, but returns nothing when I select more than one thing (that I know would match up with a few posts).
I've thought through the logic and I don't see anything wrong with it. I know nothing is wrong with anything else, so it has to be the query itself.
Any step in the right direction would be greatly appreciated. Thanks!
UPDATE
From the confusion, basically I have this:
"SELECT ...... WHERE $terms ..."
but I WANT
"SELECT ....... WHERE $interests AND $type AND $experience"
I don't want to have it filter $interest[1] OR $interest[2] OR $type[1] OR $experience[1], but instead want it to filter ($interest[1] OR $interest[2]) AND ($type[1]) AND ($experience[1])
I hope this makes more sense
*UPDATE 2*
Here's and example:
In my interests list, I select for example three things: WebDevelopment, Academia, Journalism.
In my type list, I choose two things: Fulltime, Parttime
In my experience list, I choose three things: Earlycareer, Midcareer, Latecareer.
When I run my query, I want to make sure that each record has AT LEAST one of each of the three lists. Possible Results: (WebDevelopment, Parttime, Midcareer), (Academia, Fulltime, Earlycareer, Midcareer).
NOT A RESULT: (Journalism, Earlycareer) - missing fulltime or parttime
I really hope this clears it up more. I'm willing to give compensation if I can get this working correctly.

Okay, I'll take a shot at this:
SELECT * FROM $wpdb->posts WHERE
(
$wpdb->terms.slug IN ('$interest1', '$interest2') AND
$wpdb->terms.slug IN ('$type1', '$type2') AND
$wpdb->terms.slug IN ('$exp1', '$exp2')
)
AND $wpdb->term_taxonomy.taxonomy = 'jobtype'
The IN keyword will return true if any member of the set matches.

I think you're looking for a WHERE category IN (comma, seperated, list, of, values) that you can generate dynamically from the form. If you combine it with the other categories, you can require them to select something from each with...
WHERE category1 IN (a, comma, seperated, list, of, values)
AND category2 IN (another, list, of, values)
AND ...
Which will only return a value if there is something selected from each category and will return nothing if any of the selection lists are empty; actually it may well kick out an error, so I would also generate the query dynamically if there is any content whatsoever for a given category.
if (!empty($arrayOfCategory1)) {
//sanitize input logic here
$Category[1] = 'category1 IN ('. implode(', ', $arrayOfCategory1) .')';
} else {
$Category[1] = '';
}
You concatenate the resultant string together and build the query with that. The WHERE 1=1 trick is problematic because if nothing is chosen, everything in the database will match, so I strongly recommend going through the process of adding the AND operators properly.
EDIT: it occurs to me that if you build the conditional statements as an array, you can implode those with ' AND ' and get the query in a fairly small number of lines of code.

Sort of confused a bit by what you are saying but if I wanted to build a filter I would be dynamically generating the SQL query based on the submitted filter values. Something like:
$sql = "SELECT * FROM $wpdb->posts WHERE 1=1";
if ( !empty($interest) ) { // they ticked the interested in ??? checkbox
$sql .= " AND $wpdb->terms.slug = $interest"
}
Obviously you will need to filter and escape any values that have been submitted.

Related

Efficient 2 dimensional array search in MySql

I am trying to design an application and part of it is to show users new articles in different categories after the last visit of the user to the webapp. To this I use MySql and have a table that keeps track of last visits and I can query the table to get a php array like below:
$array =[[user1,category1, datetime1],[user1,category2, datetime2],[user1,category3,datetime3]];
Where user is the user id and datetime is the visited datetime and category is the article category.
Having the setup above, I am trying to get new articles from the article table where the publish date is after user last visited to categories.
I can achieve this by multiple OR in a query like below, however it is not really a good and nice looking query, and probable not scalable. Is there any other way of doing this which is simpler and faster?
$multiwhere=[];
foreach($array as $a){
$multiwhere[]="select article_id from articles where category=".$a[1]." and publish_date>".$a[2];
}
And the final query would be like this:
"Select * from articles where article_id in (".implode(" or ".$multiwhere.")";
I deeply appreciate any suggestion to improve the query above.
Your query is almost correct, apart from the fact that you first retrieve all the article_id you want, and then use them to query for those articles. You can do that in one step, like so:
$multiwhere = [];
foreach ($array as $a) {
$multiwhere[] = "(category = " . $a[1] . " AND publish_date >= " . $a[2] .")";
}
$query = "SELECT * FROM articles";
if (count($multiwhere) > 0) {
$query = " WHERE " . implode(" OR ", $multiwhere);
}
One query will do.
I kept the way you use the $array, but it looks weird to me. Especially around publish_date. I cannot change that because I don't know the type of the field. And, of course, $array is quite a bad name. It tells you what the type of the variable is, not what it contains, as it should. A better name would be: $lastCategoryVisits, or something like that. Your loop should look something like this:
foreach ($lastCategoryVisits as $lastCategoryVisit) {
$category = $lastCategoryVisit["category"];
$lastVisit = $lastCategoryVisit["lastVisit"];
$QueryConditions[] = "(category = '$category' AND publish_date >= '$lastVisit')";
}
Don't be afraid to write out what your code actually does. It might be a bit longer, but now you can see what is going on. This will not slow down the execution of your code at all.
Finally, it would be better to always use prepared statements to prevent the possibility of SQL-injection. If you get into the habit of always doing this you don't have to use excuses like: "It is not important in this project.", "I'll to it later when the code works." or "The data for this query doesn't come from an user.".

writing a second query based on results of the first - Not working

*MySQL will be upgraded later.
Preface: Authors can register in two languages and, for various additional reasons, that meant 2 databases. We realize that the setup appears odd in the use of multiple databases but it is more this abbreviated explanation that makes it seem so. So please ignore that oddity.
Situation:
My first query produces a recordset of authors who have cancelled their subscription. It finds them in the first database.
require_once('ConnString/FirstAuth.php');
mysql_select_db($xxxxx, $xxxxxx);
$query_Recordset1 = "SELECT auth_email FROM Authors WHERE Cancel = 'Cancel'";
$Recordset1 = mysql_query($query_Recordset1, $xxxxxx) or die(mysql_error());
$row_Recordset1 = mysql_fetch_assoc($Recordset1);
In the second db where they are also listed, (table and column names are identical) I want to update them because they cancelled. To select their records for updating, I want to take the first recordset, put it into an array, swap out the connStrings, then search using that array.
These also work.
$results = array();
do {
results[] = $row_Recordset1;
} while ($row_Recordset1 = mysql_fetch_assoc($Recordset1));
print_r($results);
gives me an array. Array ( [0] => Array ( [auth_email] => renault#autxxx.com ) [1] => Array ( [auth_email] => rinaldi#autxxx.com ) [2] => Array ( [auth_email] => hemingway#autxxx.com )) ...so I know it is finding the first set of data.
Here's the problem: The query of the second database looks for the author by auth_email if it is 'IN' the $results array, but it is not finding the authors in the 2nd database as I expected. Please note the different connString
require_once('ConnString/SecondAuth.php');
mysql_select_db($xxxxx, $xxxxxx);
$query_Recordset2 = "SELECT auth_email FROM Authors WHERE auth_email IN('$results')";
$Recordset2 = mysql_query($query_Recordset2, $xxxxxx) or die(mysql_error());
$row_Recordset2 = mysql_fetch_assoc($Recordset2);
The var_dump is 0 but I know that there are two records in there that should be found.
I've tried various combinations of IN like {$results}, but when I got to "'$results'", it was time to ask for help. I've checked all the available posts and none resolve my problem though I am now more familiar with the wild goose population.
I thought that since I swapped out the connection string, maybe $result was made null so I re-set it to the original connString and it still didn't find auth_email in $results in the same database where it certainly should have done.
Further, I've swapped out connStrings before with positive results, so... hmmm...
My goal, when working, is to echo the Recordset2 into a form with a do/while loop that will permit me to update their record in the 2nd db. Since the var_dump is 0, obviously this below isn't giving me a list of authors in the second table whose email addresses appear in the $results array, but I include it as example of what I want use to start the form in the page.
do {
$row_Recordset2['auth_email_addr '];
} while($row_Recordset2 = mysql_fetch_assoc($Recordset2));
As always, any pointer you can give are appreciated and correct answers are Accepted.
If you have a db user that has access to both databases and tables, just use a cross database query to do the update
UPDATE
mydb.Authors,
mydb2.Authors
SET
mydb.Authors.somefield = 'somevalue'
WHERE
mydb.Authors.auth_email = mydb2.Authors.auth_email AND
mydb2.Authors.Cancel= 'Cancel'
The IN clause excepts variables formated like this IN(var1,var2,var3)
You should use function to create a string, containing variables from this array.
//the simplest way to go
$string = '';
foreach($results as $r){
foreach($r as $r){
$string .= $r.",";
}
}
$string = substr($string,0,-1); //remove the ',' from the end of string
Its not tested, and obviously not the best way to go, but to show you the idea of your problem and how to handle it is this code quite relevant.
Now use $string instead of $results in query

text input (seperated by comma) mysql input as array

I have a form where I am trying to implement a tag system.
It is just an:
<input type="text"/>
with values separated by commas.
e.g. "John,Mary,Ben,Steven,George"
(The list can be as long as the user wants it to be.)
I want to take that list and insert it into my database as an array (where users can add more tags later if they want). I suppose it doesn't have to be an array, that is just what seems will work best.
So, my question is how to take that list, turn it into an array, echo the array (values separated by commas), add more values later, and make the array searchable for other users. I know this question seems elementary, but no matter how much reading I do, I just can't seem to wrap my brain around how it all works. Once I think I have it figured out, something goes wrong. A simple example would be really appreciated. Thanks!
Here's what I got so far:
$DBCONNECT
$artisttags = $info['artisttags'];
$full_name = $info['full_name'];
$tel = $info['tel'];
$mainint = $info['maininst'];
if(isset($_POST['submit'])) {
$tags = $_POST['tags'];
if($artisttags == NULL) {
$artisttagsarray = array($full_name, $tel, $maininst);
array_push($artisttagsarray,$tags);
mysql_query("UPDATE users SET artisttags='$artisttagsarray' WHERE id='$id'");
print_r($artisttagsarray); //to see if I did it right
die();
} else {
array_push($artisttags,$tags);
mysql_query("UPDATE users SET artisttags='$artisttags' WHERE id='$id'");
echo $tags;
echo " <br/>";
echo $artisttags;
die();
}
}
Create a new table, let's call it "tags":
tags
- userid
- artisttag
Each user may have multiple rows in this table (with one different tag on each row). When querying you use a JOIN operation to combine the two tables. For example:
SELECT username, artisttag
FROM users, tags
WHERE users.userid = tags.userid
AND users.userid = 4711
This will give you all information about the user with id 4711.
Relational database systems are built for this type of work so it will not waste space and performance. In fact, this is the optimal way of doing it if you want to be able to search the tags.

Problem: Writing a MySQL parser to split JOIN's and run them as individual queries (denormalizing the query dynamically)

I am trying to figure out a script to take a MySQL query and turn it into individual queries, i.e. denormalizing the query dynamically.
As a test I have built a simple article system that has 4 tables:
articles
article_id
article_format_id
article_title
article_body
article_date
article_categories
article_id
category_id
categories
category_id
category_title
formats
format_id
format_title
An article can be in more than one category but only have one format. I feel this is a good example of a real-life situation.
On the category page which lists all of the articles (pulling in the format_title as well) this could be easily achieved with the following query:
SELECT articles.*, formats.format_title
FROM articles
INNER JOIN formats ON articles.article_format_id = formats.format_id
INNER JOIN article_categories ON articles.article_id = article_categories.article_id
WHERE article_categories.category_id = 2
ORDER BY articles.article_date DESC
However the script I am trying to build would receive this query, parse it and run the queries individually.
So in this category page example the script would effectively run this (worked out dynamically):
// Select article_categories
$sql = "SELECT * FROM article_categories WHERE category_id = 2";
$query = mysql_query($sql);
while ($row_article_categories = mysql_fetch_array($query, MYSQL_ASSOC)) {
// Select articles
$sql2 = "SELECT * FROM articles WHERE article_id = " . $row_article_categories['article_id'];
$query2 = mysql_query($sql2);
while ($row_articles = mysql_fetch_array($query2, MYSQL_ASSOC)) {
// Select formats
$sql3 = "SELECT * FROM formats WHERE format_id = " . $row_articles['article_format_id'];
$query3 = mysql_query($sql3);
$row_formats = mysql_fetch_array($query3, MYSQL_ASSOC);
// Merge articles and formats
$row_articles = array_merge($row_articles, $row_formats);
// Add to array
$out[] = $row_articles;
}
}
// Sort articles by date
foreach ($out as $key => $row) {
$arr[$key] = $row['article_date'];
}
array_multisort($arr, SORT_DESC, $out);
// Output articles - this would not be part of the script obviously it should just return the $out array
foreach ($out as $row) {
echo '<p>'.$row['article_title'].' <i>('.$row['format_title'].')</i><br />'.$row['article_body'].'<br /><span class="date">'.date("F jS Y", strtotime($row['article_date'])).'</span></p>';
}
The challenges of this are working out the correct queries in the right order, as you can put column names for SELECT and JOIN's in any order in the query (this is what MySQL and other SQL databases translate so well) and working out the information logic in PHP.
I am currently parsing the query using SQL_Parser which works well in splitting up the query into a multi-dimensional array, but working out the stuff mentioned above is the headache.
Any help or suggestions would be much appreciated.
From what I gather you're trying to put a layer between a 3rd-party forum application that you can't modify (obfuscated code perhaps?) and MySQL. This layer will intercept queries, re-write them to be executable individually, and generate PHP code to execute them against the database and return the aggregate result. This is a very bad idea.
It seems strange that you imply the impossibility of adding code and simultaneously suggest generating code to be added. Hopefully you're not planning on using something like funcall to inject code. This is a very bad idea.
The calls from others to avoid your initial approach and focus on the database is very sound advice. I'll add my voice to that hopefully growing chorus.
We'll assume some constraints:
You're running MySQL 5.0 or greater.
The queries cannot change.
The database tables cannot be changed.
You already have appropriate indexes in place for the tables the troublesome queries are referencing.
You have triple-checked the slow queries (and run EXPLAIN) hitting your DB and have attempted to setup indexes that would help them run faster.
The load the inner joins are placing on your MySQL install is unacceptable.
Three possible solutions:
You could deal with this problem easily by investing money into your current database by upgrading the hardware it runs on to something with more cores, more (as much as you can afford) RAM, and faster disks. If you've got the money Fusion-io's products come highly recommended for this sort of thing. This is probably the simpler of the three options I'll offer
Setup a second master MySQL database and pair it with the first. Make sure you have the ability to force AUTO_INCREMENT id alternation (one DB uses even id's, the other odd). This doesn't scale forever, but it does offer you some breathing room for the price of the hardware and rack space. Again, beef up the hardware. You may have already done this, but if not it's worth consideration.
Use something like dbShards. You still need to throw more hardware at this, but you have the added benefit of being able to scale beyond two machines and you can buy lower cost hardware over time.
To improve database performance you typically look for ways to:
Reduce the number of database calls
Making each database call as efficient as possible (via good design)
Reduce the amount of data to be transfered
...and you are doing the exact opposite? Deliberately?
On what grounds?
I'm sorry, you are doing this entirely wrong, and every single problem you encounter down this road will all be consequences of that first decision to implement a database engine outside of the database engine. You will be forced to work around work-arounds all the way to delivery date. (if you get there).
Also, we are talking about a forum? I mean, come on! Even on the most "web-scale-awesome-sauce" forums we're talking about less than what, 100 tps on average? You could do that on your laptop!
My advice is to forget about all this and implement things the most simple possible way. Then cache the aggregates (most recent, popular, statistics, whatever) in the application layer. Everything else in a forum is already primary key lookups.
I agree it sounds like a bad choice, but I can think of some situations where splitting a query could be useful.
I would try something similar to this, relying heavily on regular expressions for parsing the query. It would work in a very limited of cases, but it's support could be expanded progressively when needed.
<?php
/**
* That's a weird problem, but an interesting challenge!
* #link http://stackoverflow.com/questions/5019467/problem-writing-a-mysql-parser-to-split-joins-and-run-them-as-individual-query
*/
// Taken from the given example:
$sql = "SELECT articles.*, formats.format_title
FROM articles
INNER JOIN formats ON articles.article_format_id = formats.format_id
INNER JOIN article_categories ON articles.article_id = article_categories.article_id
WHERE article_categories.category_id = 2
ORDER BY articles.article_date DESC";
// Parse query
// (Limited to the clauses that are present in the example...)
// Edit: Made WHERE optional
if(!preg_match('/^\s*'.
'SELECT\s+(?P<select_rows>.*[^\s])'.
'\s+FROM\s+(?P<from>.*[^\s])'.
'(?:\s+WHERE\s+(?P<where>.*[^\s]))?'.
'(?:\s+ORDER\s+BY\s+(?P<order_by>.*[^\s]))?'.
'(?:\s+(?P<desc>DESC))?'.
'(.*)$/is',$sql,$query)
) {
trigger_error('Error parsing SQL!',E_USER_ERROR);
return false;
}
## Dump matches
#foreach($query as $key => $value) if(!is_int($key)) echo "\"$key\" => \"$value\"<br/>\n";
/* We get the following matches:
"select_rows" => "articles.*, formats.format_title"
"from" => "articles INNER JOIN formats ON articles.article_format_id = formats.format_id INNER JOIN article_categories ON articles.article_id = article_categories.article_id"
"where" => "article_categories.category_id = 2"
"order_by" => "articles.article_date"
"desc" => "DESC"
/**/
// Will only support WHERE conditions separated by AND that are to be
// tested on a single individual table.
if(#$query['where']) // Edit: Made WHERE optional
$where_conditions = preg_split('/\s+AND\s+/is',$query['where']);
// Retrieve individual table information & data
$tables = array();
$from_conditions = array();
$from_tables = preg_split('/\s+INNER\s+JOIN\s+/is',$query['from']);
foreach($from_tables as $from_table) {
if(!preg_match('/^(?P<table_name>[^\s]*)'.
'(?P<on_clause>\s+ON\s+(?P<table_a>.*)\.(?P<column_a>.*)\s*'.
'=\s*(?P<table_b>.*)\.(?P<column_b>.*))?$/im',$from_table,$matches)
) {
trigger_error("Error parsing SQL! Unexpected format in FROM clause: $from_table", E_USER_ERROR);
return false;
}
## Dump matches
#foreach($matches as $key => $value) if(!is_int($key)) echo "\"$key\" => \"$value\"<br/>\n";
// Remember on_clause for later jointure
// We do assume each INNER JOIN's ON clause compares left table to
// right table. Forget about parsing more complex conditions in the
// ON clause...
if(#$matches['on_clause'])
$from_conditions[$matches['table_name']] = array(
'column_a' => $matches['column_a'],
'column_b' => $matches['column_b']
);
// Match applicable WHERE conditions
$where = array();
if(#$query['where']) // Edit: Made WHERE optional
foreach($where_conditions as $where_condition)
if(preg_match("/^$matches[table_name]\.(.*)$/",$where_condition,$matched))
$where[] = $matched[1];
$where_clause = empty($where) ? null : implode(' AND ',$where);
// We simply ignore $query[select_rows] and use '*' everywhere...
$query = "SELECT * FROM $matches[table_name]".($where_clause? " WHERE $where_clause" : '');
echo "$query<br/>\n";
// Retrieve table's data
// Fetching the entire table data right away avoids multiplying MySQL
// queries exponentially...
$table = array();
if($results = mysql_query($table))
while($row = mysql_fetch_array($results, MYSQL_ASSOC))
$table[] = $row;
// Sort table if applicable
if(preg_match("/^$matches[table_name]\.(.*)$/",$query['order_by'],$matched)) {
$sort_key = $matched[1];
// #todo Do your bubble sort here!
if(#$query['desc']) array_reverse($table);
}
$tables[$matches['table_name']] = $table;
}
// From here, all data is fetched.
// All left to do is the actual jointure.
/**
* Equijoin/Theta-join.
* Joins relation $R and $S where $a from $R compares to $b from $S.
* #param array $R A relation (set of tuples).
* #param array $S A relation (set of tuples).
* #param string $a Attribute from $R to compare.
* #param string $b Attribute from $S to compare.
* #return array A relation resulting from the equijoin/theta-join.
*/
function equijoin($R,$S,$a,$b) {
$T = array();
if(empty($R) or empty($S)) return $T;
foreach($R as $tupleR) foreach($S as $tupleS)
if($tupleR[$a] == #$tupleS[$b])
$T[] = array_merge($tupleR,$tupleS);
return $T;
}
$jointure = array_shift($tables);
if(!empty($tables)) foreach($tables as $table_name => $table)
$jointure = equijoin($jointure, $table,
$from_conditions[$table_name]['column_a'],
$from_conditions[$table_name]['column_b']);
return $jointure;
?>
Good night, and Good luck!
In instead of the sql rewriting I think you should create a denormalized articles table and change it at each article insert/delete/update. It will be MUCH simpler and cheaper.
Do the create and populate it:
create table articles_denormalized
...
insert into articles_denormalized
SELECT articles.*, formats.format_title
FROM articles
INNER JOIN formats ON articles.article_format_id = formats.format_id
INNER JOIN article_categories ON articles.article_id = article_categories.article_id
Now issue the appropriate article insert/update/delete against it and you will have a denormalized table always ready to be queried.

Specialized Search Query Refinement for Auto-Complete function

I am doing a query for an autocomplete function on a mysql table that has many instances of similar titles, generally things like different years, such as '2010 Chevrolet Lumina' or 'Chevrolet Lumina 2009', etc.
The query I am currently using is:
$result = mysql_query("SELECT * FROM products WHERE MATCH (name) AGAINST ('$mystring') LIMIT 10", $db);
The $mystring variable gets built as folows:
$queryString = addslashes($_REQUEST['queryString']);
if(strlen($queryString) > 0) {
$array = explode(' ', $queryString);
foreach($array as $var){
$ctr++;
if($ctr == '1'){
$mystring = '"' . $var . '"';
}
else {
$mystring .= ' "' . $var . '"';
}
}
}
What I need to be able to do is somehow group things so only one version of a very similar actually shows in the autosuggest dropdown, leaving room for other products with chevrolet in them as well. Currently it is showing all 10 spots filled with the same product with different years, options, etc.
This one should give some of you brainiacs a good workout :)
I think the best way to do this would be to create a new field on the products table, something like classification. All the models would be entered with the same classification (e.g. "Chevrolet"). You could then still MATCH AGAINST name, but GROUP BY classification. Assuming you are using MySQL you can cheat a little and get away with selecting values and matching against values that you are not grouping by. Technically in SQL this gives undefined results and many SQL engines will not even let you try to do this, but MySQL lets you do it -- and it returns a more-or-less random sample that matches. So, for example, if you did the above query, grouped by classification, only one model (picked pretty much at random) will show up in the auto-completer.

Categories