advanced search with mysql - php

I'm creating a search function for my website where the user can put in anything he likes in a textfield. It get's matched against anything (name, title, job, car brand, ... you name it)
I initially wrote the query with an INNER JOIN on every table that needed to be searched.
SELECT column1, column2, ... FROM person INNER JOIN person_car ON ... INNER JOIN car ...
This ended up in a query with 6 or 8 INNER JOINs, and a whole lot WHERE ... LIKE '%searchvalue%'
Now this query seems to cause a time'out in MySql, and I even got a warning from my hosting provider that the queries just taking up too many resources.
Now obviously I'm doing this very wrong, but I was wondering how the correct approach to these kind of search functions is.

Use multiple queries or UNION multiple queries so they go into a single resultset.
Additionally, using FULLTEXT indexes will most likely help to speed up your queries since (LIKE '%string%') - especially with the leading '%' - is extremely slow (all rows need to be checked without using indexes)

I recommend implementing Lucene or Sphynx search engines, they are much faster and scalable than sql queries.

Related

MYSQL Query optimization, comparing 3 tables w/ thousands of records

i have this query:
SELECT L.sku,L.desc1,M.map,T.retail FROM listing L INNER JOIN moto M ON L.sku=M.sku INNER JOIN truck T ON L.sku=T.sku LIMIT 5;
Each table (listing,moto,truck) has ~300.000 rows, and just for testing purppose i've set a LIMIT of 5 results, at the end i will need hundreds but let see...
That query takes like 3:26 minutes in Console...i wont imagine how much it will take with PHP...i need to handle it there
Any advice/solution to Optmize the query? Thanks!
Two things to recommend here:
Indexes
Denormalization
One thing people tend to do when databases get massive is invoke Denormalization. This is when you store the data from multiple tables in one table to prevent the need to do a join. This is useful if your application relies on specific reads to power it. It is a commonly used tactic when scaling.
If Denormalization is out of the question, another, simpler way to optimize this query would be to make sure you have indexes on the columns you are running the join against. So the columns L.sku, m.sku,T.sku would need to be indexed, you will immediately notice an increase in performance.
Any other optimizations I would need some more information about the data, hope it helps!

Joining a count query mysql for performance

Have searched but can't find an answer which suits the exact needs for this mysql query.
I have the following quires on multiple tables to generate "stats" for an application:
SELECT COUNT(id) as count FROM `mod_**` WHERE `published`='1';
SELECT COUNT(id) as count FROM `mod_***` WHERE `published`='1';
SELECT COUNT(id) as count FROM `mod_****`;
SELECT COUNT(id) as count FROM `mod_*****`;
pretty simple just counts the rows sometimes based on a status.
however in the pursuit of performance i would love to get this into 1 query to save resources.
I'm using php to fetch this data with simple mysql_fetch_assoc and retrieving $res[count] if it makes a difference (pro isn't guaranteed, so plain old mysql here).
The overhead of sending a query and getting a single-row response is very small.
There is nothing to gain here by combining the queries.
If you don't have indexes yet an INDEX on the published column will greatly speed up the first two queries.
You can use something like
SELECT SUM(published=1)
for some of that. MySQL will take the boolean result of published=1 and translate it to an integer 0 or 1, which can be summed up.
But it looks like you're dealing with MULTIPLE tables (if that's what the **, *** etc... are), in which case you can't really. You could use a UNION query, e.g.:
SELECT ...
UNION ALL
SELECT ...
UNION ALL
SELECT ...
etc...
That can be fired off as one single query to the DB, but it'll still execute each sub-query as its own query, and simply aggregate the individual result sets into one larger set.
Disagreeing with #Halcyon I think there is an appreciable difference, especially if the MySQL server is on a different machine, as every single query uses at least one network packet.
I recommend you UNION the queries with a marker field to protect against the unexpected.
As #Halcyon said there is not much to gain here. You can anyway do several UNIONS to get all the result in one query

Combining different table queries in db with PHP and displaying all results on one page

I have been trying to create a database for fun to get a better understanding of databases and using PHP to query them for a website I'm messing around with. Pretty much I have one database with 4 tables when a user enters a search term in a PHP search box my code searches the database for any entries containing the search term. Now I can easily get my code to search individual tables, but I cannot seem to get it to search all 4 tables and display the results on the same page.
info: making a database for skyrim
Table names: classes, powers, skills, shouts
column names: name, information
Here is a snippet of the code I have that works so far:
$raw_results = mysql_query("
SELECT *
FROM `xaviorin_skyrim`.`shouts` , `xaviorin_skyrim`.`classes`
WHERE (CONVERT(`UID` USING utf8) LIKE '%".$query."%' OR
CONVERT(`Name` USING utf8) LIKE '%".$query."%' OR
CONVERT(`Information` USING utf8) LIKE '%".$query."%')
") or die(mysql_error());`
Literally all I thought I would need to do is change the table name from "shouts" to say "classes" in a new raw_results line of code but that didn't work. I have attempted unions and joins and either keep screwing them up or just don't understand how to properly format them.
echo "<p><h3>".$results['Name']."</h3>".$results['Information']."</p>";
The code above this text is what displays the results on the page on my website... it works but I don't know how to combine the information from all 4 tables into one page. If I'm going about this in the wrong way and anyone can point me in the right direction I would GREATLY appreciate it... I've been trying to research the problem without finding a proper answer for near a month now.
The problem with your approach is that relational databases do a cross join when there are several query results from two different tables. So basically every match in one table will be combined with every match from the second table. When you have 3 entries in the first and 4 in the second table, you will get 3 * 4 = 12 entries in your query result. If you add more tables, you get even more results. You want to do a full text search in several tables that are totally unrelated, thus creating some kind of non-existing relation via cross joining them will not be useful.
What you actually want to do is a UNION ALL (UNION is slower because it prunes duplicates) of several queries:
SELECT name, information, 'shouts' AS tablename FROM shouts WHERE ...
UNION ALL
SELECT name, information, 'classes' AS tablename FROM classes WHERE ...
This will do search queries on every single table and then place the results in a single result. Also note that I added a third column to each query to ensure that the originating table is not lost after merging the results.
Unless you need to do some sorting afterwards, I would suggest that you do all statements separately. Combining them this way will most likely make the post-processing more complex. And several single queries will also be faster than one big query with UNION statements.
And as I mentioned in the comments: Don't use mysql_* functions!

PHP MYSQL refine search multiple queries

I am currently building a site for a car dealership. I would like to allow the user to refine the search results similar to amazon or ebay. The ability to narrow down the results with a click would be great. The problem is the way I am doing this now there are many different queries that need to be done each with a COUNT total.
So the main ways to narrow down the results are:
Vehicle Type
Year
Make
Price Range
New/Used
Currently I am doing 5 queries every time this page is loaded to get the numbers of results while passing in the set values.
Query 1:
SELECT vehicle_type, COUNT(*) AS total FROM inventory
[[ Already Selected Search Parameters]]
GROUP BY vehicle_type
ORDER BY vehicle_type ASC
Query 2:
SELECT make, COUNT(*) AS total FROM inventory
[[ Already Selected Search Parameters]]
GROUP BY make
ORDER BY make ASC
Query 3,4,5...
Is there any way to do this in one query? Is it faster?
Your queries seem reasonable.
You can do it in a single query using UNION ALL:
SELECT 'vehicle_type' AS query_type, vehicle_type, COUNT(*) AS total
FROM inventory
...
GROUP BY vehicle_type
UNION ALL
SELECT 'make', make, COUNT(*) AS total FROM inventory ... GROUP BY make
UNION ALL
SELECT ... etc ...
The performance benefit of this will not be huge.
If you find that you are firing off these queries a lot and the results don't change often, you might want to consider caching the results. Consider using something like memcache.
There are a couple ways to rank data along the lines of data warehousing but what you are trying to accomplish in search terms is called facets. A real search engine (which would be used with the sites you mentioned) performs this.
SEE: Faceted searching and categories in MySQL and Solr
Many sites use Lucene (Java-based) search engine with SOLR to accomplish what you are referring to. There is a newer solution called ElasticSearch that has a RESTful API and offers facets but you'd need to install Java, ES, and then could make calls to search engine that returns native JSON.
SEE: http://www.elasticsearch.org/guide/reference/api/search/facets/
Doing it in MySQL without requiring so many joins might need additional tables and perhaps triggers and gets complex. If the car dealership isn't expecting Cars.com traffic (millions/day) then you may be trying to optimize something before it actually needs it. Your recursive query might be fast enough and you haven't reported that there is an actual issue or bottleneck.
Use JOIN syntax:
http://dev.mysql.com/doc/refman/5.6/en/join.html
Or, I think you could write MySQL function for that. Where you will pass your search Parameters.
http://dev.mysql.com/doc/refman/5.1/en/create-function.html
To find where is faster you should do your own speed tests. That helped me to find out that some of my queries faster without joining them.

MySQL headache, should I or should I not?

I have a classifieds website.
I am using SOLR for indexing and storing data. Then I also have a MySQL db with some more information about the classified which I dont store or index.
Now, I have a pretty normalized db with 4 tables.
Whenever ads are searched on the website, SOLR does the searching and returns an array of ID_numbers which will then be used to query mysql.
So solr returns id:s, which are then used to get all ads from the mysql db with THOSE id:s.
Now, all the JOIN and relations between my tables gives me a headache.
What except for maintanance-ease do I get for having a normalized db?
I could you know, store all info into one table with some 50 columns.
So instead of this for finding one ad and displaying it:
SELECT
category_option.option_name,
option_values.value
FROM classified, category_option, option_values
WHERE classified.classified_id=?id
AND classified.cat_id=category_options.cat_id
AND option_values.option_id=category_options.option_id
I could use this:
SELECT * FROM table_name WHERE classified_id = $classified_id
Isn't the last one actually faster?
Or does a normalized db permform faster?
Thanks
I would advise against denormalizing in your situation. You'll get better with joins as you use them more and they start to become clearer in your head, and maintenance ease is a good benefit for the future.
Here's a pretty good link about normalization (and denormalization). Here's a question about denormalization. One answer suggests creating a view using joins to get the data you need, and using that like your SELECT * FROM table_name WHERE classified_id = $classified_id query. A normalized DB will likely be slower, but it's unlikely you'll want to denormalize for that reason. I hope this provides some help.
Whenever you do denormalization you usually gain reading speed and lose write speed, because you have to write the same value many times. Additionally, extra care should be taken to maintain data integrity.
How many times the query will be executed?
Is this a high traffic application?
Can you add a cache?
The query using a JOIN is trivial as far as MySQL joins are concerned. I see no need to denormalize this.
I would however suggest rewriting it to not be such a PITA to read:
SELECT
category_option.option_name,
option_values.value
FROM classified
JOIN category_option USING (cat_id)
JOIN option_values USING (option_id)
WHERE classified.classified_id = ?

Categories