Speed Up MySQL (MyISAM) COUNTs with WHERE Clauses - php

We are implementing a system that analyses books. The system is written in PHP, and for each book loops through the words and analyses each of them, setting certain flags (that translate to database fields) from various regular expressions and other tests.
This results in a matches table, similar to the example below:
+------------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------------+--------------+------+-----+---------+----------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| regex | varchar(250) | YES | | NULL | |
| description | varchar(250) | NO | | NULL | |
| phonic_description | varchar(255) | NO | | NULL | |
| is_high_frequency | tinyint(1) | NO | | NULL | |
| is_readable | tinyint(1) | NO | | NULL | |
| book_id | bigint(20) | YES | | NULL | |
| matched_regex | varchar(255) | YES | | NULL | |
| [...] | | | | | |
+------------------------+--------------+------+-----+---------+----------------+
Most of the omitted fields are tinyint, either 0 or 1. There are currently 25 fields in the matches table.
There are ~2,000,000 rows in the matches table, the output of analyzing ~500 books.
Currently, there is a "reports" area of the site which queries the matches table like this:
SELECT COUNT(*)
FROM matches
WHERE is_readable = 1
AND other_flag = 0
AND another_flag = 1
However, at present it takes over a minute to fetch the main index report as each query takes about 0.7 seconds. I am caching this at a query level, but it still takes too long for the initial page load.
As I am not very experienced in how to manage datasets such as this, can anyone advise me of a better way to store or query this data? Are there any optimisations I can use with MySQL to improve the performance of these COUNTs, or am I better off using another database or data structure?
We are currently using MySQL with MyISAM tables and a VPS for this, so switching to a new database system altogether isn't out of the question.

You need to use indexes, create them on the columns you do a WHERE on most frequently.
ALTER TABLE `matches` ADD INDEX ( `is_readable` )
etc..
You can also create indexes based on multiple columns, if your doing the same type of query over and over its useful. phpMyAdmin has the index option on the structure page of the table at the bottom.

Add multi index to this table as you are selecting by more than one field. Below index should help a lot. Those type of indexes are very good for boolean / int columns. For indexes with varchar values read more here: http://dev.mysql.com/doc/refman/5.0/en/create-index.html
ALTER TABLE `matches` ADD INDEX ( `is_readable`, `other_flag`, `another_flag` )
One more thing is to check your queries by using EXPLAIN {YOUR WHOLE SQL STATEMENT} to check which index is used by DB. So in this example you should run query:
EXPLAIN ALTER TABLE `matches` ADD INDEX ( `is_readable`, `other_flag`, `another_flag` )
More info on EXPLAIN: http://dev.mysql.com/doc/refman/5.0/en/explain.html

Related

Which is better for storing Meta-description of categories , php var or mysql db

i am working on image site where, images belongs to the specific categories,
there are 70+ category types/name (i.e. books,trees,people,watches,cars,mobiles etc) as of now and will be near about 80-90 categories in near future , and each category has meta description, meta keywords , title specific to that category.
my question is where should i store this data ?
in php using 3-4 types of array and then using array lookup to
find and show the meta data specific to that directory ?
e.g.
$meta_description = array('books' => 'books meta description', 'trees' => 'trees meta description')
and then doing
if(isset($meta_description[$_GET['category']])){
echo $meta_description[$_GET['category']]);
}
in MySQL db table and then doing mysql query on each page load
for getting meta description, title data from db ?
e.g.
table
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| cat_name | varchar(50) | NO | MUL | | |
| title | varchar(100) | NO | | NULL | |
| metakw | varchar(100) | NO | | NULL | |
| metadesc | varchar(100) | NO | | NULL | |
| record_num | tinyint(2) | NO | PRI | NULL | auto_increment |
+------------+--------------+------+-----+---------+----------------+
and then getting info using mysql query on each page load.
i am looking for high performance as there are 20+ million records in the main table and there are already lots of mysql queries are fired every second, so i am lloking to lower the mysql burden .
Good practice is to keep as much cached as possible, therefore no. 1 is much better choice (as long as categories do not have to be configurable by user). However if you need to make it configurable then you should keep in the database and cache it using i.e. memcached http://www.memcached.org/

How can i update the Records included in another query using SUM and GROUP By in mysql

I am having a mysql table
content_votes_tmp
+------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+----------------+
| up | int(11) | NO | MUL | 0 | |
| down | int(11) | NO | | 0 | |
| ip | int(10) unsigned | NO | | NULL | |
| content | int(11) | NO | | NULL | |
| datetime | datetime | NO | | NULL | |
| is_updated | tinyint(2) | NO | | 0 | |
| record_num | int(11) | NO | PRI | NULL | auto_increment |
+------------+------------------+------+-----+---------+----------------+
surfers can vote up or vote down on posts i.e. content, a record gets inserted everytime a vote is given same as rating , in the table along with other data like ip , content id
Now i am trying to create cronjob script in php which will SUM(up) and SUM(down) of votes
like this,
mysqli_query($con, "SELECT SUM(up) as up_count, SUM(down) as down_count, content FROM `content_votes_tmp` WHERE is_updated = 0 GROUP by content")
and then by using while loop in php i can update the main table for the specific content id,
but i would like to set the records which are part of SUM to be marked as updated i.e. SET is_updated = 1, so the same values wont get summed again and again.
How can i achieve this ? using mysql query ? and work on same data set as , every second/milisecond the records are getting inserted in the table ,.
i can think of another way of achieving this is by getting all the non-updated records and doing sum in the php and then updating every record.
The simplest way would probably be a temporary table. Create one with the record_num values you want to select from;
CREATE TEMPORARY TABLE temp_table AS
SELECT record_num FROM `content_votes_tmp` WHERE is_updated = 0;
Then do your calculation using the temp table;
SELECT SUM(up) as up_count, SUM(down) as down_count, content
FROM `content_votes_tmp`
WHERE record_num IN (SELECT record_num FROM temp_table)
GROUP by content
Once you've received your result, you can set is_updated on the values you just calculated over;
UPDATE `content_votes_tmp`
SET is_updated = 1
WHERE record_num IN (SELECT record_num FROM temp_table)
If you want to reuse the connection to do the same thing again, you'll need to drop the temporary table before creating it again, but if you just want to do it a single time in a page, it will disappear automatically when the database is disconnected at the end of the page.

change values in mysql trigger with php?

i've created financialTrack table in mysql, to log inserted rows in financial table, and then create this trigger to doing it:
CREATE TRIGGER INS_after_financ
AFTER INSERT ON `financial` FOR EACH ROW
BEGIN
INSERT INTO `financialTrack` (user, changedValue) VALUES (NEW.user, NEW.Value);
END;
these are my tables structure :
TABLE NAME: financial
+--------------+--------------+-------+-------+
| Column | Type | Null | AI |
+--------------+--------------+-------+-------+
| id | int(10) | FALSE | TRUE |
| user | VARCHAR(40) | FALSE | |
| Value | BIGINT(12) | FALSE | |
+--------------+--------------+-------+-------+
TABLE NAME: financialTrack
+--------------+--------------+-------+-----------------+
| Column | Type | Null | Def.Value |
+--------------+--------------+-------+-----------------+
| user | VARCHAR(40) | FALSE | |
| changedValue | BIGINT(12) | FALSE | |
| ts | timestamp | FALSE |CURRENT_TIMESTAMP|
+--------------+--------------+-------+-----------------+
do you have any suggestion to fill user field in financialTrack table with PHP script and remove user column from financial table ?
There are several ways to approach this task, but this lecture will surely help you to learn the basics of handling database queries with PHP: http://php.net/manual/en/book.pdo.php.
PDO extension is currently quite popular and preferred over the other native mysql and mysqli extensions. You will find some other useful information by searching for PDO on stackoverflow.

Search feature for my site

I have a requirement to add a search feature to a site I'm building and was wondering if anyone has done something similar.
I have a sample table that contains the details of cats in this format:
Name, place, type, age, gender and size.
And I only have one search box where users can enter their search terms. My question is, how do I search the table if, for example someone types in "cat in Paris"?
I want to be able to search all the fields and return a something if found.
Is there any way to achieve this rather than having lots of boxes for them to select a search criteria? Any help or suggestion would be appreciated.
One of the simpler approaches that works very well in this situation is to do a fulltext search in mysql. You can have it index all of the columns and to a natural language search.
If you had a mysql table called cats with the following schema:
mysql> desc cats;
+--------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(100) | YES | MUL | NULL | |
| place | varchar(100) | YES | | NULL | |
| type | varchar(100) | YES | | NULL | |
| age | int(11) | YES | | NULL | |
| gender | varchar(100) | YES | | NULL | |
| size | varchar(100) | YES | | NULL | |
+--------+--------------+------+-----+---------+----------------+
You can run the following SQL to create the index:
CREATE FULLTEXT INDEX cats_search ON cats (name, type, place, gender);
Then when you get the search string 'male tabby in paris' you can search the table with this SQL:
SELECT *
, MATCH(name, type, place, gender)
AGAINST ('male tabby in paris' IN BOOLEAN MODE) relevance
FROM cats
WHERE MATCH(name, type, place, gender)
AGAINST ('male tabby in paris' IN BOOLEAN MODE)
ORDER BY relevance DESC;
will return all of the rows that match those terms in the order mysql decides is most relevant.
You will have to research mysql fulltext searches to fine tune the results they way you want, but this should get you off the ground.

Recursive-ish query for tags?

I have a table of tags that can be linked to other tags and I want to "recursively" select the tags in order of arrangement. So that when a search is made, we get the immediate (1-level) results and then carry on down to say 5-levels so that we always have a list of tags no matter if there wasn't enough exact matches on level 1.
I can manage this fine with making multiple queries until I get enough results, but surely there is a better, optimized, way via a one-trip query?
Any tips will be appreciated.
Thanks!
Results:
tagId, tagWord, child, child tagId
'513', 'Slap', 'Hog Slapper', '1518'
'513', 'Slap', 'Corporal Punishment', '147'
'513', 'Slap', 'Impact Play', '1394'
Query:
SELECT t.tagId, t.tagWord as tag, tt.tagWord as child, tt.tagId as childId
FROM platform.tagWords t
INNER JOIN platform.tagsLinks l ON l.parentId = t.tagId
INNER JOIN platform.tagWords tt ON tt.tagId = l.tagId
WHERE t.tagWord = 'slap'
Table Layouts:
mysql> explain tagWords;
+---------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------------------+------+-----+---------+----------------+
| tagId | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| tagWord | varchar(45) | YES | UNI | NULL | |
+---------+---------------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)
mysql> explain tagsLinks;
+----------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+---------------------+------+-----+---------+-------+
| tagId | bigint(20) unsigned | NO | | NULL | |
| parentId | bigint(20) | YES | | NULL | |
+----------+---------------------+------+-----+---------+-------+
2 rows in set (0.00 sec)
AFAIK Mysql doesn't have any mechanism for querying data recursively
Oracle has Connected By construct and Sql Server has CTE(Common Table Expressions).
But Mysql,
Read Here and Here
Here are the options that I consider each time I find myself in a situation when I need to query hierarchical data.
Nested Sets
Path enumeration
Explicit joins (when the maximum level is known)
Vendor Extensions (SQL Server CTE, Oracle Connect by etc)
Stored Procedures
Suck it up

Categories