Related
Alright so I'll try to explain it as simple as possible; consider that I have two database tables (MySQL Server / MariaDB, database-related tasks coded in procedural style in PHP using prepared statements):
in one, I have a column of datatype JSON, whose content corresponds to sth like {name1:info,name2:info}
In another one, I have simple non-json records, having a structure like:
name | status
------+--------
name1 | statusX
------+--------
name2 | statusY
My Goal: I need to retrieve the name2 from table 1), but I also need to retrieve the status of the person having that same name (which in this case is statusY). Note that, for the retrieval of name2, I cannot rely on indexes of the json object (name2 may be the first key of the json object).
How I would do it so far:
A) Get the name2 from table 1) in a first query, sanitize it, and
B) use it in the second query which then correctly retrieves the statusY
Both statements A) and B) are parametrized prepared sql statements, triggered by an AJAX Call at a regular interval (AJAX Polling).
Given that these database queries are thus executed frequently, I want them to be executed as fast as possible, and thus ideally reduce my two queries above to a single one. My problem: I need the result of statement A) to execute statement B), so I cannot summarize the two queries into a single prepared statement, as prepared statements cannot contain multiple sql statements. The best solution to reach what I want is create a stored procedure like:
SET #name = SELECT ..... FROM table_1; SELECT .... FROM table_2;
And then execute it as parametrized prepared statement; is that correct?
I'm not at all experienced with stored procedures in MySQL Server, didn't really need them yet, but they seem to be the only solution if you want to wrap > 1 sql statements into a single prepared statement. Is this assumption, and thus the conclusion that I gotta create a stored procedure to reach what I want, correct?
IMPORTANT NOTE: I don't know the name I need to query. From the two names of the json column of table 1), I only know the other name. In other words, I have one name of a person X, and I want to get the status of all the persons which have been associated with that person X in table 1), while the status of each person is listed in table 2), to avoid to have duplicate status storage in the DB. ATM, I retrieve the other names of each relation record from DB 1) by using a conditional statement saying sth like
UPDATE
See added answer below, got it working.
You can query JSON data type with MySQL (if version > 5.7), and thus you can simply do everything with a single query
Give this a try
SELECT t1.name1, t1.name2, t2.status
FROM
(
SELECT JSON_EXTRACT(your_json_column, "$.name1") AS name1,
JSON_EXTRACT(your_json_column, "$.name2") AS name2
FROM table1
WHERE JSON_EXTRACT(your_json_column, "$.name1") = 'info'
) t1
INNER JOIN table2 t2 ON t2.`name`=t1.name2
Adapt the name of your_json_column. Also I assumed that you wanted to search the name2 of a specific name1, thus my WHERE clause, remove it if it was a false assumption.
Okay got it working, pretty much thanks to the solution proposed by Thomas G and some hints of JNevill (cheers guys!):
SELECT t1.info1, t1.info2, t1.info3, t1.other_name, t2.status FROM (
SELECT
field1 AS info1,
field2 AS info2,
field3 AS info3,
CASE
WHEN JSON_VALUE(JSON_KEYS(json_names_column),"$[0]") = 'name1'
THEN JSON_VALUE(JSON_KEYS(json_names_column),"$[1]")
ELSE JSON_VALUE(JSON_KEYS(json_names_column),"$[0]")
END
AS other_name
FROM table1
WHERE id = 345
) t1 INNER JOIN table2 t2 ON t1.other_name = t2.name;
Note that I used JSON_VALUE(JSON_KEYS()) instead of JSON_EXTRACT, to only return the needed name as name data of t1, and because I don't know the name to retrieve before the query, so I cannot use the WHEREclause proposed by Thomas G.
The table reuniao (EN: meeting), has 401000 records, and it has an index on all columns, im using XAMPP, and
I think the problems are on trying to do an ORDER BY on a COUNT and the Join, but 50 seconds its to much.
Columns, nome its varchar (EN: Name), presenca its varchar (EN: presence), partido its varchar (EN: political party), id_deputado its PK INT, and data (its Date).
SELECT D.nome, COUNT(*) as count_dep_faltas
FROM reuniao R, deputado D
WHERE R.partido LIKE '%$_POST[partido]%' AND
R.presenca LIKE '%Injustific%' AND
data BETWEEN '$data_incio' AND '$data_fim' and
R.id_deputado=D.id_deputado
GROUP BY D.nome
ORDER BY count_dep_faltas DESC
LIMIT 5
This is your query, written using proper JOIN syntax:
SELECT D.nome, COUNT(*) as count_dep_faltas
FROM deputado D JOIN
reuniao R
ON R.id_deputado = D.id_deputado
WHERE R.partido LIKE '%$_POST[partido]%' AND
R.presenca LIKE '%Injustific%' AND
R.data BETWEEN '$data_incio' AND '$data_fim'
GROUP BY D.nome
ORDER BY count_dep_faltas DESC;
LIMIT 5;
First, you need to learn to use parameters for passing in queries, rather than munging the query string. This prevents unexpected syntax errors and SQL injection. But that is not related to performance.
This is hard to optimize in MySQL because of the wildcards in the LIKE patterns. You can approach this by creating an index on reuniao(data, partido, presenca, id_deputado) and deputado(id_deputado, nome). This is a covering index, so it should have some improvement.
I would also recommend that you consider full text indexes, if you really need matches on the strings with wildcards.
I am wondering if there is away (possibly a better way) to order by the order of the values in an IN() clause.
The problem is that I have 2 queries, one that gets all of the IDs and the second that retrieves all the information. The first creates the order of the IDs which I want the second to order by. The IDs are put in an IN() clause in the correct order.
So it'd be something like (extremely simplified):
SELECT id FROM table1 WHERE ... ORDER BY display_order, name
SELECT name, description, ... WHERE id IN ([id's from first])
The issue is that the second query does not return the results in the same order that the IDs are put into the IN() clause.
One solution I have found is to put all of the IDs into a temp table with an auto incrementing field which is then joined into the second query.
Is there a better option?
Note: As the first query is run "by the user" and the second is run in a background process, there is no way to combine the 2 into 1 query using sub queries.
I am using MySQL, but I'm thinking it might be useful to have it noted what options there are for other DBs as well.
Use MySQL's FIELD() function:
SELECT name, description, ...
FROM ...
WHERE id IN([ids, any order])
ORDER BY FIELD(id, [ids in order])
FIELD() will return the index of the first parameter that is equal to the first parameter (other than the first parameter itself).
FIELD('a', 'a', 'b', 'c')
will return 1
FIELD('a', 'c', 'b', 'a')
will return 3
This will do exactly what you want if you paste the ids into the IN() clause and the FIELD() function in the same order.
See following how to get sorted data.
SELECT ...
FROM ...
WHERE zip IN (91709,92886,92807,...,91356)
AND user.status=1
ORDER
BY provider.package_id DESC
, FIELD(zip,91709,92886,92807,...,91356)
LIMIT 10
Two solutions that spring to mind:
order by case id when 123 then 1 when 456 then 2 else null end asc
order by instr(','||id||',',',123,456,') asc
(instr() is from Oracle; maybe you have locate() or charindex() or something like that)
If you want to do arbitrary sorting on a query using values inputted by the query in MS SQL Server 2008+, it can be done by creating a table on the fly and doing a join like so (using nomenclature from OP).
SELECT table1.name, table1.description ...
FROM (VALUES (id1,1), (id2,2), (id3,3) ...) AS orderTbl(orderKey, orderIdx)
LEFT JOIN table1 ON orderTbl.orderKey=table1.id
ORDER BY orderTbl.orderIdx
If you replace the VALUES statement with something else that does the same thing, but in ANSI SQL, then this should work on any SQL database.
Note:
The second column in the created table (orderTbl.orderIdx) is necessary when querying record sets larger than 100 or so. I originally didn't have an orderIdx column, but found that with result sets larger than 100 I had to explicitly sort by that column; in SQL Server Express 2014 anyways.
SELECT ORDER_NO, DELIVERY_ADDRESS
from IFSAPP.PURCHASE_ORDER_TAB
where ORDER_NO in ('52000077','52000079','52000167','52000297','52000204','52000409','52000126')
ORDER BY instr('52000077,52000079,52000167,52000297,52000204,52000409,52000126',ORDER_NO)
worked really great
Ans to get sorted data.
SELECT ...
FROM ...
ORDER BY FIELD(user_id,5,3,2,...,50) LIMIT 10
The IN clause describes a set of values, and sets do not have order.
Your solution with a join and then ordering on the display_order column is the most nearly correct solution; anything else is probably a DBMS-specific hack (or is doing some stuff with the OLAP functions in standard SQL). Certainly, the join is the most nearly portable solution (though generating the data with the display_order values may be problematic). Note that you may need to select the ordering columns; that used to be a requirement in standard SQL, though I believe it was relaxed as a rule a while ago (maybe as long ago as SQL-92).
Use MySQL FIND_IN_SET function:
SELECT *
FROM table_name
WHERE id IN (..,..,..,..)
ORDER BY FIND_IN_SET (coloumn_name, .., .., ..);
For Oracle, John's solution using instr() function works. Here's slightly different solution that worked -
SELECT id
FROM table1
WHERE id IN (1, 20, 45, 60)
ORDER BY instr('1, 20, 45, 60', id)
I just tried to do this is MS SQL Server where we do not have FIELD():
SELECT table1.id
...
INNER JOIN
(VALUES (10,1),(3,2),(4,3),(5,4),(7,5),(8,6),(9,7),(2,8),(6,9),(5,10)
) AS X(id,sortorder)
ON X.id = table1.id
ORDER BY X.sortorder
Note that I am allowing duplication too.
Give this a shot:
SELECT name, description, ...
WHERE id IN
(SELECT id FROM table1 WHERE...)
ORDER BY
(SELECT display_order FROM table1 WHERE...),
(SELECT name FROM table1 WHERE...)
The WHEREs will probably take a little tweaking to get the correlated subqueries working properly, but the basic principle should be sound.
My first thought was to write a single query, but you said that was not possible because one is run by the user and the other is run in the background. How are you storing the list of ids to pass from the user to the background process? Why not put them in a temporary table with a column to signify the order.
So how about this:
The user interface bit runs and inserts values into a new table you create. It would insert the id, position and some sort of job number identifier)
The job number is passed to the background process (instead of all the ids)
The background process does a select from the table in step 1 and you join in to get the other information that you require. It uses the job number in the WHERE clause and orders by the position column.
The background process, when finished, deletes from the table based on the job identifier.
I think you should manage to store your data in a way that you will simply do a join and it will be perfect, so no hacks and complicated things going on.
I have for instance a "Recently played" list of track ids, on SQLite i simply do:
SELECT * FROM recently NATURAL JOIN tracks;
I have a database that is already in use and I have to improve the performance of the system that's using this database.
There are 2 major queries running about 1000 times in a loop and this queries have inner joins to 3 other tables each. This in turn is making the system very slow.
I tried actually to remove the query from the loop and fetch all the data only once and process it in PHP. But this is putting to much load on the memory (RAM) and the system is hanging if 2 or more clients try to use the system.
There is a lot of data in the tables even after removing the expired data .
I have attached the query below.
Can anyone help me with this issue ?
select * from inventory
where (region_id = 38 or region_id = -1)
and (tour_opp_id = 410 or tour_opp_id = -1)
and room_plan_id = 141 and meal_plan_id = 1 and bed_type_id = 1 and hotel_id = 1059
and FIND_IN_SET(supplier_code, 'QOA,QTE,QM,TEST,TEST1,MQE1,MQE3,PERR,QKT')
and ( ('2014-11-14' between from_date and to_date) )
order by hotel_id desc ,supplier_code desc, region_id desc,tour_opp_id desc,inventory.inventory_id desc
SELECT * ,pinfo.fri as pi_day_fri,pinfoadd.fri as pa_day_fri,pinfochld.fri as pc_day_fri
FROM `profit_markup`
inner join profit_markup_info as pinfo on pinfo.profit_id = profit_markup.profit_markup_id
inner join profit_markup_add_info as pinfoadd on pinfoadd.profit_id = profit_markup.profit_markup_id
inner join profit_markup_child_info as pinfochld on pinfochld.profit_id = profit_markup.profit_markup_id
where profit_markup.hotel_id = 1059 and (`booking_channel` = 1 or `booking_channel` = 2)
and (`rate_region` = -1 or `rate_region` = 128)
and ( ( period_from <= '2014-11-14' and period_to >= '2014-11-14' ) )
ORDER BY profit_markup.hotel_id DESC,supplier_code desc, rate_region desc,operators_list desc, profit_markup_id DESC
Since we have not seen your SHOW CREATE TABLES; and EXPLAIN EXTENDED plan it is hard to give you 1 answer
But generally speaking in regard to your query "BTW I re-wrote below"
SELECT
hotel_id, supplier_code, region_id, tour_opp_id, inventory_id
FROM
inventory
WHERE
region_id IN (38, -1)
AND tour_opp_id IN (410, -1)
AND room_plan_id IN (141, 1)
AND bed_type_id IN (1, 1059)
AND supplier_code IN ('QOA', 'QTE', 'QM', 'TEST', 'TEST1', 'MQE1', 'MQE3', 'PERR', 'QKT')
AND ('2014-11-14' BETWEEN from_date AND to_date )
ORDER BY
hotel_id DESC, supplier_code DESC, region_id DESC, tour_opp_id DESC, inventory_id DESC
Do not use * to get all the columns. You should list the column that you really need. Using * is just a lazy way of writing a query. limiting the columns will limit the data size that is being selected.
How often is the records in the inventory are being updates/inserted/delete? If not too often then you can use consider using SQL_CACHE. However, caching a query will cause you problems if you use it and the inventory table is updated very often. In addition, to use query cache you must check the value of query_cache_type on your server. SHOW GLOBAL VARIABLES LIKE 'query_cache_type';. If this is set to "0" then the cache feature is disabled and SQL_CACHE will be ignored. If it is set to 1 then the server will cache all queries unless you tell it not too using NO_SQL_CACHE. If the option is set to 2 then MySQL will cache the query only where SQL_CACHE clause is used. here is documentation about query_cache_type
If you have an index on those following column in this order it will help you (hotel_id, supplier_code, region_id, tour_opp_id, inventory_id)
ALTER TABLE inventory
ADD INDEX (hotel_id, supplier_code, region_id, tour_opp_id, inventory_id);
If possible increase sort_buffer_size on your server as most likely you issue here is that your are doing too much sorting.
As for the second query "BTW I re-wrote below"
SELECT
*, pinfo.fri as pi_day_fri,
pinfoadd.fri as pa_day_fri,
pinfochld.fri as pc_day_fri
FROM
profit_markup
INNER JOIN
profit_markup_info AS pinfo ON pinfo.profit_id = profit_markup.profit_markup_id
INNER JOIN
profit_markup_add_info AS pinfoadd ON pinfoadd.profit_id = profit_markup.profit_markup_id
INNER JOIN
profit_markup_child_info AS pinfochld ON pinfochld.profit_id = profit_markup.profit_markup_id
WHERE
profit_markup.hotel_id = 1059
AND booking_channel IN (1, 2)
AND rate_region IN (-1, 128)
AND period_from <= '2014-11-14'
AND period_to >= '2014-11-14'
ORDER BY
profit_markup.hotel_id DESC, supplier_code DESC, rate_region DESC,
operators_list DESC, profit_markup_id DESC
Again eliminate the use of * from your query
Make sure that the following columns have the same type/collation and same size. pinfo.profit_id, profit_markup.profit_markup_id, pinfoadd.profit_id, pinfochld.profit_id and each one have to have an index on every table. If the columns have different types then MySQL will have to convert the data every time to join the records. Even if you have index it will be slower. Also, if those column are characters type (ie. VARCHAR()) make sure they are of the CHAR() with a collation of latin1_general_ci as this will be faster for finding ID, but if you are using INT() even better.
Use the 3rd and 4th trick I listed for the previous query
Try using STRAIGHT_JOIN "you must know what your doing here or it will bite you!" Here is a good thread about this When to use STRAIGHT_JOIN with MySQL
I hope this helps.
For the first query, I am not sure if you can do much (assuming you have already indexed the fields you are ordering by) apart from replacing the * with column names (Don't expect this to increase the performance drastically).
For the second query, before you go through the loop and put in selection arguments, you could create a view with all the tables joined and ordered then make a prepared statement to select from the view and bind arguments in the loop.
Also, if your php server and the database server are in two different places, it is better if you did the selection through a stored procedure in the database.
(If nothing works out, then memcache is the way to go... Although I have personally never done this)
Here you have increase query performance not an database performance.
For both queries first check index is available on WHERE and ON(Join) clause columns, if index is missing then you have to add index to improve query performance.
Check explain plane before create index.
If possible show me the explain plane of both query that will help us.
I have to run this Mysql query on my website to fetch huge amount of data: (3 tables , each with 100,000 + records)
SELECT on_resume.*, on_users.subscribed, on_users.user_avatar, on_resume_page.*
FROM on_resume
LEFT JOIN on_users ON (on_resume.resume_userid = on_users.user_id )
LEFT JOIN on_resume_page ON ( on_resume.resume_userid = on_resume_page.resume_userid)
WHERE on_resume.active= '1'
GROUP BY on_resume.rid
ORDER BY on_resume.rid DESC
LIMIT 0,18
The time I run this at Phpmyadmin sql section, the whole mysqld service will be down and needs to be restarted.
Now I was testing this query and I found out if I don't use Group by and Order by conditions the query will be fine.
SELECT on_resume.*, on_users.subscribed, on_users.user_avatar, on_resume_page.*
FROM on_resume
LEFT JOIN on_users ON (on_resume.resume_userid = on_users.user_id )
LEFT JOIN on_resume_page ON ( on_resume.resume_userid = on_resume_page.resume_userid)
WHERE on_resume.active= '1'
LIMIT 0,18
Showing rows 0 - 17 ( 18 total, Query took 0.4248 sec)
Why is it like this and how can I fix it?...
NOTE : I have tested the SQL query with group by or Order by alone in either case , even with one of them still the query fails and hangs the server.
EDIT : This problem is solved by making column on_resume_page.resume_userid indexed.
This is what i was told, took a while to figure it out:
Look at #jer in Chicago comment
Remember, when there is a GROUP BY clause, there are certain rules that apply for grouping columns. One of those rules is "The Single-Value Rule" -- every column named in the SELECT list must also be a grouping column unless it is an argument for one of the set functions. MySQL extends standard SQL by allowing you to use columns or calculations in a SELECT list that don't appear in a GROUP BY clause. However, we are warned not to use this feature unless the columns you omit from the GROUP BY clause are not unique in the group because you will get unpredictable results.