INNER JOIN too slow. how can it bee quicker

INNER JOIN too slow. how can it bee quicker - php

My code
SELECT * FROM andmed3 INNER JOIN test ON andmed3.isik like concat('%', test.isik, '%')
In andmed3 i have 130 000 rows and on test i have 10 000 rows, and it wont run.
When i limit it to 0,500 then it will query about 2-3 minutes.
How can it be better?
andmed3 table
id name number isik link stat else
-----------------------------------------------
1 john 15 1233213 none 11 5
8455666
7884555
test table
id isik
-----------
45 8455666
So i need all the rows from the andmed3 where is number what occures in test

The problem is the engine ill need to avalute the LIKE expression for each pair of rows in the join (130.000 X 10.000).
Also indexes are useless in this scenario because the expression need to be evaluated in order to accomplish the join (and you cannot put that expression INSIDE a index)
Maybe it's your architecture/schema the problem. When no one antecipated the need to join two tables based in a string expression.
Possible solution:
(It's a wild guess)
Hard to tell for sure from your example but if andmed3.isik contains all possible values to be used in the join you can try to put that in another table like it:
Andmed3Id isik
--------- -------
1 1233213
1 8455666
1 7884555
Of course to populate this table you ill need a strategy, possbile ones are: in the insert/update, in a batch in some late hour.
If this suits you just need to add one more join in your query.

Related

How to count each item in array with sql

I have a user table which contain a membergroupids, and user table looks like this:
userid membergroupids
1 1,2
2 2,3
3 2,3,4
and I want to use sql to output a result like this
membergroupid count
1 1
2 3
3 2
4 1
I tried use SELECT membergroupids FROM user, then use php to loop through the result and get the count, but it works with small set of user table, but I have a really big user table, the select query itself will take more than 1min to finish, is there better way to do this?

There is a much better way to do it. Your tables need to be normalized:
Instead of
userid membergroupids
1 1,2
2 2,3
3 2,3,4
It needs to be
userid membergroupids
1 1
1 2
2 2
2 3
3 2
3 3
3 4
From here, it's a simple query to get the counts (assuming this table is called your_table:
select count(membergroupids) as numberofgroups, userid
from your_table
group by userid
order by userid
The real problem, then, is getting your tables normalized. If you only have 9 membergroupids, then you could use a like '%1%' to find all userids with membergroupid #1. But if you have 10, then it won't be able to distinguish between 1 and 10. And sadly, you can't count on the commas to help you distinguish because the number might not be surrounded by commas.
unless...
Create new field with group ids encapsulated by commas
you could create a new field and populate it with membergroupids and surround it with commas by using concat (check your database's docs). Something along this line:
update your_table set temp=concat(',', membergroupids, ',');
This could give you a table structure like so:
userid membergroupids temp
1 1,2 ,1,2,
2 2,3 ,2,3,
3 2,3,4 ,2,3,4,
Now, you have the ability to grab distinct member group ids in the new field, ie, where temp like '%,1,%' to find userids with membergroupid 1. (They will be encapsulated by commas) Now, you can manually build your new normalized table which I'll call user_member.
Insert membergroupid 1:
insert into user_member (userid,membergroupid) select userid,'1' from your_table where temp like '%,1,%';
You could make a php script that loops through all the membergroupids.
Keep in mind that like %...% is not very efficient, so don't even think about relying on this to do your count. It'll work, but it's not scalable. It would be much better to use this to build the normalized table.

It's easy to do your purpose IF the data structure is as like as below:
SELECT `membergroupids`, COUNT(`membergroupids`) as
CountOfMembergroupids FROM `TBL_TEST01` WHERE 1
GROUP BY `membergroupids`
ORDER BY `userid`
As you mentioned that you have to proceed with large amount of data..., I'd strongly suggest that you could revise your table structure as above...

SQL finding specific character in table

I have a table like this
d_id | d_name | d_desc | sid
1 |flu | .... |4,13,19
Where sid is VARCHAR. What i want to do is when enter 4 or 13 or 19, it will display flu. However my query only works when user select all those value. Here is my query
SELECT * FROM diseases where sid LIKE '%sid1++%'
From above query, I work with PHP and use for loop to put the sid value inside LIKE value. So there I just put sid++ to keep it simple. My query only works when all of the value is present. If let say user select 4 and 19 which will be '%4,19%' then it display nothing. Thanks all.

If you must do what you ask for, you can try to use FIND_IN_SET().
SELECT d_id, d_name, d_description
FROM diseases
WHERE FIND_IN_SET(13,sid)<>0
But this query will not be sargable, so it will be outrageously slow if your table contains more than a few dozen rows. And the ICD10 list of disease codes contains almost 92,000 rows. You don't want your patient to die or get well before you finish looking up her disease. :-)
So, you should create a separate table. Let's call it diseases_sid.
It will contain two columns. For your example the contents will be
d_id sid
1 4
1 13
1 19
If you want to find a row from your diseases table by sid, do this.
SELECT d.d_id, d.d_name, d.d_description
FROM diseases d
JOIN diseases_sid ds ON d.d_id = ds.d_id
WHERE ds.sid = 13
That's what my colleagues are talking about in the comments when they mention normalization.

Longest Prefix between two MySQL Tables

I have a MySQL database with 2 tables:
Table A:
Number
Location
Table B:
Calling Code
Area Code
Location
Initially, I have about 60,000 entries in table A, which has the Location column empty at the beginning. In table B I have about 250,000+ entries with a lot of area codes, calling codes (1, 011) and their respective location in the world. What I want is a FAST way of populating the table A's location column with the location of the number.
So for example if the first entry in Table A is (17324765600, null) I want to read trough table B and get the location for that number. Right now I am getting the location of a number with this query:
SELECT b.location
FROM
tableB b
LEFT JOIN tableA a
ON a.number LIKE CONCAT(b.calling_code, b.code, '%')
ORDER BY CHAR_LENGTH(b.code) DESC
LIMIT 1;
That gives me the proper location (even though I have my doubts that it can fail..). The problem is that performance wise this method is a no go. If I loop over all the 50k number
Update 1
Allow me to put some sample data with the expected output:
Sample Table A:
number location
17324765600 NULL
01134933638950 NULL
0114008203800 NULL
…60k Records + at the moment..
Sample Table B:
calling_code code location
1 7324765 US-NJ
011 34933 Spain
011 400820 China
…250,000+ records at the moment
Expected output after the processing:
Table A:
number location
17324765600 US-NJ
01134933638950 Spain
0114008203800 China
The best I’ve come up with is the following update statement:
UPDATE tableA a JOIN tableB b ON a.location LIKE CONCAT(b.calling_code, b.code, '%') SET a.location = b.location
Of course here I am not sure if it will always return the longest prefix of the code, for example if in the above tables there was another code starting with 73247XX let’s say that code is for Iowa (just as an example).. I am not sure if the query will always return the longest code so here I would also need help.
Let me know if the samples help.
.SQL for the database structure:
Download
Update 2:
I am thinking on doing this the following way:
Before inserting the data in table A I am thinking of exporting Table B into a CSV and sort it by area code, that way I can have 2 pointers one for the array of entries for table A and one for the csv, both sorted by area code that way I can make a kind of parallel search and populate the entry's location on PHP and not having to do this in MySQL.
Let me know if this approach seems like a better option if so I will test it out and publish the answer.

If you want all locations, then you need to remove LIMIT
SELECT b.location
FROM
tableB b
LEFT JOIN tableA a
ON a.number LIKE CONCAT(b.calling_code, b.code, '%')
ORDER BY CHAR_LENGTH(b.code);
If you want the same location name should not come twice then you need to use GROUP BY
SELECT b.location
FROM
tableB b
LEFT JOIN tableA a
ON a.number LIKE CONCAT(b.calling_code, b.code, '%')
GROUP BY b.location ORDER BY CHAR_LENGTH(b.code) ;

You have one join only with 250000 records, its not so stressful. You should take proper indexing for search columns and fine tune your mysql server. A good indexing & server variables well to set will solve your problem easily. Optimize your query well.Generally it creates problems when we have much of joins & many string comparison.
I think you need the query like this-
UPDATE a SET a.location = (
SELECT location from b
WHERE a.number LIKE CONCAT(b.calling_code, b.area_code, '%')
ORDER BY LENGTH(CONCAT(b.calling_code, b.area_code, '%')) desc
limit 1
);

I decided to take the below approach since I did not received any clear response:
Prior to the process I prepared 2 new tables, a table for country codes and a table for state codes (since I also need to know the state in case the number is within the US). Both tables will have: country, state, calling_code, code …
As for these 2 tables I broke down all the numbers with the prefixes and grouped them by area code so instead of having full 6 numbers to identify a country/state I grouped them by the first 3 numbers and if the code is within the USA or not, hence the 2 tables.
With this modifications I was able to break the 250,000 + rows table to only about 300 rows (each table).
After this I will follow these steps:
I get the list of phone numbers
I first execute a query very similar as the one I posted to update all the numbers that belong to the country_code table
I then update the rows that are still without location assigned with the table of state_code
I had to put some kind of cron in order to get this done every x amount of time to avoid having a huge amount of phones.
This may not be the best approach but for the 50k numbers that are in place at the moment I was able to (manually executing query by query with some more polishing) get it down to about 10 seconds, executing this every x amount of time (which will allow performing this process to less than 10k numbers) will make this smoothly.
I will mark this as the answer but if someone else magically comes up with a better answer I will make sure to update this.
Divide and conquer!

Repeated Insert copies on ID

We have records with a count field on an unique id.
The columns are:
mainId = unique
mainIdCount = 1320 (this 'views' field gets a + 1 when the page is visited)
How can you insert all these mainIdCount's as seperate records in another table IN ANOTHER DBASE in one query?
Yes, I do mean 1320 times an insert with the same mainId! :-)
We actually have records that go over 10,000 times an id. It just has to be like this.
This is a weird one, but we do need the copies of all these (just) counts like this.

The most straightforward way to this is with a JOIN operation between your table, and another row source that provides a set of integers. We'd match each row from our original table to as many rows from the set of integer as needed to satisfy the desired result.
As a brief example of the pattern:
INSERT INTO newtable (mainId,n)
SELECT t.mainId
, r.n
FROM mytable t
JOIN ( SELECT 1 AS n
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
) r
WHERE r.n <= t.mainIdCount
If mytable contains row mainId=5 mainIdCount=4, we'd get back rows (5,1),(5,2),(5,3),(5,4)
Obviously, the rowsource r needs to be of sufficient size. The inline view I've demonstrated here would return a maximum of five rows. For larger sets, it would be beneficial to use a table rather than an inline view.
This leads to the followup question, "How do I generate a set of integers in MySQL",
e.g. Generating a range of numbers in MySQL
And getting that done is a bit tedious. We're looking forward to an eventual feature in MySQL that will make it much easier to return a bounded set of integer values; until then, having a pre-populated table is the most efficient approach.

MYSQL UNION across 3 tables with ORDER BY

We have this statement:
(SELECT res_bev.bev_id, property_value.name AS priority
FROM res_bev, bev_property, property_value
WHERE res_bev.res_id='$resIn'
AND bev_property.bev_id=res_bev.bev_id
AND bev_property.type_id='23'
AND property_value.id=bev_property.val_id)
UNION
(SELECT res_bev.bev_id, property_value.name as priority
FROM res_bev, bev_property, property_value
WHERE res_bev.res_id='$resIn'
AND bev_property.bev_id=res_bev.bev_id
AND bev_property.type_id='22'
AND property_value.id=bev_property.val_id)
We have Three Tables:
Res_bev
res_id | bev_id | id
Bev_property
type_id | val_id | bev_id | id
Property_value
name | id
What I am looking for is the results to be ordered by glass price(type_id='23') then bottle price(type_id='22') however it seems the union includes duplicates due to fact the first select returns say 3456 | 7.5 and the second returns 3456 | 55 since the price/Glass is 7.5 and the price/Bottle is 55; how can I eliminate these duplicates form the second SQL statement to return and ordered table?
Also, fooled with creating a pseudo-table via left joins to create a table of bev_id | price/Glass | price/Bottle, however since this should be able to expand to multiple price types I figured a UNION would be more efficient. Just a push in the right direction would be helpful.

You can do it in 1 query by specifying bev_property.type_id to match against an IN() clause with the values inside.
To return only the first one found you should require a DISTINCT SELECT of the accompagnying field bev_id.
To ORDER them just add an appropriate descending ORDER BY clause. This should order first and the filter out the second bev_property.type_id value. (Databases never return anything in a specific order unless you tell them to, some might have an internal convention or it might appear they do but this is never guaranteed to be repeatable unless you specify an ORDER BY clause in your SELECT statement. )
SELECT DISTINCT res_bev.bev_id, property_value.name AS priority
FROM res_bev, bev_property, property_value
WHERE res_bev.res_id='$resIn'
AND bev_property.bev_id=res_bev.bev_id
AND bev_property.type_id IN ('23','22')
AND property_value.id=bev_property.val_id
ORDER BY bev_property.type_id DESC;
A UNION won't really be faster since you'd have to do the whole lookup twice and if you don't have this field indexed then you'll do a whole table traversal with match against 1 element twice as opposed do 1 table traversal that matches against 2 elements. (walking over a whole table is what's generally slow, not matching simple elements against each other)
When properly indexed I think you might have a tiny overhead of executing a new select query and the query analyzer running again but I don't know for sure. It'll probably be smart enough to recognise the similarities between the queries so it won't matter.
It doesn't always hurt to try on specific databases though. Whenever you try query optimisation with different statements use them with EXPLAIN, this will show you what the query will be doing and wether it'll go over whole tables, sort data on file, etc...

Unless I'm missing something, or you have from your question
SELECT res_bev.bev_id,
property_value.name AS priority
FROM res_bev, bev_property, property_value
WHERE res_bev.res_id='$resIn'
AND bev_property.bev_id=res_bev.bev_id
AND (bev_property.type_id='23' OR bev_property.type_id='22')
AND property_value.id=bev_property.val_id)
order by bev_property.type_id desc
PS if you want to order a union
try something along the lines of
Select * from
(
select ...
Union
select ...
) somenameforqryinparentheses
Order by Somecolumn1, somecolumn2

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.