Key press autocomplete queries to a 3 million rows table - php

I have a table fs_city, with 3 millions cities all around the world, as well as a table fs_country.
When the user visits the website it detects its country code, the user is required to select their country and city from a box (which looks for cities in fs_city on key press), if the user is from USA, and types "Ne", he will get a drop down list with "New York" for i.e. I do this to create an autocomplete input.
The problem is that, the query is sent upon every key press, in my example "Ne", there are two queries to the table fs_city.
Also, even if there's only one query, it takes 6 seconds to return a response from that table... My table has primary keys.
This is my SQL query:
SELECT
ci.city_id,
ci.country_code,
ci.city,
ci.region,
co.country_name
FROM
fs_city as ci,
fs_country as co
WHERE
ci.city LIKE "Ne%"
AND co.country = ci.country_code
AND ci.country_code = :country
ORDER BY
ci.city ASC
LIMIT 0,20
How can I create an autocomplete feature (with keypress), and how to speed up fs_city table queries?
Updated :
fs_city primary key : city_id
fs_country primary key : country_id
Engine : InnoDB

jQuery has nice widget with exactly what you are looking for and I'm pretty sure you can limit it to only start query the database after a set amount of characters so you can limit your results. That should solve your speed problem and shorten your javascript. http://jqueryui.com/autocomplete/
EDIT MySQL also has a LIMIT term to cap the return, which might also help.

Related

Speed-up/Optimise MySQL statement - finding a new row that hasn't been selected before

First a bit of background about the tables & DB.
I have a MySQL db with a few tables in:
films:
Contains all film/series info with netflixid as a unique primary key.
users:
Contains user info "ratingid" is a unique primary key
rating:
Contains ALL user rating info, netflixid and a unique primary key of a compound "netflixid-userid"
This statement works:
SELECT *
FROM films
WHERE
INSTR(countrylist, 'GB')
AND films.netflixid NOT IN (SELECT netflixid FROM rating WHERE rating.userid = 1)
LIMIT 1
but it takes longer and longer to retrieve a new film record that you haven't rated. (currently at 6.8 seconds for around 2400 user ratings on an 8000 row film table)
First I thought it was the INSTR(countrylist, 'GB'), so I split them out into their own tinyint columns - made no difference.
I have tried NOT EXISTS as well, but the times are similar.
Any thoughts/ideas on how to select a new "unrated" row from films quickly?
Thanks!
Try just joining?
SELECT *
FROM films
LEFT JOIN rating on rating.ratingid=CONCAT(films.netflixid,'-',1)
WHERE
INSTR(countrylist, 'GB')
AND rating.pk IS NULL
LIMIT 1
Or doing the equivalent NOT EXISTS.
I would recommend not exists:
select *
from films f
where
instr(countrylist, 'GB')
and not exists (
select 1 from rating r where r.userid = 1 and f.netflixid = r.netflixid
)
This should take advantage of the primary key index of the rating table, so the subquery executes quickly.
That said, the instr() function in the outer query also represents a bottleneck. The database cannot take advantage of an index here, because of the function call: basically it needs to apply the computation to the whole table before it is able to filter. To avoid this, you would probably need to review your design: that is, have a separate table to represent the relationship between movies and countries, which each tuple on a separate row; then, you could use another exists subquery to filter on the country.
The INSTR(countrylist, 'GB') could be changed on countrylist = 'GB' or countrylist LIKE '%GB%' if the countrylist contains more than the country.
Then don't select all '*' if you need only some columns details. Depends on the number of columns, the query could be really slow

Mysql summary from colums

I need to summary columns together on each row, like a leaderboard. How it looks:
Name | country | track 1 | track 2 | track 3 | Total
John ENG 32 56 24
Peter POL 45 43 35
Two issues here, I could use the
update 'table' set Total = track 1 + track 2 + track 3
BUT it's not always 3 tracks, anywhere from 3 to 20.
Secound if I don't SUM it in mysql I can not sort it when I present data in HTML/php.
Or is there some other smart way to build leaderboards?
You need to redesign your table to have colums for name, country, track number and data Then instead if having a wide table with just 3 track numbers you have a tall, thin table with each row being the data for a given name, country and track.
Then you can summarise using something like
SELECT
country,
name,
sum(data) as total
FROM trackdata
GROUP BY
name,
country
ORDER BY
sum(data) desc
Take a look here where I have made a SQL fiddle showing this working the way you want it
Depending upon your expected data however you might really be better having a separate table for Country, where each country name only appears once (and also for name maybe). For example, if John is always associated with ENG then you have a repeating group and its better to remove that association from the table above which is really about scores on a track not who is in what country and put that into its own table which is then joined to the track data.
A full solution might have the following tables
**Athlete**
athlete_id
athlete_name
(other data about athletes)
**Country**
country_id
country_name
(other data about countries)
**Track**
Track_id
Track_number
(other data about tracks)
**country_athlete** (this joining table allows for the one to many of one country having many athletes
country_athlete_id
country_id
athlete_id
**Times**
country_athlete_id <--- this identifies a given combination of athlete and country
track_id <--- this identifies the track
data <--- this is where you store the actual time
It can get more complex depending on your data, eg can the same track number appear in different countries? if so then you need another joining table to join one track number to many countries.
Alternatively, even with the poor design of my SQL fiddle example, it might be good to make name,country and track a primary key so that you can only ever have one 'data' value for a given combination of name, country and track. However, this decision, and that of normalising your table into multiple joined tables would be based upon the data you expect to get.
But either way as soon as you say 'I don't know how many tracks there will be' then you should start thinking 'each track's data appears in one ROW and not one COLUMN'.
Like others mentioned, you need to redesign your database. You need an One-To-Many relationship between your Leaderboard table and a new Tracks table. This means that one User can have many Tracks, with each track being represented by a record in the Tracks table.
These two databases should be connected by a foreign key, in this case it could be a user_id field.
The total field in the leaderboard table could be updated every time a new track is inserted or updated, or you could have a query similar to the one you wanted. Here is how such a query could look like:
UPDATE leaderboard SET total = (
SELECT SUM(track) FROM tracks WHERE user_id = leaderboard.user_id
)
I recommend you read about database relationships, here is a link:
https://code.tutsplus.com/articles/sql-for-beginners-part-3-database-relationships--net-8561
I still get a lot of issues with this... I don't think that the issue is the database though, I think it's more they way I pressent the date on the web.
I'm able to get all the data etc. The only thing is my is not filling up the right way.
What I do now is like: "SELECT * FROM `times` NATURAL JOIN `players`
Then <?php foreach... ?>
<tr>
<td> <?php echo $row[playerID];?> </td>
<td> <?php echo $row[Time];?> </td>
....
The thing is it's hard to get sorting, order and SUM all in ones with this static table solution.
I searched around for leaderboards and I really don't understand how they build theres with active order etc. like. https://www.pgatour.com/leaderboard.html
How do they build leaderboards like that? With sorting and everything.

MySQL performance with large WHERE IN() clause

Let's say we have a table with 4 columns: id (int 11, indexed), title, content, category (varchar 5).
I have a user select a category. Each category can contain up to 999 objects. Using SELECT id FROM table WHERE category = ? I get a list of all objects.
I then have the user select/deselect some of the objects. After which I need to select the content of the remaining selected objects.
Now my question is as follows, should I worry about performance when using SELECT content FROM table WHERE id IN($array)? Would it be better to use SELECT content FROM table WHERE category = ? AND id IN($array). The idea here being I filter it down to 999 objects before performing the IN...
Does this make any sense? Or should I not be using the IN() at all?
It sounds like you always have content showing on the screen?
999 is a long list to put on the screen. Re-think your UI.
When selected/deselected, what happens? Do you gray out the content? If so, that is a UI issue, not a database issue. If you store the subset that is currently "selected", then how/where is that stored? And, do you want to store it after each select/deselect? Or wait until he clicks "Submit"?
In other words, I don't see why this is a database question.
Back to the queries in question:
INDEX(category)
SELECT ... FROM tbl WHERE category = ...; -- This is optimal
PRIMARY KEY(id)
SELECT ... FROM tbl WHERE id IN (...); -- optimal for an arbitrary set
INDEX(category, id)
SELECT ... FROM tbl WHERE category = ... AND id IN (...)
-- use this only if you both parts are needed for filtering
-- not for optimizing

Select random row where condition apply effectvely

I have a table of names with structure likes this :
id int(11) -Auto Increment PK
name varchar(20)
gender varchar(10)
taken tinyint - bool value
I want to get a random name of a single row where gender is say male and taken is false. How can I do that without slowing down ?
What comes to mind is, SELECT all the rows where gender = male and taken = false. Then use php's rand(1, total_rows) and use the name from that randomly generated record number for the array of results.
Or I can use, but RAND() is going to slow down the process (taken from other questions on stackoverflow)
SELECT * FROM xyz WHERE (`long`='0' AND lat='0') ORDER BY RAND() LIMIT 1
You can take the following approach:
select the id list that meet your criteria, like SELECT id FROM table WHERE name=...
choose an id randomly with php
fetch whole data with that id, like SELECT * FROM table WHERE id=<id>
This approach would maximize the query cache in MySQL. The query in step 3 has a great chance of hitting the same id, in which case query cache can accelerate database access. Further more, if caching like memcached or redis is used in the future, step 3 can also be taken care of by them, without even going to db.

mySQL query to get only display

I wanted to ask one question as my query skills are not that great and I have been learning mySQL for the last week. This attachment I have shows what happens when I run the following query:
SELECT * FROM clothing, sizing WHERE id = "101";
You might notice that it produces the same id number, same name, same type, same brand_id,same price, and a lot of null values. Is there a query which I can run which only displays columns which do not have null values?
You can select the rows that dont have null values in given columns, or you can use IFNULL.
IFNULL(yourColumn,0)
This will display 0 instead of Null, but beware that NULL and 0 is not the same thing.
Null is "nothing" / undefined, 0 is a numerical value.
You can have issues multiplying with NULL, so you can do for instance:
SELECT (numProducts * IFNULL(productPrice,0))
FROM ...
You can also use CASE or IF to select differenct colums and alias them :-)
External link to docs: https://dev.mysql.com/doc/refman/4.1/en/control-flow-functions.html
Yes above solutions will work only if that column has default value set to null,if its not set then you need to check blank ,i mean to say IFNULL(productPrice,0) will not work you need to do as below,
SELECT (numProducts * IF(productPrice='',0,productPrice))
FROM ...
You are basically asking about two problems that I will address separately in this answer.
1 - More than one record is returned
You should follow mathielo and Olavxxx's comments regarding the use of JOIN.
The query as shown in your question is a cartesian product between your tables clothing and sizing. What the query is basically asking is "I want only the record with id 101 in one of the table, as well as all the records in the other table".
Judging by the rest of your question, this is not what you want. So I take it there is a relationship between rows in clothing and sizing. I will assume that a clothing can only have one size, and that this relationship is represented by a foreign key to sizing. Here the minimum the tables should contain for that to work (I do not reuse your model because from the details in the question I can only guess, not know, what your exact table model is):
clothing:
id: primary key
size_id: foreign key to sizing
sizing:
size_id: primary key
As a consequence, the following query should return all records corresponding to the selected clothing and associated size:
SELECT *
FROM clothing AS c
JOIN sizing AS s ON c.size_id = s.size_id
WHERE c.id = 101
Your relationship between your two tables may actually be different from what I have just modeled. If that is the case, I still hope the above example is enough to get you started in the right direction.
2 - Lots of NULL values
This part of the question needs to be precised. Is it that you do not want the records with NULL values for some columns to be returned, or is it that you just do not want to get the content of these columns? Or maybe you want to use a default value?
If it is the records you want to filter out, you should add <column> IS NOT NULL conditions in your WHERE clause. One for each of the columns you are interested in.
If it is the columns you do not want to get, do not use SELECT * but instead explicitely list the columns you want, for example:
SELECT id, name, price FROM clothing
If it is about using a default value instead, you need to use IF in the SELECT clause as in Supriya's answer. Another example:
SELECT name, size, IF(shoulder IS NULL, 'Default', shoulder)
FROM clothing

Categories