Making my query more efficient - php

im currently doing a query that pulls a string from my db. but it has to check for every new row i import. and a file of 100k takes almost 4 hours to import. thats way too long. im assuming that my sql code to check if he exist is the thing slowing it down.
ive heard about indexing but i have no clue what it is or how to use it.
this is the current code im using:
$sql2 = $pdo->prepare('SELECT * FROM prospects WHERE :ssn = ssn');
$sql2->execute(array(':ssn' => $ssn));
if($sql2->fetch(PDO::FETCH_NUM) > 0){
so everytime the phpscript reads a new row, it does this check. problem is, that i cant put it in "on duplicate key" in the sql code. it has to check before going to any sql, because if this is empty, then it should continue doing its thing.
what could i do to make this more efficient regarding time? and also, if index is the way to go, could someone enlighten me how this would be done by either posting examples, linking a guide or php.net page. and how i could read from that index to do what i am in my code

So you have 100k records and you don't have any index? Start then with creating one
CREATE INDEX ssn_index ON prospects (ssn)
Now, each time when you try to select something from prospects table with where condition on ssn column MySQL is going to check where it should look for the records by the index. If this column is strongly selective (there are many different values) the query is going to be performed fast.
You can check your execution plan by querying
EXPLAIN SELECT * FROM prospects WHERE :ssn = ssn

Related

performance issue from 5 queries in one page

As i am a junior PHP Developer growing day by day stuck in a performance problem described here:
I am making a search engine in PHP ,my database has one table with 41 column and million's of rows obviously it is a very large dataset. In index.php i have a form for searching data.When user enters search keyword and hit submit the action is on search.php with results.The query is like this.
SELECT * FROM TABLE WHERE product_description LIKE '%mobile%' ORDER BY id ASC LIMIT 10
This is the first query.After result shows i have to run 4 other query like this:
SELECT DISTINCT(weight_u) as weight from TABLE WHERE product_description LIKE '%mobile%'
SELECT DISTINCT(country_unit) as country_unit from TABLE WHERE product_description LIKE '%mobile%'
SELECT DISTINCT(country) as country from TABLE WHERE product_description LIKE '%mobile%'
SELECT DISTINCT(hs_code) as hscode from TABLE WHERE product_description LIKE '%mobile%'
These queries are for FILTERS ,the problem is this when i submit search button ,all queries are running simultaneously at the cost of Performance issue,its very slow.
Is there any other method to fetch weight,country,country_unit,hs_code speeder or how can achieve it.
The same functionality is implemented here,Where the filter bar comes after table is filled with data,How i can achieve it .Please help
Full Functionality implemented here.
I have tried to explain my full problem ,if there is any mistake please let me know i will improve the question,i am also new to stackoverflow.
Firstly - are you sure this code is working as you expect it? The first query retrieves 10 records matching your search term. Those records might have duplicate weight_u, country_unit, country or hs_code values, so when you then execute the next 4 queries for your filter, it's entirely possible that you will get values back which are not in the first query, so the filter might not make sense.
if that's true, I would create the filter values in your client code (PHP)- finding the unique values in 10 records is going to be quick and easy, and reduces the number of database round trips.
Finally, the biggest improvement you can make is to use MySQL's fulltext searching features. The reason your app is slow is because your search terms cannot use an index - you're wild-carding the start as well as the end. It's like searching the phonebook for people whose name contains "ishra" - you have to look at every record to check for a match. Fulltext search indexes are designed for this - they also help with fuzzy matching.
I'll give you some tips that will show useful in many situations when querying a large dataset, or mostly any dataset.
If you can list the fields you want instead of querying for '*' is a better practice. The weight of this increases as you have more columns and more rows.
Always try to use the PK's to look for the data. The more specific the filter, the less it will cost.
An index in this kind of situation would come pretty handy, as it will make the search more agile.
LIKE queries are generally pretty slow and resource heavy, and more in your situation. So again, the more specific you are, the better it will get.
Also add, that if you just want to retrieve data from this tables again and again, maybe a VIEW would fit nicely.
Those are just some tips that came to my mind to ease your problem.
Hope it helps.

PHP, indexing mysql ID in json. Is it worth it?

I have a database in which i've got a table where im having question to be displayed randomly on the main site (about 80 for now). Im reading all the IDs from the database and then randomly selecting one and doing next query to get all the rest needed data of this one. And im curious if should i leave this like that or would it be bether to store all the IDs in .json file and just update it every time i add a question. What is bether? Thanks for help.
If you're just interested in a random record from the table, just do it like this:
SELECT * FROM your_table
ORDER BY RAND()
LIMIT 1;
All in one query and you don't have to retrieve a list of IDs first.
And it's almost always a bad idea to maintain two separate data sources.

How to index a query the right way

I am trying to make my DB more optimized and are in the beginning of indexing it but not sure how to do it right.
I have this query:
$year = date("Y");
$thisYear = $year;
//$nextYear = $thisYear + 1;
$sql = mysql_query("SELECT SUM(points) as userpoints
FROM ".$prefix."_publicpoints
WHERE date BETWEEN '$thisYear" . "-01-01' AND '$thisYear" . "-12-31' AND fk_player_id = $playerid");
$row = mysql_fetch_assoc($sql);
$userPoints = $row['userpoints'];
$sql = mysql_query("SELECT
fk_player_id
FROM ".$prefix."_publicpoints
WHERE date BETWEEN '$thisYear" . "-01-01' AND '$thisYear" . "-12-31'
GROUP BY fk_player_id
HAVING SUM(points) > $userPoints");
$row = mysql_fetch_assoc($sql);
$userWrank = mysql_num_rows($sql)+1;
I am not sure how to index this? I have tried indexing the fk_player_id but it still looks through all the rows (287937).
I have indexed the date field which gives me this back in EXPLAIN:
1
SIMPLE
nf_publicpoints
range
IDXdate
IDXdate
3
NULL
143969
Using where with pushed condition; Using temporary...
I also have 2 calls to the same table... Could that be done in one?
How do I index this and/or could it be done smarter?
You should definitely spend some time reading up on indexing, there's a lot written about it, and it's important to understand what's going on.
Broadly speaking, and index imposes an ordering on the rows of a table.
For simplicity's sake, imagine a table is just a big CSV file. Whenever a row is inserted, it's inserted at the end. So the "natural" ordering of the table is just the order in which rows were inserted.
Imagine you've got that CSV file loaded up in a very rudimentary spreadsheet application. All this spreadsheet does is display the data, and numbers the rows in sequential order.
Now imagine that you need to find all the rows that has some value "M" in the third column. Given what you have available, you have only one option. You scan the table checking the value of the third column for each row. If you've got a lot of rows, this method (a "table scan") can take a long time!
Now imagine that in addition to this table, you've got an index. This particular index is the index of values in the third column. The index lists all of the values from the third column, in some meaningful order (say, alphabetically) and for each of them, provides a list of row numbers where that value appears.
Now you have a good strategy for finding all the rows where the value of the third column is M! For instance, you can perform a binary search! Whereas the table scan requires you to look N rows (where N is the number of rows), the binary search only requires that you look at log-n index entries, in the very worst case. Wow, that's sure a lot easier!
Of course, if you have this index, and you're adding rows to the table (at the end, since that's how our conceptual table works), you need need to update the index each and every time. So you do a little more work while you're writing new rows, but you save a ton of time when you're searching for something.
So, in general, indexing creates a tradeoff between read efficiency and write efficiency. With no indexes, inserts can be very fast -- the database engine just adds a row to the table. As you add indexes, the engine must update each index while performing the insert.
On the other hand, reads become a lot faster.
Hopefully that covers your first two questions (as others have answered -- you need to find the right balance).
Your third scenario is a little more complicated. If you're using LIKE, indexing engines will typically help with your read speed up to the first "%". In other words, if you're SELECTing WHERE column LIKE 'foo%bar%', the database will use the index to find all the rows where column starts with "foo", and then need to scan that intermediate rowset to find the subset that contains "bar". SELECT ... WHERE column LIKE '%bar%' can't use the index. I hope you can see why.
Finally, you need to start thinking about indexes on more than one column. The concept is the same, and behaves similarly to the LIKE stuff -- essentialy, if you have an index on (a,b,c), the engine will continue using the index from left to right as best it can. So a search on column a might use the (a,b,c) index, as would one on (a,b). However, the engine would need to do a full table scan if you were searching WHERE b=5 AND c=1)
Hopefully this helps shed a little light, but I must reiterate that you're best off spending a few hours digging around for good articles that explain these things in depth. It's also a good idea to read your particular database server's documentation. The way indices are implemented and used by query planners can vary pretty widely.
More information and example visit here : http://blog.sqlauthority.com/category/sql-index/
Try create index on date column, indexing fk_payer_id will not help with this query. If does not work - paste explain...
For more information about indexes in Mysql look here: http://hackmysql.com/case1
Why not index the date column, seeing how that's the main criterion that will be evaluated in the lookup?

PHP MySQL slow query access

I have a table in my database :
And I'm using MySQL to insert/update/create info however the time it takes to execute queries is slow example:
First when a user runs my app it verifies they can use it by retrieving all IPs from the db with
SELECT IP FROM user_info
Then using a while loop if the IP is in the db do another query:
"SELECT Valid FROM user_info WHERE IP='".$_SERVER['REMOTE_ADDR']."'""
If Valid is 1 then they can use it otherwise they can't however if their IP is not found in the db it creates a new entry for them using
"INSERT INTO user_info (IP,First) VALUES ('".$_SERVER['REMOTE_ADDR']."','".date("Y/m/d") ."')")"
Now this first script has finished it accesses another - this one was supposed to update the db every minute however i don't think i can accomplish that now ; the update query is this:
"UPDATE user_info SET Name='".$Name."', Status='".$Status."', Last='".$Last."', Address='".$Address."' WHERE IP='".$_SERVER['REMOTE_ADDR']."'")"
All together it takes average 2.2 seconds and there's only 1 row in the table atm
My question is how do i speed up mysql queries? I've read about indexes and how they can help improve performance but i do not fully understand how to use them. Any light shed on this topic would help.
Nubcake
Indexes will become very important as your site grows, but they won't help when you have only one row in your table and it cannot be the reason why your queries take so long. You seem to have some sort of fundamental problem with your database. Maybe it's a latency issue.
Try starting with some simpler queries like SELECT * FROM users LIMIT 1 or even just SELECT 1 and see if you still get bad performance.
Lesser the number of queries, lesser the latency of the system. Try merging queries being made on the same table. For example, your second and third queries can be merged and you can execute
INSERT INTO user_info ...... WHERE Valid=1 AND IP= ...
Check the number of rows affected to know if a new row was added or not.
Also, do not open/close your sql connection at any point in between. The overheads of establishing a new connection could be high.
you could make IP Primary Key and Index

PostgreSQL Query and PHP Assistance

Directly under this small intro here you'll see the layout of the database tables that I'm working with and then you'll see the details of my question. Please provide as much guidance as possible. I am still learning PHP and SQL and I really do appreciate your help as I get the hang of this.
Table One ('bue') --
chp_cd
rgn_no
bgu_cd
work_state
Table Two ('chapterassociation') --
chp_cd
rgn_no
bgu_cd
work_state
Database Type: PostgreSQL
I'm trying to do the following with these two tables, and I think it's a JOIN that I have to do but I'm not all that familiar with it and I'm trying to learn. I've created a query thus far to select a set of data from these tables so that the query isn't run on the entire database. Now with the data selected, I'm trying to do the following...
First and foremost, 'work_state' of table one ('bue') should be checked against 'work_state' of table two ('chapterassociation'). Once a match is found, 'bgu_cd' of table one ('bue') should be matched against 'bgu_cd' of table two ('chapterassociation'). When both matches are found, it will always point to a unique row within the second table ('chapterassociation'). Using that unique row within the second table ('chapterassociation'), the values of 'rgn_no' and 'chp_cd' should be UPDATED within the first table ('bue') to match the values within the second table ('chapterassociation').
I know this is probably asking a lot, but if someone could help me to construct a query to do this, it'd be wonderful! I really do want to learn, as I don't wish to be ignorant to this forever. Though I'm not sure if I completely understand how the JOIN and comparison here would work.
If I'm correct, I'll have to put this into seperate queries which will then be in PHP. So for example, it'll probably be a few IF ELSE statements that end with the final result of the final query, which updates the values from table two to table one.
A JOIN will do both level of matching for you...
bue
INNER JOIN
chapterassociation
ON bue.work_state = chapterassociation.work_state
AND bue.bgu_cd = chapterassociation.bgu_cd
The actual algorithm is determined by PostreSQL. It could be a merge, use hashes, etc, and depends on indexes and other statistics about the data. But you don't need to worry about that directly, SQL abstracts that away for you.
Then you just need a mechanism to write the data from one table to the other...
UPDATE
bue
SET
rgn_no = chapterassociation.rgn_no,
chp_cd = chapterassociation.chp_cd
FROM
chapterassociation
WHERE bue.work_state = chapterassociation.work_state
AND bue.bgu_cd = chapterassociation.bgu_cd

Categories