How to optimize this query which taking long time.? - php

While working with following query on mysql, Its taking to execute around 10sec.
SELECT SQL_CALC_FOUND_ROWS DISTINCT
b.appearance_id,
b.photo_album_id,
b.eventcmmnt_id,
b.id,
b.mem_id,
b.subj,
b.body,
b.image_link as photo_image_uploaded,
b.bottle_id,
b.date,
b.parentid,
b.from_id,
b.visible_to,
pa.photo_big as image_link,
b.post_via,
b.youtubeLink,
b.link_image,
b.link_url,
b.auto_genrate_text,
badges.badge_img,
badges.badge_bottle_img,
b.type,b.share_url_title
FROM bulletin b
INNER JOIN network n
ON (n.mem_id = b.mem_id)
LEFT JOIN badges
ON (b.bottle_id=badges.badge_id)
LEFT JOIN photo_album as pa
ON (pa.photo_id=b.photo_album_id)
JOIN members mem
ON (b.mem_id=mem.mem_id and mem.deleted<>'Y')
WHERE b.parentid = '0'
AND ('$userid' IN (n.frd_id, b.mem_id,b.from_id))
GROUP BY b.id
ORDER BY b.id DESC
LIMIT 0,10
The Inner query inside IN constraint has many frd_id,mem_id,from_id So I think because of that above query is executing slowly... So please help to optimize above query
Thanks

Check, that you have indexes on fields, which are used for joins.

As you have already been hinted, post the EXPLAIN SELECT... plan for the query. Why guess when you can check? (And a SQL Fiddle would be great too).
That will tell whether you need indexes (very, very likely!) and which indexes and how structured (do not go and index everything, since this may even harm performances!).
One suggestion I can already give: if you have an index field on members of the form
(mem_id, deleted, ...)
as you should, verify that deleted is indeed a boolean (if it is not, convert it to an enum of 'Y' and 'N'), and specify the condition like this:
on(b.mem_id=mem.mem_id AND NOT mem.deleted)
since this will go much easier on the indexing. If the field is not boolean, for example it may have the values 'Y', 'N' and 'P' for 'Pending', and you used a varchar field, then the <> has a considerable performance impact on the index - I'd even go out on a limb and say that the index is effectively neutralized. Indexes work best with matches, not with non-matches as <>.
I suspect you can also re-engineer this expression
AND ('$userid' IN (n.frd_id, b.mem_id,b.from_id))
by moving it, albeit partially, in the b JOIN. Or you could consider, depending on your tables' cardinality, using a UNION instead.

Related

Mysql fetch from last to first - [many records]

i want to fetch records from mysql starting from last to first LIMIT 20. my database have over 1M records. I am aware of order by. but from my understanding when using order by its taking forever to load 20 records i have no freaking idea. but i think mysql fetch all the records before ordering.
SELECT bookings.created_at, bookings.total_amount,
passengers.name, passengers.id_number, payments.amount,
passengers.ticket_no,bookings.phone,bookings.source,
bookings.destination,bookings.date_of_travel FROM bookings
INNER JOIN passengers ON bookings.booking_id = passengers.booking_id
INNER JOIN payments on payments.booking_id = bookings.booking_id
ORDER BY bookings.booking_id DESC LIMIT 10
I suppose if you execute the query without the order by the time would be satisfactory?
You might try to create an index in the column your are ordering:
create index idx_bookings_booking_id on bookings(booking_id)
You can try to find out complexity of the Query using
EXPLAIN SELECT bookings.created_at, bookings.total_amount,
passengers.name, passengers.id_number, payments.amount,
passengers.ticket_no,bookings.phone,bookings.source,
bookings.destination,bookings.date_of_travel FROM bookings
INNER JOIN passengers ON bookings.booking_id = passengers.booking_id
INNER JOIN payments on payments.booking_id = bookings.booking_id
ORDER BY bookings.booking_id DESC LIMIT 10
then check the proper index has been created on the table
SHOW INDEX FROM `db_name`.`table_name`;
if the index us not there create proper index on all the table
please add if anything is missing
The index lookup table needs to be able to reside in memory, if I'm not mistaken (filesort is much slower than in-mem lookup).
Use small index / column size
For a double in capacity use UNSIGNED columns if you need no negative values..
Tune sort_buffer_size and read_rnd_buffer_size (maybe better on connection level, not global)
See https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html , particularly regarding using EXPLAIN and the maybe trying another execution plan strategy.
You seem to need another workaround like materialized views.
Tell me if this sounds like it:
Create another table like the booking table e.g. CREATE TABLE booking_short LIKE booking. Though you only need the booking_id column
And check your code for where exactly you create booking orders, e.g. where you first insert into booking. SELECT COUNT(*) FROM booking_short. If it is >20, delete the first record. Insert the new booking_id.
You can select the ID and join from there before joining for more details with the rest of the tables.
You won't need limit or sorting.
Of course, this needs heavy documentation to avoid maintenance problems.
Either that or https://stackoverflow.com/a/5912827/6288442

Database Group By error

I've been working with Mysql for a while, but this is the first time I've encountered this problem.
The thing is that I have a select query...
SELECT
transactions.inventoryid,
inventoryName,
inventoryBarcode,
inventoryControlNumber,
users.nombre,
users.apellido,
transactionid,
transactionNumber,
originalQTY,
updateQTY,
finalQTY,
transactionDate,
transactionState,
transactions.observaciones
FROM
transactions
LEFT JOIN
inventory ON inventory.inventoryid = transactions.inventoryid
LEFT JOIN
users ON transactions.userid = users.userid
GROUP BY
transactions.transactionNumber
ORDER BY
transactions.inventoryid
But the GROUP BY is eliminating 2 values from the QUERY.
In this case, when I output:
foreach($inventory->inventory as $values){
$transactionid[] = $values['inventoryid'];
}
It returns:
2,3,5
If I eliminate the GROUP BY Statement it returns
2,3,4,5,6
Which is the output I need for this particular case.
The question is:
Is there a reason for this to happen?
If I'm grouping by a transaction and that was supposed to affect the query, wouldn't it then return only 1 value?
Maybe I'm over thinking this one or been working too long on the code that I don't see the obvious flaw in my logic. But if someone can lend me a hand I would appreciate it.
In standard SQL you can only SELECT colums which are
contained in GROUP BY clause
(or) aggregate "colums", like MAX() or COUNT().
You need to consult the MySQL description of the interpretation they use for columns which are not contained by GROUP BY (and which are no aggregated column) MySQL Handling of GROUP BY to find out what happens here.
Do you need more information?

Is it better do to a union in SQL or separate queries and then use php array_merge?

I have a SQL query that has 4 UNIONS and 4 LEFT JOINS. It is layed out as such:
SELECT ... FROM table1
LEFT JOIN other_table1
UNION SELECT ... FROM table2
LEFT JOIN other_table2
UNION SELECT ... other_table3
LEFT JOIN other_table3
UNION SELECT ... FROM table4
LEFT JOIN other_table4
Would it be better to run 4 separate queries and then merge the results with php after the fact? Or should I keep them together? Which would provide that fastest execution?
The most definitive answer is to test each method, however the UNION is most likely to be faster as only one query is run by MySQL as opposed to 4 for each part of the union.
You also remove the overhead of reading the data into memory in PHP and concatenating it. Instead, you can just do a while() or foreach() or whatever on one result.
In this case, it depends on the number of records you are going to get out of the result. Since you are using left join in all unions, I suggest to do different fetch to avoid bottleneck in SQL and merge the results in PHP
When a query is executed from a programming language, following steps occur
A connection is created to between application and database (or an existing connection is used from pool)
Query is sent to database
Database sends the result back
Connection is released to pool
If you are running N number of queries, above steps happen N number of times, which you can guess will definitely slow down the process. So ideally we should keep number of queries to as minimum as possible.
It will make sense to break a query into multiple parts if single query becomes complex and it gets difficult to maintain and takes a lot of time to execute. In that case too, good way will be to optimize the query itself.
As in your case, query is pretty simple, and as someone has pointed out that union will also help removing duplicate rows, the best way is to go for sql query than php code. Try optimization techniques like creating proper indexes on tables.
The UNION clause can be faster, because it will return distinct records at once (duplicated records won't be returned), otherwise you will need to do it in the application. Also, in this case it may help to reduce a traffic.
From the documentation:
The default behavior for UNION is that duplicate rows are removed from
the result. The optional DISTINCT keyword has no effect other than the
default because it also specifies duplicate-row removal. With the
optional ALL keyword, duplicate-row removal does not occur and the
result includes all matching rows from all the SELECT statements.
You can mix UNION ALL and UNION DISTINCT in the same query. Mixed UNION
types are treated such that a DISTINCT union overrides any ALL union
to its left. A DISTINCT union can be produced explicitly by using
UNION DISTINCT or implicitly by using UNION with no following DISTINCT
or ALL keyword.

MySQL Query Optimization - Random Record

I'm having a terrible time with a MySQL query. I've spent most of my weekend and most of my day today attempting to make this query run a bit faster. I've made it considerably faster, but I know I can make it better.
SELECT m.id,other_fields,C.contacts_count FROM marketingDatabase AS m
LEFT OUTER JOIN
(SELECT COUNT(*) as contacts_count, rid
FROM contacts
WHERE status = 'Active' AND install_id = 'XXXX' GROUP BY rid) as C
ON C.rid = m.id
WHERE (RAND()*2612<50)
AND do_not_call != 'true'
AND `ACTUAL SALES VOLUME` >= '800000'
AND `ACTUAL SALES VOLUME` <= '1200000'
AND status = 'Pending'
AND install_id = 'XXXXX'
ORDER BY RAND()
I have an index on 'install_id', 'category' and 'status' but the EXPLAIN shows it was sorting based on 9100 rows.
My Explain is here:
https://s3.amazonaws.com/jas-so-question/Screen+Shot+2012-03-13+at+12.34.04+AM.png
Anybody have any suggestions on what I can do to make this a bit faster? The entire point of the query is to select a random record from an account's records (install_id) that matches certain criteria like sales volume, status and do_not_call. I'm currently gathering 25 records and caching it (using PHP) so I only have to run this query once every 25 requests, but I'm already dealing with thousands of requests per day. It currently takes 0.2 seconds to run. I realize that by using ORDER BY RAND() I'm already taking a major performance hit, but it's just sorting 25 rows.
Thanks in advance for the help.
**EDIT: I forgot to mention that the 'contact_sort' index is on the 'contacts' table, and indexes install_id, status, and rid. (rid references Record ID in marketingDatabase so it knows which record a contact belongs to.
**EDIT 2: The 2612 number in the query represents the number of rows in marketingDatabase that match the criteria (install_id, status, actual sales volume, etc.)
Since I do not see your index definitions, I am not sure they are correct. The query would benefit from the following indexes:
a composite index (install_id, status, rid) on the contacts
a composite index (install_id, status, `ACTUAL SALES VOLUME`) on marketingDatabase
I played around with a few queries, and I don't think you'll ever be able to get a indexed query to work with RAND(), especially when you're using it in both a WHERE clause and an ORDER BY clause. If at all possible, I'd introduce the random element in my PHP logic, and probably look at whether two simple queries made more sense than one fairly complex one. Added to that, you have LEFT OUTER JOIN on a random result set, which may also be increasing the amount of work that has to be done a lot.
In summary, my guess would be - rewrite to exclude RAND, see if you can get rid of the LEFT OUTER JOIN. Two straightforward indexed queries with a bit of PHP in between may be a lot better.

Propel equivalent of "exists"

I am new to Propel and have been reading the documentation. But, I have not found a clear equivalent to the EXISTS and NOT EXISTS constructs from SQL. Linq in .NET, for instance, has Any(). Is there an equivalent to the following in "idiomatic" Propel?
SELECT a.column1, a column2, a.etc
FROM TableA a
WHERE NOT EXISTS (SELECT 1
FROM TableB b
WHERE b.someIdColumn = a.someIdColumn
AND b.aNullableDateColumn IS NULL)
After doing some more digging, I believe I have an answer to my question, or at least as good an answer as is currently available.
What comes after EXISTS or NOT EXISTS is a subquery. While that fact seems obvious, it did not originally occur to me to focus my search for help on subqueries. I found a few resources on the topic. Essentially, the options are to rewrite the query using JOINs (as is the heart of the answer by #Kaltas) or to use Criteria::CUSTOM. I decided I would likely prefer the second option, since it allows me to keep the subquery, potentially helping my database performance.
I did a lot of reading, then, about Criteria::CUSTOM, but the only reading that really helped me was reading the Propel 1.5 source. It's very simple, really. Just put the subquery, verbatim (using the database's table and column names, not Propel's object names) along with EXISTS or NOT EXISTS in the where call, like:
TableAQuery::create()
->where('NOT EXISTS (SELECT 1 FROM TableB WHERE TableA.someIdColumn = TableB.someIdColumn AND TableB.aNullableDateColumn IS NULL)')
->find();
It's that simple. Internally, the where method goes through a few possibilities for interpreting the clause, and finding no matches, it treats the clause as being of Criteria::CUSTOM and inserts it into the SQL query as-is. So, I could not use table aliases, for example.
If I ever have time, maybe I'll work on a more "ORM-ish" way to do this and submit a patch. Someone will probably beat me to it, though.
As in propel 1.6 u now can use Criteria::IN and Criteria::NOT_IN
Example : Select all users that are not in an UserGroup
$users = UserQuery::create()->filterById(UserPerUserGroupQuery::create()->select('user_id')->find(), CRITERIA::NOT_IN)
->orderByUserName()
->find();
I think you could rewrite the query as:
SELECT
a.column1,
a.column2,
a.etc
FROM
TableA a
WHERE
(SELECT
COUNT(*)
FROM
TableB b
WHERE
b.someIdColumn = a.someIdColumn
AND
b.aNullableDateColumn IS NULL
) > 0
which is easily doable in Propel.
Or even cleaner and easier to accomplish in Propel:
SELECT
a.column1,
a.column2,
a.etc
FROM
TableA a
LEFT JOIN
TableB b ON (b.someIdColumn = a.someIdColumn)
WHERE
b.aNullableDateColumn IS NULL
AND
b.primaryKeyColumn IS NOT NULL
Propel 2 can do:
TableAQuery::create()
->useTableBNotExistsQuery()
->filterByNullableDateColumn(null)
->endUse()
->find();
or
$nestedB = TableBQuery::create()
->filterByNullableDateColumn(null)
->where('TableB.someIdColumn = TableA.someIdColumn');
TableAQuery::create()->whereExists(nestedB)->find();

Categories