So I need to left join a table from MySQL with a couple of thousands of ids.
It’s like I need to temporarily build a table for the join then delete it, but that just sounds not right.
Currently the task is done by code but proves pretty slow on the results, and an sql query might be faster.
My thought was to use ...WHERE ID IN (“.$string_of_values.”);
But that cuts off the ids that have no match on the table.
So, how is it possible to tell MySQL to LEFT JOIN a table with a list of ids?
As I understand your task you need to leftjoin your working table to your ids, i.e. the output must contain all these ids even there is no matched row in working table. Am I correct?
If so then you must convert your ids list to the rowset.
You already tries to save them to the table. This is useful and safe practice. The additional points are:
If your dataset is once-used and may be dropped immediately after final query execution then you may create this table as TEEMPORARY. Than you may do not care of this table - it wil be deleted automatically when the connection is closed, but it may be reused (including its data edition) in this connection until it is closed. Of course the queries which creates and fills this table and final query must be executed in the same connection in that case.
If your dataset is small enough (approximately - not more than few megabytes) then you may create this table with the option ENGINE = Memory. In this case only table definition file (small text file) will be really written to the disk whereas the table body will be stored in the memory only, so the access to it will be fast enough.
You may create one or more indexes in such table and improve final query performance.
All these options may be combined.
Another option is to create such rowset dynamically.
In MySQL 5.x the only option is to create such rowset in according subquery. Like
SELECT ...
FROM ( SELECT 1 AS id UNION SELECT 2 UNIO SELECT 22 ... ) AS ids_rowset
LEFT JOIN {working tables}
...
In MySQL 8+ you have additional options.
You may do the same but use CTE:
WITH ids_rowset AS ( SELECT 1 AS id UNION SELECT 2 UNIO SELECT 22 ... )
SELECT ...
FROM ids_rowset
LEFT JOIN {working tables}
...
Alternatively you may transfer your ids list in some serialized form and parse it to the rowset in the query (in recursive CTE, or by using some table function, for example, JSON_TABLE).
All these methods creates once-used rowset (of course, CTE can be reused within the query). And this rowset cannot be indexed for query improvement (server may index this dataset during query execution if it finds this reasonable but you cannot affect this).
Related
I am converting an access database to a new format. Currently all data resides in MySQL.
For the purposes of this question, there are 3 tables. tbl_Bills, tbl_Documents, and tbl_Receipts.
I wrote an outer join query , as some bills have documents and receipts, other's don't. And I need a full listing of each set, given those situations, to be processed by a php script later on.
The problem is that the primary identifier, we'll call fld_CommonID, happens to exist in duplicate. For example, 3 bills have the same identifier, with different information. 3 documents and 3 receipts match those 3 bills.
So as you might have guessed, my join query results in 9 indistinct rows (6 duplicates), when there should be 3 (one join from each table). An inner join excludes data that isn't defined in the other table, and so doesn't work for my needs.
SO ... I'm thinking what I want to do, is update those 3 records in each table (across all rows that have duplicates) such that they have a unique counter id. #1, #2, and #3 respectively, so that I can perform join queries on them uniquely per row.
Is that possible without running php code to select the duplicates ordered by natural table order, followed-by updating them with a counter?
Would you advise that I go that route(scripted) instead of some magical SQL query to do such a thing, if such a query can be made?
Or is it possible to outer join based on natural table order (pretty sure that's impossible)?
writing this answer to simply close the question.
Inner joins would be perfect if there were a way to link duplicate fields in separate tables based on natural order (no primary key). The problem isn't that I lack a query, it's that the database is poorly structured. Which is a problem better solved with code not complex queries.
I am working on converting a prototype web application into something that can be deployed. There are some locations where the prototype has queries that select all the fields from a table although only one field is needed or the query is just being used for checking the existence of the record. Most of the cases are single row queries.
I'm considering changing these queries to queries that only get what is really relevant, i.e.:
select * from users_table where <some condition>
vs
select name from users_table where <some condition>
I have a few questions:
Is this a worthy optimization in general?
In which kind of queries might this change be particularly good? For example, would this improve queries where joins are involved?
Besides the SQL impact, would this change be good at the PHP level? For example, the returned array will be smaller (a single column vs multiple columns with data).
Thanks for your comments.
If I were to answer all of your three questions in a single word, I would definitely say YES.
You probably wanted more than just "Yes"...
SELECT * is "bad practice": If you read the results into a PHP non-associative array; then add a column; now the array subscripts are possibly changed.
If the WHERE is complex enough, or you have GROUP BY or ORDER BY, and the optimizer decides to build a tmp table, then * may lead to several inefficiencies: having to use MyISAM instead of MEMORY; the tmp table will be bulkier; etc.
EXISTS SELECT * FROM ... comes back with 0 or 1 -- even simpler.
You may be able to combine EXISTS (or a suitable equivalent JOIN) to other queries, thereby avoiding an extra roundtrip to the server.
I'm working on an existing application that uses some JOIN statements to create "immutable" objects (i.e. the results are always JOINed to create a processable object - results from only one table will be meaningless).
For example:
SELECT r.*,u.user_username,u.user_pic FROM articles r INNER JOIN users u ON u.user_id=r.article_author WHERE ...
will yield a result of type, let's say, ArticleWithUser that is necessary to display an article with the author details (like a blog post).
Now, I need to make a table featured_items which contains the columnsitem_type (article, file, comment, etc.) and item_id (the article's, file's or comment's id), and query it to get a list of the featured items of some type.
Assuming tables other than articles contain whole objects that do not need JOINing with other tables, I can simply pull them with a dynamicially generated query like
SELECT some_table.* FROM featured_items RIGHT JOIN some_table ON some_table.id = featured_items.item_id WHERE featured_items.type = X
But what if I need to get a featured item from the aforementioned type ArticleWithUser? I cannot use the dynamically generated query because the syntax will not suit two JOINs.
So, my question is: is there a better practice to retrieve results that are always combined together? Maybe do the second JOIN on the application end?
Or do I have to write special code for each of those combined results types?
Thank you!
a view can be thot of as like a table for the faint of heart.
https://dev.mysql.com/doc/refman/5.0/en/create-view.html
views can incorporate joins. and other views. keep in mind that upon creation, they take a snapshot of the columns in existence at that time on underlying tables, so Alter Table stmts adding columns to those tables are not picked up in select *.
An old article which I consider required reading on the subject of MySQL Views:
By Peter Zaitsev
To answer your question as to whether they are widely used, they are a major part of the database developer's toolkit, and in some situations offer significant benefits, which have more to do with indexing than with the nature of views, per se.
I have to work with a mssql database that I have no control over, so sadly, I can't change the structure at all. This database is setup so that there are 2 tables Entry and Area. In the Area table, there is a column sArea that I need to look up based on a value ixEntry. In the Entry table, I can do a look up (the variables are PHP variables):
SELECT sTitle,ixCategory,ixArea FROM Entry WHERE ixEntry='$ixEntry'
and then do a second query
SELECT sArea FROM Area WHERE ixArea='{$return['ixArea']}'
Which works just fine, except with the way that the network is setup, there is considerably more overhead time with two queries.
How can I combine these two queries so that I have a result that would be the equivalent of SELECT sTitle,ixCategory,sArea FROM Entry WHERE ixEntry='$ixEntry' as if sArea were in the Entry table, not ixArea?
SELECT a.sArea FROM Entry e
INNER JOIN Area a ON e.ixArea = a.ixArea
WHERE e.ixEntry='$ixEntry'
I have an array of data :
$ary = new array(
array("domain"=>"domain1", "username"=>"username1"),
array("domain"=>"domain1", "username"=>"username2"),
array("domain"=>"domain2", "username"=>"username3"),
);
I need to use this data to retrieve data from a MySql database with the following table structure (simplified for illustration).
domain_table
domain_id
domain
user_table
user_id
user
stuff_table
stuff_id
... details
link_table
link_id
user_id -- The user we are searching for connections on
connected_user_id -- from the array above
stuff_id
I need to fetch every row in the stuff table for a single user_id that also has a connected_user_id from the array.
I'm using PDO
There may be hundreds (possibly thousands) of entries in $ary.
I could generate a very large query by looping thorugh the array and adding loads of joins.
I could perform a single query for each row in $ary.
I could create a temporary table with $ary and use a join.
Something else?
What is the best way - fastest processor time without being too arcane - to achieve this?
perform a single query for each row - bed way because of small speed
many joins better then 1.
if it is possible - make view & use it.
If your entire dataset is HUGE and doesn't fit in memory, joins shouldn't be your choice.
Do sequential selects. Select rows from your link_table, gather user_id's out of the result in PHP. Then select rows from user_table using "where user_id in (?)". Handle grouping of results in PHP.
Even with large tables selects by key will be fast. And having 2-5 selects instead of 1 select is not a problem.
Joins will be fast while your DB fits into RAM. Then problems arise.