I have a database and an external file. What these two share are reference codes for products.
But in the external file I have all my reference codes saved down, whilst plenty are still missing in the database. Is there a way to make a query so that I can check what values are missing in my database, in a given table?
There's no need to worry about how the XML interfaces with the database. I already have that down through PHP and simplexml. I am mostly struggling with the query to use in this case.
Database
XML File
AJS2S
AJS2S
ABBB2
ABBB2
JJI90K
JJJJ92
If you have a list of valus at hand and you want to check which ones are missing in your table, you enumerate them in a union all subquery, then use not exists:
select x.product_code
from (
select 'AJS2S' as product_code
union all select 'ABBB2'
union all ...
) x
where not exists (select 1 from mytable t where t.product_code = x.product_code)
Or, in very recent versions of MySQL (8.0.19 or higher), you can use the values() row constructor:
select x.product_code
from (values row('AJS2S'), row('ABBB2'), ...) x(product_code)
where not exists (select 1 from mytable t where t.product_code = x.product_code)
Of course, if you have your xml data already loaded in a table, say xmltable, then you can use that instead of the subquery:
select x.product_code
from xmltable x(product_code)
where not exists (select 1 from mytable t where t.product_code = x.product_code)
You would use not exists:
select code
from xml
where not exists (select 1 from database d where d.code = xml.code);
This retrieves each code -- so it might have many duplicates. You can summarize using group by:
select code, count(*)
from xml
where not exists (select 1 from database d where d.code = xml.code)
group by code;
Related
I am building a website as a diagnostic aid for neurological conditions. It is coded in html and communicates with a MySQL database via PHP. The primary table which feeds information to the website is structured as follows:
Image showing table structure with rows representing Neurological Conditions and columns providing information on symptoms associated with these conditions
The table above can be reproduced using the following MySQL code:
CREATE TABLE IF NOT EXISTS my_table (
`Condition` VARCHAR(22) CHARACTER SET utf8,
`Diarrhoea` INT,
`Headache` INT,
`Hyporeflexia` INT,
`Hypoaesthesia_Spinothalamic` INT
);
INSERT INTO my_table VALUES
('Abetalipoproteinaemia',1,NULL,1,NULL),
('Caffeine toxicity',1,1,NULL,NULL),
('Vitamin B12 deficiency',NULL,NULL,1,2);
SELECT * FROM my_table;
Cell values are as follows:
(m,n)=1 if condition and symptom are associated
(m,n)=2 if condition and symptom CANNOT be associated. The presence of this symptom excludes the condition as a possible diagnosis.
(m,n)=null if no information exists or if symptom and condition are not associated
I'm struggling to write an SQL query which will identify all the columns (n) for a specific condition (m) where the value of the cell (m,n) = 2.
So far my reading has highlighted ideas about pivot tables (I can't see how I would be able to use them for this problem) and database normalisation which I don't think is possible because of the other queries I am running on the same table.
An example based on the table above:
Patient presents with hyporeflexia
SQL query identifies this could be cause by either "abetalipoproteinaemia" or "vitamin B12 deficiency" - this all works fine already
I want to establish whether any of the conditions identified (abetalipoproteinaemia and vitamin B12 deficiency) have symptoms that would exclude the diagnosis (any cell in that row = 2) and return the name of any column (symptom) for which this is the case.
A query to the SQL database identifies vitamin B12 deficiency would be excluded as a possible diagnosis if spinothalamic hypoesthesia is present - this will be fed back to the html display.
Any help would be much appreciated - thanks for your time!
I think it would be more usual to arrange the data something like this - apologies for any spelling errors or poor terminology, but if you pay peanuts...
syptom condition exclusion
Abetalipoproteinaemia Diarrhoea 0
Abetalipoproteinaemia Hyporeflexia 0
Caffeine toxicity Diarrhoea 0
Caffeine toxicity Headache 0
Vitamin B12 deficiency Hyporeflexia 0
Vitamin B12 deficiency Hypoaesthesia Spinthalamic 1
You would then take this one or two steps further, and have a table for symptoms, a table for conditions, and a table which says which symptom relates to which condition, and how.
Query pattern would be much more straightforward if the table were designed following normative relational patterns.
Consider the resultset returned by a query of this form:
SELECT v.condition
, v.symptom
, v.associated_or_excluded
FROM ( SELECT t1.`Condition` AS `condition`
, 'Diarrhoea' AS `symptom`
, t1.`Diarrhoea` AS `associated_or_excluded`
FROM mytable t1
UNION ALL
SELECT t2.`Condition`
, 'Headache'
, t2.`Headache`
FROM mytable t2
UNION ALL
SELECT t3.`Condition`
, 'Hyporeflexia'
, t3.`Hyporeflexia`
FROM mytable t3
UNION ALL
SELECT t4.`Condition`
, 'Hypoaesthesia_Spinothalamic'
, t4.`Hypoaesthesia_Spinothalamic`
FROM mytable t4
) v
We could use that query as an inline view (a rowsource) for an outer query, or a new table could be populated with the result from this query INSERT ... SELECT to convert.
With that resultset, with the data in standard relational form, we avoid the struggle by writing a simple query like this:
SELECT t.symptom
FROM ( ... ) t
WHERE t.condition = 'Hyporeflexia'
AND t.associated_or_excluded = 2
that will return symptoms that are excluded from a particular condition.
(or, to put it in terms of the original question, where a value of 2 is found the intersection of m and n)
Note that ( ... ) is replaced with a table name or with an inline view returning the result from query above.
Note that the entirety of the "struggle" is inside the parens, with the inline view query that gets the data represented in a suitable form.
SELECT t.symptom
FROM ( -- inline view query
SELECT t1.`Condition` AS `condition`
, 'Diarrhoea' AS `symptom`
, t1.`Diarrhoea` AS `associated_or_excluded`
FROM mytable t1
UNION ALL
SELECT t2.`Condition`
, 'Headache'
, t2.`Headache`
FROM mytable t2
UNION ALL
SELECT t3.`Condition`
, 'Hyporeflexia'
, t3.`Hyporeflexia`
FROM mytable t3
UNION ALL
SELECT t4.`Condition`
, 'Hypoaesthesia_Spinothalamic'
, t4.`Hypoaesthesia_Spinothalamic`
FROM mytable t4
) t
WHERE t.condition = 'Hyporeflexia'
AND t.associated_or_excluded = 2
SELECT EXISTS
(SELECT * FROM table WHERE deleted_at IS NULL and the_date = '$the_date' AND company_name = '$company_name' AND purchase_country = '$p_country' AND lot = '$lot_no') AS numofrecords")
What is wrong with this mysql query?
It is still allowing duplicates inserts (1 out of 1000 records). Around 100 users making entries, so the traffic is not that big, I assume. I do not have access to the database metrics, so I can not be sure.
The EXISTS condition is use in a WHERE clause. In your case, the first select doesn't specify the table and the condition.
One example:
SELECT *
FROM customers
WHERE EXISTS (SELECT *
FROM order_details
WHERE customers.customer_id = order_details.customer_id);
Try to put your statement like this, and if it returns the data duplicated, just use a DISTINCT. (SELECT DISCTINCT * .....)
Another approach for you :
INSERT INTO your_table VALUES (SELECT * FROM table GROUP BY your_column_want_to_dupplicate);
The answer from #Nick gave the clues to solve the issue. Separated EXIST check and INSERT was not the best way. Two users were actually able to do INSERT, if one got 0. A single statement query with INSERT ... ON DUPLICATE KEY UPDATE... was the way to go.
SELECT id, FIO, parent_id
FROM users
WHERE parent_id =
(
SELECT id
FROM users
WHERE parent_id =
(
SELECT id
FROM users
WHERE id = 16
)
)
So here I am making an hierarchy tree, first selecting the root parent, then the children's and so on to 24th level of depth.
The question is: How to select more than one column from the inner queries?
Because I need to get the other rows fields to display info like: name, surname, age
It looks like I can only get those columns of rows in the outer query (the topmost).
P.S.: I don't want to use joins because they generate duplicate fields.
Is there a solution?
You could iterate on the SQL side using MySQL query variables. This will return all childs with all data of one parent node without repeating yourself (and thus without imposing a limit on the depth of your tree)
something like this: (500 being the parents id to start with)
SELECT
id,
parent_id,
name,
'0' as depth,
#tree_ids := id AS foo
FROM
tree,
(SELECT #tree_ids := '', #depth := -1) vars
WHERE id = 500
UNION
SELECT
id,
parent_id,
name,
#depth := IF(parent_id = 500, 1, #depth + 1) AS depth,
#tree_ids := CONCAT(id, ',', #tree_ids) AS foo
FROM
tree
WHERE FIND_IN_SET(parent_id, #tree_ids) OR parent_id = 500
See a working example at SQLfiddle
Note that this gives a really bad performance on larger datasets because MySQL will not use your indexes and instead will do a full table scan. (i don't understand why its not using indexes, thats just how it is. if someone has advice on or explain the indexing issue, please comment!)
= comparisons work on only a single value. You can use in to compare against multiple values:
SELECT ...
FROM yourtable
WHERE somefield IN (select somevalue from othertable);
Storing heirarchical data in mysql and getting it out is not as simple as that.
Look into this: https://stackoverflow.com/a/4346009/9094
You will need more data to work with.
It seems your DB relationship is setup to be MPTT, here is a good blog post exaplaining how to query mysql MPTT data
http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/
Have a look at Full Tree example Retrieving a Full Tree
in summary it can be done with joins.
I am not 100% sure if I understood exactly what you mean, but if you want to select all columns separately from the table in a subselect...
col1, col2, col3, col4
you would need for each column a single subselect that always matches against the same WHERE. Example:
`SELECT * FROM main_table,
(SELECT col1 FROM inner_table WHERE inner_table.some_column=main_table.some_column),
(SELECT col2 FROM inner_table WHERE inner_table.some_column=main_table.some_column), ...`
I have an SQL query which links 3 tables using UNION:
$sql ="(SELECT Drive.DriveID,Ram.Memory from Drive,Ram where Drive.DriveID = Ram.RamID) UNION
(SELECT Drive.DriveID,External.Memory from Drive, External where Drive.DriveID = External.ExtID)";
Suppose I want to get Ram.Name as well. How do I do this? If I use Ram.Name in the first SELECT statement it would not produce the correct result.
Any method for tackling this? I want to do it using UNION.
In a Union query, all of the columns must be specified in all of the statements in the same order.
Therefore you'd need to have
(SELECT Drive.DriveID,Ram.Memory,Ram.Name
from Drive,Ram
where Drive.DriveID = Ram.RamID)
UNION
(SELECT Drive.DriveID,External.Memory, '' as Name
from Drive, External
where Drive.DriveID = External.ExtID)
Or if your External table has a Name field you could Include that one instead of an empty string.
I'm trying to create a query that will select all dates between two dates
This is my query:
$query = "SELECT DISTINCT * FROM D1,D2
WHERE D1.DATE_ADDED BETWEEN '$date1' AND '$date2' AND D1.D1_ID = D2.D2_ID";
The trouble is, it is not returning anything, but not producing an error either
So I tried inputting it directly into phpMyAdmin like this
SELECT DISTINCT * FROM D1,D2
WHERE D1.DATE_ADDED BETWEEN '2011-01-01' AND '2011-12-12'
AND D1.D1_ID = D2.D2_ID`
then like this
SELECT DISTINCT * FROM D1,D2
WHERE D1.DATE_ADDED BETWEEN '2011-01-01' AND '2011-12-12'
and like this
SELECT * FROM D1
WHERE DATE_ADDED BETWEEN '2011-01-01' AND '2011-12-12'
and I just get
MySQL returned an empty result set (i.e. zero rows). ( Query took 0.0003 sec )
Yes, my tables exist, and so do the columns :)
In the first cases the lack of results could be because of the inner join. For a result to be in the set it would require a record in both tables, ie. a record from d1 would not appear unless d2 also had that id in the d2_id column. To resolve this, if that is correct for your business logic, use left join.
However, the last of your cases (without the join) suggests the reasons is a lack of matching records in the first (left) table d1.
Without the full dataset we can't really comment further, since all the code you are running is perfectly valid.
If you always want to select an entire year it is easer to select it like this:
SELECT * FROM D1 WHERE YEAR(DATE_ADDED) = 2011;
Please implement below code
SELECT DISTINCT * FROM D1,D2
WHERE D1.DATE_ADDED BETWEEN DATE_FORMAT('2011-01-01','%Y-%m-%d')
AND DATE_FORMAT('2011-12-12','%Y-%m-%d')
AND D1.D1_ID = D2.D2_ID`