I have a mySql table:
id INT(10),
property_id INT(10),
value_id INT(10),
..
There's an index 'combination' on property_id + value_id
I have an array containing for example [1 => 68, 4 => 8, 9 => 15, ...]
Instead of this query:
SELECT * FROM table
WHERE (property_id = 1 && value_id = 68)
|| (property_id = 4 && value_id = 8)
|| (property_id = 9 && value_id = 15)
|| ...
i hoped something as this would work:
SELECT * FROM table WHERE combination IN ('1_68', '4_8', '9_15', ...)
I now know this does not work. But is there another way i can accomplish this?
In MySQL you can use tuples for conditions:
SELECT * FROM table
WHERE (property_id, value_id) IN (
(1, 68),
(4, 8),
(9, 15)
);
This is how it should work. But yes - MySQL doesn't use the index properly. We can just hope it will do some day (AFAIK it works for PostgreSQL).
If it is about performance and you need it now, then you might consider to use an indexed virtual (generated) column (available in MySQL 5.7.8).
ALTER TABLE `locations`
ADD COLUMN `combination` VARCHAR(21) GENERATED ALWAYS AS CONCAT(property_id, '_', value_id),
ADD INDEX `combination` (`combination`);
And now you can use your query
SELECT * FROM table WHERE combination IN ('1_68', '4_8', '9_15', ...)
If you want to save some memory you can combine two INTs into one BIGINT
ADD COLUMN `combination` VARCHAR(21) GENERATED ALWAYS AS ((property_id << 32) + value_id)
You can also just use UNION ALL
SELECT * FROM table WHERE (property_id, value_id) = (1,68)
UNION ALL
SELECT * FROM table WHERE (property_id, value_id) = (4,8)
UNION ALL
SELECT * FROM table WHERE (property_id, value_id) = (9,15)
This will be fast. It's a shame that MySQL isn't doing that trivial optimisation.
Can you use this :
SELECT * FROM table WHERE concat(property_id,'_',value_id) IN ('1_68', '4_8', '9_15', ...)
Related
I have a table with a lot of columns and I want to select all columns but I want unique one of these columns.
This works for me but I don't get all columns:
$result = mysql_query("SELECT DISTINCT company FROM table t $order");
I also tested this but doesn't do anything:
$result = mysql_query("SELECT * DISTINCT company FROM table t $order");
$result = mysql_query("SELECT DISTINCT company * FROM table t $order");
EDIT
My table has a lot of columns let's say it has 5 so an example of my records is this:
company x y price numOfEmpl
one 1 5 1.3 15
one 2 6 1.4 15
two 3 7 1.5 16
three 4 8 1.6 17
So I want to cut the second line and take all the others.
The DISTINCT keyword can be used to return only distinct (different) values within defined columns, like
SELECT DISTINCT column_name,column_name
or you can return the amount of DISTINCT values by
SELECT COUNT(DISTINCT column_name)
See some samples on W3Schools
I think you might need to use a seperate sql query for all the records
Edited previous answer based on extra information
Have a solution for you in MySQL and in SQL Server if you need it
MySQL Example (Using User Variables/Sub-Queries)
CREATE TABLE SomeTable (company VARCHAR(20), x INT, y INT, price FLOAT, numOfEmploy INT);
INSERT INTO SomeTable
(company, x, y, price, numOfEmploy)
VALUES
('one', 1, 5, 1.3, 15),
('one', 1, 6, 1.4, 15),
('two', 1, 7, 1.5, 16),
('three', 1, 8, 1.6, 17);
SET #count = NULL, #value = NULL;
SELECT company, x, y, price, numOfEmploy FROM (
SELECT
company, x, y, price, numOfEmploy,
#count := IF(#value = company, #count + 1, 1) AS rc,
#value := company
FROM SomeTable
) AS grouped_companies WHERE rc = 1
SQL Server Example (Using CTE)
--Create the table
declare #sometable table ( company varchar(10), x int, y int, price float, numOfEmploy int)
--insert the data
insert into #sometable values ('one', 1, 5, 1.3, 15)
insert into #sometable values ('one', 2, 6, 1.4, 15)
insert into #sometable values ('two', 3, 7, 1.5, 16)
insert into #sometable values ('three', 4, 8, 1.6, 17)
--WITH Common Table Expression
;WITH grouped_companies AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY company
ORDER BY company) AS rc
FROM #sometable)
SELECT gc.company, gc.x, gc.y, gc.price, gc.numOfEmploy
FROM grouped_companies gc
WHERE gc.rc = 1
I want to execute a query where I can find one ID in a list of ID.
table user
id_user | name | id_site
-------------------------
1 | james | 1, 2, 3
1 | brad | 1, 3
1 | suko | 4, 5
and my query (doesn't work)
SELECT * FROM `user` WHERE 3 IN (`id_site`)
This query work (but doesn't do the job)
SELECT * FROM `user` WHERE 3 IN (1, 2, 3, 4, 6)
That's not how IN works. I can't be bothered to explain why, just read the docs
Try this:
SELECT * FROM `user` WHERE FIND_IN_SET(3,`id_site`)
Note that this requires your data to be 1,2,3, 1,3 and 4,5 (ie no spaces). If this is not an option, try:
SELECT * FROM `user` WHERE FIND_IN_SET(3,REPLACE(`id_site`,' ',''))
Alternatively, consider restructuring your database. Namely:
CREATE TABLE `user_site_links` (
`id_user` INT UNSIGNED NOT NULL,
`id_site` INT UNSIGNED NOT NULL,
PRIMARY KEY (`user_id`,`site_id`)
);
INSERT INTO `user_site_links` VALUES
(1,1), (1,2), (1,3),
(2,1), (2,3),
(3,4), (3,5);
SELECT * FROM `user` JOIN `user_site_links` USING (`id_user`) WHERE `id_site` = 3;
Try this: FIND_IN_SET(str,strlist)
NO! For relation databases
Your table doesn't comfort first normal form ("each attribute contains only atomic values, and the value of each attribute contains only a single value from that domain") of a database and you:
use string field to contain numbers
store multiple values in one field
To work with field like this you would have to use FIND_IN_SET() or store data like ,1,2,3, (note colons or semicolons or other separator in the beginning and in the end) and use LIKE "%,7,%" to work in every case. This way it's not possible to use indexes[1][2].
Use relation table to do this:
CREATE TABLE user_on_sites(
user_id INT,
site_id INT,
PRIMARY KEY (user_id, site_id),
INDEX (user_id),
INDEX (site_id)
);
And join tables:
SELECT u.id, u.name, uos.site_id
FROM user_on_sites AS uos
INNER JOIN user AS u ON uos.user_id = user.id
WHERE uos.site_id = 3;
This way you can search efficiently using indexes.
The problem is that you are searching within several lists.
You need something more like:
SELECT * FROM `user` WHERE id_site LIKE '%3%';
However, that will also select 33, 333 and 345 so you want some more advanced text parsing.
The WHERE IN clause is useful to replace many OR conditions.
For exemple
SELECT * FROM `user` WHERE id IN (1,2,3,4)
is cleaner than
SELECT * FROM `user` WHERE id=1 OR id=2 OR id=3 OR id=4
You're just trying to use it in a wrong way.
Correct way :
WHERE `field` IN (list_item1, list_item2 [, list_itemX])
Let's say I have a table in a database like this table. Let's say that I want to get the entries on the left and right of the entry whose the primary key is equal to 5 (or any other primary key). So in our case, I want to get the entries whose primary key is equal to 4 and 6 respectively. What is the SQL query that will give me such result? Can you guys translate the SQL query into a find('all') CakePHP query?
Thank you
NOTE: The ids are not necessarily contiguous, meaning, they do not necessarily follow the 1, 2, 3, 4, 5, 6, 7, 8 sequence. I can have something like 1, 5, 13, 44, 66, 123, etc
Try Union like this
(SELECT * FROM employee WHERE id < 5 order by id DESC LIMIT 1)
UNION
(SELECT * FROM employee WHERE id >5 LIMIT 1)
PHP
$id = 5;
SELECT * FROM Employee where id = $id-1 OR id = $id+1;
MySQL
SET #id = 5;
SELECT * FROM Employee where id = #id-1 OR id = #id+1;
Checkout find('neighbors'). It returns the records before and after the one you specify and your ids can have "holes" in the sequence.
My PostgreSQL database contains a table to store instances of a registered entity. This table is populated via spreadsheet upload. A web interface allows an operator to modify the information presented. However, the original data is not modified. All changes are stored in a separate table changes with the columns unique_id, column_name, value and updated_at.
Once changes are made, they are presented to the operator by first querying the original table and then querying the change table (using instance ID and the latest change date, grouped by column name). The two results are merged in PHP and presented on the web interface. This is a rather rigid way of going about the task and I would like to keep all logic within SQL.
I can easily select the latest changes for the table using the following query:
SELECT fltr_chg.unique_id, fltr_chg.column_name, chg_val.value
FROM changes AS chg_val
JOIN (
SELECT chg_rec.unique_id, chg_rec.column_name, MAX( chg_rec.updated_at )
FROM information_schema.columns AS source
JOIN changes AS chg_rec ON source.table_name = 'instances'
AND source.column_name = chg_rec.column_name
GROUP BY chg_rec.unique_id, chg_rec.column_name
) AS fltr_chg ON fltr_chg.unique_id = chg_val.unique_id
AND fltr_chg.column_name = chg_val.column_name;
And selecting the entries from the instances table is just as easy:
SELECT * FROM instances;
Now, if there was only a way of transforming the former result and substituting the resulting values into the latter, based on the unique_id and column_name, and still retaining the result as a table, the problem would be solved. Is this possible to do?
I am sure that this is not the rarest of the problems and most likely, some systems do keep track of changes to the data in a similar way. How do they apply them back to the data if not through one of the the above described ways (current and sought solutions)?
Assuming Postgres 9.1 or later.
I simplified / optimized your basic query to retrieve the latest values:
SELECT DISTINCT ON (1,2)
c.unique_id, a.attname AS col, c.value
FROM pg_attribute a
LEFT JOIN changes c ON c.column_name = a.attname
AND c.table_name = 'instances'
-- AND c.unique_id = 3 -- uncomment to fetch single row
WHERE a.attrelid = 'instances'::regclass -- schema-qualify to be clear?
AND a.attnum > 0 -- no system columns
AND NOT a.attisdropped -- no deleted columns
ORDER BY 1, 2, c.updated_at DESC;
I query the PostgreSQL catalog instead of the standard information schema because that is faster. Note the special cast to ::regclass.
Now, that gives you a table. You want all values for one unique_id in a row.
To achieve that you have basically three options:
One subselect (or join) per column. Expensive and unwieldy. But a valid option for only a few columns.
A big CASE statement.
A pivot function. PostgreSQL provides the crosstab() function in the additional module tablefunc for that.
Basic instructions:
PostgreSQL Crosstab Query
Basic pivot table with crosstab()
I completely rewrote the function:
SELECT *
FROM crosstab(
$x$
SELECT DISTINCT ON (1, 2)
unique_id, column_name, value
FROM changes
WHERE table_name = 'instances'
-- AND unique_id = 3 -- un-comment to fetch single row
ORDER BY 1, 2, updated_at DESC;
$x$,
$y$
SELECT attname
FROM pg_catalog.pg_attribute
WHERE attrelid = 'instances'::regclass -- possibly schema-qualify table name
AND attnum > 0
AND NOT attisdropped
AND attname <> 'unique_id'
ORDER BY attnum
$y$
)
AS tbl (
unique_id integer
-- !!! You have to list all columns in order here !!! --
);
I separated the catalog lookup from the value query, as the crosstab() function with two parameters provides column names separately. Missing values (no entry in changes) are substituted with NULL automatically. A perfect match for this use case!
Assuming that attname matches column_name. Excluding unique_id, which plays a special role.
Full automation
Addressing your comment: There is a way to supply the column definition list automatically. It's not for the faint of heart, though.
I use a number of advanced Postgres features here: crosstab(), plpgsql function with dynamic SQL, composite type handling, advanced dollar quoting, catalog lookup, aggregate function, window function, object identifier type, ...
Test environment:
CREATE TABLE instances (
unique_id int
, col1 text
, col2 text -- two columns are enough for the demo
);
INSERT INTO instances VALUES
(1, 'foo1', 'bar1')
, (2, 'foo2', 'bar2')
, (3, 'foo3', 'bar3')
, (4, 'foo4', 'bar4');
CREATE TABLE changes (
unique_id int
, table_name text
, column_name text
, value text
, updated_at timestamp
);
INSERT INTO changes VALUES
(1, 'instances', 'col1', 'foo11', '2012-04-12 00:01')
, (1, 'instances', 'col1', 'foo12', '2012-04-12 00:02')
, (1, 'instances', 'col1', 'foo1x', '2012-04-12 00:03')
, (1, 'instances', 'col2', 'bar11', '2012-04-12 00:11')
, (1, 'instances', 'col2', 'bar17', '2012-04-12 00:12')
, (1, 'instances', 'col2', 'bar1x', '2012-04-12 00:13')
, (2, 'instances', 'col1', 'foo2x', '2012-04-12 00:01')
, (2, 'instances', 'col2', 'bar2x', '2012-04-12 00:13')
-- NO change for col1 of row 3 - to test NULLs
, (3, 'instances', 'col2', 'bar3x', '2012-04-12 00:13');
-- NO changes at all for row 4 - to test NULLs
Automated function for one table
CREATE OR REPLACE FUNCTION f_curr_instance(int, OUT t public.instances) AS
$func$
BEGIN
EXECUTE $f$
SELECT *
FROM crosstab($x$
SELECT DISTINCT ON (1,2)
unique_id, column_name, value
FROM changes
WHERE table_name = 'instances'
AND unique_id = $f$ || $1 || $f$
ORDER BY 1, 2, updated_at DESC;
$x$
, $y$
SELECT attname
FROM pg_catalog.pg_attribute
WHERE attrelid = 'public.instances'::regclass
AND attnum > 0
AND NOT attisdropped
AND attname <> 'unique_id'
ORDER BY attnum
$y$) AS tbl ($f$
|| (SELECT string_agg(attname || ' ' || atttypid::regtype::text
, ', ' ORDER BY attnum) -- must be in order
FROM pg_catalog.pg_attribute
WHERE attrelid = 'public.instances'::regclass
AND attnum > 0
AND NOT attisdropped)
|| ')'
INTO t;
END
$func$ LANGUAGE plpgsql;
The table instances is hard-coded, schema qualified to be unambiguous. Note the use of the table type as return type. There is a row type registered automatically for every table in PostgreSQL. This is bound to match the return type of the crosstab() function.
This binds the function to the type of the table:
You will get an error message if you try to DROP the table
Your function will fail after an ALTER TABLE. You have to recreate it (without changes). I consider this a bug in 9.1. ALTER TABLE shouldn't silently break the function, but raise an error.
This performs very well.
Call:
SELECT * FROM f_curr_instance(3);
unique_id | col1 | col2
----------+-------+-----
3 |<NULL> | bar3x
Note how col1 is NULL here.
Use in a query to display an instance with its latest values:
SELECT i.unique_id
, COALESCE(c.col1, i.col1)
, COALESCE(c.col2, i.col2)
FROM instances i
LEFT JOIN f_curr_instance(3) c USING (unique_id)
WHERE i.unique_id = 3;
Full automation for any table
(Added 2016. This is dynamite.)
Requires Postgres 9.1 or later. (Could be made out to work with pg 8.4, but I didn't bother to backpatch.)
CREATE OR REPLACE FUNCTION f_curr_instance(_id int, INOUT _t ANYELEMENT) AS
$func$
DECLARE
_type text := pg_typeof(_t);
BEGIN
EXECUTE
(
SELECT format
($f$
SELECT *
FROM crosstab(
$x$
SELECT DISTINCT ON (1,2)
unique_id, column_name, value
FROM changes
WHERE table_name = %1$L
AND unique_id = %2$s
ORDER BY 1, 2, updated_at DESC;
$x$
, $y$
SELECT attname
FROM pg_catalog.pg_attribute
WHERE attrelid = %1$L::regclass
AND attnum > 0
AND NOT attisdropped
AND attname <> 'unique_id'
ORDER BY attnum
$y$) AS ct (%3$s)
$f$
, _type, _id
, string_agg(attname || ' ' || atttypid::regtype::text
, ', ' ORDER BY attnum) -- must be in order
)
FROM pg_catalog.pg_attribute
WHERE attrelid = _type::regclass
AND attnum > 0
AND NOT attisdropped
)
INTO _t;
END
$func$ LANGUAGE plpgsql;
Call (providing the table type with NULL::public.instances:
SELECT * FROM f_curr_instance(3, NULL::public.instances);
Related:
Refactor a PL/pgSQL function to return the output of various SELECT queries
How to set value of composite variable field using dynamic SQL
I'm creating a 'similar items' link table.
i have a 2 column table. both columns contains product ids.
CREATE TABLE IF NOT EXISTS `prod_similar` (
`id_a` int(11) NOT NULL,
`id_b` int(11) NOT NULL
)
INSERT INTO `prod_similar` (`id_a`, `id_b`) VALUES
(5, 10),
(5, 15),
(10, 13),
(10, 14),
(14, 5),
(14, 13);
I want to select 3 similar products, favouring products where the id is in the first col, 'id_a'
SELECT * FROM prod_similar WHERE id_a={$id} OR id_b={$id}
ORDER BY column(?)
LIMIT 3
Don't know, maybe this?
SELECT *
FROM similar_items
WHERE col_1={$id} OR col_2={$id}
ORDER BY CASE WHEN col_1={$id} THEN 0 WHEN col_2={$id} THEN 1 END
LIMIT 3
I assume you have other columns as well
(SELECT 1 favouring, id_a id, [other columns]
FROM prod_similar
WHERE id_a = {$id})
UNION
(SELECT 2 favouring, id_b id, [other columns]
FROM prod_similar
WHERE id_b = {$id})
ORDER BY favouring, id
LIMIT 3;
In case you don't mind duplicates or there are none between id_a and id_b you can do UNION ALL instead which is considerably faster.
Unions are indication of denormalized data, denormalized data improves speed of certain queries and reduces speed of others (such as this).
An easy way to do this is this:
ORDER BY NULLIF(col_1, {$id}) LIMIT 3
The CASE WHEN works as well, but this is bit simpler.
I am not sure I get the question, could you maybe post example data for the source table and also show what the result should look like.
If I got you right i would try something like
Select (case
when col_1={$ID}:
col1
when col_2={$ID}:
col2) as id from similar_items WHERE col_1={$id} OR col_2={$id}
LIMIT 3