Comparing rows in table for differences between fields - php

I have a table (client) with 20+ columns that is mostly historical data.
Something like:
id|clientID|field1|field2|etc...|updateDate
If my data looks like this:
10|12|A|A|...|2009-03-01
11|12|A|B|...|2009-04-01
19|12|C|B|...|2009-05-01
21|14|X|Y|...|2009-06-11
27|14|X|Z|...|2009-07-01
Is there an easy way to compare each row and highlight the differences in the fields?
I need to be able to simply highlight the fields that changed between revisions (except for the key and the date of course)
There may be multiple fields updated in each new row (or just one).
This would be on a client by client basis so I could select on the clientID to filter.
It could be on the server or client side, which ever is easiest.
More details
I should expand my description a little:
I'm looking to just see if there was a difference between the fields (one is different in any way). Some of the data is numeric, some is text others are dates. A more complete example might be:
10|12|A|A|F|G|H|I|J|...|2009-03-01
11|12|A|B|F|G|H|I|J|...|2009-04-01
19|12|C|B|F|G|Z|I|J|...|2009-05-01 ***
21|14|X|Y|L|M|N|O|P|...|2009-06-11
27|14|X|Z|L|M|N|O|P|...|2009-07-01
I'd want to be able to isplay each row for clientID 12 and highlight B from row 11 and C & Z from row 19.

Any expression in SQL must reference columns only in one row (barring subqueries).
A JOIN can be used to make two different rows into one row of the result set.
So you can compare values on different rows by doing a self-join. Here's an example that shows joining each row to every other row associated with the same client (excluding a join of a row to itself):
SELECT c1.*, c2.*
FROM client c1
JOIN client c2 ON (c1.clientID = c2.clientID AND c1.id <> c2.id)
Now you can write expressions that compare columns. For example, to restrict the above query to those where field1 differs:
SELECT c1.*, c2.*
FROM client c1
JOIN client c2 ON (c1.clientID = c2.clientID AND c1.id <> c2.id)
WHERE c1.field1 <> c2.field1;
You don't specify what kinds of comparisons you need to make, so I'll leave that to you. The key point is that in general, you can use a self-join to compare rows in a given table.
Re your comments and clarification: Okay, so your "difference" is not simply by value but by ordinal position of the row. Remember that relational databases don't have a concept of row number, they only have order of rows with respect to some order you must specify in an ORDER BY clause. Don't confuse the "id" pseudokey with row number, the numbers are assigned as monotonically increasing only by coincidence of their implementation.
In MySQL, you could take advantage of user-defined variables to achieve the effect you're looking for. Order the query by clientId and then by id, and track values per column in MySQL user variables. When the value in a current row differs from the value in the variable, do whatever highlighting you were going to do. I'll show an example for one field:
SET #clientid = -1, #field1 = '';
SELECT id, clientId, field1, #clientid, #field1,
IF(#clientid <> clientid,
((#clientid := clientid) AND (#field1 := field1)) = NULL,
IF (#field1 <> field1,
(#field1 := field1),
NULL
)
) AS field1_changed
FROM client c
ORDER BY clientId, id;
Note this solution is not really different from just selecting all rows with plain SQL, and tracking the values with application variables as you fetch rows.

Related

storing a bunch of dates on a separate table, but have trouble fetching it properly

I have some tasks that I have to store in my database. And each task has an array of dates in which the tasks were completed. I've learn that it is better to not use a array (serialize) to store dates, but instead make another table. So I did:
taskTable contains columns: taskID, userid, description, name
task_days contains columns: taskID, day
But Im having trouble with php,
usually I can easily send my data to client with:
function getTasks(){
$app = \Slim\Slim::getInstance();
$userid = $app->request->params('userid');
$db = getDB();
$result = $db->prepare("Select * From taskTable where userid = ?");
$result->execute(array($userid));
$result->setFetchmode(PDO::FETCH_ASSOC);
echo json_encode($result->fetchAll());
}
I encode it, then client can easily read it as an array of JSON. But now with two tables, I'm not sure how to do it efficiently. I know I can get the required information with this query:
Select * from taskTable as t, task_days as d where t.taskID = d.taskID
But how do I make it so the days will be in an array associated with the correct task.
Do I first Select * From taskTable where userid = $userid, then for each task, I will do a query on table task_days? that seems extremely inefficient though.
So I want something like the following:
[
{taskid: 123, userid: 1, description: "do task", name: "tony", day:[1998-01-02, 1998-02-03]},
{taskid: 124, userid: 2, description: "do task2", name: "Ann", day:[2016-01-02, 2016-02-03, 2016-01-01]},
...
]
There's a couple of approaches.
1) One approach, as you already outline, is to run a query that returns the the columns from just `taskTable`. And for each row returned, run another query to get the associated rows from task_days. And you are right, that's usually not the most efficient approach. But for a reasonably small number of rows, performance should be reasonable as long as appropriate indexes are available.)
2) Another approach, assuming `taskid` is the primary key of `taskTable` is to perform a join, and use a "GROUP BY" to collapse the rows. The "GROUP_CONCAT" aggregate function can convert the multiple values of `day` from the `task_days` table into a single string. For example:
SELECT t.taskid
, t.userid
, t.description
, t.name
, GROUP_CONCAT(d.day ORDER BY d.day) AS `day`
FROM taskTable t
LEFT
JOIN task_days d
ON d.taskid = t.taskid
GROUP BY t.taskid
ORDER BY t.taskid
This would return the day as a string, not an array. If you need an array, your code would need to do that. (As a convenient way to do that, the PHP explode function might be suitable.)
NOTE: the length of the string returned by GROUP_CONCAT is limited by group_concat_max_len variable, and also by max_allowed_packet.
3) Another way to approach this is to perform a join operation, and pull back the "duplicated" task information, ordered by taskid and day
SELECT t.taskid
, t.userid
, t.description
, t.name
, d.day
FROM taskTable t
LEFT
JOIN task_days d
ON d.taskid = t.taskid
ORDER BY t.taskid, d.day
That would get a result set like this:
taskid userid description name day
------ ------ ----------- ----- ----------
123 1 do task tony 1998-01-02
123 1 do task tony 1998-02-03
124 2 do task2 Ann 2016-01-02
124 2 do task2 Ann 2016-02-03
124 2 do task2 Ann 2016-01-01
Then your code would need to do some rudimentary "control break" processing. Basically, compare the taskid of the current row to the taskid from the previous row. If they match, you are processing just a new `day` value for the same task.
If the taskid of the current row is different than the taskid from the previous row, then you are starting a new task.
Your code would effectively be ignoring the duplicated rows from `taskTable`, basically squinting at the result set and seeing it like this:
taskid userid description name day
------ ------ ----------- ----- ----------
- 123 1 do task tony 1998-01-02
+ 1998-02-03
- 124 2 do task2 Ann 2016-01-02
+ 2016-02-03
+ 2016-01-01
FOLLOWUP
The second option is closest to your original implementation, a comma separated list of values as a string, in a character column.
As far as storing a comma separated list, that's a SQL anti-pattern, and it's usually best avoided it. Multi-valued attributes can be stored in a separate table, like you have done.
The exception would be if you never, ever need the database to see the values in the list as separate values.
If you are storing that "list of dates" as if it were an image, for example like the contents of a jpeg... if you always store the entire value into the column, and always extract the contents of the column as a single value... if never need to search for an individual date, or add a date to an existing list, or remove a date from a list... and if you never need the database to enforce any constraints on the values, or do any validation of the contents...
If all of those conditions are satisfied, only then might it make sense to store a comma separated list as a single column.
My personal preference, if the implementation is targeted only to MySQL, would be the second option... using GROUP_CONCAT. If the length of the string generated by the GROUP_CONCAT exceeds group_concat_max_len, the string will be truncated, with no warning or error. (I believe that's a limitation in bytes, and not characters.)
The safest coding practice would be to do perform a query:
SELECT ##session.group_concat_max_len
save the value returned by that. Then, for the values returned from the GROUP_CONCAT expression, compare the length (in bytes) to the saved value, to see if truncation has occurred. (If the length of the returned string is less than the value of group_concat_max_len, then you can be pretty confident that truncation has not occurred.) It's also possible to override the current value of the variable (before you run the statement containing GROUP_CONCAT, with a separate SET statement. Something like this:
SET SESSION group_concat_max_len = 131072 ;
(Just be careful not to exceed max_allowed_packet.)

Repeated Insert copies on ID

We have records with a count field on an unique id.
The columns are:
mainId = unique
mainIdCount = 1320 (this 'views' field gets a + 1 when the page is visited)
How can you insert all these mainIdCount's as seperate records in another table IN ANOTHER DBASE in one query?
Yes, I do mean 1320 times an insert with the same mainId! :-)
We actually have records that go over 10,000 times an id. It just has to be like this.
This is a weird one, but we do need the copies of all these (just) counts like this.
The most straightforward way to this is with a JOIN operation between your table, and another row source that provides a set of integers. We'd match each row from our original table to as many rows from the set of integer as needed to satisfy the desired result.
As a brief example of the pattern:
INSERT INTO newtable (mainId,n)
SELECT t.mainId
, r.n
FROM mytable t
JOIN ( SELECT 1 AS n
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
) r
WHERE r.n <= t.mainIdCount
If mytable contains row mainId=5 mainIdCount=4, we'd get back rows (5,1),(5,2),(5,3),(5,4)
Obviously, the rowsource r needs to be of sufficient size. The inline view I've demonstrated here would return a maximum of five rows. For larger sets, it would be beneficial to use a table rather than an inline view.
This leads to the followup question, "How do I generate a set of integers in MySQL",
e.g. Generating a range of numbers in MySQL
And getting that done is a bit tedious. We're looking forward to an eventual feature in MySQL that will make it much easier to return a bounded set of integer values; until then, having a pre-populated table is the most efficient approach.

Adding a Row into an alphabetically ordered SQL table

I have a SQL table with two columns:
'id' int Auto_Increment
instancename varchar
The current 114 rows are ordered alphabetically after instancename.
Now i want to insert a new row that fits into the order.
So say it starts with a 'B', it would be at around id 14 and therefore had to 'push down' all of the rows after id 14. How do i do this?
An SQL table is not inherently ordered! (It is just a set.) You would simply add the new row and view it using something like:
select instancename
from thetable
order by instancename;
I think you're going about this the wrong way. IDs shouldn't be changed. If you have tables that reference these IDs as foreign keys then the DBMS wouldn't let you change them, anyway.
Instead, if you need results from a specific query to be ordered alphabetically, tell SQL to order it for you:
SELECT * FROM table ORDER BY instancename
As an aside, sometimes you want something that can seemingly be a key (read- needs to be unique for each row) but does have to change from time to time (such as something like a SKU in a product table). This should not be the primary key for the same reason (there are undoubtedly other tables that may refer to these entries, each of which would also need to be updated).
Keeping this information distinct will help keep you and everyone else working on the project from going insane.
Try using an over and joining to self.
Update thetable
Set ID = r.ID
From thetable c Join
( Select instancename, Row_Number() Over(Order By instancename) As ID
From CollectionStatus) r On c.instancename= r.instancename
This should update the id column to the ordered number. You may have to disable it's identity first.

Mysql intersect two strings

I have the following tables:
TableFinal
column id, with first row having value 1
column numbers, with first row having value `1,5,6,33,2,12,3,4,9,13,26,41,59,61,10,7,28`
And
TablePick
column id, with first row having value 1
column numbers, with first row having value 2,12,26,33
I want to check if the numbers from TablePick, column "selected" are contained in the column "numbers" of TableFinal.
I have to mention that in TablePick, the numbers in column "selected" are ordered ASC, while in TableFinal, the numbers in column "numbers" are shuffled.
Usually I would put each of these in an array using PHP and then intersect the 2 arrays and count the resulted array. But in MYSQL, it is not that simple, so practically I have no idea where to start.
Maybe I should create an ARRAY_INTERSECT function? Or do we have a simpler solution?
SELECT * FROM TablePick p RIGHT JOIN TableFinal f ON f.id=p.id WHERE ARRAY_INTERSECT(p.selected,f.numbers)
Sorry to say so, but your schema needs some serious maintenance: NEVER EVER store more than one information in one field, if you need to access them separately.
You need a pair of join tables, where instead of the first row (1, "1,5,6,33,2,12,3,4,9,13,26,41,59,61,10,7,28") you have the rows
(1,1)
(1,5)
(1,6)
(1,33)
...
and instead of the row (1, "2,12,26,33") you have the rows
(1,2)
(1,12)
(1,26)
(1,33)
Now you query is simply:
SELECT ... FROM TableFinal
INNER JOIN TABLE TablePick ON TableFinal.number=TablePick.number
WHERE TableFinal.id=1
AND TablePick.id=1
EDIT
Please understand, that even if this were possible without MySQL abuse, it would be a performance killer, once the number of rows start to rise: We are talking of n*m array intersects, if the tables have n and m rows respectivly.

Deleting rows not returning to original numbers

Just working with a database and some tests were done recently which checked the integrity of the setup.
As a result, a lot of test entries were added which were then deleted. However, when new entries are added, the ID number value continues from after the entries added.
What I want:
ID increases by one from where it left off before the additional rows were added:
4203, 4204, 4205, 4206 etc.
What is happening:
ID increases by one from after the additional rows ID:
4203, 4204, 6207, 6208 6209 etc.
Not sure where to fix this...whether in phpmyadmin or in the PHP code. Any help would be appreciated. Thanks!
I have ran into this before and I solve it easily with phpMyAdmin. Select the database, select the table, open the operations tab, and in the Table Options set the AUTO_INCREMENT to 1 then click GO. This will force mysql to look for the last auto incremented value and then set it to the value directly after that. I do this on a manually basis that way I know that when a row is skipped that it was not from testing but a deletion because when I test and delete the rows I fix the AI value.
I don't think there's a way to do this with an auto-incrementing ID key.
You could probably do it by assigning the ID to (select max(id) + 1 from the_table)
You could drop the primary key then recreate it, but this would reassign all the existing primary keys so could cause issues with relationships (although if you don't have any gaps in your primary key you may get away with it).
I would however say that you should accept (and your app should reflect) the possibility of missing IDs. For example in a web app if someone links to a missing ID you would want a 404 returned not a different record.
There should be no need to "reset" the id values; I concur with the other comments concerning this issue.
The behavior you observe with AUTO_INCREMENT is by design; it is described in the MySQL documentation.
With all that said, I will describe an approach you can use to change the id values of those rows "downwards", and make them all contiguous:
As a "stepping stone" first step, we will create a query that gets a list of the id values that we need changed, along with a proposed new id value we are going to change it to. This query makes use of a MySQL user variable.
Assuming that 4203 is the id value you want to leave as is, and you want the next higher id value to be reset to 4204, the next higher id to be reset to 4205, etc.
SELECT s.id
, #i := #i + 1 AS new_id
FROM mytable s
JOIN (SELECT #i := 4203) i
WHERE s.id > 4203
ORDER BY s.id
(Note: the constant value 4203 appears twice in the query above.)
Once we're satisfied that this query is working, and returning the old and new id values, we can use this query as an inline view (MySQL calls it a derived table), in a multi-table UPDATE statement. We just wrap that query in a set of parentheses, and give assign it an alias, so we can reference it like a regular table. (In an inline view, MySQL actually materializes the resultset returned by the query into a MyISAM table, which probably explains why MySQL refers to it as a "derived table".)
Here's an example UPDATE statement that references the derived table:
UPDATE ( SELECT s.id
, #i := #i + 1 AS new_id
FROM mytable s
JOIN (SELECT #i := 4203) i
WHERE s.id > 4203
ORDER BY s.id
) n
JOIN mytable t
ON t.id = n.id
SET t.id = n.new_id
ORDER BY t.id
Note that the old id value from the inline view is matched to the id value in the existing table (the ON clause), and the "new_id" value generated by the inline view is assigned to the id column (the SET clause.)
Once the id values are assigned, we can reset the AUTO_INCREMENT value on the table:
ALTER TABLE mytable AUTO_INCREMENT = 1;
NOTE: this is just an example, and is provided with the caveat that this should not be necessary to reassign id values. Ideally, primary key values should be IMMUTABLE i.e. they should not change once they have been assigned.

Categories