MySQL: Compare two TABLES and list rows that are new? - php

We have a daily data feed. I need to determine what rows are new. (It's a long story, but there are no record numbers for the rows and they aren't going to be any.) We need to be able to identify which rows are new since the previous data feed. The file comes in as JSON and I have been putting it into a MySQL TABLE for other purposes.
How do I take yesterday's TABLE and compare it to today's TABLE, and to display those rows which have been added since yesterday? Can all this be done in MySQL, or do I need to do this with the help of PHP?
If I was doing this in PHP, I'm thinking I would search today's TABLE with yesterday's TABLE, and flag (an added column) in today's TABLE called NEW with a "N" when it's found. "Y" would be the default which means the row is new. Then using MySQL do a select where new="Y" and this would display the new fields. Is this how to do this? Am I overlooking a better method? Thanks!

If you actually have two separate tables (which is how it sounds from your description, but is odd) and aren't comparing literally the same table, you can
SELECT partnumber FROm Today_table where partnumber not in (select partnumber from Yesterday_table)

Related

mysql database with revolving data

I'm not sure the best way to phrase this question!
I have a mysql database that needs to retrieve and store the past 24 data values, and the data always needs to be the last 24 data values.
I have this fully working, but I am sure there must be a better way to do it!
Just now, my mysql database has columns for id, timestamp, etc, and then 24 data columns:
data_01
data_02
data_03
data_04
data_05
etc
There are multiple rows for different ids.
I run a cron job every hour, which deletes column 'data_24', and then renames all columns:
data_01 -> data_02
data_02 -> data_03
data_03 -> data_04
data_04 -> data_05
data_05 -> data_06
etc
And then adds a new, blank column:
data_01
The new data is then added into this new, blank column.
Does this sound like a sensible way to do this, or is there any better way??
My concern with this method is that the column deleting, renaming and adding has to be done first, before the new data is retrieved, so that the new column exists for adding data.
If the data retrieve fails for any reason, my table then has a column with NULL as a data value.
Renaming columns for something like this is not a good idea.
I'm curious how you insert and update this data, but there must be a better way to do this.
Two things that seem feasible:
Not renaming the column, but moving the data to the next column:
update YourTable
set data1 = :newvalue,
data2 = data1,
data3 = data2,
...;
Or by spreading the data over 24 rows instead of having 24 columns. Each data is a row in your table, (or in a new table where your id is a foreign key). Every time when you insert a new value, you can also delete the oldest value for that same id. You can do this in one atomic transaction so there won't ever be more or less than 24 rows per id.
insert into YourTable(id, data)
values (:id, :newvalue);
delete from YourTable
where id = :id
order by timestamp desc
limit 1;
This will multiply the number of rows (but not the amount of data) by 24, so for 1000 rows (like you mentioned), you're talking about 24000 rows, which is still peanuts if you have the proper indexes.
We got tables in MySQL with over 100 million rows. Manipulating 24000 rows is WAY easier than rewriting a complete table of 1000 rows, which is essentially what you're doing by renaming the columns.
So the second option certainly has my preference. It will provide you with a simple structure, and should you ever decide to not clean up old data, or move that to a separate job, or stick to 100 items instead of 24, then you can easily do that by changing 3 lines of code, instead of completely overhauling your table structure and the application with it.
It doesn't look as a sensible way of doing thins, to be honest.
IMHO, having multiple rows instead of having the wide table is much more flexible.
You can define columns (id, entity_id, created). Then you'll be able to write your records in a "log" manner.
When you need to select the data in the same way as it used to be, you can use the MySQL view for that. Something like
CREATE VIEW my_view AS
SELECT data_01, ..., data_24 -- here you should put the aggregated values aliased as data_01 ... data_24
FROM my_table
WHERE my_table.created >= DATE_SUB(NOW(), INTERVAL 1 DAY)
GROUP BY ... -- here you should aggregate the fields by hours
ORDER BY created;

Trigger-Series of tables, update all except last which will update along with old values

I'm putting together a series of identical tables that consist of only dates submitted to DB. The dates mark when a particular job was executed.
The tables work like dominoes in that when a date is submitted in the update table/form the next table assumes the old date through a BEFORE UPDATE trigger, and so on, and so on.
Each of these tables can be linked from the update table, and the point of them is to view a history of work performed, when a particular job was executed. After 20 or so tables the older dates become somewhat irrelevant, but should all be eventually archived in a final, 21st table which absorbs all the dates that keep getting updated.
This last table is updated via trigger but the old dates/values should be kept, maybe separated by comma. In other words while tables 1-20 contain only one date/entry per field, the overwritten date from the previous table, table 21 will list ALL the dates associated with that particular field that have been, or will be passed down, so no OLD values are overwritten.
After extensive research I discoverted that INSERT does not overwrite old data, but every attempt at writing a trigger with INSERT to this last table has failed. All tables have the same ID-"1". No new tables are created, this is a simple exercise in storing data, and yet this last method is elusive.
No previous answers on SO really helped. How to do this simple job?
The UPDATE trigger that works, one derived from a previous SO question, for all the other tables, looks something like this:
BEGIN
UPDATE work2 SET
ins1 = OLD.ins1,
insp1 = OLD.insp1,
b1psp = OLD.b1psp,
b1ptp = OLD.b1ptp,
..........................etc
WHERE work2.id = OLD.id;
END
There must be a simple solution, yet I'm not familiar with PHP enough to solve this. I'm using EasyPHP DevServer 14.1.

How do i set the increment value of autoincrement in mysql

I am trying to develop a system to assign room numbers to tenants of a hostel upon registration, using the auto increment feature of sql.
However, it automatically increases by one after every entry. Because the hostel accommodates four people in one room, I want to change this to 4, so that after every 4 entries I get only one id/room number.
How do I go about this? I am using php and sql. If the autoincrement feature is not possible can you please suggest another way to achieve this? Thanks.
You would need:
http://dev.mysql.com/doc/refman/5.1/en/replication-options-master.html#sysvar_auto_increment_increment
It works like this:
mysql> SET ##auto_increment_increment=4;
So when you insert 4 rows, the auto increment column will be:
4,8,12,16
as best of my knowledge you cannot change the steps of auto-increment field. I suggest add another field and write a trigger to update its value based on auto-increment field (auto-increment/4).
I don't think this is possible with autoincrement..
Maybe you can do something like this:
//Pseudo code
//First you get the count of the highest id, to see how many users are in the last room.
SELECT COUNT(*) FROM table WHERE id=(SELECT id FROM table ORDER BY id DESC LIMIT 1)
//If the result of the last query is >= 4 then insert the next customer with id +1
Don't use auto_increment for this - it can't handle a situation where multiple records will share the same number and although you can reset it manually (see below) it's also not designed for a situation where numbers may get reused in a random order.
You could just have a room_number field with one of the mysql integer types (e.g. tinyint, smallint, mediumint…) or you could separate your database into two tables, one for people (each of whom have an id) and a second to map those ids to rooms.
However you do it, you'd then write a select query to check which room numbers are available before you add the person's details to the database.
You may need to read up on relational databases if that doesn't sound very clear.
If you do need to reset the auto_increment (sometimes it's nice to do it if you've filled a database with test data which you're about to wipe, and you want the real "production" data to begin at 1) you can use:
ALTER TABLE [tablename] AUTO_INCREMENT = 1
https://dev.mysql.com/doc/refman/5.0/en/example-auto-increment.html

mysql query from subquery keeps hanging

Hello I have a mysql database and all I want is basically to get a value on the second table from a first table query
I have figured something like this but is not working.
select src, dst_number, state, duration
from cdrs, area_code_infos
where SUBSTRING(cdrs.src,2,3) = area_code_infos.`npa`;
Please help me figure out this. I have tried in PHP to have multiple queries running one after the other but when I loaded the page after 45 minutes of wait time I gave up.
Thanks,
I assume the tables are farily big, and you are also doing an unindexed query.. basically substring has to be calculated for every row.
Whenever you do a join, you want to make sure both of the joined fields are indexed.
An option would be to create another column containing the substring calculation and then creating an index on that.
However, a better option would be to have an areaCodeInfosID column and set it as a foreign key to the area_code_infos table

check if a row was added to table mysql

I have a table which contains orders, and orders are being added to the table by users as time goes by.
I want to implement a service that checks if a row was added to the table.
Is there a specific way to do that?
thanks!
If you want to know which rows have been added since last time you checked, put a timestamp in each row, and keep track somewhere (separately) of the newest row you've seen so far. To find new rows, query for all rows whose timestamp is newer than newest one you've seen before. Then take the most recent timestamp from the result set, and use it to update your "newest row seen so far" variable.
The database itself doesn't keep track of which rows have been newly-added because the meaning of "new" depends on who's asking. A row that was added six months ago is "new" to someone who hasn't checked since then. That's why you have to use timestamps, and have the application keep track of which timestamp currently marks the boundary between "old" and "new".
Edit: Actually, instead of timestamps, you might want to use an auto-increment integer column. With timestamps there's a slight chance that two rows may be added so close together in time that they get the same timestamp, and if the application does its query at a moment when only one of those rows has been inserted, it'll "miss" the other one next time it checks for new rows because it thinks that timestamp has been seen already. A value that always increases for every new row would avoid that problem, plus many tables have one already (for use as a primary key).

Categories