I have a script which INSERT's data into a table and then later on when you INSERT new data it DELETE's the previous record/s and INSERT's the current data set.
The only issue is that the primary key gets wacked.
e.g. first four rows
1
2
3
4
then when i delete these and enter new data
5
3
4
6
note: the above numbers represent primary key id auto incrementations
Why does the incrementation become confused almost?
Auto-increment number do not get confused. They are unique over the table and that is the only purpose they have.
If you select the data then the DB will grab the records as fast as possible and if you do not specify a specific order then the records are returned in an unpredictable order.
That means if you specify
select * from your_table
order by id
Then the records have incrementing numbers. If you delete records then the gabs won't be filled.
If you want to restart the numbers, use truncate table instead of delete. This will reset the counter to 0:
truncate table <your table here>;
Related
So, there is this situation when you have a table in which you want to insert rows in pairs with a reference to each other. Just like in double-entry accounting, every item has it's opposite as pair of it. There is this table:
CREATE SEQUENCE tbl_item_id_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;
CREATE TABLE tbl_item (
id integer NOT NULL PRIMARY KEY DEFAULT nextval('tbl_item_id_seq'),
pair_id integer,
label character varying(50) NOT NULL,
FOREIGN KEY (pair_id) REFERENCES tbl_item (id)
);
ALTER SEQUENCE tbl_item_id_seq OWNED BY tbl_item.id;
The items are generated procedurally. Usually, there are multiple pairs generated at once, and the ultimate goal would be to insert all the pairs with one query. I have solved this with PHP where I inserted a row, returned it's id, inserted the other row with pair_id filled and updated the first row with the id of the second. This means 3 db query started from PHP, and since there are multiple pairs generated, it means number_of_pairs * 3 queries. When I have about 100 pairs, it means 300 queries and gives a nice overhead in processing time what I would like to minimize.
So, the question is given, what's the fastest way to insert pairs of rows with a reference to each other's id into a single table in PSQL?
You could reserve some ids :
select nextval('tbl_item_id_seq') from generate_series(1,200)
then manually assign the id/pair_id. This way, the inserts could even be a single COPY statement(if your Postgres driver supports it).
I am having problem to update the list of id number again starting from 1,2,3,4,5. Since I have deleted few records as I was testing the sql commands. Can you please help on how to make this id column again starting from 1.
I could just the name of the id number however if I do that then when I input new record, it will again start from the previous number which was 66.
ID Name
1 A
32 B
34 C
35 D
55 E
66 F
Truncate your table first and then execute this
ALTER TABLE tablename AUTO_INCREMENT = 1
You should truncate the table to reseed it properly and not just use alter table
(tldr; it's usually better not to worry about the density or sequential order an auto-increment column.)
It is not possible1 to use an AUTO_INCREMENT to automatically fill in values less than MAX(ID).
However, the auto increment ID can be reset if existing IDs are updated. The compacting phase is required because MySQL does not allow "filling in gaps" via an auto-increment column.
Compact the existing IDs, like so:
SET #i := 0;
UPDATE t id = #i := (#i+1)
Important: Make sure that all relational usage is identified in the form of Foreign Key relations with CASCADE ON UPDATE before this is done or the data may become irreversibly corrupted.
Assign the auto-ID see to the maximum1 ID value after compacting:
ALTER TABLE t AUTO_INCREMENT = (SELECT MAX(id) FROM t)
1 Per the AUTO_INCREMENT documentation in ALTER TABLE:
You cannot reset the counter to a value less than or equal to the value that is currently in use .. if the value is less than or equal to the maximum value currently in the AUTO_INCREMENT column, the value is reset to the current maximum AUTO_INCREMENT column value plus one.
The rule means that it is not possible to set the increment ID lower than an already used ID; in addition, manually assigning a value higher will automatically raise the AUTO_INCREMENT value.
The easiest (and sometimest fastest) way is to remove column and add it back. Updating column may screw up indexes or make a mess with values. Droping whole table got no sense. But remember that if other columns refer to that ids you can damage your app.
I have a table A which has a auto increment serial number field SLNO. When i insert values in table it will increment automatically like 1,2,3,4... etc. But when i delete a row from the table the order get break. ie, if delete row with serial number 2 then the serial number field will 1,3,4. But I want to maintain a continuous order like 1,2,3 even delete rows. Is there any way to maintain this order, like using trigger or somthing
A primary auto-increment key is only for uniquely identifying a record. Just leave it be.
Don't misuse the primary key as indicator of your record order. If you need specific order of your records then use an extra column for that. For instance a timestamp column.
If you need a specific order of your records use a timestamp column with a default value of current_timestamp. That way it will be inserted automatically.
ALTER TABLE your_table
ADD column inserted_timestamp TIMESTAMP default current_timestamp;
SQLFiddle demo
You should leave it as it is.
However, if you do really need, you can "recalculate" the primary key, the index:
set #pk:=0;
update
your_table
set pk=#pk:=#pk+1
order by pk;
add a column that will speicfy that is deleted
example:
1 - deleted already
0 - not deleted
and add where deleted = 0 in your select query
primary key column 2 column3 ..... deleted
1 1
2 0
3 0
4 1
5 0
Storing an number of a record would make deletes inefficient. Instead you can rely on existing SLNO indexes you already have, that should be enough for all use cases that come up to my mind.
If you SELECT whatever ORDER BY SLNO LIMIT ... OFFSET k, then returned rows have IDs k, k+1, k+2, ...
If you want to get an id of a record knowing its SLNO:
SELECT COUNT(SLNO) FROM A WHERE SLNO <= thatnumber
If you want to get thatnumber'th record:
SELECT * FROM A ORDER BY SLNO LIMIT 1 OFFSET thatnumber
You can do by alter the table and delete primary key then again create primary key.
But why you need this. If you have use this key as foreign key in other table. Then you lost all the data.
I have a db table filled with about ~30k records.
I want to randomly select a record one at a time (when demanded by users), delete the record from the table, and insert it in another table.
I've heard/found that doing ORDER BY RAND() can be quite slow. So I'm using this algorithm (pseudo code):
lowest = getLowestId(); //get lowest primary key id from table
highest = getHighestId(); //get highest primary key id from table
do
{
id = rand(lowest, highest); //get random number between a range of lowest id and highest id
idExists = checkIfRandomIdExists( id );
}
while (! idExists);
row = getRow (id);
process(row);
delete(id);
Right now, with 30k records, I seem to get random ids very quickly. However as the table size decreases to 15k, 10k, 5k, 100, etc, (can be months) I'm concerned that this might begin to get slower.
Can I do anything to make this method more effective, or is there a row count at which point I should start doing ORDER BY RAND() instead of this method? (e.g when 5k rows are left, start doing ORDER BY RAND() ?)
You could get a random ID using that method, but instead of checking to see if it exists, just try and get the closest one?
SELECT * FROM table WHERE id >= $randomId ORDER BY id LIMIT 0,1
Then if that fails go for a lower one.
One way to do it might be to determine number of records and choose by record:
select floor(count(*) * rand()) from thetable;
Use the resulting record number (e.g., chosenrec) in the limit:
select * from thetable limit chosenrec, 1;
I might recommend a Fisher-Yates Shuffle instead in a separate table. To generate this, create a table like:
CREATE TABLE Shuffle
(
SequentialId INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
OtherTableId INT NOT NULL
)
Notably, don't bother with the foreign key constraint. In SQL Server, for instance, I would say to add a foreign key constraint with ON DELETE CASCADE; if you have a storage engine for which that would be workable in MySQL, go for it.
Now, in the language of your choice:
Get an array of all the IDs in the other table (as #Truth suggested in comments).
Shuffle these ids using Fisher-Yates (takes linear time).
Insert them into the Shuffle table in order.
Now, you have a random order, so you can just INNER JOIN to the Shuffle table, then ORDER BY Shuffle.SequentialId to find the first record. You can delete the record from Shuffle manually if you have no way to do ON DELETE CASCADE.
For example:
Row Name
1 John
2 May
3 Marry
4 Tom
5 Peter
Suppose I delete row 2 and row 3, is it possible to update Tom and Peter to row id 2 and 3 respectively and the next insert row to be row id 4?
yes, but you need to recreate Row:
ALTER TABLE `users` DROP `Row`;
ALTER TABLE `users` AUTO_INCREMENT = 1;
ALTER TABLE `users` ADD `Row` int UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY FIRST;
No, because think of the problems that this could create. Imagine if you are managing a store inventory and each item has an internal barcode, based on an ID in a database table. You decide never to stock one item again and take it out of the database, and then every ID for every item with an ID greater than the removed item gets decremented by 1... all those barcodes on all those items become useless.
ID numbers in databases aren't generally meant to be used or referenced by people. They are meant to be a static index of a certain item in a database which allows the computer to pull up a given record quickly and reliably. When creating your ID field of your table, make sure you make the datatype large enough to account for these lost IDs when they do occur.
This is just a suggestion. I don't say this is the best solution. Just consider.
You execute your delete query.
DELETE FROM table_name WHERE Row IN (2,3);
Once deleted you make a select query request with PHP and get the data set to an array.
SELECT Row, Name from table_name ORDER BY Row ASC;
Then make a UPDATE execution using a loop.
$index = 1;
foreach($dataset as $item)
{
// Your update query
$sql = "UPDATE table_name SET Row=$index where Name='" . $item['Name'] . "'";
$index++;
}
Before you insert next query you have to get the max value of Row and set +1 value as the Row of the insert query.
This is just an idea. Not the complete code.