I :)
I need to insert 13500 lines and +500 columns from csv.
So, I use load data infile and it's work.
But, I need exactly the same order in my MySQL database and my Csv.
Actually, for example, the 1000 line of the csv can be at the 800 place in my base
I need something like "Order by column1" but I don't find the clue.
Thank for your help
Ps : I have 2primary keys (ref of products) and the are not in the mathematical order (like 1, 8, 4, etc.)
EDIT : My code
$dataload = 'LOAD DATA LOCAL INFILE "'.__FILE__.'../../../../bo/csv/'.$nomfichier.'"
REPLACE
INTO TABLE gc_csv CHARACTER SET "latin1"
FIELDS TERMINATED BY "\t"
IGNORE 1 LINES
';
I just take the csv and use data local inline with him... And the order is'nt perfectly respected, I don't know why...
My design Table
CREATE TABLE `csv` (
`example` int(20) unsigned NOT NULL,
`example` int(15) unsigned NOT NULL,
`example` varchar(10) default NULL,
[...]
`example` varchar(4) default NULL,
PRIMARY KEY (`RefCatSYS`,`IdProduit`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Add an auto_increment column to your table, with DEFAULT NULL. When you load data with LOAD DATA INFILE, there will be no value for the column, and it will get assigned an automatically generated id. Select data ordered by the column.
kostja#annie:~$ sudo cat /var/lib/mysql/test/foo.csv
10
9
8
7
6
5
4
3
2
1
mysql> create table tmp (example int primary key, id int unique auto_increment default null);
Query OK, 0 rows affected (0.11 sec)
mysql> load data infile "foo.csv" into table tmp;
Query OK, 10 rows affected, 10 warnings (0.03 sec)
Records: 10 Deleted: 0 Skipped: 0 Warnings: 10
mysql> select * from tmp;
+---------+----+
| example | id |
+---------+----+
| 10 | 1 |
| 9 | 2 |
| 8 | 3 |
| 7 | 4 |
| 6 | 5 |
| 5 | 6 |
| 4 | 7 |
| 3 | 8 |
| 2 | 9 |
| 1 | 10 |
+---------+----+
10 rows in set (0.00 sec)
The tables in relational databases are unordered collections of records. You can get the rows in a particular order when you run a query by using SORT BY. If the query does not contain a SORT BY clause, the server returns the rows in the order they are on the storage medium. This order sometimes changes when records are updated. You must never rely on it and always use SORT BY (and indexes) to get a certain order of the rows in the result set.
LOAD DATA INFILE reads the lines from the CSV file and inserts them in the same order they are in the file. Apart from setting the value of an auto-incremented column (if there is one in the table), the order of the lines in the CSV file does not matter.
I have solve the problem. In fact, the WS send to me a csv with a multiple couple of the two primary key. At the line 2365 and 9798, the RefCatSYS and IdProduit as the same. So the load data infile REPLACE the line 2365 by the 9798 and that change the order.
I ask them to send a 3th and UNIQUE primary key
Thank for your help, and sorry for the disruption.
Related
If I have a table with 2 columns , what If I am going to update a column in that table that creates duplicate rows, This table has unique constraint as well, is there any way that if unique row get created while I am updating I can process that row?
Usually adding an SQL inline IF statement with some criteria and optional processing and a self-join to detect duplication will do what you're looking for. The true answer will be specific to your structure, but I will give an example for a table called user with a column called id which is the primary key and SSN which has a unique constraint on it. We'll populate it with 2 users and update one of them to duplicate the first one in the unique ssn column"
CREATE TABLE `test`.`user` (
`id` INT NOT NULL,
`SSN` VARCHAR(45) NULL,
PRIMARY KEY (`id`),
UNIQUE INDEX `ssn_UNIQUE` (`SSN` ASC));
INSERT INTO user VALUES (1, "1234567"), (2, "0123456");
As you have noticed, if I run the following update when another user (where id=1) already has SSN="1234567", then we will have made no updates.
UPDATE user SET SSN="1234567" WHERE id=2;
ERROR 1062 (23000): Duplicate entry '1234567' for key 'ssn_UNIQUE'
However, consider the following instead:
UPDATE user u
LEFT JOIN user AS u2
ON u2.SSN="1234567"
SET u.SSN=IF(
u2.id IS NOT NULL,
CONCAT(u2.SSN, "duplicates", u2.id, "onto", u.id),
"1234567")
WHERE u.id=2;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
In the above example, the following scenarios could play out:
If user id=1 already has SSN="1234567", and I run the above update, the result will be:
SELECT * FROM test.user;
+----+--------------------------+
| id | SSN |
+----+--------------------------+
| 2 | 1234567duplicates1onto2 |
| 1 | 1234567 |
+----+--------------------------+
2 rows in set (0.00 sec)
If I try to set instead to "01234567" instead, and I run the same above update, the result will be:
SELECT * FROM test.user;
+----+----------+
| id | SSN |
+----+----------+
| 2 | 01234567 |
| 1 | 1234567 |
+----+----------+
2 rows in set (0.00 sec)
If I had a 3rd user, that user might possibly have the value "1234567duplicates2" if two other users had attempts at setting the value to "1234567" similarly:
SELECT * FROM test.user;
+----+-------------------------+
| id | SSN |
+----+-------------------------+
| 1 | 1234567 |
| 2 | 1234567duplicates1onto2 |
| 3 | 1234567duplicates1onto3 |
+----+-------------------------+
3 rows in set (0.00 sec)
As you can see, the "onto" part allows me to have many duplicates in the same update batch.
To adapt this technique, just change the output of the inline IF to be the formula you would use for processing, and the criteria for the JOIN should be anything to provide duplication detection.
http://dev.mysql.com/doc/refman/5.1/en/control-flow-functions.html
I have a table which contains a standard auto-incrementing ID, a type identifier, a number, and some other irrelevant fields. When I insert a new object into this table, the number should auto-increment based on the type identifier.
Here is an example of how the output should look:
id type_id number
1 1 1
2 1 2
3 2 1
4 1 3
5 3 1
6 3 2
7 1 4
8 2 2
As you can see, every time I insert a new object, the number increments according to the type_id (i.e. if I insert an object with type_id of 1 and there are 5 objects matching this type_id already, the number on the new object should be 6).
I'm trying to find a performant way of doing this with huge concurrency. For example, there might be 300 inserts within the same second for the same type_id and they need to be handled sequentially.
Methods I've tried already:
PHP
This was a bad idea but I've added it for completeness. A request was made to get the MAX() number for the item type and then add the number + 1 as part of an insert. This is quick but doesn't work concurrently as there could be 200 inserts between the request for MAX() and that particular insert leading to multiple objects with the same number and type_id.
Locking
Manually locking and unlocking the table before and after each insert in order to maintain the increment. This caused performance issues due to the number of concurrent inserts and because the table is constantly read from throughout the app.
Transaction with Subquery
This is how I'm currently doing it but it still causes massive performance issues:
START TRANSACTION;
INSERT INTO objects (type_id,number) VALUES ($type_id, (SELECT COALESCE(MAX(number),0)+1 FROM objects WHERE type_id = $type_id FOR UPDATE));
COMMIT;
Another negative thing about this approach is that I need to do a follow up query in order to get the number that was added (i.e. searching for an object with the $type_id ordered by number desc so I can see the number that was created - this is done based on a $user_id so it works but adds an extra query which I'd like to avoid)
Triggers
I looked into using a trigger in order to dynamically add the number upon insert but this wasn't performant as I need to perform a query on the table I'm inserting into (which isn't allowed so has to be within a subquery causing performance issues).
Grouped Auto-Increment
I've had a look at grouped auto-increment (so that the number would auto-increment based on type_id) but then I lose my auto-increment ID.
Does anybody have any ideas on how I can make this performant at the level of concurrent inserts that I need? My table is currently InnoDB on MySQL 5.5
Appreciate any help!
Update: Just in case it is relevant, the objects table has several million objects in it. Some of the type_id can have around 500,000 objects assigned to them.
Use transaction and select ... for update. This will solve concurrency conflicts.
In Transaction with Subquery
Try to make index on column type_id
I think by making index on column type_id it will speed up your subquery.
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,type_id INT NOT NULL
);
INSERT INTO my_table VALUES
(1,1),(2,1),(3,2),(4,1),(5,3),(6,3),(7,1),(8,2);
SELECT x.*
, COUNT(*) rank
FROM my_table x
JOIN my_table y
ON y.type_id = x.type_id
AND y.id <= x.id
GROUP
BY id
ORDER
BY type_id
, rank;
+----+---------+------+
| id | type_id | rank |
+----+---------+------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 4 | 1 | 3 |
| 7 | 1 | 4 |
| 3 | 2 | 1 |
| 8 | 2 | 2 |
| 5 | 3 | 1 |
| 6 | 3 | 2 |
+----+---------+------+
or, if performance is an issue, just do the same thing with a couple of #variables.
Perhaps an idea to create a (temporary) table for all rows with a common "type_id".
In that table you can use auto-incrementing for your num colomn.
Then your num shoud be fully trustable.
Then you can select your data and update your first table.
I have mysql table of some records, e.g.:
CREATE TABLE test (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
value varchar NOT NULL
)
now, what I need is to generate unique sequence of 1, 2, ..., N in php script and store to another table... How to achieve this to be thread-safe and not creating doubles or skipping something?
I was wondering if some additional mysql table could be helpful, but I don't know how to create something like "separate autoincrements for each column value" or anything else...
test:
1 ... apples
2 ... oranges
3 ... lemons
some php script (accessed parallely by multiple users at the time):
save_next_fruit($_GET['fruit']);
will create some record in another tables with values like this:
saved_fruit:
ID | FRUIT(FK) | FRUIT_NO
1 1 1
2 1 2
3 2 1
4 3 1
5 3 2
6 1 3
7 3 3
8 2 2
9 1 4
10 2 3
11 1 5
12 2 4
13 1 6
14 3 4
15 3 5
other words, I need to do this (e.g. for fruit 3 (lemons)):
insert into saved_fruit (fruit, fruit_no) values (3, select MAX(fruit_no)+1 from saved_fruit where fruit = 3);
but in thread safe way (I understand that above command is not thread safe in MyISAM MySQL database)
Can you help?
Thanks
MyISAM does support this behavior. Create a two-column primary key, and make the second column auto-increment. It'll start over for each distinct value in the first column.
CREATE TABLE t (i INT, j INT AUTO_INCREMENT, PRIMARY KEY (i,j)) ENGINE=MyISAM;
INSERT INTO t (i) VALUES (1), (1), (2), (2), (1), (3);
SELECT * FROM t;
+---+---+
| i | j |
+---+---+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
| 2 | 2 |
| 3 | 1 |
+---+---+
But if you think about it, this is only thread-safe in a storage engine that does table-level locking for INSERT statements. Because the INSERT has to search other rows in the table to find the max j value per the same i value. If other people are doing INSERTs concurrently, it creates a race condition.
Thus, the dependency on MyISAM, which does table-level locking on INSERT.
See this reference in the manual: http://dev.mysql.com/doc/refman/5.6/en/example-auto-increment.html under the section, MyISAM Notes.
There are a whole lot of good reasons not to use MyISAM. The deciding factor for me is MyISAM's tendency to corrupt data.
Re your comment:
InnoDB does not support the increment-per-group behavior described above. You can make a multi-column primary key, but the error you got is because InnoDB requires that the auto-increment column be the first column in a key of the table (it doesn't strictly have to be the primary key)
Regardless of the position of the auto-increment column in the multi-column key, it only increments when you use it with InnoDB; it does not number entries per distinct value in another column.
To do this with an InnoDB table, you'd have to lock the table explicitly for the duration of the INSERT, to avoid race conditions. You'd do your own SELECT query for the max value in the group you're inserting to. Then insert that value + 1.
Basically, you have to bypass the auto-increment feature and specify values instead of having them automatically generated.
As you using MyISAM you could to lock whole table.
LOCK TABLES `saved_fruit`;
-- Insert query with select.
UNLOCK TABLES;
I'm attempting to build a database that stores messages for multiple users. Each user will be able to send/receive 5 different message "types" (strictly a label, actual data types will be the same). My initial thought was to create multiple tables for each user, representing the 5 different message types. I quickly learned this is not such a good idea. My next thought was to create 1 table per message type with a users column, but I'm not sure that's the best method either from a performance perspective. What happens if user 1 sends 100 message type 1's, while user 3 only sends 10? The remaining fields would be null values, and I'm really not sure if that makes a difference or not. Thoughts? Suggestions and/or suggested reading? Thank you in advance!
No, that (the idea given in the subject of this question) will be tremendously inefficient. You'll need to introduce a new table each time a new user is created, and querying all them at once would be a nightmare.
It's far easier to be done with a single table for storing information about message. Each row in this table will correspond to one - and only - message.
Besides, this table should probably have three 'referential' columns: two for linking a specific message to its sender and receiver, and one for storing its type, that can be assigned only a limited set of values.
For example:
MSG_ID | SENDER_ID | RECEIVER_ID | MSG_TYPE | MSG_TEXT
------------------------------------------------------
1 | 1 | 2 | 1 | .......
2 | 2 | 1 | 1 | #######
3 | 1 | 3 | 2 | $$$$$$$
4 | 3 | 1 | 2 | %%%%%%%
...
It'll be quite easy to get both all the messages sent by someone (with WHERE sender_id = %someone_id% clause), sent to someone (WHERE receiver_id = %someone_id%), of some specific type (WHERE msg_type = %some_type%). But what's best of it, one can easily combine these clauses to set up more sophisticated filters.
What you initially thought of, it seems, looks like this:
IS_MSG_TYPE1 | IS_MSG_TYPE2 | IS_MSG_TYPE3 | IS_MSG_TYPE4
---------------------------------------------------------
1 | 0 | 0 | 0
0 | 1 | 0 | 0
0 | 0 | 1 | 0
It can be NULLs instead of 0, the core is still the same. And it's broken. Yes, you can still get all the messages of a single type with WHERE is_msg_type_1 = 1 clause. But even such an easy task as getting a type of specific message becomes, well, not so easy: you'll have to check each of these 5 columns until you find the one that has truthy value.
The similar difficulties expect the one who tries to count the number of messages of each types (which is almost trivial with the structure given above: COUNT(msg_id)... GROUP BY msg_type.
So please, don't do this. ) Unless you have a very strong reason not to, try to structure your tables so that with the time passing by they will grow in height - not in width.
The remaining fields would be null values
Except if you're designing your database vertically, there will be no remaining fields.
user int
msgid int
msg text
create table `tv_ge_main`.`Users`(
`USER_ID` bigint NOT NULL AUTO_INCREMENT ,
`USER_NAME` varchar(128),
PRIMARY KEY (`ID`)
)
create table `tv_ge_main`.`Message_Types`(
`MESSAGE_TYPE_ID` bigint NOT NULL AUTO_INCREMENT ,
`MESSAGE_TYPE` varchar(128),
PRIMARY KEY (`ID`)
)
create table `tv_ge_main`.`Messages`(
`MESSAGE_ID` bigint NOT NULL AUTO_INCREMENT ,
`USER_ID` bigint ,
`MESSAGE_TYPE_ID` bigint ,
`MESSAGE_TEXT` varchar(255) ,
PRIMARY KEY (`ID`)
)
i have an existing mysql table with the id column defined as primary, and auto-increment set as true. now, i would like to know if i can set the auto-increment to start from a predefined value, say 5678, instead of starting off from 1.
I would also like to know if i can set the steps for auto-incrementing, say increase by 15 each for each new record insertion (instead of the default increment value of 1).
Note- i am using phpmyadmin to play with the db, and i have many tables but only one db.
Thanks.
ALTER TABLE tbl AUTO_INCREMENT = 5678 will set the auto increment to 5678 for that table. Have a look at the detailed information here.
You can set the auto increment value using below command
ALTER TABLE tbl_name AUTO_INCREMENT = 5678;
And can update the auto_increment counter variable using below command
SET ##auto_increment_increment=15;
Loo at here for more info
mysql> SET ##auto_increment_increment=15;
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO autoinc1 VALUES (NULL), (NULL), (NULL), (NULL);
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> SELECT col FROM autoinc1;
+-----+
| col |
+-----+
| 1 |
| 16 |
| 31 |
| 46 |
You can also use the server-system-variables:
auto_increment_increment
and
auto_increment_offset
This will allow you to increase the offset by other values than 1 (e.g. 15) each time.
If you start from a different value using the same offset on a different server. This will allow you to keep tables on different servers that can be merged without keys overlapping.
e.g.
(inc = 15 offset = 1) (inc=15 offset = 2)
table1 on server A table1 on server B
-----------------------------------------------------
id name id name
1 bill 2 john
16 monica 17 claire
....
This can be very useful.
Because the main usage is to have the same table on different servers behave in a different way, it is a server setting and not a table setting.
ALTER TABLE whatever AUTO_INCREMENT=5678 - alternatively in phpMyAdmin, go to the "Operations" tab of the table view and set it there. For the increment step, use the setting auto_increment_increment.
You can see the example here..
http://pranaydac08.blogspot.in/2013/10/how-set-auto-increment-value-start-from.html