I'm trying to import a large csv file into Mysql. Unfortunately, the data within the file is separated both by spaces and tabs.
As a result, whenever I load the data into my table, I end up with countless empty cells (because Mysql only recognizes one field separator). Modifying the data before importing it is not an option.
Here is an example of the data:
# 1574 1 1 1
$ 1587 6 6 2
$115 1878 8 9 23
(Where the second and third value of every row are separated by a tab)
Any ideas?
If my goal were just to import the file, i'd use sed -i 's/,/ /g' *.txt to create just one delimiter to worry about.
I like CSVs, but perhaps there's a string encased in double quotes that contains a comma or space, in which case this isn't perfect. It'd still import, just would modify those strings.
In that case, another approach I've used in production is Stat/Transfer. There's a syntax language to create a shell script to convert the file and specify multiple delimiters.
MySQL import CSV file using regex delimiter
Assuming you're using LOAD DATA INFILE try this:
load data local infile 'c:/somefile.txt' into table tabspace
columns terminated by ' '
(col1, #col23, col4, col5)
set col2 = left(#col23, instr(#col23,char(9))-1),
col3 = substr(#col23,instr(#col23,char(9))+1);
Note that the separator is a space so the second column contains the col2/col3 data. This is assigned to a variable #col23 which is then split up and the parts assigned to col2 and col3.
Related
Could please somebody help me find out how to iterate these raw txt data to mysql. The format is
user id | item id | rating | timestamp
and i want to insert these data to my table in MySql (using PHPmyAdmin), let's say the table structure is : user_id (int), item_id(int), rating(int), timestamp(int) with its name "Rating".
So, i want to know how to insert these data to my table, i'm fine with php, or if there are easier way to do this.
If you want to generate raw SQL queries, you can do so by using find and replace in your text editor (that looks like Notepad++). I'm guessing that your delimiters are tabs.
Find and replace all tab characters and replace them with a comma. We do not need to quote anything as all of your fields are integers.
Find and replace all newline characters and replace them with a SQL query.
Execute these commands in regular expression mode:
Columns
Find: \t
Replace: ,
Rows
Find: \r\n (if that doesn't find anything, look for \n)
Replace: );\r\nINSERT INTO Rating (user_id, item_id, rating, timestamp) VALUES (
On the first row, insert the text INSERT INTO Rating (user_id, item_id, rating, timestamp) VALUES ( to make the row a valid SQL statement.
On the last row, remove any trailing portions of SQL query after the last semicolon.
Copy and paste this into your PHPMyAdmin and it should be all good.
The simplest way I have found for doing similar is to use Excel. Import the text file into a new document - judging by the look it should be easy to seperate the columns as they appear to be tab delimited. Once you have the required columns set up a string concatenation to include the values... kind of like
=CONCATENATE("INSERT INTO Rating SET user_id='",A1,"', item_id='",B1,"', rating='",C1,"', timestamp='",D1,"';")
Then repeat for all rows, copy and paste into sql client
you can use toad for mysql , import wisard and you create a table with the same structure (user id | item id | rating | timestamp) of you file after import all data you export the sql insert of you new table.
I'm using a script that I found online to import CSV file into MySQL: http://www.johnboy.com/blog/tutorial-import-a-csv-file-using-php-and-mysql but when the script detects a comma in an address like this My address, CO 80113 in a cell, it also splits there as well.
I've seen a solution where you can save from excel to a Tab Delimited txt file. Then you go into notepad and replace the tabs with semicolons.
Is this the best practice for "fixing the comma in the address" issue?
My end goal is to take a CSV file from Highrise full of hundreds of clients into our MySQL database, then make updates every couple of months so this seems to be a decent script but am I going at this the wrong way?
EDIT : It appears that the part of the PHP code that splits up the cells is this
while ($data = fgetcsv($handle,1000,",","'"));
If you use comma as column separator, then you should quote all string field values, e.g. -
1,'My address, CO 80113'
2,'His address, CO 80114'
4,'Her address, CO 80115'
and so on
Try to use LOAD DATA INFILE statement with FIELDS ENCLOSED BY '\'' option.
comma (',') is the hart of CSV file so when you generate the csv file it divide cell when comma found.
So I think that you should replace the comma with the space or any other delimeter. you can use PHP 's default function str_replace() when your final code is generate for CSV.
I need to convert a fixed length text file into a MySQL Table.
My biggest problem is that multiple cells are contained on each line, and this is how the file is sent to me, and the main reason why I want to convert it.
The cells are all of a specific length; however all are included on the one line.
For example the first 3 positions (1 - 3) of a line are the IRT, the next three positions (4 - 6) are the IFTC the next 5 positions (7 - 11) are the FSC, etc.
As the file can contain up to 300 lines of records, I need an easy way to import it straight into the SQL Tables.
I have been searching the net for hours trying to find a solution, however without comma separation I haven't been able to find a working solution yet.
I would like to code this solution in PHP, if possible as well. And am willing to do the long yards of working out how to use the function required to do this if someone could give me the function name, I don't expect people to write my code out for me.
File:
testfile.txt (4 rows)
AAA11111xx
BBB22222yy
CCC33333zz
DDD 444 aa
Table:
CREATE TABLE TestLoadDataInfile
( a VARCHAR(3)
, b INT(5)
, c CHAR(2)
) CHARSET = latin1;
Code:
LOAD DATA INFILE 'D:\\...\\testfile.txt'
INTO TABLE TestLoadDataInfile
FIELDS TERMINATED BY ''
LINES TERMINATED BY '\r\n' ;
Result:
mysql> SELECT * FROM TestLoadDataInfile ;
+-----+-------+----+
| a | b | c |
+-----+-------+----+
| AAA | 11111 | xx |
| BBB | 22222 | yy |
| CCC | 33333 | zz |
| DDD | 444 | aa |
+-----+-------+----+
The LOAD DATA INFILE documentation is not very good at this point (fixed-size fields). Here's the related parts:
If the FIELDS TERMINATED BY and FIELDS
ENCLOSED BY values are both empty
(''), a fixed-row (nondelimited)
format is used. With fixed-row format,
no delimiters are used between fields
(but you can still have a line
terminator). Instead, column values
are read and written using a field
width wide enough to hold all values
in the field. For TINYINT, SMALLINT,
MEDIUMINT, INT, and BIGINT, the field
widths are 4, 6, 8, 11, and 20,
respectively, no matter what the
declared display width is.
LINES TERMINATED BY is still used to
separate lines. If a line does not
contain all fields, the rest of the
columns are set to their default
values. If you do not have a line
terminator, you should set this to ''.
In this case, the text file must
contain all fields for each row.
Fixed-row format also affects handling
of NULL values, as described later.
Note that fixed-size format does not
work if you are using a multi-byte
character set.
NULL handling
With fixed-row format (which is used
when FIELDS TERMINATED BY and FIELDS
ENCLOSED BY are both empty), NULL is
written as an empty string. Note that
this causes both NULL values and empty
strings in the table to be
indistinguishable when written to the
file because both are written as empty
strings. If you need to be able to
tell the two apart when reading the
file back in, you should not use
fixed-row format.
Some cases are not supported by LOAD
DATA INFILE:
Fixed-size rows (FIELDS TERMINATED BY and FIELDS ENCLOSED BY
both empty) and BLOB or TEXT columns.
User variables cannot be used when
loading data with fixed-row format
because user variables do not have a
display width.
You probably won't like it very much, but there really isn't an easy way to do what you're after. A long time ago (circa 1991), I wrote a tool, DBLDFMT (for 'database load format') to deal with such fixed-length, non-delimited files. It is tuned to generating the load format preferred by Informix databases (so it uses a pipe symbol by default to separate the fields, but of course you can tune that with a command line option or an environment variable). It can, however, create delimited data which you can then process more normally, probably using the LOAD DATA INFILE command.
Contact me by email (see my profile) if you want the source code for DBLDFMT. (The current version, 3.17 from 2008, does not have direct support for CSV output. It would not be hard to add it. You can, more or less, achieve the required effect, but it should be a lot easier than it is.)
I have a simple PHP script that loops over data in a CSV file, and adds the records to the database accordingly. One of my fields is a description field, but that description field itself has a comma (or multiple comma's) in it. It seems as though data for that field is only read up until the comma, however the next field is correct, so it is not as though the field after that is the remainder of the description, is is using the next column which is right.
Am I supposed to escape the comma? I am adding this data to a MySQL database, could that be where the issue is being caused?
My SQL query could be something like:
$description = $data[7]; //description column eg: "hello, my name is xxxxx, I am old"
INSERT INTO tblsomething (id, description) VALUES ($id, '$description');
The above statement only inserts the description as "hello" and nothing after the first comma it encounters.
Any ideas why this is?
Many thanks,
Simon
EDIT: This is solved, apologies to all as it was a silly mistake. It appears that the person who did the front end was creating arrays of content using the patter ',' to split the content. IT seems that the description - although supposed to be one array entry - was being split into multiple entries due to it containing comma's. This will be solved by using a more rare character like the pipe symbol to create our separators.
Thanks to all
Because it's not a CSV file. Fields in a CSV file that contain commas are supposed to be delimited by double quotes; this way the CSV functions in PHP will handle them properly.
I have a large file that I would like to read via php, and then insert various fields into MySQL.
Each file in the feed is in plain text format, separated into columns and rows. Each record has the same set of fields. The following are the delimiters for each field and record:
Field Separator (FS): SOH (ASCII character 1)
Record Separator (RS) : STX (ASCII character 2) + "\n"
If I look at the first few lines of the file they look like this:
#export_dateapplication_idlanguage_codetitledescriptionrelease_notescompany_urlsupport_urlscreenshot_url_1screenshot_url_2screenshot_url_3screenshot_url_4screenshot_width_height_1screenshot_width_height_2screenshot_width_height_3screenshot_width_height_4
#primaryKey:application_idlanguage_code
#dbTypes:BIGINTINTEGERVARCHAR(20)VARCHAR(1000)LONGTEXTLONGTEXTVARCHAR(1000)VARCHAR(1000)VARCHAR(1000)VARCHAR(1000)VARCHAR(1000)VARCHAR(1000)VARCHAR(20)VARCHAR(20)VARCHAR(20)VARCHAR(20)
#exportMode:FULL
I am struggling to no where to start in order to read this file into PHP, can anyone help with the basic PHP to read each record, and assign a variable to each field, which I then will be able to write into MySQL. I can handle the writing into SQL once I have the various fields set up.
Thanks in advance,
Greg
files greator than 2gb cant be read in PHP (32 bit limit).
For lower size use simple fopen function
And inserting mysql is all the work of macthing patterns and inserts.
If structure of table is same every row then better make it manual once and then just execute inserts by extracting values either by regex or other functions like explode and split .
If every line has delimiters between each field, you may look at fgetcsv().
When you use fgetcsv() on a line, it will return an array with the contents from that line. Since you have several lines, put the funciton inside a while()-loop (look at example #1)