I'm trying to get a CSV imported into a MySQL database, where each new line should represent a new row in the database.
Here is what I have so far in the CSV:
1one, 1two, 1three, 1four
2one, 2two, 2three, 2four
And in the application:
$handle = fopen($_FILES['filename']['tmp_name'], "r");
$data = fgetcsv($handle, 1000, ",");
$sql = "INSERT INTO tbl (col1, col2, col3, col4) VALUES (?, ?, ?, ?)";
$q = $c->prepare($sql);
$q->execute(array($data[0],$data[1],$data[2],$data[3]))
The problem is that only the first four values are being inserted, clearly due to the lack of a loop.
I can think of two options to solve this:
1) Do some "hacky" for loop, that remembers the position of the index, and then does n+1 on each of the inserted array values.
2) Realise that fgetcsv is not the function I need, and there is something better to handle new lines!
Thanks!
while ($data = fgetcsv($handle, 1000, ",")){
//process each $data row
}
You may also wish to set auto_detect_line_endings to true in php.ini, to avoid issues with Mac created CSVs.
Why would you need a script for this? You can do this in 1 simple query:
LOAD DATA LOCAL INFILE '/data/path/to/file.csv' INTO your_db.and_table
FIELDS TERMINATED BY ', ' /* included the space here, bc there's one in your example*/
LINES TERMINATED BY '\n' /* or, on windows box, probably by '\r\n'*/
(`col1`, `col2`, `col3`, `col4`);
That's all there is to it (in this case, mysql manual will provide more options that can be specified like OPITIONALLY ENCLOSED BY etc...)
Ok, as far as injection goes: while inserting it's -to the best of my knowledge- impossible to be an issue. The data is at no point used to build a query from, MySQL just parses it as varchar data and inserts the data (it doesn't execute any of it). The only operation it undergoes is a cast, type cast to int or float if that turns out to be required.
What could happen is that the data does contain query strings that could do harm when you start selecting data from your table. You might be able to set your MySQL server to escape certain characters for this session, or you could just run a str_replace('``','``',$allData); or something in your script.
Bottom line is: I'm not entirely sure, but the risk of injection should be, overall, rather small.
A bit more can be found here
When it comes to temp files, since you're using $_FILES['filename']['tmp_name'], you might want to use your own temp file: file_put_contents('myLoadFile.csv',file_get_contents($_FILES['filename']['tmp_name']));, and delete that file once you're done. It could well be that it's possible to use the tempfile directly, but I haven't tried that, so I don't know (and not going to try today :-P).
Related
I'm faced with a problematic CSV file that I have to import to MySQL.
Either through the use of PHP and then insert commands, or straight through MySQL's load data infile.
I have attached a partial screenshot of how the data within the file looks:
The values I need to insert are below "ACC1000" so I have to start at line 5 and make my way through the file of about 5500 lines.
It's not possible to skip to each next line because for some Accounts there are multiple payments as shown below.
I have been trying to get to the next row by scanning the rows for the occurrence of "ACC"
if (strpos($data[$c], 'ACC') !== FALSE){
echo "Yep ";
} else {
echo "Nope ";
}
I know it's crude, but I really don't know where to start.
If you have a (foreign key) constraint defined in your target table such that records with a blank value in the type column will be rejected, you could use MySQL's LOAD DATA INFILE to read the first column into a user variable (which is carried forward into subsequent records) and apply its IGNORE keyword to skip those "records" that fail the FK constraint:
LOAD DATA INFILE '/path/to/file.csv'
IGNORE
INTO TABLE my_table
CHARACTER SET utf8
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 4 LINES
(#a, type, date, terms, due_date, class, aging, balance)
SET account_no = #account_no := IF(#a='', #account_no, #a)
There are several approaches you could take.
1) You could go with #Jorge Campos suggestion and read the file line by line, using PHP code to skip the lines you don't need and insert the ones you want into MySQL. A potential disadvantage to this approach if you have a very large file is that you will either have to run a bunch of little queries or build up a larger one and it could take some time to run.
2) You could process the file and remove any rows/columns that you don't need, leaving the file in a format that can be inserted directly into mysql via command line or whatever.
Based on which approach you decide to take, either myself or the community can provide code samples if you need them.
This snippet should get you going in the right direction:
$file = '/path/to/something.csv';
if( ! fopen($file, 'r') ) { die('bad file'); }
if( ! $headers = fgetcsv($fh) ) { die('bad data'); }
while($line = fgetcsv($fh)) {
echo var_export($line, true) . "\n";
if( preg_match('/^ACC/', $line[0] ) { echo "record begin\n"; }
}
fclose($fh);
http://php.net/manual/en/function.fgetcsv.php
I need to insert data from a plain text file, explode each line to 2 parts and then insert to the database. I'm doing in this way, But can this programme be optimized for speed ?
the file has around 27000 lines of entry
DB structure [unique key (ext,info)]
ext [varchar]
info [varchar]
code:
$string = file_get_contents('list.txt');
$file_list=explode("\n",$string);
$entry=0;
$db = new mysqli('localhost', 'root', '', 'file_type');
$sql = $db->prepare('INSERT INTO info (ext,info) VALUES(?, ?)');
$j=count($file_list);
for($i=0;$i<$j;$i++)
{
$data=explode(' ',$file_list[$i],2);
$sql->bind_param('ss', $data[0], $data[1]);
$sql->execute();
$entry++;
}
$sql->close();
echo $entry.' entry inserted !<hr>';
If you are sure that file contains unique pairs of ext/info, you can try to disable keys for import:
ALTER TABLE `info` DISABLE KEYS;
And after import:
ALTER TABLE `info` ENABLE KEYS;
This way unique index will be rebuild once for all records, not every time something is inserted.
To increase speed even more you should change format of this file to be CSV compatible and use mysql LOAD DATA to avoid parsing every line in php.
When there are multiple items to be inserted you usually put all data in a CSV file, create a temporary table with columns matching CSV, and then do a LOAD DATA [LOCAL] INFILE, and then move that data into destination table. But as I can see you don't need much additional processing, so you can even treat your input file as a CSV without any additional trouble.
$db->exec('CREATE TEMPORARY TABLE _tmp_info (ext VARCHAR(255), info VARCHAR(255))');
$db->exec("LOAD DATA LOCAL INFILE '{$filename}' INTO TABLE _tmp_info
FIELDS TERMINATED BY ' '
LINES TERMINATED BY '\n'"); // $filename = 'list.txt' in your case
$db->exec('INSERT INTO info (ext, info) SELECT t.ext, t.info FROM _tmp_info t');
You can run a COUNT(*) on temp table after that to show how many records were there.
If you have a large file that you want to read in I would not use file_get_contents. By using it you force the interpreter to store the entire contents in memory all at once, which is a bit wasteful.
The following is a snippet taken from here:
$file_handle = fopen("myfile", "r");
while (!feof($file_handle)) {
$line = fgets($file_handle);
echo $line;
}
fclose($file_handle);
This is different in that all you are keeping in memory from the file at a single instance in time is a single line (not the entire contents of the file), which in your case will probably lower the run-time memory footprint of your script. In your case, you can use the same loop to perform your INSERT operation.
If you can use something like Talend. It's an ETL program, simple and free (it has a paid version).
Here is the magic solution [3 seconds vs 240 seconds]
ALTER TABLE info DISABLE KEYS;
$db->autocommit(FALSE);
//insert
$db->commit();
ALTER TABLE info ENABLE KEYS;
I'm downloading large sets of data via an XML Query through PHP with the following scenario:
- Query for records 1-1000, download all parts (1000 parts has roughly 4.5 megs of text), then store those in memory while i query the next 1001 - 2000, store in mem (up to potentially 400k)
I'm wondering if it would be better to write these entries to a text field, rather than storing them in memory and once the complete download is done trying to insert them all up into the DB or to try and write them to the DB as they come in.
Any suggestions would be greatly appreciated.
Cheers
You can run a query like this:
INSERT INTO table (id, text)
VALUES (null, 'foo'), (null, 'bar'), ..., (null, 'value no 1000');
Doing this you'll do the thing in one shoot, and the parser will be called once. The best you can do, is running something like this with the MySQL's Benchmark function, running 1000 times a query that inserts 1000 records, or 1000000 of inserts of one record.
(Sorry about the prev. answer, I've misunderstood the question).
I think write them to database as soon as you receive them. This will save memory and u don't have to execute a 400 times slower query at the end. You will need mechanism to deal with any problems that may occur in this process like a disconnection after 399K results.
In my experience it would be better to download everything in a temporary area and then, when you are sure that everything went well, to move the data (or the files) in place.
As you are using a database you may want to dump everything into a table, something like this code:
$error=false;
while ( ($row = getNextRow($db)) && !error ) {
$sql = "insert into temptable(key, value) values ($row[0], $row[1])";
if (mysql_query ($sql) ) {
echo '#';
} else {
$error=true;
}
}
if (!error) {
$sql = "insert into myTable (select * from temptable)";
if (mysql_query($sql) {
echo 'Finished';
} else {
echo 'Error';
}
}
Alternatively, if you know the table well, you can add a "new" flag field for newly inserted lines and update everything when you are finished.
Hey, trying to figure out a way to use a file I have to generate an SQL insert to a database.
The file has many entries of the form:
100090 100090 bill smith 1998
That is,an id number, another id(not always the same), a full name and a year. These are all separated by a space.
Basically what i want to to is be able to get variables from these lines as I iterate through the file so that i can,for instance give the values on each line the names: id,id2,name,year. I then want to pass these to a database.So for each line id be able to do (in pseudo code)
INSERT INTO BLAH VALUES(id, id2,name , year)
This is in php, I noticed I haven't outlined that above, however i have also tried using grep in order to find the regex but cant find a way to paste the code: eg:"VALUES()" around the information from the file.
Any help would be appreciated. I'm kind of stuck on this one
Try something like this:
$fh = fopen('filename', 'r');
$values = array();
while(!feof($fh)) {
//Read a line from the file
$line = trim(fgets($fh));
//Match the line against the specified format
$fields = array();
if(preg_match(':^(\d+) (\d+) (.+) (\d{4})$:', $line, $fields)) {
//If it do match, create a VALUES() block
//Don't forget to escape the string part
$values[] = sprintf('VALUES(%d, %d, "%s", %d)',
$fields[1], $fields[2], mysqli_real_escape_string($fields[3]), $fields[4]);
}
}
fclose($fh);
$all_values = implode(',', $values);
//Check out what's inside $all_values:
echo $all_values;
If the file is really big you'll have to do your SQL INSERTs inside the loop instead of saving them to the end, but for small files I think it's better to save all VALUEs to the end so we can do only one SQL query.
If you can rely on the file's structure (and don't need to do additional sanitation/checking), consider using LOAD DATA INFILE.
GUI tools like HeidiSQL come with great dialogs to build fully functional mySQL statements easily.
Alternatively, PHP has fgetcsv() to parse CSV files.
If all of your lines look like the one you posted, you can read the contents of the file into a string (see http://www.ehow.com/how_5074629_read-file-contents-string-php.html)
Then use PHP split function to give you each piece of the query. (Looks like preg_split() as of PHP 5.3).
The array will look like this:
myData[0] = 10090
myData[1] = 10090
myData[2] = Bill Smith
myData[3] = 1998
.....And so on for each record
Then you can use a nifty loop to build your query.
for($i = 0, $i < (myData.length / 4); $i+4)
{
$query = 'INSERT INTO MyTABLE VALUES ($myData[$i],$myData[$i+1],$myData[$i+2],myData[$i+3])'
//Then execute the query
}
This will be better and faster than introducing a 3rd party tool.
I got thousands of data inside the array that was parsed from xml.. My concern is the processing time of my script, Does it affect the processing time of my script since I have a hundred thousand records to be inserted in the database? I there a way that I process the insertion of the data to the database in batch?
Syntax is:
INSERT INTO tablename (fld1, fld2) VALUES (val1, val2), (val3, val4)... ;
So you can write smth. like this (dummy example):
foreach ($data AS $key=>$value)
{
$data[$key] = "($value[0], $value[1])";
}
$query = "INSERT INTO tablename (fld1, fld2) VALUES ".implode(',', $data);
This works quite fast event on huge datasets, and don't worry about performance if your dataset fits in memory.
This is for SQL files - but you can follow it's model ( if not just use it ) -
It splits the file up into parts that you can specify, say 3000 lines and then inserts them on a timed interval < 1 second to 1 minute or more.
This way a large file is broken into smaller inserts etc.
This will help bypass editing the php server configuration and worrying about memory limits etc. Such as script execution time and the like.
New Users can't insert links so Google Search "sql big dump" or if this works goto:
www [dot] ozerov [dot] de [ slash ] bigdump [ dot ] php
So you could even theoretically modify the above script to accept your array as the data source instead of the SQl file. It would take some modification obviously.
Hope it helps.
-R
Its unlikely to affect the processing time, but you'll need to ensure the DB's transaction logs are big enough to build a rollback segment for 100k rows.
Or with the ADOdb wrapper (http://adodb.sourceforge.net/):
// assuming you have your data in a form like this:
$params = array(
array("key1","val1"),
array("key2","val2"),
array("key3","val3"),
// etc...
);
// you can do this:
$sql = "INSERT INTO `tablename` (`key`,`val`) VALUES ( ?, ? )";
$db->Execute( $sql, $params );
Have you thought about array_chunk? It worked for me in another project
http://www.php.net/manual/en/function.array-chunk.php