Importing large CSV into mysql database - php

I'm having a really troublesome time trying to import a large CSV file into mysql on localhost.
The CSV is about 55 MB and has about 750,000 rows.
I've rewritten the script so that it parses the CSV and dumps the rows one by one.
Here's the code:
$row = 1;
if (($handle = fopen("postal_codes.csv", "r")) !== FALSE)
{
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE)
{
$num = count($data);
$row++;
for ($c=0; $c < $num; $c++)
{
$arr = explode('|', $data[$c]);
$postcode = mysql_real_escape_string($arr[1]);
$city_name = mysql_real_escape_string($arr[2]);
$city_slug = mysql_real_escape_string(toAscii($city_name));
$prov_name = mysql_real_escape_string($arr[3]);
$prov_slug = mysql_real_escape_string(toAscii($prov_name));
$prov_abbr = mysql_real_escape_string($arr[4]);
$lat = mysql_real_escape_string($arr[6]);
$lng = mysql_real_escape_string($arr[7]);
mysql_query("insert into cities (`postcode`, `city_name`, `city_slug`, `prov_name`, `prov_slug`, `prov_abbr`, `lat`, `lng`)
values ('$postcode', '$city_name', '$city_slug', '$prov_name', '$prov_slug', '$prov_abbr', '$lat', '$lng')") or die(mysql_error());
}
}
fclose($handle);
}
The problem is that it's taking forever to execute. Any suuggested solutions would be appreciated.

You are reinventing the wheel. Check out the mysqlimport tool, which comes with MySQL. It is an efficient tool for importing CSV data files.
mysqlimport is a command-line interface for the LOAD DATA LOCAL INFILE SQL statement.
Either should run 10-20x faster than doing INSERT row by row.

Your problem is likely that you have autocommit on (by default) so MySQL is committing a new transaction for each insert. You should turn autocommit off with SET autocommit=0;. If you can switch to using the mysqli library (and you should if possible), you can use mysqli::autocommit(false) to turn off autocommitting.
$mysqli = new mysqli('localhost','db_user','my_password','mysql');
$mysqli->autocommit(false);
$stmt=$mysqli->prepare("insert into cities (`postcode`, `city_name`, `city_slug`, `prov_name`, `prov_slug`, `prov_abbr`, `lat`, `lng`)
values (?, ?, ?, ?, ?, ?, ?, ?);")
$row = 1;
if (($handle = fopen("postal_codes.csv", "r")) !== FALSE)
{
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE)
{
$num = count($data);
$row++;
for ($c=0; $c < $num; $c++)
{
$arr = explode('|', $data[$c]);
$stmt->bind_param('ssssssdd', $arr[1], $arr[2], toAscii(arr[2]), $arr[3], toAscii($arr[3]), $arr[4], $arr[6], $arr[7]);
$stmt->execute();
}
}
}
$mysqli->commit();
fclose($handle);

It will be much faster to use LOAD DATA if you can

try to do it in one query.
It could be limited by your my.cnf (mysql configuration) though
<?php
$row = 1;
$query = ("insert into cities ");
if (($handle = fopen("postal_codes.csv", "r")) !== FALSE)
{
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE)
{
$num = count($data);
$row++;
for ($c=0; $c < $num; $c++)
{
$arr = explode('|', $data[$c]);
$postcode = mysql_real_escape_string($arr[1]);
$city_name = mysql_real_escape_string($arr[2]);
$city_slug = mysql_real_escape_string(toAscii($city_name));
$prov_name = mysql_real_escape_string($arr[3]);
$prov_slug = mysql_real_escape_string(toAscii($prov_name));
$prov_abbr = mysql_real_escape_string($arr[4]);
$lat = mysql_real_escape_string($arr[6]);
$lng = mysql_real_escape_string($arr[7]);
$query .= "(`postcode`, `city_name`, `city_slug`, `prov_name`, `prov_slug`, `prov_abbr`, `lat`, `lng`)
values ('$postcode', '$city_name', '$city_slug', '$prov_name', '$prov_slug', '$prov_abbr', '$lat', '$lng'),";
}
}
fclose($handle);
}
mysql_query(rtrim($query, ","));
if it won't work, you can try this (disable automatical commit)
mysql_query("SET autocommit = 0");
$row = 1;
if (($handle = fopen("postal_codes.csv", "r")) !== FALSE)
{
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE)
{
$num = count($data);
$row++;
for ($c=0; $c < $num; $c++)
{
$arr = explode('|', $data[$c]);
$postcode = mysql_real_escape_string($arr[1]);
$city_name = mysql_real_escape_string($arr[2]);
$city_slug = mysql_real_escape_string(toAscii($city_name));
$prov_name = mysql_real_escape_string($arr[3]);
$prov_slug = mysql_real_escape_string(toAscii($prov_name));
$prov_abbr = mysql_real_escape_string($arr[4]);
$lat = mysql_real_escape_string($arr[6]);
$lng = mysql_real_escape_string($arr[7]);
mysql_query("insert into cities (`postcode`, `city_name`, `city_slug`, `prov_name`, `prov_slug`, `prov_abbr`, `lat`, `lng`)
values ('$postcode', '$city_name', '$city_slug', '$prov_name', '$prov_slug', '$prov_abbr', '$lat', '$lng')") or die(mysql_error());
}
}
fclose($handle);
}

I did this with SQL server:
I used SQL Bulkinsert command combined with data tables.
Data Tables reside in memory and are built from reading rows inside the file.
Each data table is built from a chunk of rows, not the entire file.
Keep track from the chunk processed by keeping pointers from last row read and max size of chunk.
When you are reading the file. exit the loop when the row id > last row + chunk size.
Keeping on looping and keep on inserting.

Also sometimes when you are using Load data if there are warnings the import will stop. You can use the keyword ignore.
LOAD DATA INFILE 'file Path' IGNORE INTO TABLE YOUR_Table

I had a similar situation where is was NOT feasible to use LOAD DATA. Transactions were at times unacceptable as well, as data needed to be checked for duplicates. Yet, the following drastically improved the process time for some of my import data files.
Before your while loop (CSV Lines) set autocommit to 0 and start a transaction (InnoDB only):
mysql_query('SET autocommit=0;');
mysql_query('START TRANSACTION;');
After your loop, commit and reset autocommit back to 1 (default):
mysql_query('COMMIT;');
mysql_query('SET autocommit=1;');
Replace mysql_query() with whatever Database object your code is using. I hope this helps others.

Related

how to insert multiple array in mysql table

Im trying to insert array in mysql table... but my code doesn't work
$File = 'testfile.csv';
$arrResult = array();
$handle = fopen($File, "r");
$row = 0;
if(empty($handle) === false) {
while(($data = fgetcsv($handle, 1000, ";")) !== FALSE){
$arrResult[] = $data;
$num = count($data); //2100 resultats in my testfile
$row++;
if($row>1){ //ignore header line
for ($c=0; $c < $num; $c++) { //start loop
$sql = '
INSERT INTO MyTable (name, class, level, ability)
VALUES ("'.$data[0].'","'.$data[1].'","'.$data[2].'","'.$data[3].'")
';
$Add=$db->query($sql);
}
}
}
fclose($handle);
};
Result in Mytable:
1,Hero1, Warrior, 65, vitality;
2,Hero1, Warrior, 65, vitality;
3,Hero1, Warrior, 65, vitality;
4,Hero1, Warrior, 65, vitality;
...
You don't need the inner for loop. You're inserting the same row multiple times, since $count is the number of fields in the CSV.
And instead of checking $row each time through the loop, you can simply read the first line and ignore it before the loop.
if(empty($handle) === false) {
fgets($handle); // skip header line
while(($data = fgetcsv($handle, 1000, ";")) !== FALSE){
$sql = '
INSERT INTO MyTable (name, class, level, ability)
VALUES ("'.$data[0].'","'.$data[1].'","'.$data[2].'","'.$data[3].'")
';
$Add=$db->query($sql);
}
}
// remove `for` loop
if($row>1){ //ignore header line
$sql = '
INSERT INTO MyTable (name, class, level, ability)
VALUES ("'.$data[0].'","'.$data[1].'","'.$data[2].'","'.$data[3].'")
';
$Add=$db->query($sql);
}
And of course move to prepared statements to make your code more secure.

convert string to variable within prepared statement

I am creating a php script that imports a csv file into an existing mySQL database.
I am selecting the matching column headings by checking if the heading in in an array and trying to use the column number to create my input into my prepared statement.
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$numCols = count($data);
$row = array();
// process the first row to select columns for extraction
if($isFirstRow) {
$num = count($data);
for($col=0; $col<count($columns); $col++){
for($c=0; $c<$numCols; $c++)
if(!in_array($data[$c], $columns[$col])){
if($c == ($numCols-1))
{
$matchingCol[$col] = '""';
}
}
else{
$matchingCol[$col] = 'data['.$c.']';
apc_store("foo$c", $matchingCol[$col]);
}
}
$isFirstRow = false;
}
$data = array(
'contractorName' => (apc_fetch('foo1')) ,
'contractorType' => $matchingCol[3]);
$query = "INSERT INTO uploadSQL SET" . bindFields($data);
$result = $pdo->prepare($query);
$result->execute($data);
The data posted into the database is '$data[3]', '$data[5]' etc.
How can I get the INSERT to input the data stored at $data[3] and not the string '$data[3]'?
Use $data[$c] instead of 'data['.$c.']'...!?
Single quoted strings will display things almost completely "as is." Variables and most escape sequences will not be interpreted.
Not an exact solution to the problem but a work around. I'm sure there's a more efficient way.
if($isFirstRow) {
$csvRow1 = $data;
}
$num = count($data);
for($col=0; $col<count($columns); $col++){
for($c=0; $c<$numCols; $c++){
if(!in_array($csvRow1[$c], $columns[$col])){
if($c == ($numCols-1))
{
$matchingCol[$col] = '""';
}
}
else{
$matchingCol[$col] = "$data[$c]";
$c = $numCols;
}
}
}

Skipping blank rows with fopen();

I currently have some code like this:
$handle = fopen($_FILES['file']['tmp_name'], "r");
$i = 0;
while (($data = fgetcsv($handle, 1000, ",")) !== false) {
if($i > 0) {
$sql = "
insert into TABLE(A, B, C, D)
values ('$data[0]', '$data[1]', '$data[2]', '$data[3]')
";
$stmt = $dbh -> prepare($sql);
$stmt->execute();
}
$i++;
}
fclose($handle);
This allows me to write to a certain table the contents of a CSV file, excluding the first row where all the names are. I want to be able to extract only the filled rows. How would I use so using this code?
fgetcsv returns an array consisting of a single null if the rows are empty
http://www.php.net/manual/en/function.fgetcsv.php
so you should be able to do a check based on that.
if ($data[0]===null)
{
continue;
}
or something like that
fgetcsv() returns an array with null for blank lines so you can do something like below.
$handle = fopen($_FILES['file']['tmp_name'], "r");
$i = 0;
while (($data = fgetcsv($handle, 1000, ",")) !== false) {
if (array(null) === $data) { // ignore blank lines
continue;
}
if($i > 0) {
$sql = "
insert into TABLE(A, B, C, D)
values ('$data[0]', '$data[1]', '$data[2]', '$data[3]')
";
$stmt = $dbh -> prepare($sql);
$stmt->execute();
}
$i++;
}
fclose($handle);
Based on the documentation, fgetcsv will return an array consisting of a single null value for empty rows, so you should be able to test the return value against that and skip blank lines that way.
The following example code will skip processing blank lines. Note that I have changed the file and removed some other logic to make it more easily testable.
<?php
$handle = fopen("LocalInput.txt", "r");
$i = 0;
while (($data = fgetcsv($handle, 1000, ",")) !== false) {
if($data== array(null)) continue;
var_dump($data);
$i++;
}
fclose($handle);
?>

insert data from csv file to mysql with php

i want to insert data from a csv file into my mysql database with php. But i dont know what i doing wrong.
This is my php code
if ($_FILES[csv][size] > 0){
$csv_file = $_FILES[csv][tmp_name]; // Name of your CSV file
$csvfile = fopen($csv_file, 'r');
$theData = fgets($csvfile);
$i = 0;
while (!feof($csvfile)) {
$csv_data[] = fgets($csvfile, 1024);
$csv_array = explode(",", $csv_data[$i]);
$insert_csv = array();
$insert_csv['id'] = $csv_array[0];
$insert_csv['name'] = $csv_array[1];
$insert_csv['email'] = $csv_array[2];
if(!empty($insert_csv['email'])){
$query = "INSERT INTO contacts(id,name,email)
VALUES('','".$insert_csv['name']."','".$insert_csv['email']."')";
$n=mysqli_query($database->connection,$query);
}
$i++;
}
fclose($csvfile);
}
This is my csv looks like.
id---- name ------- email
1 ---- user1--------bla#hotmail.com
2 ---- user2 --------blah
3------ user 3 ------ blah
When i run this code my mysql results are
in email table = ##0.00 "TL")$# en in my name table= also ##0.00
"TL")$#;
What do i wrong?
You might want to use MySQL to do the whole loading process with the LOAD DATA INFILE statement.
if($_FILES['csv']['error'] === UPLOAD_ERR_OK && $_FILES['csv']['size'] > 0) {
$query = "LOAD DATA INFILE '" . $_FILES['csv']['tmp_name']
. "' INTO TABLE contacts FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\n' (id, name, email);";
if(!mysqli_query($query)){
die('Oops! Something went wrong!');
}
}
If required you can tweak the loading parameters (FIELDS TERMINATED BY, ENCLOSED BY, LINES TERMINATED BY).
Do take note that if you use this approach your temporary file needs to be stored in a place where its accessible by the MySQL server (like /tmp).
To start with, I think you should remove the first
$data = fgetcsv($getfile, 1000, ",");
line, outside of the while loop...
Please try like this as a example , it should work for you as you want
I think you missed qoutes in "
$query = "INSERT INTO contacts(id,name,email)
VALUES('".$col1."','".$col2."','".$col3."')";
"
<?php
$csv_file = 'C:\wamp\www\stockmarket\test.csv'; // Name of your CSV file with path
if (($getfile = fopen($csv_file, "r")) !== FALSE) {
$data = fgetcsv($getfile, 1000, ",");
while (($data = fgetcsv($getfile, 1000, ",")) !== FALSE) {
$num = count($data);
for ($c=0; $c < $num; $c++) {
$result = $data;
$str = implode(",", $result);
$slice = explode(",", $str);
$col1 = $slice[0];
$col2 = $slice[1];
$col3 = $slice[2];
// SQL Query to insert data into DataBase
$query = "INSERT INTO contacts(id,name,email)
VALUES('".$col1."','".$col2."','".$col3."')";
$s=mysql_query($query, $connect );
}
}
}
?>

Using PHP to import CSV data into mySQL, Part 2

Following up on my last thread. Trying to import a user-generated CSV into MySQL via a PHP upload script. Uploads successfully, but I am not able to use LOAD DATA due to a permissions problem. Here is what I am trying to do instead:
$row = 1;
if (($handle = fopen($target_path, "r")) !== FALSE)
{
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE)
{
$num = count($data);
echo "<p> $num fields in line $row: <br /></p>\n";
$row++;
for ($c=0; $c < $num; $c++)
{
$fullRow = $fullRow . $data[$c] . "," ;
}
echo $fullRow;
mysql_query("INSERT INTO foo (field1, field2, field3, field4) VALUES ('$fullRow')");
$fullRow = NULL;
}
fclose($handle);
}
echo $fullRow spits out a verbatim copy of the line from the CSV file, except for an additional comma on the end. Is this why the Insert is not working correctly? When I do a manual upload via phpMyAdmin, the CSV file is imported without issue. Or is there a problem with the VALUE ('$fullRow') bit of the code?
You can simply remove the last comma.
for ($c=0; $c < $num; $c++)
{
$fullRow = $fullRow . $data[$c] . "," ;
}
echo $fullRow;
$fullRow = substr($fullRow,0,-1);
And also you script is not ok.
mysql_query(" INSERT INTO foo (field1, field2, field3, field4) VALUES ('$fullRow') " );
$fullRow = NULL;
Paolo_NL_FR's fixes should get you up and running. The script could use some TLC though, and does not have even basic sql injection protection. Try something like this perhaps:
if (($handle = fopen($target_path, "r")) !== FALSE)
{
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE)
{
$values = implode(",", array_map('mysql_real_escape_string', $data));
mysql_query("INSERT INTO foo (field1, field2, field3, field4) VALUES ($values);");
}
fclose($handle);
}

Categories