I'm looping through a CSV to insert/update the name field of some records into a table. The script is mean't to insert the record and if it exists, only update the name field.
It's taking quite some time for larger CSV files so I was wondering if this code could be modified into a multiple INSERT query with an ON DUPLICATE KEY UPDATE command which will only update the name field of the record.
The CSV DOES NOT contain all the fields for the table, only the ones for the primary key and the name. And for that reason, REPLACE will not work for this case.
if (($handle = fopen($_FILES['csv']['tmp_name'], "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$title = 'Import: '.date('d-m-Y').' '.$row;
# CHECK IF ALREADY EXISTS
$explode = explode('-',$data[0]);
$areacode = $explode[0];
$exchange = $explode[1];
$number = $explode[2];
$update = "INSERT INTO ".TBLPREFIX."numbers SET
area_code = ".$areacode.",
exchange = ".$exchange.",
number = ".$number.",
status = 1,
name = '".escape($data[1])."'
ON DUPLICATE KEY UPDATE name = '".escape($data[1])."'";
mysql_query($update) or die(mysql_error());
$row++;
}
fclose($handle);
$content .= success($row.' numbers have been imported.');
}
Open a transaction before you start inserting, and commit it after you are done. This way the database can optimize the write operation on disk, because it takes place in a separate memory space. Without a transaction, all single queries are automatically committed at once and effective for every other query.
At least I hope you are using InnoDB as a storage engine - MyISAM does not support transactions and has other significant drawbacks. You should avoid it if possible.
Related
I asked a question yesterday that was unclear and I've now expanded it slightly. In short, this current project calls for a simple web interface where the user can upload a csv file (this web page is created already). I've modified my PHP for a test file but my situation calls for something different. Every day, the user will upload 1 to 5 different CSV reports. These reports have about 110 fields/columns, though not all fields will be filled in every report. I've created a database with 5 tables, each table covering different fields out of the 110. For instance, one table holds info on the water meters (25 fields) and another table holds info for the tests done on the meters (45 fields). I'm having a hard time finding a way to take the CSV, once uploaded, and split the data into the different tables. I've heard of putting the whole CSV into one table and splitting from there with INSERT statements but I have questions with that:
Is there a way to put a CSV with 110 fields into one table without having fields created? Or would I have to create 110 fields in MYSQL workbench and then create a variable for each in PHP?
If not, would I be able to declare variables from the table dump so that the right data then goes into its correct table?
I'm not as familiar with CSVs in terms of uploading like this, usually just pulling a csv from a folder with a known file name, so that's where my confusion is coming from. Here is the PHP i've used as a simple test with only 10 columns. This was done to make sure the CSV upload works, which it does.
<?php
$server = "localhost";
$user = "root";
$pw = "root";
$db = "uwstest";
$connect = mysqli_connect($server, $user, $pw, $db);
if ($connect->connect_error) {
die("Connection failed: " . $conn->connect_error);
}
if(isset($_POST['submit']))
{
$file = $_FILES['file']['tmp_name'];
$handle = fopen($file, "r");
$c = 0;
while(($filesop = fgetcsv($handle, 1000, ",")) !== false)
{
$one = $filesop[0];
$two = $filesop[1];
$three = $filesop[2];
$four = $filesop[3];
$five = $filesop[4];
$six = $filesop[5];
$seven = $filesop[6];
$eight = $filesop[7];
$nine = $filesop[8];
$ten = $filesop[9];
$sql = "INSERT INTO staging (One, Two, Three, Four, Five, Six, Seven, Eight, Nine, Ten) VALUES ('$one','$two', '$three','$four','$five','$six','$seven','$eight','$nine','$ten')";
}
if ($connect->query($sql) === TRUE) {
echo "You database has imported successfully";
} else {
echo "Error: " . $sql . "<br>" . $conn->error;
}
}
}?>
Depending on CSV size, you might want to consider using MySQL's native CSV import function since it runs 10x-100x times faster.
If you do insist on importing row by row, then you can do something like this with PDO (or adapt it to mysqli).
If you want to match columns, then ,either store your csv as associative array, or parse first row and store it in in array like $cols.
in this case, $results is an associative array that stores a row of csv with column_name=>column_value
$cols=implode(',',array_keys($result));
$vals=':'.str_replace(",",",:",$cols);
$inserter = $pdo->prepare("INSERT INTO `mydb`.`mytable`($cols) VALUES($vals);");
foreach ($result as $k => $v) {
$result[':' . $k] = utf8_encode($v);
if(is_null($v))
$result[':' . $k] = null;
unset($result[$k]);
}
$inserter->execute($result);
hope this helps.
I suggest going with PDO just to avoid all kinds of weirdness that you may encounter in CSV's data.
This is how I would create columns/vals.
$is_first=true;
$cols='';
$vals='';
$cols_array=array();
while (($csv = fgetcsv($handle)) !== false) {
if($is_first)
{
$cols_array=$csv;
$cols=implode(',',$csv);
$is_first=false;
$vals=':'.str_replace(",",",:",$cols);
continue;
}
foreach ($result as $k => $v) {
$result[':' . $cols_array[$k]] = utf8_encode($v);
if(is_null($v))
$result[':' . $cols_array[$k]] = null;
unset($result[$k]);
}
$inserter->execute($result);
}
here is the code that I use for CSV imports.
$file='data/data.csv';
$handle = fopen($file, "r");
$path=realpath(dirname(__FILE__));
$full_path=$path."/../../$file";
$cnt = 0;
$is_first = true;
$headers=array();
$bind=array();
$csv = fgetcsv($handle, 10000, ",");
$headers=$csv;
$alt_query='LOAD DATA LOCAL INFILE \''.$full_path.'\' INTO TABLE mytable
FIELDS TERMINATED BY \',\'
ENCLOSED BY \'\"\'
LINES TERMINATED BY \'\r\n\'
IGNORE 1 LINES
(' . implode(',',$headers).')';
echo exec("mysql -e \"USE mydb;$alt_query;\"",$output,$code);
Assuming the relation between the tables and the CSV is arbitrary but uniform for now on you just need to establish that correspondence array index -> table column once.
I'm trying to take a TSV file and 'POST'ed inputs and load the TSV file's contents into a DB table, replacing any existing data for specified columns. The TSV may contain any number of columns and rows and the 1st row specifies the columns that are supposed to be modified.
My problem concerns data in columns that ARE NOT supposed to be modified when running the code-generated LOAD DATA INFILE ... REPLACE INTO TABLE ... MySQL statement. When I run my code (see below), data of columns that are NOT specified in $columnsText (which is generated from the 1st row of the TSV file) end-up getting set to NULL or their default value. On the other hand, data of columns that ARE specified in $columnsText have their contents replaced just as intended.
An example of the MySQL statement that is generated by my code and is working as described above is:
LOAD DATA INFILE 'C:\\MyProject\\public\\1459772537-cities7.tsv' REPLACE INTO TABLE cities FIELDS TERMINATED BY ' ' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '"' LINES TERMINATED BY ' ' IGNORE 1 LINES (id,UNLOCODE,name_english,UN_subdiv) -- for all TSV-file mentioned rows, this statement will update the mentioned columns (id,UNLOCODE,name_english,UN_subdiv) correctly, but then all unmentioned columns for that row will be set to NULL!
How do I modify this code to keep the data of unspecified columns from being set to their default/NULL values? Or more simply, getting to the root of the problem, how do I fix the MySQL statement that is being generated to achieve my objective?
I'm using PHP with Laravel.
// Get file, put it in a folder on the server.
if (Input::hasFile('file')) {
echo "POST has file <br>";
$file = Input::file('file');
$name = time() . '-' . $file->getClientOriginalName();
$path = public_path();
$file->move($path, $name);
$pathName= $path .'\\'.$name;
echo "location: ".$pathName."<br>";
// Determine whether to use IGNORE OR REPLACE in MySQL query.
if (isset($_POST['replace']) && $_POST['replace'] == true){
$ignoreOrReplace = "REPLACE";
}
else {$ignoreOrReplace = "IGNORE";}
echo "ignore or replace: ".$ignoreOrReplace."<br>";
// Determine columns to insert in DB, based on values of input file's 1st row.
$columnsText = "";
if (($handle = fopen("$pathName", "r")) !== FALSE) { //"r" parameter = read-only, w file-pointer at start of file.
$columns = fgetcsv($handle,0,"\t"); // makes an array of the column names that are in the 1st row of TSV file.
$firstIteration = true;
foreach ($columns as $column){
if ($firstIteration){$firstIteration=false;}
else {$columnsText .= ",";}
$columnsText .= $column;
}
echo "DB columns to load: ".$columnsText;
fclose($handle);
}
$query = sprintf(
"LOAD DATA INFILE '%s' %s INTO TABLE %s
FIELDS TERMINATED BY '\t'
OPTIONALLY ENCLOSED BY '\"'
ESCAPED BY '\"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES (%s)",
addslashes($pathName),$ignoreOrReplace,$_POST['mytable'],$columnsText
);
echo "<br>Here's the query: ".$query."<br>";
echo "<br><br> Database update should be complete!<br><br>";
echo 'Return to Home Page<br>';
DB::connection()->getpdo()->exec("SET sql_mode ='';"); // I forgot what this does.
return DB::connection()->getpdo()->exec($query);
The documentation states:
If you specify REPLACE, input rows replace existing rows. In other words, rows that have the same value for a primary key or unique index as an existing row. See Section 13.2.8, “REPLACE Syntax”.
REPLACE is not UPDATE. REPLACE is a MySQL extension to the SQL that first deletes the row if it exists, then it inserts the new one.
On INSERT, MySQL uses the default values for the fields that are not provided in the query. These fields probably default to NULL in your case.
There is no way to update the existing rows using LOAD DATA INFILE.
I suggest you to create a working table and use it only for the purpose of loading data into it as follows:
TRUNCATE it before using it.
LOAD DATA INFILE in it.
Join it against the table you want to update and use UPDATE on the join to copy the fields you need from the working table to the final table.
Use INSERT ... SELECT to get from the join the rows that are not in the final table and insert them.
TRUNCATE it.
Don't delete the table after it is used, you'll need it again next time. The last step aims to keep its disk usage at a minimum; the table definition doesn't take much space.
The answer provided by axiac is broadly-speaking the correct answer.
In case it is useful to anyone, I have included below the specific code (PHP/Laravel/MySQL) that solved my problem. I can't necessarily say this is the most efficient way to solve this problem, but it it working! :)
// (1) setup
DB::connection()->disableQueryLog();
// (2) Get file, put it in a folder on the server.
if (Input::hasFile('file')) {
$file = Input::file('file');
}
else {
echo "<br>Input file not found! Please review inputed information.<br>";
return null;
}
$name = time() . '-' . $file->getClientOriginalName();
$path = public_path();
$file->move($path, $name);
$pathName= $path .'\\'.$name;
echo "Input file location: ".$pathName."<br>";
// (3) Determine main table and staging table.
$mainTable = $_POST['mytable'];
$stagingTable = $_POST['mytable'].'_staging'; // All staging tables are named: 'standardtable_staging'.
// (4) Determine destination DB table's columns and columns to be inserted into that table (based on values of input file's 1st row).
$columnsMain = Schema::getColumnListing($mainTable);
$columnsInput = [];
$columnsInputText = "";
if (($handle = fopen("$pathName", "r")) !== FALSE) { //"r" parameter = read-only, w file-pointer at start of file.
$columnsInput = fgetcsv($handle,0,"\t"); // makes an array of the column names that are in the 1st row of TSV file.
$firstIteration = true;
foreach ($columnsInput as $columnInput){
if ($firstIteration){$firstIteration=false;}
else {$columnsInputText .= ",";}
$columnsInputText .= $columnInput;
}
echo "<br>DB columns to load: ".$columnsInputText."<br>";
fclose($handle);
}
// (5) Create a new empty staging table.
$statement = "DROP TABLE IF EXISTS ".$stagingTable; // we drop rather than truncate b/c we want to re-determine columns.
DB::connection()->getpdo()->exec($statement);
$statement = "CREATE TABLE ".$stagingTable." LIKE ".$mainTable;
DB::connection()->getpdo()->exec($statement);
// (6) The staging table only needs to have columns that exist in the TSV file, so let's minimize its columns.
$columnsToDrop = [];
foreach ($columnsMain as $columnMain){
if (! in_array($columnMain,$columnsInput)){
array_push($columnsToDrop,$columnMain);
}
}
if (count($columnsToDrop) > 0){
Schema::table($stagingTable, function($t) use ($columnsToDrop) {$t->dropColumn($columnsToDrop);});
}
// (7) Load data to the staging table.
$statement = sprintf(
"LOAD DATA INFILE '%s' INTO TABLE %s
FIELDS TERMINATED BY '\t'
OPTIONALLY ENCLOSED BY '\"'
ESCAPED BY '\"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES (%s)",
addslashes($pathName),$stagingTable,$columnsInputText
);
echo "<br>Here's the MySQL staging statement: <br>".$statement."<br>";
DB::connection()->getpdo()->exec("SET sql_mode ='';"); // don't actually recall why I put this here.
DB::connection()->getpdo()->exec($statement);
// (8) 'INSERT...ON DUPLICATE KEY UPDATE' is used here to get data from staging table to the actually-used table.
// Note: Any new columns in the staging table MUST already be defined in the main table.
$statement = sprintf("INSERT INTO %s (%s) SELECT * FROM %s ON DUPLICATE KEY UPDATE ", $mainTable,$columnsInputText,$stagingTable);
$firstClause = true;
foreach ($columnsInput as $columnInput) {
if (strtoupper($columnInput) != "ID"){
if ($firstClause){$firstClause=false;}
else {$statement .= ", ";}
$clause = $mainTable.".".$columnInput." = IF (".$stagingTable.".".$columnInput." <=> NULL,".
$mainTable.".".$columnInput.",".
$stagingTable.".".$columnInput.")";
$statement .= $clause;
}
}
echo "<br>Here's the staging-to-actual-table statement:<br>".$statement."<br>";
DB::connection()->getpdo()->exec($statement);
echo "<br>New information added to database!<br>";
I have a MySQL table with the following fields:
ID
PHONE
NAME
CITY
COUNTRY
Using PHP, I am reading a comma separated dump of values off a text document, parsing the values and inserting records to the table. For reference, here's the code:
<?php
// Includes
require_once 'PROJdbconn.php';
// Read comma-separated text file
$arrindx = 0;
$i = 0;
$filehandle = fopen(PROJCDUMPPATH.PROJCDUMPNAME,"rb");
while (!feof($filehandle)){
$parts = explode(',', fgets($filehandle));
$contnames[$arrindx] = $parts['0'];
$contnumbers[$arrindx] = preg_replace('/[^0-9]/','',$parts['1']);
$arrindx += 1;
}
fclose($filehandle);
$arrindx -= 1;
$filehandle = NULL;
$parts = NULL;
// Build SQL query
$sql = "INSERT INTO Contact_table (PHONE, NAME) VALUES ";
for ($i = 0; $i < $arrindx; ++$i){
$sql .= "('".$contnumbers[$i]."', '".$contnames[$i]."'),";
}
$i = NULL;
$arrindx = NULL;
$contnames = NULL;
$contnumbers = NULL;
$sql = substr($sql,0,strlen($sql)-1).";";
// Connect to MySQL database
$connect = dbconn(PROJHOST,PROJDB,PROJDBUSER,PROJDBPWD);
// Execute SQL query
$query = $connect->query($sql);
$sql = NULL;
$query = NULL;
// Close connection to MySQL database
$connect = NULL;
?>
Now, this code, as you can see, blindly dumps all records into the table. However, I need to modify the code logic as such:
Read text file and parse records into arrays (already doing)
For each record in text file
Check if PHONE exists in the table
If yes,
For each field in the text file record
If text file field != NULL
Update corresponding field in table
Else
Skip
If no,
INSERT record (already doing)
I apologize if the logic isn't terribly clear, feel free to ask me if any aspect confuses you. So, I understand this logic would involve an insane number of SELECT, UPDATE, and INSERT queries, depending on the number of fields (I intend to add more fields in future) and records. Is there any way to either somehow morph them into a single query or leastwise optimize the code by minimizing the number of queries?
What you're trying to do is called an "upsert" (update/insert).
MySQL INSERT else if exists UPDATE
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I would like to create and upload page in php and import the uploaded csv file data into multiple tables. tried searching here but looks like can't find any which is importing from a csv to multiple table. any help here is greatly appreciated. thank you.
As the another variant proposed above, you can read your CSV line-by-line and explode each line into fields. Each field will corresponds one variable.
$handle = fopen("/my/file.csv", "r"); // opening CSV file for reading
if ($handle) { // if file successfully opened
while (($CSVrecord = fgets($handle, 4096)) !== false) { // iterating through each line of our CSV
list($field1, $field2, $field3, $field4) = explode(',', $CSVrecord); // exploding CSV record (line) to the variables (fields)
// and here you can easily compose SQL queries and map you data to the tables you need using simple variables
}
fclose($handle); // closing file handler
}
If you have access to PHPmyadmin, you can upload the CSV into there. Then copy if over to each desired table
In response to your comment that some data is going to one table and other data is going to another table, here is a simple example.
Table1 has 3 fields: name, age and sex. Table2 has 2 fields: haircolour, shoesize. So your CSV could be laid out like:
john smith,32,m,blonde,11
jane doe,29,f,red,4
anders anderson,56,m,grey,9
For the next step you will be using the function fgetcsv. This will break each line of the csv into an array that you can then use to build your SQL statements:
if (($handle = fopen($mycsvfile, "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
// this loops through each line of your csv, putting the values into array elements
$sql1 = "INSERT INTO table1 (`name`, `age`, `sex`) values ('".$data[0]."', '".$data[1]."', '".$data[2]."')";
$sql2 = "INSERT INTO table2 (`haircolour`, `shoesize`) values ('".$data[3]."', '".$data[4]."')";
}
fclose($handle);
}
Please note that this does not take any SQL security such as validation into account, but that is basically how it will work.
the problem seems to me to differentiate what field is for which table.
when you are sending a header like
table.field, table.field, table.field
and then split the header, you'll get all tables and fields.
could that be a way to go?
all the best
ps: because of your comment ...
A csv file has/can have a first line with fieldnames in it. when there is a need too copy csv data into more than one tables, then you can use a workaround to find out which field is for which table.
user.username, user.lastname, blog.comment, blog.title
"sam" , "Manson" , "this is a comment", "and I am a title"
Now, when reading the csv data you can work over the first line, split the title at the dot to find out wich tables are used and also the fields.
With this method you are able to copy csv data to more than one table.
But it means, you have to code it first :(
To split the fieldnames
// only the first line for the fieldnames
$topfields = preg_split('/,|;|\t/')
foreach( $topfields as $t => $f ) {
// t = tablename, f = field
}
if (($handle = fopen($mycsvfile, "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
// this loops through each line of your csv, putting the values into array elements
$sql1 = "INSERT INTO table1 (`name`, `age`, `sex`) values ('".$data[0]."', '".$data[1]."', '".$data[2]."')";
$sql2 = "INSERT INTO table2 (`haircolour`, `shoesize`) values ('".$data[3]."', '".$data[4]."')";
}
fclose($handle);
}
in above code you use two insert queries how you gonna run these queries ?
I am stuck with a peculiar issue here. I have a script that basically imports a CSV file into a database using fgetcsv() in php. There is no problem in doing this at all and I am able to update old entries as well using MySQL syntax ON DUPLICATE KEY UPDATE (I am in no way a MySQL expert, hence me asking here).
Here is that part of the code:
$handle = fopen($file,"r");
fgetcsv($handle, 1000, ",");//skip first row since they are headers
while(($fileop = fgetcsv($handle, 1000, ",")) !== false) //read line by line into $fileop
{
//read array values into vars
$item1 = $fileop[0];
$item2 = $fileop[1];
$key = $fileop[2];
// and a couple more
// now INSERT / UPDATE data in MySQL table
$sql = mysql_query("INSERT INTO table (item1,item2,key)
VALUES ('$item1','$item2','$key')
ON DUPLICATE KEY UPDATE item1='$item1',item2='$item2'");
}
This all works fine. What I am stuck with is the fact that some entries may have been removed from the actual CSV (as in the key may no longer be existant). What I would like to do is remove the entries from the MySQL table that are no longer present in the CSV.
Meaning if $key is gone from CSV also remove that row in the database table. I suppose I would do it before I run the Insert / Update query on the MySQL table?
I would appreciate any help guys.
Just keep an account of your keys.
Save every $key in an array in your while, and in the end run a query that says
DELETE FROM tabel WHERE key NOT IN (listofcommaseparatedkeysgoeshere)
$arrayThatYouNeedToTest = array();
$handle = fopen($file,"r");
fgetcsv($handle, 1000, ",");//skip first row since they are headers
while(($fileop = fgetcsv($handle, 1000, ",")) !== false) //read line by line into $fileop
{
//read array values into vars
$item1 = $fileop[0];
$item2 = $fileop[1];
$key = $fileop[2];
// and a couple more
// now INSERT / UPDATE data in MySQL table
$sql = mysql_query("INSERT INTO table (item1,item2,key)
VALUES ('$item1','$item2','$key')
ON DUPLICATE KEY UPDATE item1='$item1',item2='$item2'");
$arrayThatYouNeedToTest[] = $key;
}
$stringThatYouNeedToInspect = implode(",",$arrayThatYouNeedToTest);
$queryYouREALLYneedToCheckFirst = "DELETE FROM tabel WHERE key NOT IN (".$stringThatYouNeedToInspect.")";
//$result = mysql_query($queryYouREALLYneedToCheckFirst);
I do something very similar to this with an affiliate website - having just under 500,000 products.
In your database, simply add another column named "update_flag" or something similar. Set the default to be 0. As you add items from the CSV file, set the update_flag to be "1". In your 'on duplicate statement', set the filed to be "2". I also went and added 2 other fields: "date_added" and "date_updated".
After your import is complete, you can count the old items (to be deleted), newly added items, and those that have been updated. You can then simple delete from table where update_flag = 0
I hope this helps.