Well I'm trying to insert several rows from an csv file to a myslDB, my first attempt (wrong approach) was to trying to insert creating a new object with $o = new Model();
After read/research on the web I saw that what I need is to use transaction, Right now im using phpactiverectord ORM and this is my code:
But still having the 30 sec fatal error
try{
if (($handle = fopen("somefile.csv", "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$someid = $data[4];
Usuario::transaction(function() use ($someid){
Usuario::create(array("matricula" => $someid));
});
}
fclose($handle);
}
}
I think im coding in a wrong way the transaction but I don't realize how to do it. Need some help. Actually the insert is working wha I need is to insert all before the 30 sec error happend, my database is on godaddy btw.
thanks
EDIT -
This was solve with the set_time_limit function was not a transaction problem. Maybe this question can work for other person I will leave it.
Related
I explain, I have a Symfony2 project and I need to import users via csv file in my database. I have to do some work on the datas before importing it in MySQL. I created a service for this and everything is working fine but it takes too much time to execute and slow my server if I give it my entire file. My files have usually between 500 and 1500 rows and I have to split my file in ~200 rows files and import one by one.
I need to handle related users that can be both in the file and/or in database already. Related users are usually a parent of a child.
Here is my simplified code :
$validator = $this->validator;
$members = array();
$children = array();
$mails = array();
$handle = fopen($filePath, 'r');
$datas = fgetcsv($handle, 0, ";");
while (($datas = fgetcsv($handle, 0, ";")) !== false) {
$user = new User();
//If there is a related user
if($datas[18] != ""){
$user->setRelatedMemberEmail($datas[18]);
$relation = array_search(ucfirst(strtolower($datas[19])), UserComiti::$RELATIONSHIPS);
if($relation !== false)
$user->setParentRelationship($relation);
}
else {
$user->setRelatedMemberEmail($datas[0]);
$user->addRole ( "ROLE_MEMBER" );
}
$user->setEmail($mail);
$user->setLastName($lastName);
$user->setFirstName($firstName);
$user->setGender($gender);
$user->setBirthdate($birthdate);
$user->setCity($city);
$user->setPostalCode($zipCode);
$user->setAddressLine1($adressLine1);
$user->setAddressLine2($adressLine2);
$user->setCountry($country);
$user->setNationality($nationality);
$user->setPhoneNumber($phone);
//Entity Validation
$listErrors = $validator->validate($user);
//In case of errors
if(count($listErrors) > 0) {
foreach($listErrors as $error){
$nbError++;
$errors .= "Line " . $line . " : " . $error->getMessage() . "\n";
}
}
else {
if($mailParent != null)
$children[] = $user;
else{
$members[] = $user;
$nbAdded++;
}
}
foreach($members as $user){
$this->em->persist($user);
$this->em->flush();
}
foreach($children as $child){
//If the related user is already in DB
$parent = $this->userRepo->findOneBy(array('username' => $child->getRelatedMemberEmail(), 'club' => $this->club));
if ($parent !== false){
//Check if someone related to related user already has the same last name and first name. If it is the case we can guess that this user is already created
$testName = $this->userRepo->findByParentAndName($child->getFirstName(), $child->getLastName(), $parent, $this->club);
if(!$testName){
$child->setParent($parent);
$this->em->persist($child);
$nbAdded++;
}
else
$nbSkipped++;
}
//Else in case the related user is neither file nor in database we create a fake one that will be able to update his profile later.
else{
$newParent = clone $child;
$newParent->setUsername($child->getRelatedMemberEmail());
$newParent->setEmail($child->getRelatedMemberEmail());
$newParent->setFirstName('Unknown');
$this->em->persist($newParent);
$child->setParent($newParent);
$this->em->persist($child);
$nbAdded += 2;
$this->em->flush();
}
}
}
It's not my whole service because I don't think the remaining would help here but if you need more information ask me.
While I do not heave the means to quantitatively determine the bottlenecks in your program, I can suggest a couple of guidelines that will likely significantly increase its performance.
Minimize the number of database commits you are making. A lot happens when you write to the database. Is it possible to commit only once at the end?
Minimize the number of database reads you are making. Similar to the previous point, a lot happens when you read from the database.
If after considering the above points you still have issues, determine what SQL the ORM is actually generating and executing. ORMs work great until efficiency becomes a problem and more care needs to go into ensuring optimal queries are being generated. At this point, becoming more familiar with the ORM and SQL would be beneficial.
You don't seem to be working with too much data, but if you were, MySQL alone supports reading CSV files.
The LOAD DATA INFILE statement reads rows from a text file into a table at a very high speed.
https://dev.mysql.com/doc/refman/5.7/en/load-data.html
You may be able to access this MySQL specific feature through your ORM, but if not, you would need to write some plain SQL to utilize it. Since you need to modify the data you are reading from the CSV, you would likely be able to do this very, very quickly by following these steps:
Use LOAD DATA INFILE to read the CSV into a temporary table.
Manipulate the data in the temporary table and other tables as required.
SELECT the data from the temporary table into your destination table.
I know that it is very old topic, but some time ago I created a bundle, which can help import entities from csv to database. So maybe if someone will see this topic, it will be helpful for him.
https://github.com/jgrygierek/BatchEntityImportBundle
https://github.com/jgrygierek/SonataBatchEntityImportBundle
I'm tryin to insert datas (160,000+ rows) using INSERT INTO and PHP PDO but i have a bug.
When I launch the PHP script, i see more than the exact number of lines in my CSV inserted in my database.
Can someone say me if my loop is not correct or something ?
Here the code I have :
$bdd = new PDO('mysql:host=<myhost>;dbname=<mydb>', '<user>', '<pswd>');
// I clean the table
$req = $bdd->prepare("TRUNCATE TABLE lbppan_ticket_reglements;");
$req->execute();
// I read and import line by line the CSV file
$handle = fopen('<pathToMyCsvFile>', "r");
while (($data = fgetcsv($handle, 0, ',')) !== FALSE) {
$reqImport =
"INSERT INTO lbppan_ticket_reglements
(<my31Columns>)
VALUES
('$data[0]','$data[1]','$data[2]','$data[3]','$data[4]','$data[5]','$data[6]','$data[7]','$data[8]',
'$data[9]','$data[10]','$data[11]','$data[12]','$data[13]','$data[14]','$data[15]','$data[16]',
'$data[17]','$data[18]','$data[19]','$data[20]','$data[21]','$data[22]','$data[23]','$data[24]',
'$data[25]','$data[26]','$data[27]','$data[28]','$data[29]','$data[30]')";
$req = $bdd->prepare($reqImport);
$req->execute();
}
fclose($handle);
The script works a little because datas are in the table but i dunno why it bugs and inserts more datas. I think maybe, due to the file size (18 Mo) maybe the script crash and attempts to relaunch inserting same rows again.
I can't use LOAD DATA on the server I'm using.
Thanks for your help.
This is not an answer but adding this much into comments is quite tricky.
Start by upping the maximum execution time
If that does not solve your issue, start working your way through the code line by line and handle every exception you can think of. For example, you are truncating the table BUT you say you have loads more data after execution, could the truncate be failing?
try {
$req = $bdd->prepare("TRUNCATE TABLE lbppan_ticket_reglements;");
$req->execute();
} catch (\Exception $e) {
exit($e->getMessage()); // Die immediately for ease of reading
}
Not the most graceful of try/catches but it will allow you to easily spot a problem. You can also apply this to the proceeding query...
try {
$req = $bdd->prepare($reqImport);
$req->execute();
} catch (\Exception $e) {
exit($e->getMessage());
}
and also stick in some diagnostics, are you inserting 160k rows? You could optionally echo out $i on each loop and see if you can spot any breaks or abnormalities.
$i = 0;
while (($data = fgetcsv($handle, 0, ',')) !== FALSE) {
// ... your stuff
$i++;
}
echo "Rows inserted " . $i . "\n\n";
Going beyond that you can the loop print out the SQL content for you to look at manually, perhaps its doing something weird and fruity.
Hope that helps.
Assuming $data[0] is the unique identifier then you can try this to spot the offending row(s):
$i = 0;
while (($data = fgetcsv($handle, 0, ',')) !== FALSE) {
echo 'Row #'.++$i.' - '.$data[0];
}
Since you are not using prepared statements, it is very possible that one of the $data array items are causing a double-insert or some other unknown issue.
I'm going to explain with my best efforts what my goal is here. Everything I've searched for online hasn't been relevant enough for me to gain an idea.
First off, this is a PHP assignment where we have to load CSV files into a MySQL database.
Now, each table (total of 4) have the exact same field values. What I am trying to accomplish is using a for each loop that populates each table with the information from the CSV file. I know I can do this by having a while loop for each table and CSV file but I'm trying to go above the requirements and learn more about PHP. Here is my code for what I'm trying to accomplish:
$files = glob('*.txt'); // .txt required extension
foreach($files as $file) {
if (($handle = fopen($file, "r")) !== FALSE) {
while (($data = fgetcsv($handle,4048, ",")) !== FALSE) {
echo $data[0]; // making sure data is correct
$import = "INSERT INTO".basename($file)."(id,itemName,price) VALUES('$data[0]','$data[1]','$data[2]')";
multi_query($import) or die (mysql_error());
}
fclose($handle);
}
else {
echo "Could not open file: " . $file;
}
}
Each CSV file contains the id, itemName and price. Hopefully this is understandable enough. Thank you
The way you are importing data into MySQL is OK for small volume of data. However, if you are importing huge volumes(thousands of rows), the best way would be to import it directy into MySQL is by using infile. Fo example:
LOAD DATA LOCAL INFILE '/path/to/your_file.csv'
INTO TABLE your_table_name
FIELDS TERMINATED BY ','
ENCLOSED BY '"' LINES
TERMINATED BY '\n' (id, itemName, price)
That's a smarter way to import your CSV data :)
I found and followed the directions contained within this StackOverflow thread: Update MySql Table from CSV using PHP
I've got an error somewhere that I'm unable to detect, I think there's a problem with my query, which works fine in actual MySQL but seems to not quite translate to PHP.
In short, I'm trying to UPDATE the value of several rows within a single table (catalog_product_entity_varchar) with CSV column $data[1], but only where certain skus are concerned AND attribute_id = 523 AND entity_id matches $data[0] of my CSV. Here's my code (actual PW/username, etc, obviously removed)
$con=mysqli_connect("localhost","username","password","some_db");
if (!$con){
die('Could not connect: ' . mysql_error());
}
if (($file = fopen("upload.csv", "r")) !== FALSE) {
while (($data = fgetcsv($file)) !== FALSE) {
$sql = "UPDATE catalog_product_entity_varchar
JOIN catalog_product_flat_1
ON catalog_product_flat_1.entity_id = catalog_product_entity_varchar.entity_id
SET catalog_product_entity_varchar.value='{$data[1]}'
WHERE catalog_product_entity_varchar.entity_id='{$data[0]}'
AND catalog_product_entity_varchar.attribute_id = 523
AND (catalog_product_flat_1.sku LIKE '%PR%'
OR catalog_product_flat_1.sku LIKE '%PT%'
OR catalog_product_flat_1.sku LIKE '%PF%')";
if (mysql_query($con,$sql)) {
echo "Updated!";
} else {
echo "Error updating " . mysql_error();
}
}
}
fclose($file);
It simply returns "Error updating" for every line of the spreadsheet. This query, when simply done in MySQL (without the PHP) and modified to have actual values instead of $data[1] or $data[0] works just fine. What am I missing?
If you're unclear of what I'm trying to achieve, I did post this question yesterday (trying to do it via pure mySQL) and there's more context here - https://stackoverflow.com/questions/21170245/updating-a-joined-table-in-mysql-from-a-csv
Wow.
So I feel stupid. Apparently mixing mysqli_connect and mysql_query doesn't work very well. Adding the "i" to the "mysql" of mysql_query solved it. Thanks for looking everyone!
I have a csv file that has 3.5 million codes in it.
I should point out that this is only EVER going to be this once.
The csv looks like
age9tlg,
rigfh34,
...
Here is my code:
ini_set('max_execution_time', 600);
ini_set("memory_limit", "512M");
$file_handle = fopen("Weekly.csv", "r");
while (!feof($file_handle)) {
$line_of_text = fgetcsv($file_handle);
if (is_array($line_of_text))
foreach ($line_of_text as $col) {
if (!empty($col)) {
mysql_query("insert into `action_6_weekly` Values('$col', '')") or die(mysql_error());
}
} else {
if (!empty($line_of_text)) {
mysql_query("insert into `action_6_weekly` Values('$line_of_text', '')") or die(mysql_error());
}
}
}
fclose($file_handle);
Is this code going to die part way through on me?
Will my memory and max execution time be high enough?
NB:
This code will be run on my localhost, and the database is on the same PC, so latency is not an issue.
Update:
here is another possible implementation.
This one does it in bulk inserts of 2000 records
$file_handle = fopen("Weekly.csv", "r");
$i = 0;
$vals = array();
while (!feof($file_handle)) {
$line_of_text = fgetcsv($file_handle);
if (is_array($line_of_text))
foreach ($line_of_text as $col) {
if (!empty($col)) {
if ($i < 2000) {
$vals[] = "('$col', '')";
$i++;
} else {
$vals = implode(', ', $vals);
mysql_query("insert into `action_6_weekly` Values $vals") or die(mysql_error());
$vals = array();
$i = 0;
}
}
} else {
if (!empty($line_of_text)) {
if ($i < 2000) {
$vals[] = "('$line_of_text', '')";
$i++;
} else {
$vals = implode(', ', $vals);
mysql_query("insert into `action_6_weekly` Values $vals") or die(mysql_error());
$vals = array();
$i = 0;
}
}
}
}
fclose($file_handle);
if i was to use this method what is the highest value i could set it to insert at once?
Update 2
so, ive found i can use
LOAD DATA LOCAL INFILE 'C:\\xampp\\htdocs\\weekly.csv' INTO TABLE `action_6_weekly` FIELDS TERMINATED BY ';' ENCLOSED BY '"' ESCAPED BY '\\' LINES TERMINATED BY ','(`code`)
but the issue now is that, i was wrong about the csv format,
it is actually 4 codes and then a line break,
so
fhroflg,qporlfg,vcalpfx,rplfigc,
vapworf,flofigx,apqoeei,clxosrc,
...
so i need to be able to specify two LINES TERMINATED BY
this question has been branched out to Here.
Update 3
Setting it to do bulk inserts of 20k rows, using
while (!feof($file_handle)) {
$val[] = fgetcsv($file_handle);
$i++;
if($i == 20000) {
//do insert
//set $i = 0;
//$val = array();
}
}
//do insert(for last few rows that dont reach 20k
but it dies at this point because for some reason $val contains 75k rows, and idea why?
note the above code is simplified.
I doubt this will be the popular answer, but I would have your php application run mysqlimport on the csv file. Surely it is optimized far beyond what you will do in php.
is this code going to die part way
through on me? will my memory and max
execution time be high enough?
Why don't you try and find out?
You can adjust both the memory (memory_limit) and execution time (max_execution_time) limits, so if you really have to use that, it shouldn't be a problem.
Note that MySQL supports delayed and multiple row insertion:
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
http://dev.mysql.com/doc/refman/5.1/en/insert.html
make sure there are no indexes on your table, as indexes will slow down inserts (add the indexes after you've done all the inserts)
rather than create a new SQL statement in each call of the loop try and Prepare the SQL statement outside of the loop, and Execute that prepared statement with parameters inside the loop. Depending on the database this can be heaps faster.
I've done the above when importing a large Access database into Postgres using perl and got the insert time down to 30 seconds. I would have used an importer tool, but I wanted perl to enforce some rules when inserting.
You should accumulate the values and insert them into the database all at once at the end, or in batches every x records. Doing a single query for each row means 3.5 million SQL queries, each carrying quite some overhead.
Also, you should run this on the command line, where you won't need to worry about execution time limits.
The real answer though is evilclown's answer, importing to MySQL from CSV is already a solved problem.
I hope there is not a web client waiting for a response on this. Other than calling the import utility already referenced, I would start this as a job and return feedback to the client almost immediately. Have the insert loop update a percentage-complete somewhere so the end user can check the status, if you absolutely must do it this way.
2 possible ways.
1) Batch the process, then have a scheduled job import the file, while updating a status. This way, you can have a page that keeps checking the status and refresh itself if the status is not yet 100%. Users will have a live update of how much has been done. But for this you need to access to the OS to be able to set up the schedule task. And the task will be running idle when there is nothing to import.
2) Have the page handle 1000 rows (or any N number of rows... you decide), then send a java script to the browser to refresh itself with a new parameter to tell the script to handle the next 1000 rows. You can also display a status to the user while this is happening. Only problem is that if the page somehow does nor refresh, then the import stops.