Best practice: Import mySQL file in PHP; split queries

Best practice: Import mySQL file in PHP; split queries - php

I have a situation where I have to update a web site on a shared hosting provider. The site has a CMS. Uploading the CMS's files is pretty straightforward using FTP.
I also have to import a big (relative to the confines of a PHP script) database file (Around 2-3 MB uncompressed). Mysql is closed for access from the outside, so I have to upload a file using FTP, and start a PHP script to import it. Sadly, I do not have access to the mysql command line function so I have to parse and query it using native PHP. I also can't use LOAD DATA INFILE. I also can't use any kind of interactive front-end like phpMyAdmin, it needs to run in an automated fashion. I also can't use mysqli_multi_query().
Does anybody know or have a already coded, simple solution that reliably splits such a file into single queries (there could be multi-line statements) and runs the query. I would like to avoid to start fiddling with it myself due to the many gotchas that I'm likely to come across (How to detect whether a field delimiter is part of the data; how to deal with line breaks in memo fields; and so on). There must be a ready made solution for this.

Here is a memory-friendly function that should be able to split a big file in individual queries without needing to open the whole file at once:
function SplitSQL($file, $delimiter = ';')
{
set_time_limit(0);
if (is_file($file) === true)
{
$file = fopen($file, 'r');
if (is_resource($file) === true)
{
$query = array();
while (feof($file) === false)
{
$query[] = fgets($file);
if (preg_match('~' . preg_quote($delimiter, '~') . '\s*$~iS', end($query)) === 1)
{
$query = trim(implode('', $query));
if (mysql_query($query) === false)
{
echo '<h3>ERROR: ' . $query . '</h3>' . "\n";
}
else
{
echo '<h3>SUCCESS: ' . $query . '</h3>' . "\n";
}
while (ob_get_level() > 0)
{
ob_end_flush();
}
flush();
}
if (is_string($query) === true)
{
$query = array();
}
}
return fclose($file);
}
}
return false;
}
I tested it on a big phpMyAdmin SQL dump and it worked just fine.
Some test data:
CREATE TABLE IF NOT EXISTS "test" (
"id" INTEGER PRIMARY KEY AUTOINCREMENT,
"name" TEXT,
"description" TEXT
);
BEGIN;
INSERT INTO "test" ("name", "description")
VALUES (";;;", "something for you mind; body; soul");
COMMIT;
UPDATE "test"
SET "name" = "; "
WHERE "id" = 1;
And the respective output:
SUCCESS: CREATE TABLE IF NOT EXISTS "test" ( "id" INTEGER PRIMARY KEY AUTOINCREMENT, "name" TEXT, "description" TEXT );
SUCCESS: BEGIN;
SUCCESS: INSERT INTO "test" ("name", "description") VALUES (";;;", "something for you mind; body; soul");
SUCCESS: COMMIT;
SUCCESS: UPDATE "test" SET "name" = "; " WHERE "id" = 1;

Single page PHPMyAdmin - Adminer - Just one PHP script file.
check : http://www.adminer.org/en/

When StackOverflow released their monthly data dump in XML format, I wrote PHP scripts to load it into a MySQL database. I imported about 2.2 gigabytes of XML in a few minutes.
My technique is to prepare() an INSERT statement with parameter placeholders for the column values. Then use XMLReader to loop over the XML elements and execute() my prepared query, plugging in values for the parameters. I chose XMLReader because it's a streaming XML reader; it reads the XML input incrementally instead of requiring to load the whole file into memory.
You could also read a CSV file one line at a time with fgetcsv().
If you're inporting into InnoDB tables, I recommend starting and committing transactions explicitly, to reduce the overhead of autocommit. I commit every 1000 rows, but this is arbitrary.
I'm not going to post the code here (because of StackOverflow's licensing policy), but in pseudocode:
connect to database
open data file
PREPARE parameterizes INSERT statement
begin first transaction
loop, reading lines from data file: {
parse line into individual fields
EXECUTE prepared query, passing data fields as parameters
if ++counter % 1000 == 0,
commit transaction and begin new transaction
}
commit final transaction
Writing this code in PHP is not rocket science, and it runs pretty quickly when one uses prepared statements and explicit transactions. Those features are not available in the outdated mysql PHP extension, but you can use them if you use mysqli or PDO_MySQL.
I also added convenient stuff like error checking, progress reporting, and support for default values when the data file doesn't include one of the fields.
I wrote my code in an abstract PHP class that I subclass for each table I need to load. Each subclass declares the columns it wants to load, and maps them to fields in the XML data file by name (or by position if the data file is CSV).

Can't you install phpMyAdmin, gzip the file (which should make it much smaller) and import it using phpMyAdmin?
EDIT: Well, if you can't use phpMyAdmin, you can use the code from phpMyAdmin. I'm not sure about this particular part, but it's generaly nicely structured.

Export
The first step is getting the input in a sane format for parsing when you export it. From your question
it appears that you have control over the exporting of this data, but not the importing.
~: mysqldump test --opt --skip-extended-insert | grep -v '^--' | grep . > test.sql
This dumps the test database excluding all comment lines and blank lines into test.sql. It also disables
extended inserts, meaning there is one INSERT statement per line. This will help limit the memory usage
during the import, but at a cost of import speed.
Import
The import script is as simple as this:
<?php
$mysqli = new mysqli('localhost', 'hobodave', 'p4ssw3rd', 'test');
$handle = fopen('test.sql', 'rb');
if ($handle) {
while (!feof($handle)) {
// This assumes you don't have a row that is > 1MB (1000000)
// which is unlikely given the size of your DB
// Note that it has a DIRECT effect on your scripts memory
// usage.
$buffer = stream_get_line($handle, 1000000, ";\n");
$mysqli->query($buffer);
}
}
echo "Peak MB: ",memory_get_peak_usage(true)/1024/1024;
This will utilize an absurdly low amount of memory as shown below:
daves-macbookpro:~ hobodave$ du -hs test.sql
15M test.sql
daves-macbookpro:~ hobodave$ time php import.php
Peak MB: 1.75
real 2m55.619s
user 0m4.998s
sys 0m4.588s
What that says is you processed a 15MB mysqldump with a peak RAM usage of 1.75 MB in just under 3 minutes.
Alternate Export
If you have a high enough memory_limit and this is too slow, you can try this using the following export:
~: mysqldump test --opt | grep -v '^--' | grep . > test.sql
This will allow extended inserts, which insert multiple rows in a single query. Here are the statistics for the same datbase:
daves-macbookpro:~ hobodave$ du -hs test.sql
11M test.sql
daves-macbookpro:~ hobodave$ time php import.php
Peak MB: 3.75
real 0m23.878s
user 0m0.110s
sys 0m0.101s
Notice that it uses over 2x the RAM at 3.75 MB, but takes about 1/6th as long. I suggest trying both methods and seeing which suits your needs.
Edit:
I was unable to get a newline to appear literally in any mysqldump output using any of CHAR, VARCHAR, BINARY, VARBINARY, and BLOB field types. If you do have BLOB/BINARY fields though then please use the following just in case:
~: mysqldump5 test --hex-blob --opt | grep -v '^--' | grep . > test.sql

Can you use LOAD DATA INFILE?
If you format your db dump file using SELECT INTO OUTFILE, this should be exactly what you need. No reason to have PHP parse anything.

I ran into the same problem. I solved it using a regular expression:
function splitQueryText($query) {
// the regex needs a trailing semicolon
$query = trim($query);
if (substr($query, -1) != ";")
$query .= ";";
// i spent 3 days figuring out this line
preg_match_all("/(?>[^;']|(''|(?>'([^']|\\')*[^\\\]')))+;/ixU", $query, $matches, PREG_SET_ORDER);
$querySplit = "";
foreach ($matches as $match) {
// get rid of the trailing semicolon
$querySplit[] = substr($match[0], 0, -1);
}
return $querySplit;
}
$queryList = splitQueryText($inputText);
foreach ($queryList as $query) {
$result = mysql_query($query);
}

Already answered: Loading .sql files from within PHP
Also:
http://webxadmin.free.fr/article/import-huge-mysql-dumps-using-php-only-342.php
http://www.phpbuilder.com/board/showthread.php?t=10323180
http://forums.tizag.com/archive/index.php?t-3581.html

Splitting a query cannot be reliably done without parsing. Here is valid SQL that would be impossible to split correctly with a regular expression.
SELECT ";"; SELECT ";\"; a;";
SELECT ";
abc";
I wrote a small SqlFormatter class in PHP that includes a query tokenizer. I added a splitQuery method to it that splits all queries (including the above example) reliably.
https://github.com/jdorn/sql-formatter/blob/master/SqlFormatter.php
You can remove the format and highlight methods if you don't need them.
One downside is that it requires the whole sql string to be in memory, which could be a problem if you're working with huge sql files. I'm sure with a little bit of tinkering, you could make the getNextToken method work on a file pointer instead.

First at all thanks for this topic. This saved a lot of time for me :)
And let me to make little fix for your code.
Sometimes if TRIGGERS or PROCEDURES is in dump file, it is not enough to examine the ; delimiters.
In this case may be DELIMITER [something] in sql code, to say that the statement will not end with ; but [something]. For example a section in xxx.sql:
DELIMITER //
CREATE TRIGGER `mytrigger` BEFORE INSERT ON `mytable`
FOR EACH ROW BEGIN
SET NEW.`create_time` = NOW();
END
//
DELIMITER ;
So first need to have a falg, to detect, that query does not ends with ;
And delete the unqanted query chunks, because the mysql_query does not need delimiter
(the delimiter is the end of string)
so mysql_query need someting like this:
CREATE TRIGGER `mytrigger` BEFORE INSERT ON `mytable`
FOR EACH ROW BEGIN
SET NEW.`create_time` = NOW();
END;
So a little work and here is the fixed code:
function SplitSQL($file, $delimiter = ';')
{
set_time_limit(0);
$matches = array();
$otherDelimiter = false;
if (is_file($file) === true) {
$file = fopen($file, 'r');
if (is_resource($file) === true) {
$query = array();
while (feof($file) === false) {
$query[] = fgets($file);
if (preg_match('~' . preg_quote('delimiter', '~') . '\s*([^\s]+)$~iS', end($query), $matches) === 1){
//DELIMITER DIRECTIVE DETECTED
array_pop($query); //WE DON'T NEED THIS LINE IN SQL QUERY
if( $otherDelimiter = ( $matches[1] != $delimiter )){
}else{
//THIS IS THE DEFAULT DELIMITER, DELETE THE LINE BEFORE THE LAST (THAT SHOULD BE THE NOT DEFAULT DELIMITER) AND WE SHOULD CLOSE THE STATEMENT
array_pop($query);
$query[]=$delimiter;
}
}
if ( !$otherDelimiter && preg_match('~' . preg_quote($delimiter, '~') . '\s*$~iS', end($query)) === 1) {
$query = trim(implode('', $query));
if (mysql_query($query) === false){
echo '<h3>ERROR: ' . $query . '</h3>' . "\n";
}else{
echo '<h3>SUCCESS: ' . $query . '</h3>' . "\n";
}
while (ob_get_level() > 0){
ob_end_flush();
}
flush();
}
if (is_string($query) === true) {
$query = array();
}
}
return fclose($file);
}
}
return false;
}
I hope i could help somebody too.
Have a nice day!

http://www.ozerov.de/bigdump/ was very useful for me in importing 200+ MB sql file.
Note:
SQL file should be already present in the server so that the process can be completed without any issue

You can use phpMyAdmin for importing the file. Even if it is huge, just use UploadDir configuration directory, upload it there and choose it from phpMyAdmin import page. Once file processing will be close to the PHP limits, phpMyAdmin interrupts importing, shows you again import page with predefined values indicating where to continue in the import.

what do you think about:
system("cat xxx.sql | mysql -l username database");

Related

Php Export large amount of data to CSV or EXCEL from custom tables in drupal 8

We have 2 millions encrypted records in a table to export. We are using Drupal 8 but we cannot export data it through custom views or using webform export due to encryption of sensitive data. So we have to write a custom function to export data in CSV or Excel. But it throw "Allowed Memory Exhausted" error due large amount of data whenever we tried to export it.
It seems the best option is loading data in smaller chunks and appending to the same sheet. How can we achieve this approach? Or any idea to do it in PHP or Drupal 8.

Exporting to CSV is by far the simpler operation. There are a couple of ways to do this.
1. You could always use mysqldump with text delimiters to avoid PHP memory constraints:
mysqldump -u YOUR_USERNAME -p -t DATABASE_NAME TABLE_NAME
--fields-terminated-by=","
--fields-optionally-enclosed-by="\""
--fields-escaped-by="\""
--lines-terminated-by="\r\n"
--tab /PATH/TO/TARGET_DIR
Line breaks added for readability. By default, mysqldump also generates a .sql file with DROP/CREATE TABLE statements. The -t option skips that.
2. You can make a MySQL query and define INTO OUTFILE with the appropriate delimiters to format your data as CSV and save it into a file:
SELECT * FROM `db_name`.`table_name`
INTO OUTFILE 'path_to_folder/table_dump.csv'
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '"'
LINES TERMINATED BY '\r\n';"
If you run this on the command line, you can probably get away with a single call without the need to batch it (subject to your server specs and MySQL memory config).
If you do need to batch, then add something like LIMIT 0, 100000 where 100000 is whatever is a good result set size, and adapt your filename to match: table_dump_100000.csv etc. Merging the resulting CSV dumps into one file should be a simple operation.
3. If you do want to run this over PHP, then you most likely have to batch it. Basic steps:
A loop with for($i = 0; $i <= $max_rows; $i += $incr) where $incr is the batch size. In the loop:
Make MySQL query with variables used in the LIMIT clause; as in LIMIT $i, $incr.
Write the rows with fputcsv into your target file. Define your handle before the loop.
The above is more of a homework assignment than an attempt to provide ready code. Get started and ask again (with code shown). Whatever you do, make sure the data variables used for each batch iteration are reused or cleared to prevent massive memory usage buildup.
You can up your script's memory limit with ini_set('memory_limit', '2048M'); (or whatever your server can handle). If you run into max execution time, set_time_limit(600) (10 min; or whatever seems enough) at the start of your script.

Never tried with 2 million records.
But it works with a few hundred thousand, using drush and a script similar to:
<?php
// php -d memory_limit=-1 vendor/bin/drush php:script export_nodes.php
// options
$type = 'the_type';
$csv_file_name = '/path/to/csv_file.csv';
$delimiter = '"';
$separator = ',';
// fields
$fields = [
'nid',
'title',
'field_one',
'field_two',
'field_three',
];
// header
$header = '';
foreach ($fields as $field) {
$header = $header . $delimiter . $field . $delimiter . $separator;
}
$header = $header . PHP_EOL;
file_put_contents ($csv_file_name, $header, FILE_APPEND);
unset ($header);
// get nodes
$nodes = \Drupal::entityTypeManager()
->getStorage('node')
->loadByProperties([
'type' => $type,
]);
// loop nodes
foreach ($nodes as $node) {
$line = '';
// loop fields
foreach ($fields as $field) {
$field_value_array = $node->get($field)->getValue();
if (empty ($field_value_array[0]['value'])) {
$field_value = '';
}
else {
$field_value = $field_value_array[0]['value'];
}
$line = $line . $delimiter . $field_value . $delimiter . $separator;
}
unset ($field_value_array);
unset ($field_value);
// break line
$line = $line . PHP_EOL;
// write line
file_put_contents ($csv_file_name, $line, FILE_APPEND);
unset ($line);
}
unset ($nodes);

PHP looping through csv file and returning incorrect row

Simply trying to use the url query string as a lookup code in an csv table. This works 9/10 times, but on occasion this will return the wrong line (usually a few lines below what should be the correct line).
csv looks something like this
taskcode1, info1, info1, info1
taskcode2, info2, info2, info2
taskcode3, info3, info3, info3
The problem is that sometimes (around 1/10 times so far), a given url query of taskcode1 will actually return line info3.
This csv file is being read concurrently by more than one user. Could the problem be stemming from simultaneously reading? I know there can be issues for writing, and a flock on the file may be necessary. Here's the actual code in my php script. Thank you for any advice.
Notice that as soon as the task code is found, $this_taskcode == $taskcode, I break the while loop.
//get query from request and look up task configuration
$csv_file_path = "tasks.csv";
$taskcode = $_SERVER['QUERY_STRING'];
//open csv file and find taskcode
$fid = fopen($csv_file_path, 'r');
//loop through each line of csv until taskcode is found, then save the whole line as $hit
while (($line = fgetcsv($fid)) !== FALSE){
$this_taskcode = $line[0];
if ($this_taskcode == $taskcode){
$hit = $line;
break;
};
}
fclose($fid);

Handle Large File with PHP

I have a file with the size of around 10 GB or more. The file contains only numbers ranging from 1 to 10 on each line and nothing else. Now the task is to read the data[numbers] from the file and then sort the numbers in ascending or descending order and create a new file with the sorted numbers.
Can anyone of you please help me with the answer?

I'm assuming this is somekind of homework and goal for this is to sort more data than you can hold in your RAM?
Since you only have numbers 1-10, this is not that complicated task. Just open your input file and count how many occourances of every specific number you have. After that you can construct simple loop and write values into another file. Following example is pretty self explainatory.
$inFile = '/path/to/input/file';
$outFile = '/path/to/output/file';
$input = fopen($inFile, 'r');
if ($input === false) {
throw new Exception('Unable to open: ' . $inFile);
}
//$map will be array with size of 10, filled with 0-s
$map = array_fill(1, 10, 0);
//Read file line by line and count how many of each specific number you have
while (!feof($input)) {
$int = (int) fgets($input);
$map[$int]++;
}
fclose($input);
$output = fopen($outFile, 'w');
if ($output === false) {
throw new Exception('Unable to open: ' . $outFile);
}
/*
* Reverse array if you need to change direction between
* ascending and descending order
*/
//$map = array_reverse($map);
//Write values into your output file
foreach ($map AS $number => $count) {
$string = ((string) $number) . PHP_EOL;
for ($i = 0; $i < $count; $i++) {
fwrite($output, $string);
}
}
fclose($output);
Taking into account the fact, that you are dealing with huge files, you should also check script execution time limit for your PHP environment, following example will take VERY long for 10GB+ sized files, but since I didn't see any limitations concerning execution time and performance in your question, I'm assuming it is OK.

I had a similar issue before. Trying to manipulate such a large file ended up being huge drain on resources and it couldn't cope. The easiest solution I ended up with was to try and import it into a MySQL database using a fast data dump function called LOAD DATA INFILE
http://dev.mysql.com/doc/refman/5.1/en/load-data.html
Once it's in you should be able to manipulate the data.
Alternatively, you could just read the file line by line while outputting the result into another file line by line with the sorted numbers. Not too sure how well this would work though.
Have you had any previous attempts at it or are you just after a possible method of doing it?

If that's all you don't need PHP (if you have a Linux maschine at hand):
sort -n file > file_sorted-asc
sort -nr file > file_sorted-desc
Edit: OK, here's your solution in PHP (if you have a Linux maschine at hand):
<?php
// Sort ascending
`sort -n file > file_sorted-asc`;
// Sort descending
`sort -nr file > file_sorted-desc`;
?>
:)

php fgetcsv multiple lines not only one or all

I wand to read biiiiig CSV-Files and want to insert them into a database. That already works:
if(($handleF = fopen($path."\\".$file, 'r')) !== false){
$i = 1;
// loop through the file line-by-line
while(($dataRow = fgetcsv($handleF,0,";")) !== false) {
// Only start at the startRow, otherwise skip the row.
if($i >= $startRow){
// Check if to use headers
if($lookAtHeaders == 1 && $i == $startRow){
$this->createUberschriften( array_map(array($this, "convert"), $dataRow ) );
} else {
$dataRow = array_map(array($this, "convert"), $dataRow );
$data = $this->changeMapping($dataRow, $startCol);
$this->executeInsert($data, $tableFields);
}
unset($dataRow);
}
$i++;
}
fclose($handleF);
}
My problem of this solution is, that it's very slow. But the files are too big to put it directly into the memory... So I wand to ask, if there a posibility to read, for example 10 lines, into the $dataRow array not only one or all.
I want to get a better balance between the memory and the performance.
Do you understand what i mean? Thanks for help.
Greetz
V
EDIT:
Ok, I still have to try to find a solution with the MSSQL-Database. My solution was to stack the data and than make a multiple-MSSQL-Insert:
while(($dataRow = fgetcsv($handleF,0,";")) !== false) {
// Only start at the startRow, otherwise skip the row.
if($i >= $startRow){
// Check if to use headers
if($lookAtHeaders == 1 && $i == $startRow){
$this->createUberschriften( array_map(array($this, "convert"), $dataRow ) );
} else {
$dataRow = array_map(array($this, "convert"), $dataRow );
$data = $this->changeMapping($dataRow, $startCol);
$this->setCurrentRow($i);
if(count($dataStack) > 210){
array_push($dataStack, $data);
#echo '<pre>', print_r($dataStack), '</pre>';
$this->executeInsert($dataStack, $tableFields, true);
// reset the stack
unset($dataStack);
$dataStack = array();
} else {
array_push($dataStack, $data);
}
unset($data);
}
$i++;
unset($dataRow);
}
}
Finaly I have to loop the Stack and build in mulitiple Insert in the method "executeInsert", to create a query like this:
INSERT INTO [myTable] (field1, field2) VALUES ('data1', 'data2'),('data2', 'datta3')...
That works much better. I still have to check the best balance, but therefor i can change only the value '210' in the code above. I hope that help's everybody with a similar problem.
Attention: Don't forget to execute the method "executeInsert" again after readin the complete file, because it could happen that there are still some data in the stack and the method will only be executed when the stack reach the size of 210....
Greetz
V

I think your bottleneck is not reading the file. Which is a text file. Your bottleneck is the INSERT in the SQL table.
Do something, just comment the line that actually do the insert and you will see the difference.
I had this same issue in the past, where i did exactly what you are doing. reading a 5+ million lines CSV and inserting it in a Mysql table. The execution time was 60 hours which is
unrealistic.
My solutions was switch to another db technology. I selected MongoDB and the execution time
was reduced to 5 minutes. MongoDB performs really fast on this scenarios and also have a tool called mongoimport that will allow you to import a csv file firectly from the command line.
Give it a try if the db technology is not a limitation on your side.
Another solution will be spliting the huge CSV file into chunks and then run the same php script multiple times in parallel and each one will take care of the chunks with an specific preffix or suffix on the filename.
I don't know which specific OS are you using, but in Unix/Linux there is a command line tool
called split that will do that for you and will also add any prefix or suffix you want to the filename of the chunks.

PHP Memory Debugging

For one off my projects I need to import a very huge text file ( ~ 950MB ). I'm using Symfony2 & Doctrine 2 for my project.
My problem is that I get errors like:
Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 24 bytes)
The error even occurs if I increase the memory limit to 1GB.
I tried to analyze the problem by using XDebug and KCacheGrind ( as part of PHPEdit ), but I don't really understand the values :(
I'am looking for a tool or a method (Quick & Simple due to the fact that I don't have much time) to find out why memory is allocated and not freed again.
Edit
To clear some things up here is my code:
$handle = fopen($geonameBasePath . 'allCountries.txt','r');
$i = 0;
$batchSize = 100;
if($handle) {
while (($buffer = fgets($handle,16384)) !== false) {
if( $buffer[0] == '#') //skip comments
continue;
//split parts
$parts = explode("\t",$buffer);
if( $parts[6] != 'P')
continue;
if( $i%$batchSize == 0 ) {
echo 'Flush & Clear' . PHP_EOL;
$em->flush();
$em->clear();
}
$entity = $em->getRepository('MyApplicationBundle:City')->findOneByGeonameId( $parts[0] );
if( $entity !== null) {
$i++;
continue;
}
//create city object
$city = new City();
$city->setGeonameId( $parts[0] );
$city->setName( $parts[1] );
$city->setInternationalName( $parts[2] );
$city->setLatitude($parts[4] );
$city->setLongitude( $parts[5] );
$city->setCountry( $em->getRepository('MyApplicationBundle:Country')->findOneByIsoCode( $parts[8] ) );
$em->persist($city);
unset($city);
unset($entity);
unset($parts);
unset($buffer);
echo $i . PHP_EOL;
$i++;
}
}
fclose($handle);
Things I have tried, but nothing helped:
Adding second parameter to fgets
Increasing memory_limit
Unsetting vars

Increasing memory limit is not going to be enough. When importing files like that, you buffer the reading.
$f = fopen('yourfile');
while ($data = fread($f, '4096') != 0) {
// Do your stuff using the read $data
}
fclose($f);
Update :
When working with an ORM, you have to understand that nothing is actually inserted in the database until the flush call. Meaning all those objects are stored by the ORM tagged as "to be inserted". Only when the flush call is made, the ORM will check the collection and start inserting.
Solution 1 : Flush often. And clear.
Solution 2 : Don't use the ORM. Go for plain SQL command. They will take up far less memory than the object + ORM solution.

33554432 are 32MB
change memory limit in php.ini for example 75MB
memory_limit = 75M
and restart server

Instead of simply reading the file, you should read the file line by line. Every time you do read the one line you should process your data. Do NOT try to fit EVERYTHING in memory. You will fail. The reason for that is that while you can put the TEXT file in ram, you will not be able to also have the data as php objects/variables/whathaveyou at the same time, since php by itself needs much larger amounts of memory for each of them.
What I instead suggest is
a) read a new line,
b) parse the data in the line
c) create the new object to store in the database
d) goto step a, by unset(ting) the old object first or reusing it's memory

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.