Importing a large delimited file to a MySQL table - php

I have this large (and oddly formatted txt file) from the USDA's website. It is the NUT_DATA.txt file.
But the problem is that it is almost 27mb! I was successful in importing the a few other smaller files, but my method was using file_get_contents which it makes sense why an error would be thrown if I try to snag 27+ mb of RAM.
So how can I import this massive file to my MySQL DB without running into a timeout and RAM issue? I've tried just getting one line at a time from the file, but this ran into timeout issue.
Using PHP 5.2.0.
Here is the old script (the fields in the DB are just numbers because I could not figure out what number represented what nutrient, I found this data very poorly document. Sorry about the ugliness of the code):
<?
$file = "NUT_DATA.txt";
$data = split("\n", file_get_contents($file)); // split each line
$link = mysql_connect("localhost", "username", "password");
mysql_select_db("database", $link);
for($i = 0, $e = sizeof($data); $i < $e; $i++)
{
$sql = "INSERT INTO `USDA` (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17) VALUES(";
$row = split("\^", trim($data[$i])); // split each line by carrot
for ($j = 0, $k = sizeof($row); $j < $k; $j++) {
$val = trim($row[$j], '~');
$val = (empty($val)) ? 0 : $val;
$sql .= ((empty($val)) ? 0 : $val) . ','; // this gets rid of those tildas and replaces empty strings with 0s
}
$sql = rtrim($sql, ',') . ");";
mysql_query($sql) or die(mysql_error()); // query the db
}
echo "Finished inserting data into database.\n";
mysql_close($link);
?>

If you have to use PHP, you can read the file line by line using fopen and fgets
<?
$file = "NUT_DATA.txt";
$fh = #fopen( $file, "r" ); // open the file for reading
$link = mysql_connect("localhost", "username", "password");
mysql_select_db("database", $link);
while( !feof( $fh ) )
{
$data = fgets( $fh, 4096 ); // read line from file
$sql = "INSERT INTO `USDA` (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17) VALUES(";
$row = split("\^", trim($data)); // split each line by carrot
for ($j = 0, $k = sizeof($row); $j < $k; $j++) {
$val = trim($row[$j], '~');
$val = (empty($val)) ? 0 : $val;
$sql .= ((empty($val)) ? 0 : $val) . ','; // this gets rid of those tildas and replaces empty strings with 0s
}
$sql = rtrim($sql, ',') . ");";
mysql_query($sql) or die(mysql_error()); // query the db
}
echo "Finished inserting data into database.\n";
fclose( $fh );
mysql_close($link);
?>
Check out the fgets documentation for more info

Read the file line by line so that you're not loading the entire file in memory. Use
set_time_limit(0);
To avoid having your script time out.
http://php.net/manual/en/function.set-time-limit.php

You can increase the amount of memory each script can use by setting this value in php.ini:
memory_limit = 64M
Having said this: do you have to use PHP? other scripting languages (like python) might be more appropriate for this kind of tasks.

Related

limitation on writing line to .csv or .txt file php

i wrote a convert bulk data from an excel to a .txt or .csv by using fwrite php. however when I fread on the .txt file that my data placed on, the data turned out broken line. how can i fix it? Attached here is the highlighted from .txt or .csv and also the codes as below:-
**********my write code *************
for loop here {
$split_filename_path = fopen($tempfile_path.$split_filename.$total_round.$extension, "w");`
$split_file_txt = $ISBN.$comma.$Title.$comma.$Qty.$comma.$Location.$comma.$outlet;
fwrite($split_filename_path, $split_file_txt);
}
fclose($split_filename_path);
then it comes here
my read code*****
$myfile = fopen($file, "r");
$fread = fread($myfile,filesize($file));
fclose($myfile);
$split = explode("\n",$fread);
$datafields = array('ISBN', 'title', 'qty', 'location', 'outlet');
$insertvalues = array();
foreach($split as $string){
$row = explode(",",$string);
$questionmarks[] = '(' . $this->placeholder('?',$irow, sizeof($row)) . ')';
$insertvalues = array_merge( $insertvalues, array_values($row));
}
$sql = "INSERT INTO temp_data (".implode( ',', $datafields) . ")
VALUES ". implode( ',', $questionmarks);
function placeholder( $text, $irow, $count = 0, $separator = ',' ) {
$result = array();
if ($count > 0) {
for ($x = 0; $x < $count; $x++) {
$result[] = $text;
}
}
return implode($separator, $result);
}
with this codes, the result that I have is here :
9780794435295,BARBIE & HER SISTERS IN THE GREAT PUPPY ADVENTURE: (supposed to be full :- BARBIE & HER SISTERS IN THE GREAT PUPPY ADVENTURE: A SLIDING TAB BOOK BARBIE MOVIE TIEIN )
found the answer.
$Title = preg_replace( "/\r|\n/", "", $Title);

PHP, inserting CSV from external source into MySQL DB

I know this has been asked before -- I tried to read previous Q&A on the topic but I'm still stuck. Probably I read too many Q&A and have a bad mix of techniques as a result.
I don't get an error, I just don't get anything in my table. The echo $i is to help debug and I only ever get a 0 rather than expected 0 1 2 3 ... N rows.
My DB connection credentials are all fine I use them all over my site for Select statements.
$csv_file = str_getcsv("https://ds.data.jma.go.jp/tcc/tcc/products/gwp/temp/list/csv/year_wld.csv");
$csvfile = fopen($csv_file, 'r');
$theData = fgets($csvfile);
$i = 0;
while (!feof($csvfile))
{
echo $i;
$csv_data[] = fgets($csvfile, 1024);
$csv_array = explode(",", $csv_data[$i]);
$yrr = $csv_array[0];
$vals= $csv_array[1];
$sql1 = "INSERT INTO Table1(Year,Value) VALUES(" . $yrr . "," . $vals. ")";
$conn->query($sql1);
$i++;
}
The main problem here is the fact that you are trying to open a text variable as a file:
$csvfile = fopen($csv_file, 'r');
In fact you already have an array from from str_getcsv so your whole code should look like (if you can read the whole file at once):
$csvFile = array_map('str_getcsv', file("https://ds.data.jma.go.jp/tcc/tcc/products/gwp/temp/list/csv/year_wld.csv"));
array_shift($csvFile); //we remove the headers
$i = 0;
/**
* Removes all the "*" and "+" symbols as I assume that you want a float since you are not wrapping it in the sql query
*/
function removeUnwantedChars($string) {
return preg_replace('/[^0-9\\.\\-]/i', '', $string);
}
foreach($csvFile as $csvData) {
echo $i++;
$yrr = $csvData[0];
$vals = removeUnwantedChars($csvData[1]);
$sql1 = "INSERT INTO Table1(Year,Value) VALUES(" . $yrr . "," . $vals. ")";
$conn->query($sql1);
}
If you cannot read it all at once then I suggest to first download the file line by line:
<?php
$url = "https://ds.data.jma.go.jp/tcc/tcc/products/gwp/temp/list/csv/year_wld.csv";
$fileHandle = fopen($url, "r");
/**
* Removes all the "*" and "+" symbols
*/
function removeUnwantedChars($string) {
return preg_replace('/[^0-9\\.\\-]/i', '', $string);
}
$i = 0;
$headersSkipped = false;
while ($csvData = fgetcsv($fileHandle)) {
if (!$headersSkipped) {
$headersSkipped = true;
continue;
}
echo $i++;
$yrr = $csvData[0];
$vals = removeUnwantedChars($csvData[1]);
$sql1 = "INSERT INTO Table1(Year,Value) VALUES(" . $yrr . "," . $vals. ")";
$conn->query($sql1);
}
fclose($fileHandle);
Yet like said by #Shadow above it is always great to be more verbose. So in case query returned false then it would be great to output the last error (if you are using PDO then errorInfo() function.

Parsing tab delimited txt file PHP fgetcsv hangs / not ending / exiting function

This function is pretty generic its intended to parse a txt file that is delimited with a tab, the files im trying to parse is geonames database, it tops out at 1953146 results every time and from which point nothing happens at all, no more querys and doesnt exit, counting the lines i can see there is 8,000,000 lines in the file so im guessing it is stalled, errors are enabled and there is no error returned php_memory is set to 2048M execution time is set to unlimited.
<?php
function table_populate($table,$file,$columns){
$handle = fopen($file, "r");
$lines = count(explode("\n",file_get_contents($file)));
$i = 0;
while (($line = fgetcsv($handle, 10000, "\t")) !== false && $i < $lines) {
if(preg_match('[#]',$line['0'])){
// do nothing row is commented out
}else{
$row = '';
$comma = '';
for ($z = 0; $z < count($line); $z++) {
$row .= $comma."'".$line[$z]."'";
$comma = ', ';
}
$sql = "INSERT INTO ".$table." (".$columns.") VALUES (".$row.")";
}
$i++;
}
fclose($handle);
return;
}
?>

Read file lines backwards (fgets) with php

I have a txt file that I want to read backwards, currently I'm using this:
$fh = fopen('myfile.txt','r');
while ($line = fgets($fh)) {
echo $line."<br />";
}
This outputs all the lines in my file.
I want to read the lines from bottom to top.
Is there a way to do it?
First way:
$file = file("test.txt");
$file = array_reverse($file);
foreach($file as $f){
echo $f."<br />";
}
Second Way (a):
To completely reverse a file:
$fl = fopen("\some_file.txt", "r");
for($x_pos = 0, $output = ''; fseek($fl, $x_pos, SEEK_END) !== -1; $x_pos--) {
$output .= fgetc($fl);
}
fclose($fl);
print_r($output);
Second Way (b):
Of course, you wanted line-by-line reversal...
$fl = fopen("\some_file.txt", "r");
for($x_pos = 0, $ln = 0, $output = array(); fseek($fl, $x_pos, SEEK_END) !== -1; $x_pos--) {
$char = fgetc($fl);
if ($char === "\n") {
// analyse completed line $output[$ln] if need be
$ln++;
continue;
}
$output[$ln] = $char . ((array_key_exists($ln, $output)) ? $output[$ln] : '');
}
fclose($fl);
print_r($output);
Try something simpler like this..
print_r(array_reverse(file('myfile.txt')));
Here is my solution for just printing the file backwards. It is quite memory-friendly. And seems more readable (IMO [=in my opinion]).
It goes through the file backwards, count the characters till start of a line or start of the file and then reads and prints that amount of characters as a line, then moves cursor back and reads another line like that...
if( $v = #fopen("PATH_TO_YOUR_FILE", 'r') ){ //open the file
fseek($v, 0, SEEK_END); //move cursor to the end of the file
/* help functions: */
//moves cursor one step back if can - returns true, if can't - returns false
function moveOneStepBack( &$f ){
if( ftell($f) > 0 ){ fseek($f, -1, SEEK_CUR); return true; }
else return false;
}
//reads $length chars but moves cursor back where it was before reading
function readNotSeek( &$f, $length ){
$r = fread($f, $length);
fseek($f, -$length, SEEK_CUR);
return $r;
}
/* THE READING+PRINTING ITSELF: */
while( ftell($v) > 0 ){ //while there is at least 1 character to read
$newLine = false;
$charCounter = 0;
//line counting
while( !$newLine && moveOneStepBack( $v ) ){ //not start of a line / the file
if( readNotSeek($v, 1) == "\n" ) $newLine = true;
$charCounter++;
}
//line reading / printing
if( $charCounter>1 ){ //if there was anything on the line
if( !$newLine ) echo "\n"; //prints missing "\n" before last *printed* line
echo readNotSeek( $v, $charCounter ); //prints current line
}
}
fclose( $v ); //close the file, because we are well-behaved
}
Of course replace PATH_TO_YOUR_FILE with your own path to your file, # is used when opening the file, because when the file is not found or can't be opened - warning is raised - if you want to display this warning - just remove the error surpressor '#'.
If the file is not so big you can use file():
$lines = file($file);
for($i = count($lines) -1; $i >= 0; $i--){
echo $lines[$i] . '<br/>';
}
However, this requires the whole file to be in memory, that's why it is not suited for really large files.
Here's my simple solution without messing up anything or adding more complex code
$fh = fopen('myfile.txt','r');
while ($line = fgets($fh)) {
$result = $line . "<br>" . $result;
}
echo $result // or return $result if you are using it as a function

PHP program will run and echo out nothing

I made a script that reads data from a .xls file and converts it into a .csv, then I have a script that takes the .csv and puts it in an array, and then I have a script with a foreach loop and at the end should echo out the end variable, but it echos out nothing, just a blank page. The file writes okay, and that's for sure, but I don't know if the script read the csv, because if I put an echo after it reads, it just returns blank.
Here my code:
<?php
ini_set('memory_limit', '300M');
$username = 'test';
function convert($in) {
require_once 'Excel/reader.php';
$excel = new Spreadsheet_Excel_Reader();
$excel->setOutputEncoding('CP1251');
$excel->read($in);
$x=1;
$sep = ",";
ob_start();
while($x<=$excel->sheets[0]['numRows']) {
$y=1;
$row="";
while($y<=$excel->sheets[0]['numCols']) {
$cell = isset($excel->sheets[0]['cells'][$x][$y]) ? $excel->sheets[0]['cells'][$x][$y] : '';
$row.=($row=="")?"\"".$cell."\"":"".$sep."\"".$cell."\"";
$y++;
}
echo $row."\n";
$x++;
}
return ob_get_contents();
ob_end_clean();
}
$csv = convert('usage.xls');
$file = $username . '.csv';
$fh = fopen($file, 'w') or die("Can't open the file");
$stringData = $csv;
fwrite($fh, $stringData);
fclose($fh);
$maxlinelength = 1000;
$fh = fopen($file);
$firstline = fgetcsv($fh, $maxlinelength);
$cols = count($firstline);
$row = 0;
$inventory = array();
while (($nextline = fgetcsv($fh, $maxlinelength)) !== FALSE )
{
for ( $i = 0; $i < $cols; ++$i )
{
$inventory[$firstline[$i]][$row] = $nextline[$i];
}
++$row;
}
fclose($fh);
$arr = $inventory['Category'];
$texts = 0;
$num2 = 0;
foreach($inventory['Category'] as $key => $value) {
$val = $value;
if (is_object($value)) { echo 'true'; }
if ($value == 'Messages ') {
$texts++;
}
}
echo 'You have used ' . $texts . ' text messages';
?>
Once you return. you cannot do anything else in the function:
return ob_get_contents();
ob_end_clean();//THIS NEVER HAPPENS
Therefore the ob what never flushed and won't have any output.
I see a lot of repetitive useless operations there. Why not simply build an array with the data you're pulling out of the Excel file? You can then write out that array with fputcsv(), instead of building the CSV string yourself.
You then write the csv out to a file, then read the file back in and process it back into an array. Which begs the question... why? You've already got the raw individual bits of data at the moment you read from the excel file, so why all the fancy-ish giftwrapping only to tear it all apart again?

Categories