Read large data from csv file in php [duplicate] - php

This question already has answers here:
file_get_contents => PHP Fatal error: Allowed memory exhausted
(4 answers)
Closed 3 years ago.
I am reading csv & checking with mysql that records are present in my table or not in php.
csv has near about 25000 records & when i run my code it display "Service Unavailable" error after 2m 10s (onload: 2m 10s)
here i have added code
// for set memory limit & execution time
ini_set('memory_limit', '512M');
ini_set('max_execution_time', '180');
//function to read csv file
function readCSV($csvFile)
{
$file_handle = fopen($csvFile, 'r');
while (!feof($file_handle) ) {
set_time_limit(60); // you can enable this if you have lot of data
$line_of_text[] = fgetcsv($file_handle, 1024);
}
fclose($file_handle);
return $line_of_text;
}
// Set path to CSV file
$csvFile = 'my_records.csv';
$csv = readCSV($csvFile);
for($i=1;$i<count($csv);$i++)
{
$user_email= $csv[$i][1];
$qry = "SELECT u.user_id, u.user_email_id FROM tbl_user as u WHERE u.user_email_id = '".$user_email."'";
$result = #mysql_query($qry) or die("Couldn't execute query:".mysql_error().''.mysql_errno());
$rec = #mysql_fetch_row($result);
if($rec)
{
echo "Record exist";
}
else
{
echo "Record not exist";
}
}
Note: I just want to list out records those are not exist in my table.
Please suggest me solution on this...

An excellent method to deal with large files is located at: https://stackoverflow.com/a/5249971/797620
This method is used at http://www.cuddlycactus.com/knownpasswords/ (page has been taken down) to search through 170+ million passwords in just a few milliseconds.

After struggling a lot, finally i found a good solution, may be it help others also.
When i tried 2,367KB csv file containing 18226 rows, the least time taken by different php scripts were
(1) from php.net fgetcsv documentation named CsvImporter, and
(2) file_get_contents => PHP Fatal error: Allowed memory exhausted
(1) took 0.92574405670166
(2) took 0.12543702125549 (string form) & 0.52903485298157 (splitted to array)
Note: this calculation not include adding to mysql.
The best solution i found uses 3.0644409656525 total including adding to database and some conditional check also.
It took 11 seconds in processing a 8MB file.
solution is :
$csvInfo = analyse_file($file, 5);
$lineSeperator = $csvInfo['line_ending']['value'];
$fieldSeperator = $csvInfo['delimiter']['value'];
$columns = getColumns($file);
echo '<br>========Details========<br>';
echo 'Line Sep: \t '.$lineSeperator;
echo '<br>Field Sep:\t '.$fieldSeperator;
echo '<br>Columns: ';print_r($columns);
echo '<br>========Details========<br>';
$ext = pathinfo($file, PATHINFO_EXTENSION);
$table = str_replace(' ', '_', basename($file, "." . $ext));
$rslt = table_insert($table, $columns);
if($rslt){
$query = "LOAD DATA LOCAL INFILE '".$file."' INTO TABLE $table FIELDS TERMINATED BY '$fieldSeperator' ";
var_dump(addToDb($query, false));
}
function addToDb($query, $getRec = true){
//echo '<br>Query : '.$query;
$con = #mysql_connect('localhost', 'root', '');
#mysql_select_db('rtest', $con);
$result = mysql_query($query, $con);
if($result){
if($getRec){
$data = array();
while ($row = mysql_fetch_assoc($result)) {
$data[] = $row;
}
return $data;
}else return true;
}else{
var_dump(mysql_error());
return false;
}
}
function table_insert($table_name, $table_columns) {
$queryString = "CREATE TABLE " . $table_name . " (";
$columns = '';
$values = '';
foreach ($table_columns as $column) {
$values .= (strtolower(str_replace(' ', '_', $column))) . " VARCHAR(2048), ";
}
$values = substr($values, 0, strlen($values) - 2);
$queryString .= $values . ") ";
//// echo $queryString;
return addToDb($queryString, false);
}
function getColumns($file){
$cols = array();
if (($handle = fopen($file, 'r')) !== FALSE)
{
while (($row = fgetcsv($handle)) !== FALSE)
{
$cols = $row;
if(count($cols)>0){
break;
}
}
return $cols;
}else return false;
}
function analyse_file($file, $capture_limit_in_kb = 10) {
// capture starting memory usage
$output['peak_mem']['start'] = memory_get_peak_usage(true);
// log the limit how much of the file was sampled (in Kb)
$output['read_kb'] = $capture_limit_in_kb;
// read in file
$fh = fopen($file, 'r');
$contents = fread($fh, ($capture_limit_in_kb * 1024)); // in KB
fclose($fh);
// specify allowed field delimiters
$delimiters = array(
'comma' => ',',
'semicolon' => ';',
'tab' => "\t",
'pipe' => '|',
'colon' => ':'
);
// specify allowed line endings
$line_endings = array(
'rn' => "\r\n",
'n' => "\n",
'r' => "\r",
'nr' => "\n\r"
);
// loop and count each line ending instance
foreach ($line_endings as $key => $value) {
$line_result[$key] = substr_count($contents, $value);
}
// sort by largest array value
asort($line_result);
// log to output array
$output['line_ending']['results'] = $line_result;
$output['line_ending']['count'] = end($line_result);
$output['line_ending']['key'] = key($line_result);
$output['line_ending']['value'] = $line_endings[$output['line_ending']['key']];
$lines = explode($output['line_ending']['value'], $contents);
// remove last line of array, as this maybe incomplete?
array_pop($lines);
// create a string from the legal lines
$complete_lines = implode(' ', $lines);
// log statistics to output array
$output['lines']['count'] = count($lines);
$output['lines']['length'] = strlen($complete_lines);
// loop and count each delimiter instance
foreach ($delimiters as $delimiter_key => $delimiter) {
$delimiter_result[$delimiter_key] = substr_count($complete_lines, $delimiter);
}
// sort by largest array value
asort($delimiter_result);
// log statistics to output array with largest counts as the value
$output['delimiter']['results'] = $delimiter_result;
$output['delimiter']['count'] = end($delimiter_result);
$output['delimiter']['key'] = key($delimiter_result);
$output['delimiter']['value'] = $delimiters[$output['delimiter']['key']];
// capture ending memory usage
$output['peak_mem']['end'] = memory_get_peak_usage(true);
return $output;
}

Normally, "Service Unavailable" error will come when 500 error occurs.
I think this is coming because of insufficient execution time. Please check your log/browser console, may be you can see 500 error.
First of all,
Keep set_time_limit(60) out of loop.
Do some changes like,
Apply INDEX on user_email_id column, so you can get the rows faster with your select query.
Do not echo message, Keep the output buffer free.
And
I have done these kind of take using Open source program. You can get it here http://sourceforge.net/projects/phpexcelreader/
Try this.

Related

PHP Memory Exhaustion, inherited code causes error with larger files, do I flush the memory, batch the processing, or increase memory allocation?

Uploader worked fine until the file became larger than 100,000 lines. I didn't write the code but I want to fix it. I have worked with other languages but not PHP. I know there are different ways to address the issue, but I am unsure of the best investment of time. Ideally I would like uploader to accept files of any size. Changing the memory allocation seems to be the quickest fix, but I would expect long term issues when the file outgrows the memory. Flushing the memory and batching the uploads seem to be 2 sides of the same coin, however the uploader currently will only process a single file and a single upload to the database, every time the file is uploaded it deletes the previous data and replaces it with data from the file. Specifically I have been adjusting the CSV uploader and not the XLSX uploader.
I have already unsuccessfully tried to allocate addition memory to the program but it crashed the server and I would prefer not to do that again. I have also attempted to batch the csv file but it failed as well.
<?php
class Part {
public $id;
public $oem;
public $part_number;
public $desc;
// Assigning the values
public function __construct($id, $oem, $part_number, $desc) {
$this->id = $id;
$this->oem = $oem;
$this->part_number = $part_number;
$this->desc = $desc;
}
}
//imports single csv file and returns an array of Parts
function importCSVpartfinder($filename, $brand, $root){ //$filename is a dataTable of dimensions: first row contains dimension labels, second row are units, the first column is the part number
$handle = fopen($filename, 'r') or die('unable to open file: $filename');
$contents = fread($handle, filesize($filename));
fclose($handle);
$row = explode("\r" , $contents);
$data = array();
$data2 = array();
for ($i=0; $i < sizeof($row); $i++) {
$columns = explode(",", $row[$i]);
array_push($data, $columns);
}
$all = array(); //array of all Parts
//I should probably sanatize here
for ($i=0; $i < sizeof($data); $i++) {
if (sizeof($data[$i]) != 1){
$id = $data[$i][0];
$oem = $data[$i][1];
$part_number = $data[$i][2];
$desc = $data[$i][3];
$obj = new Part($id, $oem, $part_number, $desc);
array_push($all, $obj);
}
}
return $all;
}
//returns a message with # of succes and list of failures //this is slow with large uploads
function addPartsToDB($data, $connection){ //$data is an array of Parts
//delete
$deleteSQL = "DELETE FROM Part_finder WHERE 1";
$res = $connection->query($deleteSQL);
if (!$res){
echo " Failed to delete Part_finder data, ";
exit;
}
//insert
$e=0;
$s=0;
$failures = "";
$d="";
for ($i=0; $i < sizeof($data); $i++) {
$d .= "(".$data[$i]->id.",'".$data[$i]->oem."','".$data[$i]->part_number."','".$data[$i]->desc."'),";
$s++;
}
$d = substr($d, 0, -1);
$sqlquery = "INSERT INTO Part_finder (id_part, oem, part_number, description) VALUES $d";
$res = $connection->query($sqlquery);
if (!$res){
$sqlError = $connection->error;
return ( $s." items failed to update. Database error. ".$sqlError);
}else{
return ( $s." items updated.");
}
/*
for ($i=0; $i < sizeof($data); $i++) {
$d = "(".$data[$i]->id.",'".$data[$i]->oem."','".$data[$i]->part_number."','".$data[$i]->desc."')";
$sqlquery = "INSERT INTO Part_finder (id_part, oem, part_number, description) VALUES $d";
#$res = $connection->query($sqlquery);
if (!$res){
$failures .= $data[$i]->part_number . "
" ;
$e++;
}else{
$s++;
}
}*/
#return $sqlquery;
}
function importXLSXpartfinder($filename, $root){
require($root.'./plugins/XLSXReader/XLSXReader.php');
$xlsx = new XLSXReader($filename);
/* $sheetNames = $xlsx->getSheetNames();
foreach ($sheetNames as $Name) {
$sheetName = $Name;
}*/
$sheet = $xlsx->getSheet("Sheet1");
$rawData = $sheet->getData();
#$columnTitles = array_shift($rawData);
$all = array(); //array of all Parts
for ($i=0; $i < sizeof($rawData); $i++) {
if (sizeof($rawData[$i]) != 1){
$id = $rawData[$i][0];
$oem = $rawData[$i][1];
$part_number = $rawData[$i][2];
$desc = $rawData[$i][3];
$obj = new Part($id, $oem, $part_number, $desc);
array_push($all, $obj);
}
}
return $all;
}
$filename = $file["partfinder"]["tmp_name"];
if($file["partfinder"]["size"] > 100000000){
echo "File too big".$file["partfinder"]["size"];
exit;
}
//$file comes from edit.php
if($file["partfinder"]["type"] === "text/csv" ) {
$a = importCSVpartfinder($filename, $brand, $root);
}elseif ($file["partfinder"]["type"] === "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" ) {
$a = importXLSXpartfinder($filename, $root);
}else{
var_dump($file["partfinder"]["type"]);
echo ".xlsx or .csv file types only";
exit;
}
$b = addPartsToDB($a,$connection);
echo $b;
?>
The memory exhaustion currently occurs on line 25
$columns = explode(",", $row[$i]);
and the error code is
Fatal error: Allowed memory size of 94371840 bytes exhausted (tried to allocate 20480 bytes) in /www/tools/import-csv-partfinder.php on line 25
Ideally I would still like to upload a single file to update the database and I would need to alter additional programs to be able to upload multiple files or not have the database wipe itself during every upload. Unfortunately I am not able to contact the person who wrote the programs originally, so I am pretty much on my own to figure this out.
I'd suggest using a generator to read your CSV rather than reading the whole thing into an array (actually two arrays with the way it's currently written). This way you only hold one line at a time in memory.
function importCSVpartfinder($filename = '') {
$handle = fopen($filename, 'r');
while (($row = fgetcsv($handle)) !== false) {
yield $row;
}
fclose($handle);
}
Then for your database insert function, use a prepared statement and iterate the generator, executing the statement for each row in the file.
function addPartsToDB($parts, $connection) {
$connection->query('DELETE FROM Part_finder');
$statement = $connection->prepare('INSERT INTO Part_finder
(id_part, oem, part_number, description)
VALUES (?, ?, ?, ?)');
foreach ($parts as $part) {
$statement->execute($part);
}
}
These examples are simplified just to show the concept. You should be able to adapt them to your exact needs, but they are working examples as written.
addPartsToDB(importCSVpartfinder($filename), $connection);

How to parse a .csv file into a multidimensional array in PHP?

I'm a newbie in PHP and I'm trying to make a todo list that communicates with a .csv file,. So far I've managed to write a function that writes the user input into the csv file, but I'm stuck on writing a function that would parse (I'm not even sure if this is the correct term) every line of the .csv file into a multi dimensional array, so I could display every line of the list to my convenience in the PHTML file.
Here's what I have so far :
`<?php
//
// ─── DATA ────────────────────────────────────────────────────────────────────
//
$user_entry = array(
'title' => '',
'description' => '',
'date' => '',
'priority' => ''
);
// puts the data the users entered into an array
$user_entry['title'] = $_POST['title'];
$user_entry['description'] = $_POST['description'];
$user_entry['date'] = $_POST['date'];
$user_entry['priority'] = $_POST['priority'];
//
// ─── FUNCTIONS ──────────────────────────────────────────────────────────────────
//
function writeInList() {
//parses the $user_entry array into the .csv file
global $user_entry;
$file = fopen("todo.csv","a");
fputcsv($file, $user_entry, ",");
fclose($file);
}
function displayList() {
//That's where I'm stuck.
$file = fopen("todo.csv","r");
$fileCountable = file("todo.csv");
for ($i = 0; $i < count($fileCountable); $i++) {
$csvContent = fgetcsv($file, 1000, ",");
foreach ($csvContent as $value){
$var[$i] = $value;
}
echo '<br>';
}
fclose($file);
}
//
// ─── MAIN CODE ─────────────────────────────────────────────────────────────
//
writeInList();
include 'todolist.phtml';`
I'm sorry if it has been discussed before. I've searched a lot and found similar questions but can't get to make it work in my own code. Thanks a lot in advance if anyone takes the time to take a look at my code !
This is also my very first time posting here so I hope I'm doing it right.
You did pretty good. You can look at fgetcsv documentation for more. I would have change you function so it will get the argument as input (try avoid using global)
// insert data
function writeInList($user_entry, $path ) {
$file = fopen($path ,"a");
fputcsv($file, $user_entry, ",");
fclose($file);
}
//extract data
function getList($path, $limit = 100000) {
$file = fopen($path, "r");
if (!$file) return null; // or throw error or print to log
$allRows = []; //
while (($data = fgetcsv($file, $limit, ",")) !== FALSE) {
$allRows[] = $data; // as fgetcsv return array already exlode by ","
}
fclose($file);
return $allRows;
}
Now you have 2-Dim array return from getList. Use is as getList("todo.csv") and display as you pleased.
Hope that helps!

replace rows from one csv file to another csv where id of is the samePHP?

I have one excel orginal.csv file
ID Name Price
1 Xblue 12
2 Yblue 32
3 Zblue 52
And another copy.csv file
ID Name Price
1 Xblue 89
2 Yblue 43
3 Zblue 45
I want to replace rows from orginal.csv to copy.csv where ID is the same.
Can I do this manually or maybe somehow using PHP?
I search for some options on the internet, but I only found getcsv and readcsv functions that can't help me in this case. Cause this is something like updating CSV file.
It may end up in request timeout in PHP because it requires so many loops to do it. If someone can reduce the time complexity of this program then it will work. if it even works it will take a lot of time to do it.
while(! feof($f_pointer)){ //open old csv to update loop1
$ar=fgetcsv($f_pointer); // getting first row
for($i=0;$i<count($ar);$i++){ //loop2 first row array
$f_pointer2=fopen("new.csv","r"); open new csv to get data
while(! feof($f_pointer2)){ // loop3 to find ID in new csv
$ar2=fgetcsv($f_pointer2); //getting each row in array
for($j=0;$j<count($ar2);$j++){ //loop4 to compare id of old csv to new csv and update data
if($ar[i] == $ar2[j]){
foreach ($ar2 as $fields) { //loop5
fputcsv($f_pointer, $fields);
}
}
}
}
}
}
?>
I've created a little soulution. If order is important you don't have to index the array and loop through the copied array.
<?php
if(file_exists('output.csv'))
{
unlink('output.csv');
}
function fputcsv_eol($handle, $array, $delimiter = ',', $enclosure = '"', $eol = "\n") {
$return = fputcsv($handle, $array, $delimiter, $enclosure);
if($return !== FALSE && "\n" != $eol && 0 === fseek($handle, -1, SEEK_CUR)) {
fwrite($handle, $eol);
}
return $return;
}
function scanFile($sFilename, $iIndexColumn)
{
$rFile = fopen($sFilename, 'r');
$aData = array();
while(($aLine = fgetcsv($rFile)) !== false)
{
$aData[$aLine[$iIndexColumn]] = $aLine;
}
fclose($rFile);
return $aData;
}
$iIndexColumn = 0;
$iValueColum = 2;
$aOriginalData = scanFile('original.csv', 0);
$aCopyData = scanFile('copy.csv', 0);
foreach($aOriginalData as $iID => $aOriginalDatum)
{
if(array_key_exists($iID, $aCopyData))
{
$aCopyData[$iID] = $aOriginalDatum;
}
}
$rFile = fopen('output.csv', 'w');
foreach($aCopyData as $aCopyDatum)
{
fputcsv_eol($rFile, $aCopyDatum, ',', '"',"\r\n");
}
fclose($rFile);

How can I divide data separated by commas on different lines

I have this script that extracts a .csv file from the database that holds data for different locals that a user has logged into. The .csv files come like this:
"id_user";"id_local"
"1";""
"2";"2,3,4"
"3";""
"5";"2,5"
"10";""
"13";"2"
"14";"5"
"15";"2"
"16";"1"
"20";"2"
"21";""
As you can se, it get one register per user
But, to manipulate it properly, we need it like this:
"id_user";"id_local"
"2";"2"
"2";"3
"2";"4"
"5";"2"
"5";"5"
"13";"2"
"14";"5"
"15";"2"
"16";"1"
"20";"2"
So, I need to create a function that deletes users with no local and splits different locals of the same user in different registers. Does anyone knows how can I do it?
Here is the code I have so far but I'm not sure if I'm on the right way:
function fix_local_secundario(){
$filename = "local_secundario.csv";
$file_locais = file_get_contents($filename);
$locais = explode("\n", $file_locais);
// $pattern = "/,/";
// $replacement = "\"\n;\"";
while ($line = current($locais)) {
$line = str_getcsv($line, ';', '"','\n');
// $line = preg_replace($pattern, $replacement, $line);
var_dump($line);
echo "\n";
next($locais);
}
}
Try this and see if this works:
function fix_local_secundario(){
$filename = "local_secundario.csv";
$file_locais = file_get_contents($filename);
$locais = explode("\n", $file_locais);
while ($line = current($locais)) {
// do first split on ; character
$arr1 = explode(";", $line);
// if the part after ; is not empty for this line
if ($arr1[1]!='""'){
// split it further on , character
$arr2 = explode(",", $arr1[1]);
foreach ($arr2 as $key => $val){
if($val[0] != '"'){
$val = '"'.$val;
}
if($val[strlen($val)-1] != '"'){
$val = $val . '"';
}
echo $arr1[0] . ";" . $val . "<BR>";
}
}
next($locais);
}
}
Once this basic piece is working, you should change it to return values rather than echo values since this code is part of a function as per updates made to your question.
What about this…
$f = fopen("myfile.csv", "r");
while($row = fgetcsv($f, 0, ";")){
$locals = explode(",", $row[1]);
if (count($locals)>1){
foreach($locals as $local)
// iterate with $row[0] and $local
}elseif($row[1] != "")
// use $row[0] and $row[1]
}

what's the code meaning?

$file = fopen("test.txt","r");
while($line = fgets($file)) {
$line = trim($line);
list($model,$price) = preg_split('/\s+/',$line);
if(empty($price)) {
$price = 0;
}
$sql = "UPDATE products
SET products_price=$price
WHERE products_model='$model'";
// run the sql query.
}
fclose($file);
the txt file like this:
model price
LB2117 19.49
LB2381 25.99
1, what's the meaning of list($model,$price) = preg_split('/\s+/',$line);
i know preg_split like explode, but i don't know what't the parameter meaning of the above line
2, how to skip the first record.
it's taking the results of the preg_split and assigning them to the vars $model and $price. You're looking at a parsing algorithm. Sorry if this is not enough. I have a hard time understanding the question as it is written.
Also, if I read this correctly, there is no need to skip line 1 unless you have an item with the model defined as "model" in the database.
But if you wanted to for some reason, you could add a counter...
$i = 0;
while($line = fgets($file)) {
if($i > 0)
{
$line = trim($line);
list($model,$price) = preg_split('/\s+/',$line);
if(empty($price)) {
$price = 0;
}
$sql = "UPDATE products
SET products_price=$price
WHERE products_model='$model'";
// run the sql query.
}
$i++;
}
That is a language construct that allows you to assign to multiple variables at once. You can think of it as array unpacking (preg_split returns an array). So, when you do:
<?php
list($a, $b) = explode(".","a.b");
echo $a . "\n";
echo $b . "\n";
You will get:
a
b
Having less elements in list than the array is ok, excess elements in array are ignored, but having insufficent elements in array will give you an undefined index error. For example:
list($a) = explode(".","a.b"); // ok
list($a,$b,$c) = explode(".","a.b") // error
I don't know if you meant that by skip the first record but...
$file = fopen("test.txt","r"); // open file for reading
$first = true;
while($line = fgets($file)) { // get the content file
if ($first === true) { $first = false;}//skip the first record
else{
$line = trim($line); // remove the whitespace before and after the test
// not in the middle
list($model,$price) = preg_split('/\s+/',$line); // create two variable model and price, the values are from the preg_split \s means whitespace, tab and linebreak
if(empty($price)) { // if $price is empty price =0
$price = 0;
}
$sql = "UPDATE products // sql update
SET products_price=$price
WHERE products_model='$model'";
// run the sql query.
}
}
fclose($file); //close the file

Categories