I have 2 functions to parse each one a file Excel their sizes are 673K and 131K. They have the same code just their names.
One function it read the data from file Excel and the other function it return:
Fatal Error: Allowed Memory Size of 134217728 Bytes Exhausted (Tried to allow 72 bytes)
I have others files Excel their sizes more bigger than this ones and their parsing functions works well.
I want to create a logfile to register every action they do inside the system but I have no idea how to do it. On the other hand I found this solution in Stackoverflow for #Lawrence Cherone:
enter link description here
But the problem is the first time I will do a logfile, I don't know how I create it ? I create a new file and I put it in my project ? How I excute it and how I can see the reason of the error in my function ? Or I put this file in the function when I have the error like the solution proposed by #Lawrence Cherone ?
This solution it seems worked fine and the problem is resolved.
This following is my small code of my function, can you guide me how I create this logfile to debug it:
public function parseEquipment($filePath = null) {
set_time_limit(0);
$listEquipement = [];
$count = 0;
$chunkSize = 1024;
$objReader = PHPExcel_IOFactory::createReader(PHPExcel_IOFactory::identify($filePath));
$spreadsheetInfo = $objReader->listWorksheetInfo($filePath);
$chunkFilter = new \Floose\Parse\ChunkReadFilter();
$objReader->setReadFilter($chunkFilter);
$objReader->setReadDataOnly(true);
$chunkFilter->setRows(0, 1);
$objPHPExcel = $objReader->load($filePath);
$totalRows = $spreadsheetInfo[0]['totalRows'];
for ($startRow = 1; $startRow <= $totalRows; $startRow += $chunkSize) {
$chunkFilter->setRows($startRow, $chunkSize);
$objPHPExcel = $objReader->load($filePath);
$sheetData = $objPHPExcel->getActiveSheet()->toArray(null, null, true, false);
$startIndex = ($startRow == 1) ? $startRow : $startRow - 1;
if (!empty($sheetData) && $startRow < $totalRows) {
$dataToAnalyse = array_slice($sheetData, $startIndex, $chunkSize);
//echo 'test1';
if($dataToAnalyse[1][0]==NULL){
//echo 'test2';
break;
}
//echo 'test3';
//var_dump($sheetData);
for ($i = 0; $i < $chunkSize; $i++) {
if ($dataToAnalyse[$i]['0'] != NULL) {
//echo 'OK';
$listEquipement[] = new Article($dataToAnalyse[$i]['3'], $dataToAnalyse[$i]['4'], $dataToAnalyse[$i]['2']);
// echo 'test4';
$count++;
}
}
}
$objPHPExcel->disconnectWorksheets();
unset($objPHPExcel, $sheetData);
}
//var_dump(array_slice($sheetData, $startIndex, $chunkSize););
return $listEquipement;
}
error_log("You messed up!".$my_message, 3, "/var/tmp/custom-errors.log");
Related
I'm triying to put a text into some pdfs with TCPDI.
It works fine in most pdfs, but in some pdf the code get stuck when it reachs the useTemplate() function, and got 500 error (Maximum time exceded).
They are not long pdf (1,2,3 pages max), and anothers pdfs with more pages works fine. Here is my code:
$pdf = new TCPDI();
$pageCount = $pdf->setSourceFile($path);
for ($pageNo = 1; $pageNo <= $pageCount; $pageNo++) {
$templateId = $pdf->importPage($pageNo);
$size = $pdf->getTemplateSize($templateId);
if ($size['w'] > $size['h']) {
$pdf->AddPage('L', array($size['w'], $size['h']));
} else {
$pdf->AddPage('P', array($size['w'], $size['h']));
}
$pdf->useTemplate($templateId); //Here is where it takes so long that it exceeds time
$pdf->SetFont('Helvetica');
$pdf->SetFontSize(10);
$pdf->SetTextColor(255, 0, 0);
$pdf->SetXY(2, 0);
$pdf->Write(0, 'Code nÂș 4');
}
$pdf->Output($file,'D');
Is there any prop of the pdf that can make it stay locked? There are alterntives?
I have a code that process several pdf with this code in a loop an put they into a zip, and when in the chain is one pdf that jam the code the zip obviously dind't process, so if there are a way to detect what pdf are goingo to give me problem, I cant jump over there and generate the zip with the goods one.
I have not control over the pdfs, their are upload by a lot of clients
EDIT: In log there are more than a million of lines similar to PHP Warning: Illegal string offset 'DAmip' in ...\TCPDF\tcpdi_parser.php on line 712 before Maximum execution Time Fatal Error
Same problem for me.
I 'solved' the situation adding the line set_time_limit(10); before the line
while (strspn($data{$offset}, "\x00\x09\x0a\x0c\x0d\x20") == 1) {
I know it's not the right solution but, at least, doesn't block your program
I suggest you to read my comment on GitHub : https://github.com/pauln/tcpdi_parser/pull/23
The function getDictValue() of tcpdi_parser.pdf is incompatible with some PDF files.
All is explained in my response dated of this day.
Suggested modification :
private function getDictValue($offset, &$data) {
$objval = array();
// Extract dict from data
$i = 1;
$dict = "";
$offset+= 2;
$is_bracket = false;
do {
if ($data[$offset] == "[") {
$is_bracket = true;
$dict.= $data[$offset];
} else if ($data[$offset] == "]") {
$is_bracket = false;
$dict.= $data[$offset];
} else if (!$is_bracket && ($data[$offset] == "<") && ($data[$offset + 1] == "<")) {
$i++;
$dict.= "<<";
$offset++;
} else if (!$is_bracket && ($data[$offset] == ">") && ($data[$offset + 1] == ">")) {
$i--;
$dict.= ">>";
$offset++;
} else {
$dict.= $data[$offset];
}
$offset++;
} while ($i > 0);
// Now that we have just the dict, parse it.
$dictoffset = 0;
do {
// Get dict element.
file_put_contents("debug_tcpdi.txt", "getRawObject($dictoffset, data) depuis getDictValue()" . "\r\n", FILE_APPEND);
list($key, $eloffset) = $this->getRawObject($dictoffset, $dict);
if ($key[0] == '>>') {
break;
}
file_put_contents("debug_tcpdi.txt", "getRawObject($eloffset, data) depuis getDictValue()" . "\r\n", FILE_APPEND);
list($element, $dictoffset) = $this->getRawObject($eloffset, $dict);
$objval['/'.$key[1]] = $element;
unset($key);
unset($element);
} while (true);
return array($objval, $offset);
}
I need to read a xlsx file with 10 sheets, each sheet with about 3K rows.
Is there a way to loop each sheet and chunk his rows?
Following the examples I'm on this point:
public function import($file)
{
$inputFileType = IOFactory::identify($file);
$reader = IOFactory::createReader($inputFileType);
//My ChunkReadFilter is exactly the same of the PhpSpreadsheet examples
$chunkFilter = new ChunkReadFilter();
$reader->setReadFilter($chunkFilter);
$chunkSize = 100;
$spreadsheet = $reader->load($file);
$loadedSheetNames = $spreadsheet->getSheetNames();
foreach ($loadedSheetNames as $sheetIndex => $loadedSheetName) {
$sheet = $spreadsheet->getSheet($sheetIndex);
//$highestRow = $sheet->getHighestRow(); //Is returning 1 as result
$highestRow = 3000;
for ($startRow = 1; $startRow <= $highestRow; $startRow += $chunkSize) {
/** Tell the Read Filter which rows we want this iteration **/
$chunkFilter->setRows($startRow, $chunkSize);
$sheetData = $sheet->toArray(null, true, false, true);
var_dump($sheetData);
}
}
}
The var_dump($sheetData); prints all sheet data, not only the chunk size.
So, how can I read each sheet data and chunk the rows?
I'm using "phpoffice/phpspreadsheet": "^1.4"
I completely missed your goal (the question was not so clear).
I completely change my answer.
Assumed that you can loop through multiple sheets with the code below:
// .... add helper here....
$helper->log('Loading file ' . pathinfo($inputFileName, PATHINFO_BASENAME) . ' using IOFactory with a defined reader type of ' . $inputFileType);
$reader = IOFactory::createReader($inputFileType);
// Define how many rows we want for each "chunk"
$chunkSize = 10;
// Loop to read our worksheet in "chunk size" blocks
for ($startRow = 2; $startRow <= 50 ; $startRow += $chunkSize) {
// ..... use the helper ...
$helper->log('Loading WorkSheet using configurable filter for headings row 1 and for rows ' . $startRow . ' to ' . ($startRow + $chunkSize - 1));
// Create a new Instance of our Read Filter, passing in the limits on which rows we want to read
$chunkFilter = new ChunkReadFilter($startRow, $chunkSize);
// Tell the Reader that we want to use the new Read Filter that we've just Instantiated
$reader->setReadFilter($chunkFilter);
// Load only the rows that match our filter from $inputFileName to a PhpSpreadsheet Object
$spreadsheet = $reader->load($inputFileName);
$sheetCount = $spreadsheet->getSheetCount();
for ($i = 0; $i < $sheetCount; $i++) {
$sheet = $spreadsheet->getSheet($i);
// ...not what you want, but I leave this here
$higestRow = $sheet->getHighestRow();
echo "<p> Sheet n. ".$i. " highest row is:" . ($higestRow) . "</p>";
$sheetData = $sheet->toArray(null, true, true, true);
var_dump($sheetData);
}
}
...to reach your goal I guess you need to call use PhpOffice\PhpSpreadsheet\Reader\IReadFilter; and build your own filter in order to set the highestRow inside the for loop, as for your needs.
This code is taken from the documentation, the poblic function setRows() I guess is where you need to put your own code, and than cal the filter in the for loop:
namespace Samples\Sample12;
use PhpOffice\PhpSpreadsheet\IOFactory;
use PhpOffice\PhpSpreadsheet\Reader\IReadFilter;
require __DIR__ . '/../Header.php';
$inputFileType = 'Xls';
$inputFileName = __DIR__ . '/sampleData/example2.xls';
/** Define a Read Filter class implementing IReadFilter */
class ChunkReadFilter implements IReadFilter
{
private $startRow = 0;
private $endRow = 0;
/**
* Set the list of rows that we want to read.
*
* #param mixed $startRow
* #param mixed $chunkSize
*/
public function setRows($startRow, $chunkSize)
{
$this->startRow = $startRow;
$this->endRow = $startRow + $chunkSize;
}
public function readCell($column, $row, $worksheetName = '')
{
// Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow
if (($row == 1) || ($row >= $this->startRow && $row < $this->endRow)) {
return true;
}
return false;
}
}
$helper->log('Loading file ' . pathinfo($inputFileName, PATHINFO_BASENAME) . ' using IOFactory with a defined reader type of ' . $inputFileType);
// Create a new Reader of the type defined in $inputFileType
$reader = IOFactory::createReader($inputFileType);
// Define how many rows we want to read for each "chunk"
$chunkSize = 10;
// Create a new Instance of our Read Filter
$chunkFilter = new ChunkReadFilter();
// Tell the Reader that we want to use the Read Filter that we've Instantiated
$reader->setReadFilter($chunkFilter);
$spreadsheet = $reader->load($inputFileName);
$sheetCount = $spreadsheet->getSheetCount();
for ($i = 0; $i < $sheetCount; $i++) {
$sheet = $spreadsheet->getSheet($i);
// ...we get the highest row here, now
$higestRow = $sheet->getHighestRow();
for ($startRow = 2; $startRow <= $higestRow; $startRow += $chunkSize) {
// ..just for check the output
echo "<p> Sheet n. ".$i. " highest row is:" . ($higestRow) . "</p>";
$helper->log('Loading WorkSheet using configurable filter for headings row 1 and for rows ' . $startRow . ' to ' . ($higestRow + $chunkSize - 1));
// Tell the Read Filter, the limits on which rows we want to read this iteration
$chunkFilter->setRows($startRow, $chunkSize);
// Load only the rows that match our filter from $inputFileName to a PhpSpreadsheet Object
$spreadsheet = $reader->load($inputFileName);
// Do some processing here
$sheetData = $spreadsheet->getActiveSheet()->toArray(null, true, true, true);
var_dump($sheetData);
}
}
I am still new to this but tried out a solution which helps us here:
We can read files in Chunk through excel sheet as mentioned in the above comments but to save memory. We can create reader inside the loop and release it at the end of the loop like as mentioned below:
// Define how many rows we want to read for each "chunk"
$chunkSize = 1000;
// Loop to read our worksheet in "chunk size" blocks
for ($startRow = 1; $startRow <= $rawRows; $startRow += $chunkSize) {
// Create a new Reader of the type defined in
$reader = IOFactory::createReader($inputFileType);
// Create a new Instance of our Read Filter
$chunkFilter = new Chunk();
// Tell the Reader that we want to use the Read Filter that we've Instantiated
$reader->setReadFilter($chunkFilter);
// Tell the Read Filter, the limits on which rows we want to read this iteration
$chunkFilter->setRows($startRow, $chunkSize);
// Load only the rows that match our filter from $inputFileName to a PhpSpreadsheet Object
$spreadsheet = $reader->load($inputFileName);
.....
// process the file
.....
// then release the memory
$spreadsheet->__destruct();
$spreadsheet = null;
unset($spreadsheet);
$reader->__destruct();
$reader = null;
unset($reader);
}
It helps for large sheets to use only memory of a chunk and never exceed the memory limit.
Please let me know if this is helpful.
I have a function PHP to parse file Excel, it read all the number of the rows but it doesn't return all the data in the object.
Number of lines 3247, but it return just 1023 lines.
this following the parsing function:
public function parseEquipement($filePath = null) {
set_time_limit(0);
$listEquipement = [];
$count = 0;
$chunkSize = 8192;
$objReader = PHPExcel_IOFactory::createReader(PHPExcel_IOFactory::identify($filePath));
$spreadsheetInfo = $objReader->listWorksheetInfo($filePath);
$chunkFilter = new \Floose\Parse\ChunkReadFilter();
$objReader->setReadFilter($chunkFilter);
$objReader->setReadDataOnly(true);
$chunkFilter->setRows(0, 1);
$objPHPExcel = $objReader->load($filePath);
$totalRows = $spreadsheetInfo[0]['totalRows'];
for ($startRow = 1; $startRow <= $totalRows; $startRow += $chunkSize) {
$chunkFilter->setRows($startRow, $chunkSize);
$objPHPExcel = $objReader->load($filePath);
$sheetData = $objPHPExcel->getActiveSheet()->toArray(null, null, true, false);
$startIndex = ($startRow == 1) ? $startRow : $startRow - 1;
if (!empty($sheetData) && $startRow < $totalRows) {
$dataToAnalyse = array_slice($sheetData, $startIndex, $chunkSize);
if($dataToAnalyse[0][0]==NULL){
break;
}
for ($i = 0; $i < $chunkSize; $i++) {
if ($dataToAnalyse[$i]['0'] != NULL) {
$listEquipement[] = new Article($dataToAnalyse[$i]['0'], '', $dataToAnalyse[$i]['1']);
$count++;
}
}
}
//echo($totalRows); // is best
//echo($count); // is wrong
//print_r($listEquipement);
$objPHPExcel->disconnectWorksheets();
unset($objPHPExcel, $sheetData);
}
return $listEquipement;
}
I changed all the code by this following but it doesn't work:
public function parseEquipment($filePath = null) {
$objReader = PHPExcel_IOFactory::createReader(PHPExcel_IOFactory::identify($filePath));
$objReader->setReadDataOnly(true);
$objPHPExcel = $objReader->load($filePath);
$sheet = $objPHPExcel->getSheet(0);
$highestRow = $sheet->getHighestRow();
for ($row = 2; $row <= $highestRow; $row++){
echo $sheet->getCellByColumnAndRow(3, $row)->getCalculatedValue();
echo $sheet->getCellByColumnAndRow(4, $row)->getCalculatedValue();
echo $sheet->getCellByColumnAndRow(2,$row)->getCalculatedValue();
$listEquipement[] = new Article(
$sheet->getCellByColumnAndRow(3, $row)->getCalculatedValue(),
$sheet->getCellByColumnAndRow(4, $row)->getCalculatedValue(),
$sheet->getCellByColumnAndRow(2, $row)->getCalculatedValue()
);
}
}
And when I run my code always it display an error of memory size knowing that the size of my file is 81K and it display the number of lines in the same time.
Fatal Error: Allowed Memory Size of 134217728 Bytes Exhausted (Tried to allocate 54byte)
Could anyone be kind enough to guide and teach me how I should do my codes or can you suggest me another code to parsing a file Excel ?
I am making currently migration from one database to another, project is on laravel so I am creating laravel command for this. I have one table with about 700000 records. I have created function with LIMIT and transactions to optimize query but still getting out of memory error from PHP.
Here is my code:
ini_set('memory_limit', '750M'); // at beginning of file
$circuit_c = DB::connection('legacy')->select('SELECT COUNT(*) FROM tbl_info');
$count = (array) $circuit_c[0];
$counc = $count['COUNT(*)'];
$max = 1000;
$pages = ceil($counc / $max);
for ($i = 1; $i < ($pages + 1); $i++) {
$offset = (($i - 1) * $max);
$start = ($offset == 0 ? 0 : ($offset + 1));
$infos = DB::connection('legacy')->select('SELECT * from tbl_info LIMIT ' . $offset . ', ' . $max);
DB::connection('mysql')->transaction(function() use ($infos) {
foreach ($infos as $info) {
$validator = Validator::make($data = (array) $info, Info::$rules);
if ($validator->passes()) {
if ($info->record_type == 'C') {
$b_user_new = Info::create($data);
unset($b_user_new);
}
}
unset($info);
unset($validator);
}
});
unset($infos);
}
Error is this:
user#lenovo /var/www/info $ php artisan migratedata
PHP Fatal error: Allowed memory size of 786432000 bytes exhausted (tried to allocate 32 bytes) in /var/www/info/vendor/laravel/framework/src/Illuminate/Database/Grammar.php on line 75
Error is show after importing about 50000 records.
There is kind of a "memory leak" in here. You need to find out which of the variables is hogging all of this memory. Try this function to debug and see which variable keep on growing constantly
function sizeofvar($var) {
$start_memory = memory_get_usage();
$tmp = unserialize(serialize($var));
return memory_get_usage() - $start_memory;
}
Once you know what variable is taking all the memory then you can start implementĂng appropriate measures.
Found the answer, laravel caches all queries, so just: DB::connection()->disableQueryLog();
Similar to: How to read only 5 last line of the text file in PHP?
I have a large log file and I want to be able to show 100 lines from position X in the file.
I need to use fseek rather than file() because the log file is too large.
I have a similar function but it will only read from the end of the file. How can it be modified so that a start position can be specified as well? I would also need to start at the end of the file.
function read_line($filename, $lines, $revers = false)
{
$offset = -1;
$i = 0;
$fp = #fopen($filename, "r");
while( $lines && fseek($fp, $offset, SEEK_END) >= 0 ) {
$c = fgetc($fp);
if($c == "\n" || $c == "\r"){
$lines--;
if($revers){
$read[$i] = strrev($read[$i]);
$i++;
}
}
if($revers) $read[$i] .= $c;
else $read .= $c;
$offset--;
}
fclose ($fp);
if($revers){
if($read[$i] == "\n" || $read[$i] == "\r")
array_pop($read);
else $read[$i] = strrev($read[$i]);
return implode('',$read);
}
return strrev(rtrim($read,"\n\r"));
}
What I'm trying to do is create a web based log viewer that will start from the end of the file and display 100 lines, and when pressing the "Next" button, the next 100 lines preceding it will be shown.
If you're on Unix, you can utilize the sed tool. For example: to get line 10-20 from a file:
sed -n 10,20p errors.log
And you can do this in your script:
<?php
$page = 1;
$limit = 100;
$off = ($page * $limit) - ($limit - 1);
exec("sed -n $off,".($limit+$off-1)."p errors.log", $out);
print_r($out);
The lines are available in $out array.
This uses fseek to read 100 lines of a file starting from a specified offset. If the offset is greater than the number of lines in the log, the first 100 lines are read.
In your application, you could pass the current offset through the query string for prev and next and base the next offset on that. You could also store and pass the current file position for more efficiency.
<?php
$GLOBALS["interval"] = 100;
read_log();
function read_log()
{
$fp = fopen("log", "r");
$offset = determine_offset();
$interval = $GLOBALS["interval"];
if (seek_to_offset($fp, $offset) != -1)
{
show_next_button($offset, $interval);
}
$lines = array();
for ($ii = 0; $ii < $interval; $ii++)
{
$lines[] = trim(fgets($fp));
}
echo "<pre>";
print_r(array_reverse($lines));
}
// Get the offset from the query string or default to the interval
function determine_offset()
{
$interval = $GLOBALS["interval"];
if (isset($_GET["offset"]))
{
return intval($_GET["offset"]) + $interval;
}
return $interval;
}
function show_next_button($offset, $interval)
{
$next_offset = $offset + $interval;
echo "Next";
}
// Seek to the end of the file, then seek backward $offset lines
function seek_to_offset($fp, $offset)
{
fseek($fp, 0, SEEK_END);
for ($ii = 0; $ii < $offset; $ii++)
{
if (seek_to_previous_line($fp) == -1)
{
rewind($fp);
return -1;
}
}
}
// Seek backward by char until line break
function seek_to_previous_line($fp)
{
fseek($fp, -2, SEEK_CUR);
while (fgetc($fp) != "\n")
{
if (fseek($fp, -2, SEEK_CUR) == -1)
{
return -1;
}
}
}
Is "position X" measured in lines or bytes? If lines, you can easily use SplFileObject to seek to a certain line and then read 100 lines:
$file = new SplFileObject('log.txt');
$file->seek(199); // go to line 200
for($i = 0; $i < 100 and $file->valid(); $i++, $file->next())
{
echo $file->current();
}
If position X is measured in bytes, isn't it a simple matter of changing your initial $offset = -1 to a different value?
I would do it as followed:
function readFileFunc($tempFile){
if(#!file_exists($tempFile)){
return FALSE;
}else{
return file($tempFile);
}
}
$textArray = readFileFunc('./data/yourTextfile.txt');
$slicePos = count($textArray)-101;
if($slicePos < 0){
$slicePos = 0;
}
$last100 = array_slice($textArray, $slicePos);
$last100 = implode('<br />', $last100);
echo $last100;