PhpSpreadsheet - Chunk data on multiple sheets - php

I need to read a xlsx file with 10 sheets, each sheet with about 3K rows.
Is there a way to loop each sheet and chunk his rows?
Following the examples I'm on this point:
public function import($file)
{
$inputFileType = IOFactory::identify($file);
$reader = IOFactory::createReader($inputFileType);
//My ChunkReadFilter is exactly the same of the PhpSpreadsheet examples
$chunkFilter = new ChunkReadFilter();
$reader->setReadFilter($chunkFilter);
$chunkSize = 100;
$spreadsheet = $reader->load($file);
$loadedSheetNames = $spreadsheet->getSheetNames();
foreach ($loadedSheetNames as $sheetIndex => $loadedSheetName) {
$sheet = $spreadsheet->getSheet($sheetIndex);
//$highestRow = $sheet->getHighestRow(); //Is returning 1 as result
$highestRow = 3000;
for ($startRow = 1; $startRow <= $highestRow; $startRow += $chunkSize) {
/** Tell the Read Filter which rows we want this iteration **/
$chunkFilter->setRows($startRow, $chunkSize);
$sheetData = $sheet->toArray(null, true, false, true);
var_dump($sheetData);
}
}
}
The var_dump($sheetData); prints all sheet data, not only the chunk size.
So, how can I read each sheet data and chunk the rows?
I'm using "phpoffice/phpspreadsheet": "^1.4"

I completely missed your goal (the question was not so clear).
I completely change my answer.
Assumed that you can loop through multiple sheets with the code below:
// .... add helper here....
$helper->log('Loading file ' . pathinfo($inputFileName, PATHINFO_BASENAME) . ' using IOFactory with a defined reader type of ' . $inputFileType);
$reader = IOFactory::createReader($inputFileType);
// Define how many rows we want for each "chunk"
$chunkSize = 10;
// Loop to read our worksheet in "chunk size" blocks
for ($startRow = 2; $startRow <= 50 ; $startRow += $chunkSize) {
// ..... use the helper ...
$helper->log('Loading WorkSheet using configurable filter for headings row 1 and for rows ' . $startRow . ' to ' . ($startRow + $chunkSize - 1));
// Create a new Instance of our Read Filter, passing in the limits on which rows we want to read
$chunkFilter = new ChunkReadFilter($startRow, $chunkSize);
// Tell the Reader that we want to use the new Read Filter that we've just Instantiated
$reader->setReadFilter($chunkFilter);
// Load only the rows that match our filter from $inputFileName to a PhpSpreadsheet Object
$spreadsheet = $reader->load($inputFileName);
$sheetCount = $spreadsheet->getSheetCount();
for ($i = 0; $i < $sheetCount; $i++) {
$sheet = $spreadsheet->getSheet($i);
// ...not what you want, but I leave this here
$higestRow = $sheet->getHighestRow();
echo "<p> Sheet n. ".$i. " highest row is:" . ($higestRow) . "</p>";
$sheetData = $sheet->toArray(null, true, true, true);
var_dump($sheetData);
}
}
...to reach your goal I guess you need to call use PhpOffice\PhpSpreadsheet\Reader\IReadFilter; and build your own filter in order to set the highestRow inside the for loop, as for your needs.
This code is taken from the documentation, the poblic function setRows() I guess is where you need to put your own code, and than cal the filter in the for loop:
namespace Samples\Sample12;
use PhpOffice\PhpSpreadsheet\IOFactory;
use PhpOffice\PhpSpreadsheet\Reader\IReadFilter;
require __DIR__ . '/../Header.php';
$inputFileType = 'Xls';
$inputFileName = __DIR__ . '/sampleData/example2.xls';
/** Define a Read Filter class implementing IReadFilter */
class ChunkReadFilter implements IReadFilter
{
private $startRow = 0;
private $endRow = 0;
/**
* Set the list of rows that we want to read.
*
* #param mixed $startRow
* #param mixed $chunkSize
*/
public function setRows($startRow, $chunkSize)
{
$this->startRow = $startRow;
$this->endRow = $startRow + $chunkSize;
}
public function readCell($column, $row, $worksheetName = '')
{
// Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow
if (($row == 1) || ($row >= $this->startRow && $row < $this->endRow)) {
return true;
}
return false;
}
}
$helper->log('Loading file ' . pathinfo($inputFileName, PATHINFO_BASENAME) . ' using IOFactory with a defined reader type of ' . $inputFileType);
// Create a new Reader of the type defined in $inputFileType
$reader = IOFactory::createReader($inputFileType);
// Define how many rows we want to read for each "chunk"
$chunkSize = 10;
// Create a new Instance of our Read Filter
$chunkFilter = new ChunkReadFilter();
// Tell the Reader that we want to use the Read Filter that we've Instantiated
$reader->setReadFilter($chunkFilter);
$spreadsheet = $reader->load($inputFileName);
$sheetCount = $spreadsheet->getSheetCount();
for ($i = 0; $i < $sheetCount; $i++) {
$sheet = $spreadsheet->getSheet($i);
// ...we get the highest row here, now
$higestRow = $sheet->getHighestRow();
for ($startRow = 2; $startRow <= $higestRow; $startRow += $chunkSize) {
// ..just for check the output
echo "<p> Sheet n. ".$i. " highest row is:" . ($higestRow) . "</p>";
$helper->log('Loading WorkSheet using configurable filter for headings row 1 and for rows ' . $startRow . ' to ' . ($higestRow + $chunkSize - 1));
// Tell the Read Filter, the limits on which rows we want to read this iteration
$chunkFilter->setRows($startRow, $chunkSize);
// Load only the rows that match our filter from $inputFileName to a PhpSpreadsheet Object
$spreadsheet = $reader->load($inputFileName);
// Do some processing here
$sheetData = $spreadsheet->getActiveSheet()->toArray(null, true, true, true);
var_dump($sheetData);
}
}

I am still new to this but tried out a solution which helps us here:
We can read files in Chunk through excel sheet as mentioned in the above comments but to save memory. We can create reader inside the loop and release it at the end of the loop like as mentioned below:
// Define how many rows we want to read for each "chunk"
$chunkSize = 1000;
// Loop to read our worksheet in "chunk size" blocks
for ($startRow = 1; $startRow <= $rawRows; $startRow += $chunkSize) {
// Create a new Reader of the type defined in
$reader = IOFactory::createReader($inputFileType);
// Create a new Instance of our Read Filter
$chunkFilter = new Chunk();
// Tell the Reader that we want to use the Read Filter that we've Instantiated
$reader->setReadFilter($chunkFilter);
// Tell the Read Filter, the limits on which rows we want to read this iteration
$chunkFilter->setRows($startRow, $chunkSize);
// Load only the rows that match our filter from $inputFileName to a PhpSpreadsheet Object
$spreadsheet = $reader->load($inputFileName);
.....
// process the file
.....
// then release the memory
$spreadsheet->__destruct();
$spreadsheet = null;
unset($spreadsheet);
$reader->__destruct();
$reader = null;
unset($reader);
}
It helps for large sheets to use only memory of a chunk and never exceed the memory limit.
Please let me know if this is helpful.

Related

Setting the column value by number in PHPExcel

I've got an array with specifications. I want each specification to become a column. I am having trouble with working this out though.
$specifications = new Specifications();
$columnCounter = 1;
foreach ($specifications as $specificationId => $specification) {
$column = PHPExcel_Cell::getColumnByNumber($columnCounter);
$objPHPExcel
->getActiveSheet()
->getColumnDimension($column)
->setAutoSize(true)
;
$objPHPExcel->setActiveSheetIndex(0)
->setCellValue($column.'1', $specification['value'])
;
$columnCounter++;
}
The PHPExcel::getColumnByNumber() is of course an imaginary function. Though I am wondering how others do this and how best to address this.
$book = new PHPExcel();
$book->setActiveSheetIndex(0);
$sheet = $book->getActiveSheet();
$sheet->setTitle('Sets');
$xls_row = 5;
$xls_col = 3;
foreach($specifications as $specificationId => &$specification)
{
$adr = coord($xls_row, $xls_col);
$sheet->setCellValueExplicit($adr, $specification->title, PHPExcel_Cell_DataType::TYPE_STRING);
$sheet->getColumnDimension(coord_x($xls_col))->setAutoSize(true);
$xls_col++;
}
// convert a 0-based coordinate value into EXCEL B1-format
function coord_x($x)
{
if($x<26) $x = chr(ord('A')+$x);
else
{
$x -= 26;
$c1 = $x % 26;
$c2 = intval(($x - $c1)/26);
$x = chr(ord('A')+$c2).chr(ord('A')+$c1);
}
return $x;
}
// convert X,Y 0-based cell address into EXCEL B1-format pair
function coord($y,$x)
{
return coord_x($x).($y+1);
}

Parsing file Excel from PHP

I have a function PHP to parse file Excel, it read all the number of the rows but it doesn't return all the data in the object.
Number of lines 3247, but it return just 1023 lines.
this following the parsing function:
public function parseEquipement($filePath = null) {
set_time_limit(0);
$listEquipement = [];
$count = 0;
$chunkSize = 8192;
$objReader = PHPExcel_IOFactory::createReader(PHPExcel_IOFactory::identify($filePath));
$spreadsheetInfo = $objReader->listWorksheetInfo($filePath);
$chunkFilter = new \Floose\Parse\ChunkReadFilter();
$objReader->setReadFilter($chunkFilter);
$objReader->setReadDataOnly(true);
$chunkFilter->setRows(0, 1);
$objPHPExcel = $objReader->load($filePath);
$totalRows = $spreadsheetInfo[0]['totalRows'];
for ($startRow = 1; $startRow <= $totalRows; $startRow += $chunkSize) {
$chunkFilter->setRows($startRow, $chunkSize);
$objPHPExcel = $objReader->load($filePath);
$sheetData = $objPHPExcel->getActiveSheet()->toArray(null, null, true, false);
$startIndex = ($startRow == 1) ? $startRow : $startRow - 1;
if (!empty($sheetData) && $startRow < $totalRows) {
$dataToAnalyse = array_slice($sheetData, $startIndex, $chunkSize);
if($dataToAnalyse[0][0]==NULL){
break;
}
for ($i = 0; $i < $chunkSize; $i++) {
if ($dataToAnalyse[$i]['0'] != NULL) {
$listEquipement[] = new Article($dataToAnalyse[$i]['0'], '', $dataToAnalyse[$i]['1']);
$count++;
}
}
}
//echo($totalRows); // is best
//echo($count); // is wrong
//print_r($listEquipement);
$objPHPExcel->disconnectWorksheets();
unset($objPHPExcel, $sheetData);
}
return $listEquipement;
}
I changed all the code by this following but it doesn't work:
public function parseEquipment($filePath = null) {
$objReader = PHPExcel_IOFactory::createReader(PHPExcel_IOFactory::identify($filePath));
$objReader->setReadDataOnly(true);
$objPHPExcel = $objReader->load($filePath);
$sheet = $objPHPExcel->getSheet(0);
$highestRow = $sheet->getHighestRow();
for ($row = 2; $row <= $highestRow; $row++){
echo $sheet->getCellByColumnAndRow(3, $row)->getCalculatedValue();
echo $sheet->getCellByColumnAndRow(4, $row)->getCalculatedValue();
echo $sheet->getCellByColumnAndRow(2,$row)->getCalculatedValue();
$listEquipement[] = new Article(
$sheet->getCellByColumnAndRow(3, $row)->getCalculatedValue(),
$sheet->getCellByColumnAndRow(4, $row)->getCalculatedValue(),
$sheet->getCellByColumnAndRow(2, $row)->getCalculatedValue()
);
}
}
And when I run my code always it display an error of memory size knowing that the size of my file is 81K and it display the number of lines in the same time.
Fatal Error: Allowed Memory Size of 134217728 Bytes Exhausted (Tried to allocate 54byte)
Could anyone be kind enough to guide and teach me how I should do my codes or can you suggest me another code to parsing a file Excel ?

How to create a log file in PHP

I have 2 functions to parse each one a file Excel their sizes are 673K and 131K. They have the same code just their names.
One function it read the data from file Excel and the other function it return:
Fatal Error: Allowed Memory Size of 134217728 Bytes Exhausted (Tried to allow 72 bytes)
I have others files Excel their sizes more bigger than this ones and their parsing functions works well.
I want to create a logfile to register every action they do inside the system but I have no idea how to do it. On the other hand I found this solution in Stackoverflow for #Lawrence Cherone:
enter link description here
But the problem is the first time I will do a logfile, I don't know how I create it ? I create a new file and I put it in my project ? How I excute it and how I can see the reason of the error in my function ? Or I put this file in the function when I have the error like the solution proposed by #Lawrence Cherone ?
This solution it seems worked fine and the problem is resolved.
This following is my small code of my function, can you guide me how I create this logfile to debug it:
public function parseEquipment($filePath = null) {
set_time_limit(0);
$listEquipement = [];
$count = 0;
$chunkSize = 1024;
$objReader = PHPExcel_IOFactory::createReader(PHPExcel_IOFactory::identify($filePath));
$spreadsheetInfo = $objReader->listWorksheetInfo($filePath);
$chunkFilter = new \Floose\Parse\ChunkReadFilter();
$objReader->setReadFilter($chunkFilter);
$objReader->setReadDataOnly(true);
$chunkFilter->setRows(0, 1);
$objPHPExcel = $objReader->load($filePath);
$totalRows = $spreadsheetInfo[0]['totalRows'];
for ($startRow = 1; $startRow <= $totalRows; $startRow += $chunkSize) {
$chunkFilter->setRows($startRow, $chunkSize);
$objPHPExcel = $objReader->load($filePath);
$sheetData = $objPHPExcel->getActiveSheet()->toArray(null, null, true, false);
$startIndex = ($startRow == 1) ? $startRow : $startRow - 1;
if (!empty($sheetData) && $startRow < $totalRows) {
$dataToAnalyse = array_slice($sheetData, $startIndex, $chunkSize);
//echo 'test1';
if($dataToAnalyse[1][0]==NULL){
//echo 'test2';
break;
}
//echo 'test3';
//var_dump($sheetData);
for ($i = 0; $i < $chunkSize; $i++) {
if ($dataToAnalyse[$i]['0'] != NULL) {
//echo 'OK';
$listEquipement[] = new Article($dataToAnalyse[$i]['3'], $dataToAnalyse[$i]['4'], $dataToAnalyse[$i]['2']);
// echo 'test4';
$count++;
}
}
}
$objPHPExcel->disconnectWorksheets();
unset($objPHPExcel, $sheetData);
}
//var_dump(array_slice($sheetData, $startIndex, $chunkSize););
return $listEquipement;
}
error_log("You messed up!".$my_message, 3, "/var/tmp/custom-errors.log");

Copy an entire column using phpexcel

Am trying to copy an entire column or copy the values to another column.my script can determine the column that needs to be copied over and then highestrow value. any suggestions.
I were having the same issue, and after several days of searching, all I've got so far is this topic [copy style and data in PHPExcel ]. The code worked perfectly and clearly to understand, but just as you, I need to copy from column to column, not row to row. Then, I figured out copy a column basically is just "self-copy a cell to another cell index in a row". So here's the code, tested and it's worked for me. Hope this can help.
/**
* Copy excel column to column
* #param $sheet:current active sheet
* #param $srcRow: source row
* #param $dstRow: destination row
* #param $height: rows number want to copy
* #param $width: column number want to copy
**/
private function selfCopyRow(\PHPExcel_Worksheet $sheet, $srcRow, $dstRow, $height, $width)
{
for ($row = 0; $row < $height; $row++) {
for ($col = 0; $col < $width; $col++) {
$cell = $sheet->getCellByColumnAndRow($col, $srcRow + $row);
$style = $sheet->getStyleByColumnAndRow($col, $srcRow + $row);
$dstCell = \PHPExcel_Cell::stringFromColumnIndex(($width + $col)) . (string)($dstRow + $row);
$sheet->setCellValue($dstCell, $cell->getValue());
$sheet->duplicateStyle($style, $dstCell);
}
$h = $sheet->getRowDimension($srcRow + $row)->getRowHeight();
$sheet->getRowDimension($dstRow + $row)->setRowHeight($h);
}
// EN : Copy format
foreach ($sheet->getMergeCells() as $mergeCell) {
$mc = explode(":", $mergeCell);
$col_s = preg_replace("/[0-9]*/", "", $mc[0]);
$col_e = preg_replace("/[0-9]*/", "", $mc[1]);
$row_s = ((int)preg_replace("/[A-Z]*/", "", $mc[0])) - $srcRow;
$row_e = ((int)preg_replace("/[A-Z]*/", "", $mc[1])) - $srcRow;
if (0 <= $row_s && $row_s < $height) {
$merge = $col_s . (string)($dstRow + $row_s) . ":" . $col_e . (string)($dstRow + $row_e);
$sheet->mergeCells($merge);
}
}
}

How to read large worksheets from large Excel files (27MB+) with PHPExcel?

I have large Excel worksheets that I want to be able to read into MySQL using PHPExcel.
I am using the recent patch which allows you to read in Worksheets without opening the whole file. This way I can read one worksheet at a time.
However, one Excel file is 27MB large. I can successfully read in the first worksheet since it is small, but the second worksheet is so large that the cron job that started the process at 22:00 was not finished at 8:00 AM, the worksheet is simple too big.
Is there any way to read in a worksheet line by line, e.g. something like this:
$inputFileType = 'Excel2007';
$inputFileName = 'big_file.xlsx';
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$worksheetNames = $objReader->listWorksheetNames($inputFileName);
foreach ($worksheetNames as $sheetName) {
//BELOW IS "WISH CODE":
foreach($row = 1; $row <=$max_rows; $row+= 100) {
$dataset = $objReader->getWorksheetWithRows($row, $row+100);
save_dataset_to_database($dataset);
}
}
Addendum
#mark, I used the code you posted to create the following example:
function readRowsFromWorksheet() {
$file_name = htmlentities($_POST['file_name']);
$file_type = htmlentities($_POST['file_type']);
echo 'Read rows from worksheet:<br />';
debug_log('----------start');
$objReader = PHPExcel_IOFactory::createReader($file_type);
$chunkSize = 20;
$chunkFilter = new ChunkReadFilter();
$objReader->setReadFilter($chunkFilter);
for ($startRow = 2; $startRow <= 240; $startRow += $chunkSize) {
$chunkFilter->setRows($startRow, $chunkSize);
$objPHPExcel = $objReader->load('data/' . $file_name);
debug_log('reading chunk starting at row '.$startRow);
$sheetData = $objPHPExcel->getActiveSheet()->toArray(null, true, true, true);
var_dump($sheetData);
echo '<hr />';
}
debug_log('end');
}
As the following log file shows, it runs fine on a small 8K Excel file, but when I run it on a 3 MB Excel file, it never gets past the first chunk, is there any way I can optimize this code for performance, otherwise it doesn't look like it is not performant enough to get chunks out of a large Excel file:
2011-01-12 11:07:15: ----------start
2011-01-12 11:07:15: reading chunk starting at row 2
2011-01-12 11:07:15: reading chunk starting at row 22
2011-01-12 11:07:15: reading chunk starting at row 42
2011-01-12 11:07:15: reading chunk starting at row 62
2011-01-12 11:07:15: reading chunk starting at row 82
2011-01-12 11:07:15: reading chunk starting at row 102
2011-01-12 11:07:15: reading chunk starting at row 122
2011-01-12 11:07:15: reading chunk starting at row 142
2011-01-12 11:07:15: reading chunk starting at row 162
2011-01-12 11:07:15: reading chunk starting at row 182
2011-01-12 11:07:15: reading chunk starting at row 202
2011-01-12 11:07:15: reading chunk starting at row 222
2011-01-12 11:07:15: end
2011-01-12 11:07:52: ----------start
2011-01-12 11:08:01: reading chunk starting at row 2
(...at 11:18, CPU usage at 93% still running...)
Addendum 2
When I comment out:
//$sheetData = $objPHPExcel->getActiveSheet()->toArray(null, true, true, true);
//var_dump($sheetData);
Then it parses at an acceptable speed (about 2 rows per second), is there anyway to increase the performance of toArray()?
2011-01-12 11:40:51: ----------start
2011-01-12 11:40:59: reading chunk starting at row 2
2011-01-12 11:41:07: reading chunk starting at row 22
2011-01-12 11:41:14: reading chunk starting at row 42
2011-01-12 11:41:22: reading chunk starting at row 62
2011-01-12 11:41:29: reading chunk starting at row 82
2011-01-12 11:41:37: reading chunk starting at row 102
2011-01-12 11:41:45: reading chunk starting at row 122
2011-01-12 11:41:52: reading chunk starting at row 142
2011-01-12 11:42:00: reading chunk starting at row 162
2011-01-12 11:42:07: reading chunk starting at row 182
2011-01-12 11:42:15: reading chunk starting at row 202
2011-01-12 11:42:22: reading chunk starting at row 222
2011-01-12 11:42:22: end
Addendum 3
This seems to work adequately, for instance, at least on the 3 MB file:
for ($startRow = 2; $startRow <= 240; $startRow += $chunkSize) {
echo 'Loading WorkSheet using configurable filter for headings row 1 and for rows ', $startRow, ' to ', ($startRow + $chunkSize - 1), '<br />';
$chunkFilter->setRows($startRow, $chunkSize);
$objPHPExcel = $objReader->load('data/' . $file_name);
debug_log('reading chunk starting at row ' . $startRow);
foreach ($objPHPExcel->getActiveSheet()->getRowIterator() as $row) {
$cellIterator = $row->getCellIterator();
$cellIterator->setIterateOnlyExistingCells(false);
echo '<tr>';
foreach ($cellIterator as $cell) {
if (!is_null($cell)) {
//$value = $cell->getCalculatedValue();
$rawValue = $cell->getValue();
debug_log($rawValue);
}
}
}
}
It is possible to read a worksheet in "chunks" using Read Filters, although I can make no guarantees about efficiency.
$inputFileType = 'Excel5';
$inputFileName = './sampleData/example2.xls';
/** Define a Read Filter class implementing PHPExcel_Reader_IReadFilter */
class chunkReadFilter implements PHPExcel_Reader_IReadFilter
{
private $_startRow = 0;
private $_endRow = 0;
/** Set the list of rows that we want to read */
public function setRows($startRow, $chunkSize) {
$this->_startRow = $startRow;
$this->_endRow = $startRow + $chunkSize;
}
public function readCell($column, $row, $worksheetName = '') {
// Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow
if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) {
return true;
}
return false;
}
}
echo 'Loading file ',pathinfo($inputFileName,PATHINFO_BASENAME),' using IOFactory with a defined reader type of ',$inputFileType,'<br />';
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
echo '<hr />';
/** Define how many rows we want to read for each "chunk" **/
$chunkSize = 20;
/** Create a new Instance of our Read Filter **/
$chunkFilter = new chunkReadFilter();
/** Tell the Reader that we want to use the Read Filter that we've Instantiated **/
$objReader->setReadFilter($chunkFilter);
/** Loop to read our worksheet in "chunk size" blocks **/
/** $startRow is set to 2 initially because we always read the headings in row #1 **/
for ($startRow = 2; $startRow <= 240; $startRow += $chunkSize) {
echo 'Loading WorkSheet using configurable filter for headings row 1 and for rows ',$startRow,' to ',($startRow+$chunkSize-1),'<br />';
/** Tell the Read Filter, the limits on which rows we want to read this iteration **/
$chunkFilter->setRows($startRow,$chunkSize);
/** Load only the rows that match our filter from $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($inputFileName);
// Do some processing here
$sheetData = $objPHPExcel->getActiveSheet()->toArray(null,true,true,true);
var_dump($sheetData);
echo '<br /><br />';
}
Note that this Read Filter will always read the first row of the worksheet, as well as the rows defined by the chunk rule.
When using a read filter, PHPExcel still parses the entire file, but only loads those cells that match the defined read filter, so it only uses the memory required by that number of cells. However, it will parse the file multiple times, once for each chunk, so it will be slower. This example reads 20 rows at a time: to read line by line, simply set $chunkSize to 1.
This can also cause problems if you have formulae that reference cells in different "chunks", because the data simply isn't available for cells outside of the current "chunk".
Currently to read .xlsx, .csv and .ods the best option is spreadsheet-reader (https://github.com/nuovo/spreadsheet-reader) because it can read the files without loading it all into memory. For the .xls extension it has limitations because it uses the PHPExcel for reading.
This is the ChunkReadFilter.php :
<?php
Class ChunkReadFilter implements PHPExcel_Reader_IReadFilter {
private $_startRow = 0;
private $_endRow = 0;
/** Set the list of rows that we want to read */
public function setRows($startRow, $chunkSize) {
$this->_startRow = $startRow;
$this->_endRow = $startRow + $chunkSize;
}
public function readCell($column, $row, $worksheetName = '') {
// Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow
if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) {
return true;
}
return false;
}
}
?>
And this is the index.php and a not perfect but basic implementation at the end of this file.
<?php
require_once './Classes/PHPExcel/IOFactory.php';
require_once 'ChunkReadFilter.php';
class Excelreader {
/**
* This function is used to read data from excel file in chunks and insert into database
* #param string $filePath
* #param integer $chunkSize
*/
public function readFileAndDumpInDB($filePath, $chunkSize) {
echo("Loading file " . $filePath . " ....." . PHP_EOL);
/** Create a new Reader of the type that has been identified * */
$objReader = PHPExcel_IOFactory::createReader(PHPExcel_IOFactory::identify($filePath));
$spreadsheetInfo = $objReader->listWorksheetInfo($filePath);
/** Create a new Instance of our Read Filter * */
$chunkFilter = new ChunkReadFilter();
/** Tell the Reader that we want to use the Read Filter that we've Instantiated * */
$objReader->setReadFilter($chunkFilter);
$objReader->setReadDataOnly(true);
//$objReader->setLoadSheetsOnly("Sheet1");
//get header column name
$chunkFilter->setRows(0, 1);
echo("Reading file " . $filePath . PHP_EOL . "<br>");
$totalRows = $spreadsheetInfo[0]['totalRows'];
echo("Total rows in file " . $totalRows . " " . PHP_EOL . "<br>");
/** Loop to read our worksheet in "chunk size" blocks * */
/** $startRow is set to 1 initially because we always read the headings in row #1 * */
for ($startRow = 1; $startRow <= $totalRows; $startRow += $chunkSize) {
echo("Loading WorkSheet for rows " . $startRow . " to " . ($startRow + $chunkSize - 1) . PHP_EOL . "<br>");
$i = 0;
/** Tell the Read Filter, the limits on which rows we want to read this iteration * */
$chunkFilter->setRows($startRow, $chunkSize);
/** Load only the rows that match our filter from $inputFileName to a PHPExcel Object * */
$objPHPExcel = $objReader->load($filePath);
$sheetData = $objPHPExcel->getActiveSheet()->toArray(null, true, true, false);
$startIndex = ($startRow == 1) ? $startRow : $startRow - 1;
//dumping in database
if (!empty($sheetData) && $startRow < $totalRows) {
/**
* $this->dumpInDb(array_slice($sheetData, $startIndex, $chunkSize));
*/
echo "<table border='1'>";
foreach ($sheetData as $key => $value) {
$i++;
if ($value[0] != null) {
echo "<tr><td>id:$i</td><td>{$value[0]} </td><td>{$value[1]} </td><td>{$value[2]} </td><td>{$value[3]} </td></tr>";
}
}
echo "</table><br/><br/>";
}
$objPHPExcel->disconnectWorksheets();
unset($objPHPExcel, $sheetData);
}
echo("File " . $filePath . " has been uploaded successfully in database" . PHP_EOL . "<br>");
}
/**
* Insert data into database table
* #param Array $sheetData
* #return boolean
* #throws Exception
* THE METHOD FOR THE DATABASE IS NOT WORKING, JUST THE PUBLIC METHOD..
*/
protected function dumpInDb($sheetData) {
$con = DbAdapter::getDBConnection();
$query = "INSERT INTO employe(name,address)VALUES";
for ($i = 1; $i < count($sheetData); $i++) {
$query .= "(" . "'" . mysql_escape_string($sheetData[$i][0]) . "',"
. "'" . mysql_escape_string($sheetData[$i][1]) . "')";
}
$query = trim($query, ",");
$query .="ON DUPLICATE KEY UPDATE name=VALUES(name),
=VALUES(address),
";
if (mysqli_query($con, $query)) {
mysql_close($con);
return true;
} else {
mysql_close($con);
throw new Exception(mysqli_error($con));
}
}
/**
* This function returns list of files corresponding to given directory path
* #param String $dataFolderPath
* #return Array list of file
*/
protected function getFileList($dataFolderPath) {
if (!is_dir($dataFolderPath)) {
throw new Exception("Directory " . $dataFolderPath . " is not exist");
}
$root = scandir($dataFolderPath);
$fileList = array();
foreach ($root as $value) {
if ($value === '.' || $value === '..') {
continue;
}
if (is_file("$dataFolderPath/$value")) {
$fileList[] = "$dataFolderPath/$value";
continue;
}
}
return $fileList;
}
}
$inputFileName = './prueba_para_batch.xls';
$excelReader = new Excelreader();
$excelReader->readFileAndDumpInDB($inputFileName, 500);

Categories