Remove empty columns in an excel file using phpexcel library - php

I am using phpexcel library to read an excel file.It works perfeclty on 99%. But sometimes it reads empty columns also.My code is
try {
$inputFileType = PHPExcel_IOFactory::identify($inputFileName);
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objPHPExcel = $objReader->load($inputFileName);
} catch(Exception $e) {
die('Error loading file');
}
$worksheet=$objPHPExcel->getActiveSheet();) {
$worksheetTitle = $worksheet->getTitle();
$highestRow = $worksheet->getHighestRow();
$highestColumn = $worksheet->getHighestColumn();
$highestColumnIndex = PHPExcel_Cell::columnIndexFromString($highestColumn);
$nrColumns = ord($highestColumn) - 64;
sometimes the $highestcolumn returns 'WVL' even if the data in excel column up to 'C' why?.
Also i want to check all the rows under a particular column is empty or not,Is there any easy method to do it instead of iterating all rows using for loop.

The getHighestRow() and getHighestColumn() methods work on the basis of testing for anything related to a cell, even if that's a style setting or a named range or print settings or a column/row setting such as width/height or hidden.
That's why the getHighestDataRow() and getHighestDataColumn() methods exist. These two methods look at the actual data in cells.
Note: Just because a cell looks empty when you view it in MS Excel, doesn't mean that it actually is empty. NULL is a valid cell value, as is a space character, neither of which is visible.
In answer to your second question: you can pass an optional argument to the getHighestRow(), getHighestColumn() and to the getHighestDataRow() and getHighestDataColumn(), so a row number passed to getHighestColumn() or getHighestDataColumn() will return the highest column in the specified row; and a column letter passed to getHighestRow() or getHighestDataRow() will return the highest row in that column.
e.g.
$highestColumnInRow5 = $worksheet->getHighestColumn(5);
or
$highestDataRowInColumnAA = $worksheet->getHighestDataRow('AA');

Related

PHPSpreadsheet: how to get the number of loaded rows?

How do I find out how many rows I have loaded using PHPSpreadsheet\Reader\Xlsx::load() method?
I cannot find methods (or properties) for getting row count in Spreadsheet or Worksheet classes either.
BTW I am using following code:
$filename = 'test.xlsx';
$inputFileType = \PhpOffice\PhpSpreadsheet\IOFactory::identify($filename);
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
$reader->setReadDataOnly(true);
$reader->setLoadSheetsOnly($sheet);
$this->spreadsheet = $reader->load($filename);
$this->worksheet = $this->spreadsheet->getActiveSheet();
Using the worksheet's getHighestRow() method
$highestRow = $this->spreadsheet->getActiveSheet()->getHighestRow();
or getHighestDataRow() if you're only interested in rows where cells contain data and not any blank rows at the end of the worksheet

Read excel sheet containing merged cells using PHPExcel

I want to read an excel sheet completely and using AJAX send each row to another page for processing. So I have used the following code for converting the excel sheet data into JSON array(Reference PHPExcel example provided in Library):
<?php
error_reporting(E_ALL);
set_time_limit(0);
date_default_timezone_set('Asia/Kolkata');
set_include_path(get_include_path() . PATH_SEPARATOR . 'PHPExcel-1.8/Classes/');
require_once 'PHPExcel/IOFactory.php';
$inputFileType = PHPExcel_IOFactory::identify($fileLocation);
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objReader->setLoadSheetsOnly("SHEETNAME");
$objPHPExcel = $objReader->load($fileLocation);
$data = $objPHPExcel->getActiveSheet()->toArray(null,true,true,true);
?>
Here $filelocation is the location of the uploaded file which is to be read for sending the rows individually using AJAX to another page.
I am using $data in javascript as
DataToBeUploaded=<?php echo json_encode($data);?>;
But the excel sheet contains some merged cells so PHPExcel is not able to read the values in these merged cells. Hence values in these cells are read as NULL.
Is there a way where I can use the merged cells' upper left cell value for all of the subsequent cells? (Actually in my case cells are merged vertically only)
Eg.
I have (Assume rows are numbered from 1 and columns from A)
Here PHPExcel reads this as:
data[1][A]='abc'
$data[1][B]='123'
$data[2][A]=''
$data[2][B]='456'
$data[3][A]=''
$data[3][B]='789'
I want the snippet to result in these values:
data[1][A]='abc'
$data[1][B]='123'
$data[2][A]='abc'
$data[2][B]='456'
$data[3][A]='abc'
$data[3][B]='789'
Referring to https://github.com/PHPOffice/PHPExcel/issues/643
I have written the following snippet:
$referenceRow=array();
for ( $row = 2; $row <= $noOfBooks; $row++ ){
for ( $col = 0; $col < 7; $col++ ){
if (!$objPHPExcel->getActiveSheet()->getCellByColumnAndRow( $col, $row )->isInMergeRange() || $objPHPExcel->getActiveSheet()->getCellByColumnAndRow( $col, $row )->isMergeRangeValueCell()) {
// Cell is not merged cell
$data[$row][$col] = $objPHPExcel->getActiveSheet()->getCellByColumnAndRow( $col, $row )->getCalculatedValue();
$referenceRow[$col]=$data[$row][$col];
//This will store the value of cell in $referenceRow so that if the next row is merged then it will use this value for the attribute
} else {
// Cell is part of a merge-range
$data[$row][$col]=$referenceRow[$col];
//The value stored for this column in $referenceRow in one of the previous iterations is the value of the merged cell
}
}
}
This will give the result exactly as required

is it possible to import and export excel file with size 70MB using PHPExcel library?

I have one excel file with 3 columns in which 2nd column contains email hyper-link. So I have to import this file and export it with only 2 columns first one should contains name and second one email means I have to split that hyper-link into name and email.
For 31MB file I changed memory limit to 2048MB and execution time 1200 in php.ini file. I can successfully imported and exported excel file of 31MB but while exporting 70MB file execution takes so much time and gives the following error message.
Fatal error: Allowed memory size of 2147483648 bytes exhausted (tried to allocate 15667514 bytes) in /var/www/html/PHPExcel/Reader/Excel2007.php on line 327
Is it possible to import and export excel file with size 70MB using PHPExcel library? And what I have to change like memory limit and max execution time etc in php.ini file.
require "PHPExcel.php";
require "PHPExcel/IOFactory.php";
$inputFileName = 'xxx.xlsx';
$inputFileType = PHPExcel_IOFactory::identify($inputFileName);
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objReader->setReadDataOnly(true);
$objPHPExcel = $objReader->load($inputFileName);
$outputObj = new PHPExcel();
// Get worksheet dimensions
$sheet = $objPHPExcel->getSheet(0);
$highestRow = $sheet->getHighestRow();
$outputObj->setActiveSheetIndex(0);
$outSheet = $outputObj->getActiveSheet();
// Loop through each row of the worksheet in turn
for ($row = 2; $row <= $highestRow; $row++){ // As row 1 seems to be header
// Read cell B2, B3, etc.
$line = $sheet->getCell('B' . $row)->getValue();
preg_match("|([^\.]+)\ <([^>]+)>|", $line, $data);
if(!empty($data))
{
// $data[1] will be name & $data[2] will be email
$outSheet->setCellValue('A' . $row, $data[1]);
$outSheet->setCellValue('B' . $row, $data[2]);
}
}
$objWriter = new PHPExcel_Writer_CSV($outputObj);
$objWriter->save("xxx.csv");
NOTE: Can I export excel file without making any changes in php.ini file
I got solution. Successfully I have done this task in python. Hopefully it will help someone. :)
# Time taken 2min 4sec for 69.9MB file.
import csv
import re
from openpyxl import Workbook, load_workbook
location = 'big.xlsx'
wb = load_workbook(filename=location, read_only=True)
users_data = []
# pattern = '^(.+?) <([^>].+)>$' # matches "your name <email#email.com>"
# pattern_new = '^(.+?)<([^>].+)>$' # matches "your name<email#email.com>"
# pattern_email = '([\w.-]+#[\w.-]+)' # extracts email from sentence
# Define patterns to check on string.
patterns = ['^(.+?) <([^>].+)>$', '^(.+?)<([^>].+)>$']
# Loop through all sheets in XLSX
for wsheet in wb.get_sheet_names():
# Load data from Sheet.
ws = wb.get_sheet_by_name(wsheet)
# Loop through each row in current Sheet.
for row in ws.rows:
# We need column B data, so get that directly.
# Check if its not empty.
if row[1].value:
val = ""
# Get column B data, remove unnecessary data and encode using utf-8 format.
data = row[1].value.replace("(at)", "#").replace("(dot)", ".").encode('utf-8')
# Loop through all patterns to match in current data.
for pattern in patterns:
# Apply regex on data.
name_data = re.search(pattern, data)
# If match found.
if name_data:
# Create list of matched data and break loop to avoid extra searches on current row.
val = [name_data.group(1), name_data.group(2)]
# val = name_data.group()
break
# If no matches found, check for only email, if not then use data as it is.
if not val:
# val = data
name_data = re.search('([\w.-]+#[\w.-]+)', data)
# If match found, then use that, else use data.
if name_data:
val = [name_data.group(1)]
else:
val = data
# Append new data to users_data array.
users_data.append(val)
# Open CSV file for writting list.
myfile = open('big.csv', 'wb')
# Open file in write mode.
wr = csv.writer(myfile, dialect='excel', delimiter = ',', quotechar='"', quoting=csv.QUOTE_MINIMAL, lineterminator='\n')
# Loop through each value in list.
for word in users_data:
# Append data in CSV.
wr.writerow([word])
# Close CSV file.
myfile.close()
#Priyanka, you can also try using Spout: https://github.com/box/spout. It works great for large files! You won't have to change your php.ini file, as it won't require more than 10MB of memory and should finish before the default time limit.
You can do something like this:
$filePath = 'xxx.xlsx';
$reader = ReaderFactory::create(Type::XLSX);
$reader->open($filePath);
$writer = WriterFactory::create(Type::CSV);
$writer->openToFile($'xxx.csv');
$rowCount = 0;
while ($reader->hasNextSheet()) {
$reader->nextSheet();
while ($reader->hasNextRow()) {
$row = $reader->nextRow();
$rowCount++;
if ($rowCount === 1) {
continue; // that's for the header row
}
// get the values you need in the current row
// for example:
$name = $row[1];
$email = $row[2];
// write the data to the CSV file
$writer->addRow([$name, $email]);
}
}
$reader->close();
$writer->close();
Give it a try! Hopefully it will solve your problem :)
I don't see the point in loading one spreadsheet file, copying everything from that to a second, then saving the second.... that will be memory and performance intensive
why not just load the first, delete your heading row 1, then save to your CSV output
// Read the original spreadsheet
$inputFileName = 'TraiDBDump.xlsx';
$inputFileType = PHPExcel_IOFactory::identify($inputFileName);
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objReader->setReadDataOnly(true);
$objPHPExcel = $objReader->load($inputFileName);
// Remove header row
$objPHPExcel->getSheet(0)->removeRow(1, 1);
// Save as a csv file
$objWriter = new PHPExcel_Writer_CSV($objPHPExcel);
$objWriter->save("TraiDBDump.csv");
If your original has a lot of columns, and you only need A and B, then you could use a read filter to read only those two columns

PHPExcel How to get only 1 cell value?

I would think that a getCell($X, $y) or getCellValue($X, $y) would be available for one to easily pick a a certain value. This can be usefully, as example crosscheck data prior to a larger process.
How do you get a specific value from say cell C3.
I do not want an array of values to sort through.
Section 4.5.2 of the developer documentation
Retrieving a cell by coordinate
To retrieve the value of a cell, the cell should first be retrieved from the worksheet using the getCell method. A cell’s value can be read again using the following line of code:
$objPHPExcel->getActiveSheet()->getCell('B8')->getValue();
Section 4.5.4 of the developer documentation
Retrieving a cell by column and row
To retrieve the value of a cell, the cell should first be retrieved from the worksheet using the getCellByColumnAndRow method. A cell’s value can be read again using the following line of code:
// Get cell B8
$objPHPExcel->getActiveSheet()->getCellByColumnAndRow(1, 8)->getValue();
If you need the calculated value of a cell, use the following code. This is further explained in 4.4.35
// Get cell B8
$objPHPExcel->getActiveSheet()->getCellByColumnAndRow(1, 8)->getCalculatedValue();
By far the simplest - and it uses normal Excel co-ordinates:
// Assuming $sheet is a PHPExcel_Worksheet
$value = $sheet->getCell( 'A1' )->getValue();
You can separate the co-ordinates out in a function if you like:
function getCell( PHPExcel_Worksheet $sheet, /* string */ $x = 'A', /* int */ $y = 1 ) {
return $sheet->getCell( $x . $y );
}
// eg:
getCell( $sheet, 'B', 2 )->getValue();
This is a source based answer feel free to improve or comment.
function toNumber($dest)
{
if ($dest)
return ord(strtolower($dest)) - 96;
else
return 0;
}
function myFunction($s,$x,$y){
$x = toNumber($x);
return $s->getCellByColumnAndRow($x, $y)->getFormattedValue();
}
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objPHPExcel = $objReader->load($inputFileName);
$objPHPExcel->setActiveSheetIndex(0);
$sheetData = $objPHPExcel->getActiveSheet();
$cellData = myFunction($sheetData,'B','2');
var_dump($cellData);
This does not work past the letter Z, and could be improved but works for my needs.

To upload Excel and store it in database?

I want to upload an Excel file into our webpage, then corresponding data store it in database. And then I want to retrieve all data and display it in table format. I have one code but using that I can't upload all Excel files. Only a single format can be upload.
Below is the function. But there is some restriction.
public function check_excel($filename)
{
$path='./assets/uploads/excel/'.$filename;
$this->load->library('excel');
$inputFileType = PHPExcel_IOFactory::identify($path);
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objPHPExcel = PHPExcel_IOFactory::load($path);
$sheet = $objPHPExcel->getSheet(0);
$highestRow = $sheet->getHighestRow();
$highestColumn = $sheet->getHighestColumn();
$xf[]='';
$result[]='';
$first_check='';
$var_check=0;
for ($row = 13; $row <= $highestRow; $row++)
{
$xf[$row]=$objPHPExcel->getActiveSheet()->getCell('A'.$row)->getXfIndex(); // Get sheet index value
if($row>13 && $row<16) //This block check first kpi data expand or not
{
if($xf[$row-1]==$xf[$row]) //check parent and child sheet index value same
$first_check='false';
if ($row==15)
{
if($xf[$row]==$xf[$row-1] || $xf[$row]==$xf[$row-2]) // check the grand-child sheet index value same in parent and child
$first_check='false';
else
{
$first_check='true';
$a=$row-2;
$b=$row-1;
$check_kpi=$objPHPExcel->getActiveSheet()->getCell('A'.$a)->getXfIndex();
$check_unit=$objPHPExcel->getActiveSheet()->getCell('A'.$b)->getXfIndex();
$check_sub_unit=$objPHPExcel->getActiveSheet()->getCell('A'.$row)->getXfIndex();
}
}
}
if($first_check=='true') //This block check second kpi to upto last kpi data expand or not
{
if($row>15)
{
if($var_check==1) // This block check the child data expand or not
{
if($check_unit!=$objPHPExcel->getActiveSheet()->getCell('A'.$row)->getXfIndex())
{
$result[$row]='false';
break;
}
}
if($var_check==2) // this block check the grand - child data expand or not
{
if($check_sub_unit!=$objPHPExcel->getActiveSheet()->getCell('A'.$row)->getXfIndex())
{
$result[$row]='false';
break;
}
}
if($xf[$row]!=$check_sub_unit)
{
if($xf[$row]!=$check_unit)
$var_check=1; // var_check value is one, the kpi is present
else
$var_check=2; // var_check value is two, the unit is present
}
else
$var_check=0; // var_check value is zero, the sub_unit is present
}
}
else if($first_check=='false')
{
$result[$row]='false';
break;
}
}
$return='true';
for ($row = 13; $row <= $highestRow; $row++)
{
if(!empty($result[$row]))
{
if($result[$row]=='false'){
$return='false';
break;
}
}
}
return $return;
}
It sounds like you are using a relational DB (e.g. MySQL, Postgres, etc), which uses fixed column tables.
You should probably use a Document-based DB (e.g. CouchDB, Mongo, etc). This would be the best solution.
But, if you're stuck using a relational DB, you can use an EAV model.
Here is a basic example:
Create a table for the entity (excel file): EntityID, ExcelFileName
Create a table for the attribute (column info): AttributeID, EntityID, AttributeName
Create a table for the value (excel row/column): ValueID, RowNumber, AttributeID, AttributeValue
The downside is that the AttributeValue isn't specifically typed (it's just varchar/text). You can solve this by adding a "AttributeType" to the attribute table that is then used in your code to know what type of data that column should contain. BUT, unless you know the contents/format of the Excel file in advance, you'll probably have to GUESS what the type of a column is...which isn't hard as long as the excel file isn't messed up.
If you're just displaying the data that was imported, this probably isn't a big deal.
There are other (more complex) ways to implement EAV, including one with typed columns, if you have such a need.
Have you tried PHPExcel?
They also have a codeigniter library.
And this post might interest you : how to use phpexcel to read data and insert into database?
You can use PHPExcel of course, but have a look at other data format. Using comma-separated or tab-separated values can help you to solve your problem easily. Excel can save datasheets in these simple formats. Anyway, you cannot save formulas or conditional formatting in you database Moreover, it is much faster and robust and you can import CSV files with LOAD DATA INFILE query.

Categories