PHPExcel dumping Excel contents onto Screen - php

I am using PHPExcel for the first time. I just wrote a basic code snippet to read one of my Excel files. I want to load the file, iterate through each row and process its contents. However, the function to load the file seems to dump its contents onto the screen.
My code snippet is this:
include 'lib/Classes/PHPExcel/IOFactory.php';
$dest = "uploads/";
$excel = "2012-12-STANDARD.xls";
$objPHPExcel = PHPExcel_IOFactory::load($dest.$excel);
On running this code, the data in the sheet has been echoed twice onto the screen. First, like a regular echo, and the second is a var_dump.
Here is a sample snippet of the screen output:
DOM ELEMENT: HTML DOM ELEMENT: BODY DOM ELEMENT: P START OF
PARAGRAPH: END OF PARAGRAPH: FLUSH CELL: A1 => Type ZipCode City
State County AreaCode CityType CityAliasAbbreviation CityAliasName
Latitude Longitude TimeZone Elevation CountyFIPS DayLightSaving
PreferredLastLineKey ClassificationCode MultiCounty StateFIPS
CityStateKey CityAliasCode PrimaryRecord CityMixedCase
CityAliasMixedCase StateANSI CountyANSI FacilityCode
CityDeliveryIndicator CarrierRouteRateSortation FinanceNumber
UniqueZIPName D 18540 ......
array 1 =>
array
'A' => string 'Type ZipCode City State County AreaCode CityType CityAliasAbbreviation CityAliasName Latitude Longitude TimeZone
Elevation CountyFIPS DayLightSaving PreferredLastLineKey
ClassificationCode MultiCounty StateFIPS CityStateKey CityAliasCode ........'... (length=4573)
Am I doing something wrong here? Why would the load function echo the contents before accessing it?

The cause of the problem is that the IOFactory static load method fails to determine your file's format correctly. You might not want to use it after all, because according to the documentation:
While easy to implement in your code, and you don't need to worry
about the file type; this isn't the most efficient method to load a
file; and it lacks the flexibility to configure the loader in any way
before actually reading the file into a PHPExcel object.
To successfully load the file you can instantiate a Reader explicitly specifying the format. For a file in Excel 2007 format it would be:
$xl_reader = PHPExcel_IOFactory::createReader('Excel2007');
$xl = $xl_reader->load("/tmp/yourfile.xls");
You can also use a Reader's canRead() method to determine wether the reader that you created can load the specified file.
$xl_reader = PHPExcel_IOFactory::createReader('Excel2007');
if ($xl_reader->canRead('/tmp/yourfile.xls')) {
echo "It's a success! Loading the file...";
$xl = $xl_reader->load('/tmp/yourfile.xls');
...
} else {
echo "Cannot read the file.";
...
}

That code is picking up on the HTML Reader, which still has some of my diagnostics in the code (mea culpa)... if you edit the file Classes/PHPExcel/Reader/HTML.php and comment out every line that contains an echo or a var_dump statement, then it should eliminate the problem.
Coincidence that it's something I was actually working on last night.
Then you can also ask the person who provided you with the file to give you a proper .xls file in future, rather than one which has an extension of .xls but contains html markup rather than a properly formatted BIFF file.

Try adding the PHPExcel to the PHP include path.
set_include_path(get_include_path().PATH_SEPARATOR.'lib/Classes/PHPExcel');
include 'lib/Classes/PHPExcel/IOFactory.php';
$dest = "uploads/";
$excel = "2012-12-STANDARD.xls";
$objPHPExcel = PHPExcel_IOFactory::load($dest.$excel);

Related

Detect an edited csv file using PHPExcel

I am using PHPExcel to validate csv files before parsing them and storing in my database and server. I am trying to use the file properties to determine if the file has been modified or if it is the original file. I have used the following for .xls, .xlsx with great results (using the appropriate reader);
$file = $_FILES['file']['tmp_name'];
$reader = new PHPExcel_Reader_CSV();
if($reader->canRead($file)){
$object = $reader->load($file);
$created = $object->getProperties()->getCreated();
$modified = $object->getProperties()->getModified();
if(!$created===$modified){
//File has been edited and cannot be used
}else{
//File is good, continue processing
}
}
However, when using CSV files, NOTHING is working as expected. I renamed an MS-Word doc to .csv->passed, edited a csv->passed, even used a .jpg->passed. What on earth am I missing?? Any help would greatly appreciated! Edit->I should note that $created and $modifed are an exact match when var_dump($object) despite having edited the file and confirming the changes within the document properties.
The properties values accessible from PHPExcel are those stored within the file itself, not within the directory entries for that file.
CSV files don't have any inherent properties of their own; CSV is purely a raw data file format These property methods are for accessing the properties that do exist in other spreadsheet formats such as BIFF (xls) and OfficeOpenXML (xlsx) which do support them. Loading a CSV (or other format that doesn't support properties) into PHPExcel will provide default value for those properties (so that calls like you're making won't trigger fatal errors), but it cannot provide actual values for something that doesn't natively exist in the format being loaded.

PHPExcel: How to check whether a XLS file is valid or not?

I'm using PHPExcel 1.7.8 to read .xls files, uploaded by a radom user. All is working properly with a valid .xls file, but now I wanted to make some tests with invalid files to check if the program displays good error messages.
So I took a .csv file, and renamed it with .xls (without converting anything, just changing the name) to the end, just to check...
Broken! :)
DOM ELEMENT: HTML
DOM ELEMENT: BODY
DOM ELEMENT: P
START OF PARAGRAPH:
END OF PARAGRAPH:
FLUSH CELL: A1 => block,date,hour...
array
1 =>
array
'A' => string 'block,date,hour...' (length=2777)
{"step":"error","errors":[],"warnings":[]}
Like you can see, there's an error message displaying, I didn't ask for that, and then the JSON that I usually write.
It happens on this line :
<?php
echo "Loading file\n";
try {
if (!($objPHPExcel = PHPExcel_IOFactory::load('path'))) {
echo "Failed\n";
return;
// ...
}
} catch(Exception $e) {
echo 'Exception !';
}
echo "Done\n";
And this code displays:
Loading file
/!\ ERROR MESSAGE ABOVE /!\
Done
My question is, is there a way with PHPExcel or anything else to check whether a file is a valid XLS file before I try to parse it?
Thank you.
Even if it's more than a year question, I still find it cumbersome to figure it out how to deal with this issue, I'll try to post my answer here.
If using try/catch block doesn't work (in my case, I renamed a jpg file to xls and the error handler doesn't work, instead of throwing error, invalid file just throws a warning), you can consider a manual checking using canRead() as Mark said, here's an example of how to use this function.
If you know what your filetypes are, you can define it manually and check against them:
$valid = false;
$types = array('Excel2007', 'Excel5');
foreach ($types as $type) {
$reader = PHPExcel_IOFactory::createReader($type);
if ($reader->canRead($file_path)) {
$valid = true;
break;
}
}
if ($valid) {
// TODO: load file
// e.g. PHPExcel_IOFactory::load($file_path)
} else {
// TODO: show error message
}
Hope this help anyone with the same problem.
Each reader in PHPExcel has a canRead() method that validates the file passed in to the read is of the appropriate format for that reader - the method returns a simple boolean True or False. A return of True from a call to the canRead() method of the PHPExcel_Reader_Excel5 class will confirm that the file can be read by that reader, irrespective of the file extension.
The IOFactory identify() method uses this call, testing against the Readers for each supported format in turn until it gets a true return from the canRead() call. The IOFactory load() method, in it's turn, uses identify() to determine which Reader should be used for the specified file.
The ability to verify a filetype (without depending on the file extension which can often be misleading) is particularly useful when you want to set additional arguments for the reader.
The fallback from identify()/load() is slightly less satisfactory: if canRead() returns false for all other Readers, then the file is treated as a CSV.

phpexcel - using it with excel template (chart goes missing) php

I have tried to use phpexcel with my own template file. phpexcel loads the file and writes data to some cells A2, A3, A4 for example.. and opens an output file with the new data.
my template file has chart built-in.. all i want to phpexcel to do is to populate values in cells and don't touch the chart. And, open the new file. (Please note that I don't want to make the chart in code.. I want the chart to pre-exist with in my template in same format as I created originally). Only the data should update.
But, when i try to do this.. the chart itself goes missing from the resulting file. After trying various ways.. still failed.
And, i found the following code from http://phpexcel.codeplex.com/discussions/397263
require_once 'Classes/PHPExcel.php';
/** PHPExcel_IOFactory */
include 'Classes/PHPExcel/IOFactory.php';
$target ='Results/';
$fileType = 'Excel2007';
$InputFileName = $target.'Result.xlsx';
$OutputFileName = $target . '_Result.xlsx';
//Read the file (including chart template)
$objReader = PHPExcel_IOFactory::createReader($fileType);
$objReader->setIncludeCharts(TRUE);
$objPHPExcel = $objReader->load($InputFileName);
//Change the file
$objPHPExcel->setActiveSheetIndex(0)
// Add data
->setCellValue('C3','10' )
->setCellValue('C4','20' )
->setCellValue('C5','30')
->setCellValue('C5','40' );
//Write the file (including chart)
PHPExcel_Settings::setZipClass(PHPExcel_Settings::PCLZIP);
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, $fileType);
$objWriter->setIncludeCharts(TRUE);
$objWriter->save($OutputFileName);
The above code works in excel 2010 and now keeps my chart in tact... but still when I try to use filetype "Excel5" it doesn't work.
It throws the following error:
Fatal error: Call to undefined method PHPExcel_Reader_Excel5::setIncludeCharts()
in D:\IT\bfstools\PHPExcel\MyExamples\test1.php on line 16
Please provide a simple solution where I want my template file to work with .xls and .xlsx and all my original chart in the template file should stay intact. I do not want the chart removed it from the resulting file. Neither do I plan to create the chart using phpexcel code. (why write unnecessary code when excel can do all the work for you).
I want the easiest way out which is just to use everything with in my template and just populate cells with new data. And, my existing chart in the template comes live automatically. I don't want to write unnecessary code while I can safely rely on excel template and charting functions.
Please help.
There's a very good reason for this:
Charting is only implemented in core, and for the Excel2007 Readers and Writers at this point in time, so all of the other readers or writers will ignore charts, treat them as though they simply don't exist. The intention is to roll out charting to the other readers/writers over the coming year.
EDIT
I see from your comment that you don't understand how PHPExcel works at all, so I have a lot of explaining to do.
PHPExcel is not a library for "editing" workbook files: you're not using PHPExcel to change a file, you're changing a PHPExcel object that can be loaded from a file, and can subsequently be written to a file.
PHPExcel Core is an in-memory representation of the spreadsheet, with the different constituent objects such as worksheets, cells, images, styles, etc all represented as PHP Objects.
The PHPExcel Readers parse a spreadsheet file and load all the components from a file that they have been programmed to recognise, and create the appropriate PHPExcel core objects from those file components. If there is no equivalent PHPExcel Core object (such as Pivot Tables), then that file component can't be "loaded"; if the loader hasn't been programmed to recognise a file component, then it can't be loaded. In these cases, those elements from the file are simply ignored. Once the Reader has done it's job, a PHPExcel object exists, and the spreadsheet file is closed and forgotten.
When a PHPExcel Core object exists in memory, you have a set of methods allowing you to manipulate and change it, to add, modify or delete Core elements; but these work purely on the "in memory" collection of worksheet, cell, style objects that comprise the PHPExcel Core. The Core exists without knowledge of having been loaded from a file or having been created using a PHP "new PHPExcel()" statement; it makes no changes to files in any way.
When writing, the reverse is true. Each Writer takes the PHPExcel core objects, and writes them to a file in the appropriate format (Excel BIFF, OfficeOpenXML, HTML, etc). Like the Readers, each writer can only write those PHPExcel Core objects that it has been programmed to write. If it has not been programmed to write (for example, charts) then any charts defined in the PHPExcel Core will be ignored because that writer simply doesn't know how to write them yet. Likewise, features that exist in PHPExcel Core that are not supported by the file format that is being written to (such as cell styles for the CSV Writer) are ignored.
So to support a spreadsheet feature such as charts, it is necessary for the PHPExcel Core object collection to have been modified to provide an "in memory" representation of those elements, and for the different Readers to have been programmed to recognise those elements in the file they are loading and to convert them to the appropriate PHPExcel Core objects, and for the different Writers to have been programmed to convert the PHPExcel core representation to the appropriate file representation.
Each Reader and each Writer needs to be programmed individually. Charts is a relatively new feature, only added to the PHPExcel Core in the 1.7.7 release, and at this point only the Reader and Writer for the Excel2007 format have been programmed to recognise chart elements.
While it is the intention of the developers to extend this to cover the other formats as well, the necessary code isn't created automagically. Programming each individual Reader and Writer takes time and effort. While the Chart code for the Excel2007 Reader and Writer has now stabilised to the point where it is now no longer considered "experimental", and development focus is turning to writing the necessary code for chart handling in the Excel5 Reader and Writer, it is work that has not yet been completed.
If you can use Golang, try Excelize. Support save file without losing original charts of XLSX.
Try set setIncludeCharts
$objReader = PHPExcel_IOFactory::createReader('Excel2007');
// Tell the reader to include charts when it loads a file
$objReader->setIncludeCharts(TRUE);
// Load the file
$objPHPExcel = $objReader->load($filePath);

Is it possible to read a pdf file as a txt?

I need to find a certain key in a pdf file. As far as I know the only way to do that is to interpret a pdf as txt file. I want to do this in PHP without installing a addon/framework/etc.
Thanks
You can certainly open a PDF file as text. PDF file format is actually a collection of objects. There is a header in the first line that tells you the version. You would then go to the bottom to find the offset to the start of the xref table that tells where all the objects are located. The contents of individual objects in the file, like graphics, are often binary and compressed. The 1.7 specification can be found here.
I found this function, hope it helps.
http://community.livejournal.com/php/295413.html
You can't just open the file as it is a binary dump of objects used to create the PDF display, including encoding, fonts, text, images. I wrote an blog post explaining how text is stored at http://pdf.jpedal.org/java-pdf-blog/bid/27187/Understanding-the-PDF-file-format-text-streams
Thank you all for your help. I owe you this piece of code:
// Proceed if file exists
if(file_exists($sourcePath)){
$pdfFile = fopen($sourcePath,"rb");
$data = fread($pdfFile, filesize($sourcePath));
fclose($pdfFile);
// Check if file is encrypted or not
if(stripos($data,$searchFor)){ // $searchFor = "/Encrypt"
$counterEncrypted++;
}else{
$counterNotEncrpyted++;
}
}else{
$counterNotExisting++;
}

Open an excel file using COM and save it as .xml file

Im trying the following code:
<?php
$workbook = "D:\b2\\test.XLS";
$sheet = "Sheet1";
#Instantiate the spreadsheet component.
$ex = new COM("Excel.sheet") or Die ("Did not connect");
#Get the application name and version
print "Application name:{$ex->Application->value}<BR>" ;
print "Loaded version: {$ex->Application->version}<BR>";
#Open the workbook that we want to use.
$wkb = $ex->application->Workbooks->Open($workbook) or Die ("Did not open");
#Create a copy of the workbook, so the original workbook will be preserved.
$ex->Application->ActiveWorkbook->SaveAs("D:\b2\Ourtest.xml");
#$ex->Application->Visible = 1; #Uncomment to make Excel visible.
#Optionally, save the modified workbook
$ex->Application->ActiveWorkbook->SaveAs("D:\Ourtest.xml");
#Close all workbooks without questioning
$ex->application->ActiveWorkbook->Close("False");
unset ($ex);
?>
This actually works and creates the Ourtest.xml file. But im getting characters like:
ÐÏࡱá > þÿ þÿÿÿ
I have tried with SaveAs("D:\Ourtest.pdf") and it says the file has been corrupted or incorrectly decoded.
Can anyone help me please?Thanks
That is because you are saving it as Excel Format. Check the Excel documentation for how to save it as an XML document - it might be a separate paramter to SaveAs or a different method alltogether (Export*).
EDIT: It seems SaveAs is the right method to use. Check the msdn documentation here. You probably want to specify the second parameter FileFormat. Maybe setting it to the XlFileFormat.xlXMLSpreadsheet value?
Problem solved by using
$ex->Application->ActiveWorkbook->SaveAs("D:\Ourtest.xml",46);
46 is the value for xlXMLSpreadsheet
Thanks everyone

Categories