I'm importing a file with .dot extension but I get a error:
ZipArchive::locateName(): Invalid or uninitialized Zip object
The weird thing is that a .dotx file does work. Is .dot not supported anymore?
From its README.md:
The current version of PHPWord supports Microsoft Office Open XML (OOXML or OpenXML), OASIS Open Document Format for Office Applications (OpenDocument or ODF), Rich Text Format (RTF), HTML, and PDF.
Office Open XML is the newer Microsoft Office file format consisting of a zip archive containing several other files. .dotx files are Office Open XML files.
In contrast, .dot files are not zip archives. They're for legacy versions of Microsoft Office.
Office Open XML is an open specification, making it much easier to use than legacy Office formats by third-party applications. This is probably one of the reasons that PHPWord supports it but doesn't support legacy file types.
I suggest saving your .dot file as a .dotx file in a modern version of Word and then working with the .dotx version of the file in your PHP code.
Related
I have heard of PHP Excel inorder to work with excel documents in PHP. However do I require MS Office on the Ubuntu System inorder to work and serve excel documents? Or can the PHP Excel extension work ad hoc with serving excel documents in any system?
PHPExcel is pure PHP, it has no requirement to access MS Office in any way
From the readme of the PHPExcel repo on github
PHPExcel is a library written in pure PHP and providing a set of classes that allow you to write to and read from different spreadsheet file formats, like Excel (BIFF) .xls, Excel 2007 (OfficeOpenXML) .xlsx, CSV, Libre/OpenOffice Calc .ods, Gnumeric, PDF, HTML, ... This project is built around Microsoft's OpenXML standard and PHP.
Requirements
PHP version 5.2.0 or higher
PHP extension php_zip enabled (required if you need PHPExcel to handle .xlsx .ods or .gnumeric files)
PHP extension php_xml enabled
PHP extension php_gd2 enabled (optional, but required for exact column width autocalculation)
I have some problem using phpexcel api.
this api is taking to long to fill data to exist template excel.
so, I want to write by pure php without using any api.
I want to know how to fill data to template excel by pure php.
Please give me some advise. Thanks :)
Old xls files were proprietary binary file formats, quite complicated, also known as Excel BIFF, you can find
reverse engineered specification here: http://www.openoffice.org/sc/excelfileformat.pdf
Microsoft's public specification here: [MS-XLS]: Excel Binary File Format (.xls) Structure (PDF) and here: [MS-XLS]: Excel Binary File Format (.xls) Structure (HTML)
New xlsx files are "standardized" open formats. It is basically a zip file (rename it to *.zip and extract) with few xml files inside
Some general information is available at http://en.wikipedia.org/wiki/Office_Open_XML
More detailed documentation is available from
MSDN: Office → Dev Center → Open XML SDK → Understanding the Open XML file formats
and from Ecma International → Ecma Office Open XML File Formats Standard
Still even the new file format is quite complicated if you want to be able to do everything or anything. In that case reusing several man/years of development effort (including debugging) materialized in a form of an existing PHP library as suggested by #mark-baker is reasonable
If you just need to do a specific task, e.g. populate existing xlsx template file with some data then you only need
a PHP functions for copying files
a PHP functions to work with zip files
and a PHP functions to work with xml files
and the documentation (from the links above) or an executable documentation in a form of Excel.exe
EDIT better links to the specification both for the old and for the new Excel file formats were provided by Mark Baker
I'm developing a system where users will be uploading documents. It will only support MS Office files 2003 and earlier so it currently barfs if the user uploads any of the x files (docx, xlsx, pptx).
I've found PHPWord in another SO question, but this does the exact opposite of what I need and also is still in what looks to be early beta.
Worst case scenario, are there any good libraries for conversion from these same MSO-X formats to PDFs?
Is there any alternative for PHP_excel which can "Export to XLSX/XLS" file in a customized format?
This is a General Reference question for the php tag
For Writing Excel
PEAR's PHP_Excel_Writer (xls only)
php_writeexcel from Bettina
Attack (xls only)
XLS File Generator commercial and xls only
Excel Writer for PHP from Sourceforge (spreadsheetML only)
Ilia Alshanetsky's Excel extension now on github (xls and xlsx, and requires commercial libXL component)
PHP's COM extension (requires a COM enabled spreadsheet program such as MS Excel or OpenOffice Calc running on the server)
The Open Office alternative to COM (PUNO) (requires Open Office installed on the server with Java support enabled)
PHP-Export-Data by Eli Dickinson (Writes SpreadsheetML - the Excel 2003 XML format, and CSV)
Oliver Schwarz's php-excel (SpreadsheetML)
Oliver Schwarz's original version of php-excel (SpreadsheetML)
excel_xml (SpreadsheetML, despite its name)... link reported as broken
The tiny-but-strong (tbs) project includes the OpenTBS tool for creating OfficeOpenXML documents (OpenDocument and OfficeOpenXML formats)
SimpleExcel Claims to read and write Microsoft Excel XML / CSV / TSV / HTML / JSON / etc formats
KoolGrid xls spreadsheets only, but also doc and pdf
PHP_XLSXWriter OfficeOpenXML
PHP_XLSXWriter_plus OfficeOpenXML, fork of PHP_XLSXWriter
php_writeexcel xls only (looks like it's based on PEAR SEW)
spout OfficeOpenXML (xlsx) and CSV
Slamdunk/php-excel (xls only) looks like an updated version of the old PEAR Spreadsheet Writer
For Reading Excel
php-spreadsheetreader reads a variety of formats (.xls, .ods and .csv)
PHP-ExcelReader (xls only)
PHP_Excel_Reader (xls only)
PHP_Excel_Reader2 (xls only)
XLS File Reader Commercial and xls only
SimpleXLSX From the description it reads xlsx files , though the author constantly refers to xls
PHP Excel Explorer Commercial and xls only
Ilia Alshanetsky's Excel extension now on github (xls and xlsx, and requires commercial libXL component)
PHP's COM extension (requires a COM enabled spreadsheet program such as MS Excel or OpenOffice Calc running on the server)
The Open Office alternative to COM (PUNO) (requires Open Office installed on the server with Java support enabled)
Nuovo's spreadsheet-reader (csv, xls, xlsx, and ods)
SimpleExcel Claims to read and write Microsoft Excel XML / CSV / TSV / HTML / JSON / etc formats
PHPExcleReader Is just a ZIP with an old version of PHPExcel
Akeneo Labs Spreadsheet Parser OfficeOpenXML (.xlsx) and CSV files
spout OfficeOpenXML (xlsx) and CSV
xhook's php-spreadsheetreader Claims to do most formats
A new C++ Excel extension for PHP, though you'll need to build it yourself, and the docs are pretty sparse when it comes to trying to find out what functionality (I can't even find out from the site what formats it supports, or whether it reads or writes or both.... I'm guessing both) it offers is phpexcellib from SIMITGROUP.
All claim to be faster than PHPExcel from codeplex or from github, but (with the exception of COM, PUNO Ilia's wrapper around libXl and spout) they don't offer both reading and writing, or both xls and xlsx; may no longer be supported; and (while I haven't tested Ilia's extension) only COM and PUNO offers the same degree of control over the created workbook.
I wrote a very simple class for exporting to "Excel XML" aka SpreadsheetML. It's not quite as convenient for the end user as XSLX (depending on file extension and Excel version, they may get a warning message), but it's a lot easier to work with than XLS or XLSX.
http://github.com/elidickinson/php-export-data
I have a web project where I must import text and images from a user-supplied document, and one of the possible formats is Microsoft Office 2007. There's also a need to generate documents in this format.
The server runs CentOS 5.2 and has PHP/Perl/Python installed. I can execute local binaries and shell scripts if I must. We use Apache 2.2 but will be switching over to Nginx once it goes live.
What are my options? Anyone had experience with this?
The Office 2007 file formats are open and well documented. Roughly speaking, all of the new file formats ending in "x" are zip compressed XML documents. For example:
To open a Word 2007 XML file Create a
temporary folder in which to store the
file and its parts.
Save a Word 2007 document, containing
text, pictures, and other elements, as
a .docx file.
Add a .zip extension to the end of the
file name.
Double-click the file. It will open in
the ZIP application. You can see the
parts that comprise the file.
Extract the parts to the folder that
you created previously.
The other file formats are roughly similar. I don't know of any open source libraries for interacting with them as yet - but depending on your exact requirements, it doesn't look too difficult to read and write simple documents. Certainly it should be a lot easier than with the older formats.
If you need to read the older formats, OpenOffice has an API and can read and write Office 2003 and older documents with more or less success.
The python docx module can generate formatted Microsoft office docx files from pure Python. Out of the box, it does headers, paragraphs, tables, and bullets, but the makeelement() module can be extended to do arbitrary elements like images.
from docx import *
document = newdocument()
# This location is where most document content lives
docbody = document.xpath('/w:document/w:body',namespaces=wordnamespaces)[0]
# Append two headings
docbody.append(heading('Heading',1) )
docbody.append(heading('Subheading',2))
docbody.append(paragraph('Some text')
I have successfully used the OpenXML Format SDK in a project to modify an Excel spreadsheet via code. This would require .NET and I'm not sure about how well it would work under Mono.
You can probably check the code for Sphider. They docs and pdfs, so I'm sure they can read them. Might also lead you in the right direction for other Office formats.