Is there any quick way to read/write an ODF file from PHP?
---- edit -----
I needed this for an ODS(spreadsheet) file
http://www.opendocumentphp.org/ seems to be unmaintained so i didn't even bother.
http://www.phpcraparchive.org/browse/package/4398.html (ods-php) gives an fatal error(tries to allocate more than 512Mb) when opening the file
http://www.odtphp.com/index.php?i=dev&p=Odf if for writing documents only from what i saw(i need to read from spreadsheets also)
http://sourceforge.net/projects/php-o3-template/ (PHP ODF templates) is pre-alpha so again did not try it
There seem to be multiple. OpenDocumentPHP looks most promising:
http://www.opendocumentphp.org/
http://www.odtphp.com/index.php?i=dev&p=Odf
http://sourceforge.net/projects/php-o3-template/
http://en.wikipedia.org/wiki/OpenDocument_software
http://www.phpcraparchive.org/browse/package/4398.html
Evaluate and report back! ;}
OpenTBS is a PHP tool that can create OpenDocument files (ODT, ODS, ODF, ODG, ...) using templates. It is also able to read data in OpenDocument files.
Related
i'm using phpexcel and i have a problem: when creating a reader object i get this error:
Fatal error: Class 'PHPExcel_Reader_excel.php' not found in C:\xampp\htdocs\phpexcel\Classes\PHPExcel\IOFactory.php on line 170
my code is:
<?php
require_once(dirname(__FILE__)."/Classes/phpexcel.php");
//or
require_once(dirname(__FILE__)."/Classes/PHPExcel/IOFactory.php");
//$phpexcel = new PHPExcel();
$reader = PHPExcel_IOFactory::createReader("excel.php");
?>
i checked IOFactory.php on line 170 and found this:
$searchType = 'IReader';
// Include class
foreach (self::$_searchLocations as $searchLocation) {
if ($searchLocation['type'] == $searchType) {
$className = str_replace('{0}', $readerType, $searchLocation['class']);
$instance = new $className();
if ($instance !== NULL) {
return $instance;
}
}
}
but it is not possible to locate any class because they are using _ instead of / (the path is phpexcel\Classes\PHPExcel\Reader and there are files like excel5.php excel2007.php but not excel.php)
what is wrong? documentation is a litle bit confusing
Unless you've added a custom reader of your own called PHPExcel_Reader_excel.php then this will return an error.
As described in section 1 of PHPExcel User Documentation - Reading Spreadsheet Files online and in the /Documentation folder, there are 7 different readers available for 7 different spreadsheet formats:
PHPExcel can read a number of different spreadsheet file formats, although not all features are supported by all of the readers. Check the Functionality Cross-Reference document (Functionality Cross-Reference.xls) for a list that identifies which features are supported by which readers.
Currently, PHPExcel supports the following File Types for Reading:
Excel5
The Microsoft Excel™ Binary file format (BIFF5 and BIFF8) is a binary file format that was used by Microsoft Excel™ between versions 95 and 2003. The format is supported (to various extents) by most spreadsheet programs. BIFF files normally have an extension of .xls. Documentation describing the format can be found online at http://msdn.microsoft.com/en-us/library/cc313154(v=office.12).aspx or from http://download.microsoft.com/download/2/4/8/24862317-78F0-4C4B-B355-C7B2C1D997DB/[MS-XLS].pdf (as a downloadable PDF).
Excel2003XML
Microsoft Excel™ 2003 included options for a file format called SpreadsheetML. This file is a zipped XML document. It is not very common, but its core features are supported. Documentation for the format can be found at http://msdn.microsoft.com/en-us/library/aa140066%28office.10%29.aspx though it’s sadly rather sparse in its detail.
Excel2007
Microsoft Excel™ 2007 shipped with a new file format, namely Microsoft Office Open XML SpreadsheetML, and Excel 2010 extended this still further with its new features such as sparklines. These files typically have an extension of .xlsx. This format is based around a zipped collection of eXtensible Markup Language (XML) files. Microsoft Office Open XML SpreadsheetML is mostly standardized in ECMA 376 (http://www.ecma-international.org/news/TC45_current_work/TC45_available_docs.htm) and ISO 29500.
OOCalc
aka Open Document Format (ODF) or OASIS, this is the OpenOffice.org XML File Format for spreadsheets. It comprises a zip archive including several components all of which are text files, most of these with markup in the eXtensible Markup Language (XML). It is the standard file format for OpenOffice.org Calc and StarCalc, and files typically have an extension of .ods. The published specification for the file format is available from the OASIS Open Office XML Format Technical Committee web page (http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office#technical). Other information is available from the OpenOffice.org XML File Format web page (http://xml.openoffice.org/general.html), part of the OpenOffice.org project.
SYLK
This is the Microsoft Multiplan Symbolic Link Interchange (SYLK) file format. Multiplan was a predecessor to Microsoft Excel™. Files normally have an extension of .slk. While not common, there are still a few applications that generate SYLK files as a cross-platform option, because (despite being limited to a single worksheet) it is a simple format to implement, and supports some basic data and cell formatting options (unlike CSV files).
Gnumeric
The Gnumeric file format is used by the Gnome Gnumeric spreadsheet application, and typically files have an extension of .gnumeric. The file contents are stored using eXtensible Markup Language (XML) markup, and the file is then compressed using the GNU project's gzip compression library. http://projects.gnome.org/gnumeric/doc/file-format-gnumeric.shtml
CSV
Comma Separated Value (CSV) file format is a common structuring strategy for text format files. In CSV flies, each line in the file represents a row of data and (within each line of the file) the different data fields (or columns) are separated from one another using a comma (“,”). If a data field contains a comma, then it should be enclosed (typically in quotation marks ("). Sometimes tabs “\t” or the pipe symbol (“|”) are used as separators instead of a comma. Because CSV is a text-only format, it doesn't support any data formatting options.
You need to specify the reader by name when you use the createReader() method, e.g:
$reader = PHPExcel_IOFactory::createReader("Excel5");
There are plenty of examples in the /Examples folder showing this usage for different readers, for letting PHPExcel itself select the correct reader using load(), and for verifying that your file is of the correct format before setting the reader using the identify() method
I have to confess, I'd thought this documentation was fairly straightforward, especially with the examples that are included
To make it easier you could use
$objReader = PHPExcel_IOFactory::createReaderForFile($file);
and it will automatically pick a reader for your file
I have been trying to find a simple way to create OpenOffice calc files with no success.
I have tried:
openTBS - Seems to work writing an xml and a template file but can't find anything about how the xml file format.
Ods php generator - I tried this one as it provides clear examples, but when I copy the files to my server I always get corrupted files
Php doc writer - Tried an example and got an sxw file. I don't even know what that is
ODS-PHP - No documentation, only one example for creating 4 cells
Everything looks old, stalled and undocumented. ¿Any suggestion?
I have used opentbs successfully.
You can generate both excel and calc files. It also nice that you can "reuse" your html implementation so to speak.
Maybe this thread could get you going http://www.tinybutstrong.com/forum.php?thr=3069
Do the html version first.. then edit for calc/excel
Spout from Box works well enough for me. There are some missing features but it is simple to use, has a fluent API, and has no dependencies (it supports composer but you can use it standalone and its dependency graph has zero depth 😉 ).
Here's my "array of objects to ODS" pipeline, using Spout:
(I'm not using their recommended use import because all this code fits in a much larger file that I didn't want to contaminate and the $factory pattern looks cleaner to me anyway)
$factory = 'Box\Spout\Writer\Common\Creator\WriterEntityFactory';
$factory::createODSWriter()
->openToBrowser('filename.ods')
->addRow($factory::createRow([
$factory::createCell(__('Heading 1')),
$factory::createCell(__('Heading 2')),
$factory::createCell(__('Heading 3')),
]))
->addRows(array_map(function($row) use ($factory) {
return $factory::createRow([
$factory::createCell($row->first_val),
$factory::createCell($row->second_val),
$factory::createCell($row->third_val),
]);
}, loadDataFromSomewhere()))
->close();
I would like to merge multiple doc or rtf files into a single file which should be the same format of multiple files.
What I mean is that if a user selects multiple rtf template files from a list box and clicks on a button on web page, the output should be a single rtf file which combines multiple rtf template files, I should use php for this.
I haven't decided the format of template files, but it should be either rtf or doc, and also I assume that template file has some images as well.
I have spent many hours to research the library for this, but still can't find it out.
Please help me out here!! :(
Thanks in advance.
If you are searching for a solution for handling RTF documents only, you can find a PHP package to merge multiple RTF documents here :
www.rtftools.com
Here is a short example on how to merge multiple documents together :
include ( 'path/to/RtfMerger.phpclass' ) ;
$merger = new RtfMerger ( 'sample1.rtf', 'sample2.rtf' ) ; // You can specify docs to be merged to the class constructor...
$merger -> Add ( 'sample3.rtf' ) ; // or by using the Add() method
$merger [] = 'sample4.rtf' ; // or by using the array access methods
$merger -> SaveTo ( 'output.rtf' ) ; // Will save files 'sample1' to 'sample4' into 'output.rtf'
This package allows you to handle documents that are bigger than the available memory.
I've been working on a similar project and havne't managed to find any PHP (or any other open source language) libraries for manipulating MSWord files. The way I approach it is kind of complicated, but works. Here's how I would do it (assuming you have a Linux server):
Setup:
Install JODConverter and OpenOffice
Start open office as a server (see http://www.artofsolving.com/node/10)
Approach (ie. what to do in your PHP code):
Convert your MSWord or RTF files into ODT format by calling JODConverter via backticks or exec()
Unzip each file into a temporary directory of its own
Read the contents.xml file from each unzipped document using a DOM Parser
Extract the <office:text> contents from each, and concatenate
Put this concatenated xml back into the right spot in one of the content.xml files
Re-zip the contents of that temporary directory and give it an .odt extension
Use JODConverter to convert this file back to MSWord again
As I said, it's not pretty, but it does the job.
If you're looking to go down the RTF route, this question may also help: Concatenate RTF files in PHP (REGEX)
I guess no one was lucky to found the best solution of handling reports in php, specialy when it's a .doc/x report or file .... i searched for sometime and then i found phpdocx.com .. amazing php script, but it just doesn't work, and i don't know exactly where to find the output file ... and unfortunately the documentation doesn't help at any level ...
Now i need to know the way this script work .. i mean how results come out and become usable ... and what needs it take the script to work .. because it simply doesn't work on my local host .. i am using appache 2, php 5.2.6 ..
I don't actually need more than writing html with in ( a real doc format file, not rename a html file to .doc !! ), so if there is any solution ( without the COM Lib ... i am not on a windows server ) to generate real doc file with HTML .. please but it here
Thanks very much in advance :)
I guess no one was lucky to found the best solution of handling
reports in php, specialy when it's a .doc/x report or file
This is not the question corresponding to the title, but you should try OpenTBS.
It's an open source PHP library which builds DOCX with the technique of templates.
No temp directory, no extra exe needed. First create your DOCX, XLSX, PPTX with Ms Office, (ODT, ODS, ODP are also supported, that's OpenOffice files). Then you use OpenTBS to load the template and change the content using the Template Engine (easy, see the demo). At the end, you save the result where you need. It can be a new file, a download flow, a PHP binary string.
OpenTBS can also change pictures and charts in a document.
Demo page
Documentation
The documentation of PHPDocX has been greatly improved.
Have you tried to look at the PHPDocX tutorial?
You may also have a look at the Forum.
require_once "Path of phpdocx library/CreateDocx.inc";
$docx = new CreateDocx();
$html = 'your data will store in this variable';
$docx->embedHTML(
$html,
array(
'parseDivsAsPs' => true,
'downloadImages' => true,
'WordStyles' => array(
'<table>' => 'MediumGrid3-accent5PHPDOCX'
),
'tableStyle' => 'NormalTablePHPDOCX'
)
);
$docx->createDocx($varPublicPath.'/word_export_file/example1_'.time());
// this is location where your docx file will generate(inside word_export_file docx file will store)
I want to add an word import function to our CMS, the only problem I cannot seems to find a good library for reading docx files (Word 2007).
Do anyone has some recommendations, the library should be able to extract content of the document and basic styling like italic, bold, superscript?
Thanks for your help
docx files are actually just containers for the document's XML. You should be able to unzip the docx file and then go to the word folder inside, then to the document.xml. This has the actual text. But things like the fonts and styles are in other xml files in the docx container, so you'll probably want to mess around a bit and figure out what is what and how to match it up (start by using namespaces, I bet).
But yea, unzip the file, then use simplexml to convert it into something you can actually mess around with.
PHPDocX PRO includes a TransformDoc class that can read .docx (zip) files and generate XHTML (or PDF) from it:
...
require_once 'phpdocx_pro/classes/TransformDoc.inc';
$doc = new TransformDoc();
$doc->setStrFile($file->filepath);
$doc->generateXHTML();
$html = $doc->getStrXHTML();
There is a library to do this but it works with Zend framework may be it will help you
It is called phpLiveDocx : http://www.phplivedocx.org/downloads/
The library is licensed under New Bcd
I have just find a library that has both reading and writing support check it on the codeplex forge http://openxmlapi.codeplex.com and it is licensed under GPLv2 .
Or, since you requested a library, you may want to look into something like Docvert. I was just looking around based on your question, and it's my favorite so far for PHP. You input the word file location, it transforms it into something simple with the attributes and all that good stuff.
Convert a docx document to a odt using OpenOffice. Use then eZ Components to do the parsing and import. They actually use the import in their CMZ eZ Publish.
Here is a simple working solution I found
http://webcheatsheet.com/php/reading_the_clean_text_from_docx_odt.php