I try to edit ODT-documents programmaticly in PHP. In fact I just want to do some text replacement and adding new rows in a table.
I know that a normal ODF document is an archive containing XML-files. But before I reinvent the wheel: is there any library which does most of the tasks? Or should I just parse the XML-file in a DOM-parser and modify it?
http://en.wikipedia.org/wiki/OpenDocument_software lists several PHP tools for working with ODT.
Docvert
OpenDocumentPHP
odf-xslt
OpenDocument
ods-php
odtPHP
I haven't used any of these, I'm just giving you the list. You'll have to evaluate them to see if they have the specific features you need.
I also suggest that you also try with OpenTBS.
It is able to edit an OpenDocument file using the technical of templates, but it can also retrieve XML in an OpenDocument.
Related
To document my code I thought it would be best practice to use phpDoc syntax, because there are several parsers out there and some IDEs create IntelliSense out of it.
Now I need to put the documentation (API) into a word file, but I don't know which parser is able to output .doc or similar.
I tried DoxyGen, which outputs .rtf and phpDocumentor2, which can only export to .html and .xml (?).
Is there a way to generate a .doc(x) file from phpDoc? Or a simple way to get a document which can be imported to word?
I would appreciate if I don't have to change the phpDoc syntax, because my documentation is very long.
Edit: The prefered parser would be phpDocumentor2, because it supports PHP 5.3 functionalities and it's faster than DoxyGen, but phpDocumentor2 has less features than phpDocumentor, which is no longer maintained, related to output formats.
Edit: I tried to copy content from the .rtf file into the .docx file, but when I select 'Use Destination Styles', both Word instances suspend and do not respond.
Presumably you want one large Word doc that contains all the info for your project in the one doc/file... therefore just opening the phpDoc2 HTML output into Word in order to convert it to docx will not meet your need, since that would be one docx per phpdoc2 HTML page.
You might try altering your searches to be for a tool that can spider a given HTML page, recursively pick up all its target page hierarchy, and convert it all into a single docx. You might have more luck finding a tool that does this but produces a PDF... then you could just use Word to convert the PDF into docx.
I've been asked to write a php script that should read/parse a docx file and do some operations such as duplicate a specific paragraph/table and fill-in some variables (#myvar or $myvar) with values.
What do you guys recommand, use the word/document.xml file directly or convert the whole document to an HTML file and then parse it using DOM(I don't like this solution :( )?
the structure of the docx to parse is not defined yet, it's my job to do that ! And it has to be as general as possible.
To have a clear idea about what I'm doing, the docx file is a CV model that I have to fill-in with data from DB.
P.S: I don't know how to efficiently parse/modify the XML file using Xquery since the only solution I have is to use variables (plain text with $ or #..) inside that docx
thanks for your help :)
There are 2 major PHP libraries able to create Word documents. Here's a description of features from both that might help you solve your problem:
PHPWord (opensource) - allows to load template documents and replace values... take a look at this example in library's source code, maybe you can define a CV template and use this to work a solution out;
PHPDocX (free with basic features, paid for more advanced features) - allows templates and search and replace of content in documents (probably only in paid versions though).
This is a old question, but I thought I give some pointers as I have been struggling with this for some time and have ended up writing my own package at github: wrklst/docxmustache.
Here are some solutions I know of:
Free solutions:
https://github.com/PHPOffice/PHPWord (as mentioned above, cumbersome and not very capable)
http://www.tinybutstrong.com/opentbs.php (works but is highly cumbersome, also introduces a lot of security issues if you plan to allow user supplied templates)
Partially Free and Paid:
https://www.phpdocx.com
http://www.docxpresso.com (looks like one of the more complete solutions to me, at 199 eur for a server license its not too expensive either)
https://modules.docxtemplater.com
I worked with opentbs quite a bit but I am not happy with it and I am currently trying to evaluate to write my own solution that is more geared to my specific needs. Generally you need:
- A zip calss to unzip/rezip the docx file
- A template engine to replace values, I am using mustache (https://github.com/bobthecow/mustache.php)
- If you are planning to replace images as well you need to more advanced file, reference and xml handling. Php's SimpleXMLElement should be sufficient to handle all the xml manipulation.
Off course you can always convert the docx into a more accessible format, but that will greatly mess with any styling. If thats not an issue I recommend to use libreoffice to convert your docx into any format that libreoffice supports. on linux based servers you can easily access it via command line, here an example with symfony for command execution:
$command = "soffice --headless --convert-to html ".$inputfile.' --outdir '.$outputfile.'/');
$process = new \Symfony\Component\Process\Process($command);
$process->start();
while ($process->isRunning()) {}
// executes after the command finishes
if (!$process->isSuccessful()) {
throw new \Symfony\Component\Process\Exception\ProcessFailedException($process);
}
Check out my package wrklst/docxmustache if you want to see this in context.
good luck!
I have a template for a style of Avery stationery in a Word document. What I'd like to do is fill in the template with images (in this case, QR codes) for easy printing and labeling of objects.
I'm wondering, what would be the easiest way to do this? I saved the template as a Word XML file, but looking at the file, I feel hopeless. I also tried converting the template to HTML, but unsuprisingly it screwed up the formatting. I'm not sure where to go next, any ideas?
There are a couple of good options out there to do this. I would recommend PHP LiveDocX (free) or PHP DocX (free basic version).
Check out the PHP Class, MsDoc Generator. It will let you create and add elements (text, tables, images) dynamically.
I need an option from within PHP to Manipulate .docx (Microsoft Office 2007) document.
I need to:
Read the internal text
Convert to .html
To view them inside a browser.
To replace text.
I know I can use Word Automation, creating a COM object of Microsoft Word, but it's too slow, unstable and I have to have it installed on the server.
Is there any library or code that can do it from PHP?
There is PHPWord for that by the authors of PHPExcel.
Docx is just a ZIP file containing multiple XML files and embedded media files like images. Because of this, you can read and edit the document with ease. Just unzip it, open word/document.xml, do reading & writing, and repack the files.
Convet to HTML may be difficult. But you'll find a thumbnail of the first page in docProps/thumbnail.jpeg.
Note that you'll have to familiarize yourself with the XML structure to do any complex edits. There's a summary XML docProps/app.xml which has some metadata for the file so don't forget to update it. Read more from Wikipedia: http://en.wikipedia.org/wiki/Office_Open_XML
You may have a look at PHPDocX I believe it does all you are asking for.
You may replace variables in a template or just plain text from a prexisting Word document.
It offers quite a few conversion options.
You can also extract the text.
You can work with the internal format directly.
DOCX is just a zip file, and inside that there's word/document.xml containing the actual document.
It's quite trivial to unzip the file, read document.xml, str_replace() what you're looking for, save it and re-zip the directory, and it makes for a lightweight, quick and easy mail merge capability for word documents. This also works for other office formats.
Here's the official docs on the internal structure for more information.
There is also a PHP class for merging new content into an existing .docx file. It is available here: http://www.tinybutstrong.com/ . The documentation is pretty good as well as having many examples and it is all free and open source. It does require familiarity with the .docx concepts, though.
I would like to know if someone triend exporting data from MySQL to an ODF format ?
Any information / documentation would be very much appreciated.
I am going to try to export a MySQL result set to ODF spreadsheet if possible.
You could try looking at this: http://www.phpclasses.org/browse/package/4398.html
However looking at the source code it doesn't look great and has lots of hard coded xml strings
Another option to create OOo files with PHP is tbsOOo; however, it is mainly a templating engine.
This class allows to create OpenOffice
documents dynamically by separating
display formatting from logic and
data. In practice, you create a
template using OpenOffice with the
TinyButStrong tags. Then you create a
PHP script that merges the template
with a data source to get a new
OpenOffice document.
There is also: OpenDocument PHP which is more complex and can create the files dynamically.