Fill in a Microsoft Word Form with PHP - php

I have a form created in Microsoft Word that I need to fill in via PHP. I have looked at PHPWord, but it looks like you can only create Word documents with it. I considered exporting the form to XML and editing it that way, but the formatting gets screwy from the export. Is there another way?

At the time of writing this, there is no direct solution.
You might want to have a look at the best answer for Create Word Document using PHP in Linux to get a hint (uses OpenOffice documents that you can change since they are XML+ZIP, and converts opendocument to .doc on cmdline).
Another alternative is - if you run your script on a windows server - to use the COM interface to speak with Word. See http://drewd.com/2007/01/25/reading-from-a-word-document-with-com-in-php for an example to read a file, and - by digging through the Word COM API - you can also change existing documents.

Related

PHP str_replace into microsoft word template

Right now I'm get task to make generate contract letter function in HRMS.
I'm already using CKEditor but the result is very different since the purpose made CKEditor is not like Microsoft Word or Google Docs purpose.
So I'm having idea that I'm making the template first in Microsoft Word and use PHP function str_replace to passing the data into Microsoft Word template.
The question is :
1. With that flow, is it possible to do that?
2. If Question 1 is possible can you hit me with the sample?
Many Thanks,
Hendra
There are several Classes that can do at least part of what you are trying to do:
wrklst/docxmustache
openTBS – Tiny But Strong
PHPWord
docxtemplater pro (basic opensource / free version / MIT license available as of writing; image replacing is a commercial plugin)
docxpresso (commercial)
phpdocx (commercial)
The first 4 of these are at least partially open source and investigating the code will help you understand the process, which is not trivial with word. In addition you can check out http://officeopenxml.com for the format details.
The main problem I see is with proper HTML to openXML conversion. Meaning to convert the styling from CKEditor (which might be HTML) into the proper XML Styling, which functions quite differently and a direct translation is not trivial. Check out https://github.com/wrklst/docxmustache/blob/master/src/WrkLst/DocxMustache/HtmlConversion.php so see some basic HTML conversion on singular runs of bold, italic and underlined text.
To my knowledge there is no maintained open source package that delivers proper html to openxml conversion. If you need this and cannot write it yourself, you will probably go for one of the paid solutions.
Good luck.
Docx is a zipped format that contains some xml. If you want to build a simple replace {tag} by value system, it can already become complicated, because the {tag} is internally separated into <w:t>{</w:t><w:t>tag</w:t><w:t>}</w:t>. If you want to embed loops to iterate over an array, it becomes a real hassle.
source : https://docxtemplater.readthedocs.io/en/latest/goals.html
You could use the library I created in answer for this problem : https://github.com/open-xml-templating/docxtemplater , it works with JS in the browser or with node.js.

Importing/Copying and Pasting Word Document to HTML

We need to import OR copy and paste word documents and convert them to HTML ready data.
Here's my thoughts:
collect the text with file_get_contents
apply the function nl2br
However, it does not account for bold and other text formatting.
Also, there are several microsoft characters that we shouldn't require.
What is a good strategy for word imports into beautiful HTML?
I wouldn't try to tackle all of this on your own. word2cleanhtml.com looks like it will suit your needs and may have an API offering soon.
However, it appears that you can use Word itself from the command line to convert your document for you. This will, of course, require that MS Word is installed on your PHP server.
shell_exec("C:/Program Files/Microsoft Office/Office12/WINWORD.EXE /msaveashtml C:/path/to/your.doc");
The above code uses the macro defined in this answer to a similar question. You will need to copy the the saveashtml macro from that answer and add it to Word.

Scan Php code looking for Gettext

I need to generate .po files for the Php code of my web application. This is a very large application that needs to be translated to
several different languages. So far, I have been using PoEdit in order to generate my .po files. The problem lies in that many
of my files lack the Gettext notation echo _("message") and in the past I only used echo "message".
This is what I think it could be the best solution for my issue:
Create a script that scans my Php code and tells me which of my messages are being and not being displayed using Gettext. How would I do this?
Replace those string not using Gettext with the appropriate Gettext pattern.
Can you please advise me what is the best aproach in order to get all my code using Gettext that I should look to?
You have one way how to convert echoing messages without translation by gettext to echoing messages with translation by gettext:
If you know that all messages are represented by only one variable, for example $Message, following way will be made relatively fast, else you would have to find all used echoing of any messages ... and do that manually (mostly if message is only one - and it is represented by text, not by variable holding that text).
In your editor to start global search (in all files) and then, to go file by file and use replacing - and set $Message for search and _($Message) for replacement ... unless your editor allows replacing in more files at the same time. Then you would do all replacement at one time.
I would not suggest to do this replacing directly in php.

Saving single pages of a word file as separate documents using COM

Lately I've been playing with Microsoft COM object class for PHP to manipolate word files. So far so good, as I've been able to make it work and do some file conversions, such as saving an entire DOC as a PDF on the server.
Now I'm facing a problem: since I'll be converting and manipulating the given word file a lot at runtime, I thought it would be much better if I could save every single -page- separately and work on them one by one instead of reprocessing the whole document each time.
I have been reading all the MSDN part about the COM Document Class, and I have the feeling that I can't save just one page of the document, unless I do some sort of magic using the Range Method, but apparently there's -no way- to know the 'current end position' for each page. Any ideas?
tl;dr I'm trying to save single pages inside a word document using a 'word.application' COM object through a PHP script, but I can't find examples of the Document.Range method.
Francesco, I'll have to warn you. #SLaks is correct in that you really cannot use Word Automation on a server. No, really. We're serious.
There are two reasons:
First, Word is an incredibly complex piece of software designed to be used by an interactive user. It was not programmed or tested to be used under a server environment, and does not work correctly when running under a non-interactive account (the way services do). Sooner or later it will crash or freeze. I've seen it. I'm not talking necessarily about bugs. There are things that Word will do that require a full user account; or where Word expects somebody will be clicking on message boxes. There is no escaping it.
Second, because even if you manage to make it do what you want, it turns out that the Office license expressely forbids you from running Word that way.
Now, exclusively from the point of view of Automation:
Word doesn't really manipulate 'pages'. 'Pages' are just an incidental side-effect of whichever printer is currently selected. Take the same file to a different computer with a different printer and/or driver, and the pagination can change. On large documents it will change.
Yes, most of the time the page breaks don't move (a lot), particularly if you have a document that is a bunch of not-quite-a-full-page forms, but I'm not trying to be fastidious: The point is, the Word document object model won't help you a lot to manipulate 'pages' because they are not a first-class citizen but incidental formatting.
I guess that your best bet would be to use section breaks between the pages, instead of letting the pages autoflow; that way you have something for the object model to grab onto.
You can use the ActiveDocument.Sections collection to locate your... ahem... 'pages' (really, section objects), then use the Range method (to extract the Range object) and the ExportAsFixedFormat method to export that range to a PDF.
If you want a Word document instead, I don't think the object model allows you to save a piece of the document as a separate document. However you can easily copy-and-paste the range to a new document and save that instead.
I have written some code in VB.net that splits a passed word document into individual pages. It then goes on to save the pages as JPG images so I would think this is what you want.
I am happy to share the code with you if you've not accomplished the task yet?

Does a PHP library exist to work with PRC/.mobi files?

I'm writing a WordPress plugin to create an eBook from a selected category in most major eBook formats. I would like to support MobiPocket since that's the format used by the Kindle but I'm not sure how to go about it. From what I've read .mobi files are actually Palm Resource Databases (PRC) but I haven't been able to find a PHP class to work with these.
I thought about using exec along with KindleGen but that would be undesirable as it would complicate initial setup. I've also thought about hosting a web service somewhere and using XML-RPC to accomplish this but that also complicates things.
My question is: is there a PHP class/library (PHP-only preferred) that can work with PRC or even better, a class that specialises in creating MobiPocket ebooks? (needs to be open source since I'm releasing under the GPL)
I've tried searching but haven't been able to find anything.
I don't know whether you're still looking for this PHP library, but just in case: https://github.com/raiju/phpMobi. This is a library that creates mobi files from html files.
It's should still be seen as an experimental version, but it should work without a problem for basic document with a few images.
Unfortunately not; however, the binary compiled format is an open specification available at:
http://www.mobipocket.com/dev/article.asp?BaseFolder=prcgen
The only direct way of transforming the uncompiled format is using the native XML functionality of PHP to create them and then invoking a compiler with exec, which I understand you don't want to do. If you go with this route, the link above also has details about this XML format.
You might want to try the mobiperl tools,
https://dev.mobileread.com/trac/mobiperl/wiki
Please note I haven't tested them yet. But they have been
around since at least 2007 so they should work well by now.
google "Mobiperl - Perl tools for handling MobiPocket files" to
find a thread on mobileread board discussing it. As a new
poster I can't put 2 hyperlinks into my reply.
Another tool I have recently found (but not yet tested), is: http://www.phpclasses.org/package/8173-PHP-Generate-Kindle-ebook-file-in-mobi-format.html#files
It is based upon KindleGen, and looks pretty straight forward to implement.

Categories