have anyone come across a php code that convert text or doc into pdf ?
it has to follow the same format as the original txt or doc file meaning the line feed as well as new paragraph...
Converting from DOC to PDF is possible using phpLiveDocx:
$phpLiveDocx = new Zend_Service_LiveDocx_MailMerge();
$phpLiveDocx->setUsername('username')
->setPassword('password');
$phpLiveDocx->setLocalTemplate('document.doc');
// necessary as of LiveDocx 1.2
$phpLiveDocx->assign('dummyFieldName', 'dummyFieldValue');
$phpLiveDocx->createDocument();
$document = $phpLiveDocx->retrieveDocument('pdf');
file_put_contents('document.pdf', $document);
unset($phpLiveDocx);
For text to PDF, you can use the pdf extension is PHP.
You can view the examples here.
Have a look at this SO question. Using OpenOffice in command line mode for conversions can be done, though you'd have to search a bit for the conversion macro's. I'm not saying it's light-weight though :)
See HTML_ToPDF. It also works for text.
It has been a long time since I touched PHP, but if you can make web service calls from it then try this product. It provides excellent conversion fidelity. It also supports additional formats including Infopath, Excel, PowerPoint etc as well as Watermarking support.
Please note that I have worked on this product so the usual disclaimers apply.
Related
I have a pdf document and I want to check if a specific text occurs (which are tags that I put in while generating the pdf) in the document, however using these libraries (tcpdfFpdi, pdftk or fdpi) I couldn't figure out if it's possible or how to do it.
$str = "{hello}";
$pdf = new TcpdfFpdi();
$pdf->setSourceFile($filePath);
$pdf->searchForText($str); // something like this which returns boolean
If I try without any library to dd(file_get_contents($filePath)), it returns a very long output and doesn't seem to contain the file I want so I think it's better to use one of those libraries.
Just an idea…
It's no actual PHP solution but you could use tools like pdftotext which I know from this post (where a PDF file is converted into a string to count its words): https://superuser.com/a/221367/535203
You can install it and play around with that command and call it from within your PHP application.
As far as I remember (long time ago since I used pdftotext) the output text is not exaclty the PDF's content but to search a few tags in it it's at least a good try.
I am trying to create a Microsoft word document without using any 3rd party libraries. What I am trying to do is :
Create a template document in Microsoft Word
Save it as an XML File
Read this XML file and populate the data in PHP
I am able to do it so far. I would like to export it as an *.docx format. However when I do that, it is throwing an exception, when I try to open it.
Error Message : File is corrupt and cannot be opened
However, when I save it as *.doc, I am able to open the word document.
Any idea, what could be wrong. Do I need to use any libraries to export it to an docx file ?
Thanks
Docx is not backwards-compatible with doc. Docx is a zipped format: Docx Tag Info.
I would recommend you to create another template for the docx format, because the formats are so different.
Also, you might want to check that your code is writing the correct encoding. Before I put it in the correct encoding I was getting odd letters that weren't compatible when I converted it into a .docx format. To do this I implemented it in the inputstream:
InputStreamReader isr= new InputStreamReader(template.getInputStream(entry), "UTF-8");
BufferedReader fileContents = new BufferedReader(isr);
I used this with enumeration for the entry, but the "UTF-8" puts it in the right format and eliminates the odd characters. I was also getting "null" typed out at the end of some of the xml's, so I eliminated that by taking it out (I brought the contents of each file into a string so I could manipulate it anyway):
String ending = "null";
while(sb.indexOf(ending) != -1){
sb.delete(sb.indexOf(ending), (sb.indexOf(ending) + ending.length()));
}
sb was the stringbuilder I put it into. This problem may have been solved with the UTF-8, but I fixed it before I implemented the encoding, so figured I'd include it in case it ends up being a problem. I hope this helps.
I'm using the GD Library to create images from data I'm pulling from an API.
The strings that are returned can sometimes be kind of lengthy, and I'm hoping to find a way to automatically create a new line for text if the string goes too far.
Is there something like this built into the GD library, or will I have to write some code to count the characters and move everything to a new line if it goes too long?
GD is strictly for drawing. You'll need a text layout engine such as Pango.
I am not familier with a built-in function that automatically creates new lines,
so I guess you need to write a php function that sorts the string to "sub-strings"
according to your width length and then use them in your image.
Consider looking at this post:
http://www.php.net/manual/en/function.imagestring.php#90481
I need to convert single Powerpoint (PPT) slides/files to JPG or PNG format on linux but haven't found any way of doing so successfully so far. I have heard that it can be done with open office via php but haven't found any examples or much useful documentation. I'd consider doing it with python or java also, but I'm unsure which route to take.
I understand that it can be done using COM on a Windows server but would really like to refrain from doing so if possible.
Any ideas/pointers gratefully received. (And yes, I have searched the site and others before posting!)
Thanks in advance,
Rob Ganly
Quick answer (2 steps):
## First converts your presentation to PDF
unoconv -f pdf presentation.ppt
## Then convert your PDF to jpg
convert presentation.pdf presentation_%03d.jpg
And voilá.
Explaning a little more:
I had already follow in the same need. Convert a powerpoint set of slides into a set of images. I haven't found one tool to exactly this. But I have found unoconv which converts libreoffice formats to other formats, including jpg, png and PDF. The only drawback is that unoconv only converts one slide to a jpg/png file, but when converting to PDF it converts the whole presentation to a multiple page PDF file. So the answare were convert the PPT to PDF and with imagemagick's convert, convert the multiple page PDF to a set of images.
Unoconv is distributed within Ubuntu distribution
apt-get install unoconv
And convert is distributed with the imagemagick package
apt-get install imagemagick
In my blog there is an entry about this
This can be done from PHP using a 3d party library (Aspose.Slides). It will work on both .ppt and .pptx, and it's lightning fast.
Here is the relevant piece of code in PHP:
$runtime->RegisterAssemblyFromFile("libraries/_bin/aspose/Aspose.Slides.dll", "Aspose.Slides");
$runtime->RegisterAssemblyFromFullQualifiedName("System.Drawing, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a", "System.Drawing");
$sourcefile = "D:\\MYPRESENTATION.ppt";
$presentation = $runtime->TypeFromName("Aspose.Slides.Presentation")->Instantiate($sourcefile);
$format = $runtime->TypeFromName("System.Drawing.Imaging.ImageFormat")->Png;
$x = 0;
/** #var \NetPhp\Core\NetProxyCollection */
$slides = $presentation->Slides->AsIterator();
foreach ($slides as $slide) {
$bitmap = $slide->GetThumbnail(1, 1);
$destinationfile ="d:\\output\\slide_{$x++}.png";
$bitmap->Save($destinationfile, $format);
}
$presentation->Dispose();
It does not use Office Interop (which is NOT recommended for server side automation) and is lightining fast.
You can control the output format, size and quality of the images. Indeed you get a .Net Bitmap object so you can do with it whatever you want.
The original post is here:
http://www.drupalonwindows.com/en/blog/powerpoint-presentation-images-php-drupal-example
Is there a quick-and-dirty way to access the "producer" metadata of a PDF file, using Regex or XML parsing, from a PHP application?
The technique does not have to be infallible. The objective is to prompt the user if they upload a PDF created using TeX.
You can hack the value out by looking for the producer or creator tag but it might be encoded rather than available as ascii.
On the command line, the following outputs a matching line:
$ strings my.pdf | grep TeX
Producer (pdfTeX-1.40.10)
/Creator (TeX)
/PTEX.Fullbanner (This is pdfTeX, Version 3.1415926-1.40.10-2.2 (TeX Live 2009) kpathsea version 5.0.0)
You might do something similar in PHP, see Read plain text from binary file with PHP.