Looking to render complex fonts (with diacritics, joined glyphs, right to left text) in various languages/scripts, output is an image (not web page), ideally need to use PHP. The commonly built in graphics libraries for PHP, Imagick and GD, don't support complex fonts, I believe because the version of Freetype they come with doesn't support it.
I've looked into custom building PHP with the possible support but it looks horribly complex and messy.
Any thoughts on an easier solution for this?
Thanks
This a problem I faced a lot too, imagettftext is the most use GD function to write text on a image but the problem with this is that it can't render complex unicode characters, so as long as your language does not have complex characters we are good.
To render complex unicode characters you may need imagick installation plus pango installed on your server. Most of the hosting providers do not have pango installed so that means you need to have a dedicated server ready for your application.
most of the linux distributions comes with pango pre installed so if you managed to install imagick on your local linux matchine following code should work without any problem
/* complex unicode string */
$text = "වෙබ් මත ඕනෑම තැනක";
$im = new \Imagick();
$background = new \ImagickPixel('none');
$im->setBackgroundColor($background);
$im->setPointSize(30);
$im->setGravity(\Imagick::GRAVITY_EAST);
$im->newPseudoImage(300, 200, "pango:" . $text );
$im->setImageFormat("png");
$image = imagecreatefromstring($im->getImageBlob());
//just for print out to the browser
ob_start();
imagepng($image);
$base64 = base64_encode(ob_get_clean());
$url = "data:image/png;base64,$base64";
echo "<img src='$url' />";
Let me know if you find any difficulties with the code
I cannot comment, but if it is Unicode (AFAIK I think it is) you could use character map and echo it from the PHP code. You would probably need to insert a <meta> to make the page a certain variation of Unicode (sorry I don't know too much about Unicode).
Related
I'm trying now to convert a PDF file to JPG, using ImageMagick with PHP and CakePHP. The PDF is in perfect shape and it's right the way it should be, but the image generated from the PDF is always overflowing the borders of the file.
Until now, I've tried tweaking the code for the generation with no sucess, reading a lot from the PHP docs (http://php.net/manual/pt_BR/book.imagick.php).
Here are the convertion code:
$image = new Imagick();
$image->setResolution(300,300);
$image->setBackgroundColor('white');
$image->readImage($workfile);
$image->setGravity(Imagick::GRAVITY_CENTER);
$image->setOption('pdf:fit-to-page',true);
$image->setImageFormat('jpeg');
$image->setImageCompression(imagick::COMPRESSION_JPEG);
$image->setImageCompressionQuality(60);
$image->scaleImage(1200,1200, true);
$image->mergeImageLayers(Imagick::LAYERMETHOD_FLATTEN);
$image->setImageAlphaChannel(Imagick::ALPHACHANNEL_REMOVE);
$image->writeImage(WWW_ROOT . 'files' . DS . 'Snapshots' . DS . $filename);
Here are the results:
https://imgur.com/a/ISBmDMv
The first image is the PDF before the conversion and the second one, the image generated from the PDF where the right side text overflows.
So, why this is happening? And if someone got some alternative for any tech used (the GhostScript, ImageMagick, etc) is also welcome!
Thanks everyone!
Its very hard to say why you see the result you do, without seeing the original PDF file, rather than a picture of it.
The most likely explanation is that your original PDF file uses a font, but does not embed that font in the PDF. When Ghostscript comes to render it to an image it must then substitute 'something' in place of the missing font. If the metrics (eg spacing) of the substituted font do not match precisely the metrics of the missing font, then the rendered text will be misplaced/incorrectly sized. Of course since its not using the same font it also won't match the shapes of the characters either.
This can result in several different kinds of problems, but what you show is pretty typical of one such class of problem. Although you haven't mentioned it, I can also see several places in the document where text overwrites as well, which is another symptom of exactly the same problem.
If this is the case then the Ghostscript back channel transcript will have told you that it was unable to find a font and is substituting a named font for the missing one. I can't tell you if Imagemagick stores that anywhere, my guess would be it doesn't. However you can copy the command line from the ImagMagick profile.xml file and then use that to run Ghostscript yourself, and then you will be able to see if that's what is happening.
If this is what is happening then you must either;
Create your PDF file with the fonts embedded (this is good practice anyway)
Supply Ghostscript with a copy of the missing font as a substitute
Live with the text as it is
We have an Image processing microservice that created rich images with text on top of it and we are in the process of adding arabic locale to our website
While translating some of the content to Arabic our translator told us that the text in generated images not rendering correctly in arabic - they appear with no ligatures, so every character is separated.
We currently use ImageMagick (6.8.9-9) with PHP and the text generation used ImagickDraw->annotateImage that worked fine until we hit that ligatures problem.
I googled a little and found "pango" that solves that problem and also allows some sort of "html" syntax to define multiple settings for one text line which is pretty cool.
Problem is that I didn't find a way to use custom font files
This is the code:
$img = new \Imagick;
$img->newImage($width, $height, new ImagickPixel('transparent'));
$img->setBackgroundColor(new ImagickPixel('transparent'));
$img->setFont($fontFile); // didn't work
$img->setPointSize($fontSize);
$img->setOption("font", $fontFile); // didn't work
$img->newPseudoImage($width, $height, "pango:<span font='".$fontName."' foreground='".$textColor."'>".$text."</span>"); // font='".$fontName."' - also didn't work
I also tried to install the font in ubuntu OS and use only the font name but that didn't worked either, as well as combining all the options above or some of them.
I found that question:
PHP Imagick don't use custom font > convert does. But he said he used "caption" instead of pango - which doesn't provide that cool "html" syntax and does not solve the Arabic issue.
Please - if you know any solution I will love you forever!
Thanks.
I need to convert single Powerpoint (PPT) slides/files to JPG or PNG format on linux but haven't found any way of doing so successfully so far. I have heard that it can be done with open office via php but haven't found any examples or much useful documentation. I'd consider doing it with python or java also, but I'm unsure which route to take.
I understand that it can be done using COM on a Windows server but would really like to refrain from doing so if possible.
Any ideas/pointers gratefully received. (And yes, I have searched the site and others before posting!)
Thanks in advance,
Rob Ganly
Quick answer (2 steps):
## First converts your presentation to PDF
unoconv -f pdf presentation.ppt
## Then convert your PDF to jpg
convert presentation.pdf presentation_%03d.jpg
And voilá.
Explaning a little more:
I had already follow in the same need. Convert a powerpoint set of slides into a set of images. I haven't found one tool to exactly this. But I have found unoconv which converts libreoffice formats to other formats, including jpg, png and PDF. The only drawback is that unoconv only converts one slide to a jpg/png file, but when converting to PDF it converts the whole presentation to a multiple page PDF file. So the answare were convert the PPT to PDF and with imagemagick's convert, convert the multiple page PDF to a set of images.
Unoconv is distributed within Ubuntu distribution
apt-get install unoconv
And convert is distributed with the imagemagick package
apt-get install imagemagick
In my blog there is an entry about this
This can be done from PHP using a 3d party library (Aspose.Slides). It will work on both .ppt and .pptx, and it's lightning fast.
Here is the relevant piece of code in PHP:
$runtime->RegisterAssemblyFromFile("libraries/_bin/aspose/Aspose.Slides.dll", "Aspose.Slides");
$runtime->RegisterAssemblyFromFullQualifiedName("System.Drawing, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a", "System.Drawing");
$sourcefile = "D:\\MYPRESENTATION.ppt";
$presentation = $runtime->TypeFromName("Aspose.Slides.Presentation")->Instantiate($sourcefile);
$format = $runtime->TypeFromName("System.Drawing.Imaging.ImageFormat")->Png;
$x = 0;
/** #var \NetPhp\Core\NetProxyCollection */
$slides = $presentation->Slides->AsIterator();
foreach ($slides as $slide) {
$bitmap = $slide->GetThumbnail(1, 1);
$destinationfile ="d:\\output\\slide_{$x++}.png";
$bitmap->Save($destinationfile, $format);
}
$presentation->Dispose();
It does not use Office Interop (which is NOT recommended for server side automation) and is lightining fast.
You can control the output format, size and quality of the images. Indeed you get a .Net Bitmap object so you can do with it whatever you want.
The original post is here:
http://www.drupalonwindows.com/en/blog/powerpoint-presentation-images-php-drupal-example
//i've added a new take on this please see Cheating PHP integers . any help will be much appreciated. I've had an idea to trying and hack the storage option of the arrays by packing the integers into unsigned bytes (only need 8 or 16 bits integers to reduce the memory considerably).
Hi
I'm currently working on custom charset detection libraries and created a port from Mozilla's charset detection algorithm and used chardet (the python port) for a helping hand. However, this is extremely memory intensive in PHP (around 30mb of memory if I just load in Western language detection). I've optimised all I can without rewriting it from scratch to load each piece (this would reduce memory but make it a lot slower).
My question is that, do you know of any LGPL PHP libraries that do charset detection?
This would be purely for research to give me a slight guiding hand in the right direction.
I already know of mb_detect_encoding but it's far too limited and brings up far too many false positives with the text files i have (yet python's chardet detects them perfectly)
I created a method which encodes correctly to UTF-8. But it was hard to figure out what is currently encoded so I came to this solution:
<?php
function _convert($content) {
if(!mb_check_encoding($content, 'UTF-8')
OR !($content === mb_convert_encoding(mb_convert_encoding($content, 'UTF-32', 'UTF-8' ), 'UTF-8', 'UTF-32'))) {
$content = mb_convert_encoding($content, 'UTF-8');
if (mb_check_encoding($content, 'UTF-8')) {
// log('Converted to UTF-8');
} else {
// log('Could not converted to UTF-8');
}
}
return $content;
}
?>
As you can see I do a conversion to check if it still the same (UTF-8/16) and if not convert it. Maybe you can use some of this code.
First of all, interesting project you are working on! I'm curious how the end product will be.
Have you take a look at the ICU project already?
have anyone come across a php code that convert text or doc into pdf ?
it has to follow the same format as the original txt or doc file meaning the line feed as well as new paragraph...
Converting from DOC to PDF is possible using phpLiveDocx:
$phpLiveDocx = new Zend_Service_LiveDocx_MailMerge();
$phpLiveDocx->setUsername('username')
->setPassword('password');
$phpLiveDocx->setLocalTemplate('document.doc');
// necessary as of LiveDocx 1.2
$phpLiveDocx->assign('dummyFieldName', 'dummyFieldValue');
$phpLiveDocx->createDocument();
$document = $phpLiveDocx->retrieveDocument('pdf');
file_put_contents('document.pdf', $document);
unset($phpLiveDocx);
For text to PDF, you can use the pdf extension is PHP.
You can view the examples here.
Have a look at this SO question. Using OpenOffice in command line mode for conversions can be done, though you'd have to search a bit for the conversion macro's. I'm not saying it's light-weight though :)
See HTML_ToPDF. It also works for text.
It has been a long time since I touched PHP, but if you can make web service calls from it then try this product. It provides excellent conversion fidelity. It also supports additional formats including Infopath, Excel, PowerPoint etc as well as Watermarking support.
Please note that I have worked on this product so the usual disclaimers apply.