Prestashop TCPDF validation errors - php

I'm using tcpdf on generating pdf files out of html-snippets out of the prestashop (Version 1.6) source code.
Everything seems fine on viewing the pdf in Adobe Reader. On trying to send it via an electronic-fax service (e-post.de) i'm not abled to upload the file,...sadly without an error log or message.
On checking generated file on http://www.pdf-tools.com/pdf/validate-pdfa-online.aspx, i'm getting this result:
Validating file "AYTKXFQRB_TestEins.pdf" for conformance level pdfa-3b
The required XMP property 'pdfaid:part' is missing.
The required XMP property 'pdfaid:conformance' is missing.
The embedded font program 'AAAAAB+ArialMT' cannot be read.
A device-specific color space (DeviceRGB) without an appropriate output intent is used.
The embedded font program 'AAAAAC+ArialMT,Bold' cannot be read.
The glyph for character 8364 in font 'AAAAAB+ArialMT' is missing.
The glyph for character 8364 in font 'AAAAAC+ArialMT,Bold' is missing.
The width for character 0 in font 'AAAAAB+ArialMT' does not match.
The width for character 0 in font 'AAAAAC+ArialMT,Bold' does not match.
The document does not conform to the requested standard.
The document contains device-specific color spaces.
The document contains fonts without embedded font programs or encoding information (CMAPs).
The document's meta data is either missing or inconsistent or corrupt.
Done.
I think, there are a lot of warnings which can be ignored. Files generated directly out of Adobe InDesign are also containing the majority of these warnings. The only warnings which seems to be hurting are these:
The required XMP property 'pdfaid:part' is missing.
The required XMP property 'pdfaid:conformance' is missing.
Any ideas how to resolve this issue?
EDIT:
Solution for the xmp-Error:
Setting the flag pdfa=true on calling the constructor to activate the pdfa_mode ...
public function __construct($orientation='P', $unit='mm', $format='A4', $unicode=true, $encoding='UTF-8', $diskcache=false, $pdfa=false)
But I have to still get rid of the font-errors :-(

Related

Pdflib textfield in table cell - encoding breaks when losing focus

While entering text the first time it seems fine. But as soon as the textfield loses focus the encoding seems to break and remains broken when entering the field again.
the following optlist was given to "add_table_cell"
fieldtype=textfield fieldname={price_5f215239aaa89} fitfield={multiline=true linewidth=1 font=1 fontsize=3 scrollable=false}
Edit
Pdflib version: 9.1.1
Font loaded with $pdflib->load_font($fontName, 'unicode', 'kerning=true embedding=true fontwarning=true');
The font is used by other elements in the Document without problems.
Font in Textflow
Font in Form Field
With PDFlib 9.1.1 you have to use a font with 8-bit encoding for a form field.
So you should load a second font with:
$fieldfont = $pdflib->load_font($fontName, 'winansi', 'simplefont embedding nosubsetting');
Please see PDFlib 9.1.1 API Reference, chapter 4.1, table 4.2 for details on the simplefont option.
Please apply the $fieldfont fonthandle to the form field option list.

ImageMagick with PHP text overflowing PDF to JPG conversion

I'm trying now to convert a PDF file to JPG, using ImageMagick with PHP and CakePHP. The PDF is in perfect shape and it's right the way it should be, but the image generated from the PDF is always overflowing the borders of the file.
Until now, I've tried tweaking the code for the generation with no sucess, reading a lot from the PHP docs (http://php.net/manual/pt_BR/book.imagick.php).
Here are the convertion code:
$image = new Imagick();
$image->setResolution(300,300);
$image->setBackgroundColor('white');
$image->readImage($workfile);
$image->setGravity(Imagick::GRAVITY_CENTER);
$image->setOption('pdf:fit-to-page',true);
$image->setImageFormat('jpeg');
$image->setImageCompression(imagick::COMPRESSION_JPEG);
$image->setImageCompressionQuality(60);
$image->scaleImage(1200,1200, true);
$image->mergeImageLayers(Imagick::LAYERMETHOD_FLATTEN);
$image->setImageAlphaChannel(Imagick::ALPHACHANNEL_REMOVE);
$image->writeImage(WWW_ROOT . 'files' . DS . 'Snapshots' . DS . $filename);
Here are the results:
https://imgur.com/a/ISBmDMv
The first image is the PDF before the conversion and the second one, the image generated from the PDF where the right side text overflows.
So, why this is happening? And if someone got some alternative for any tech used (the GhostScript, ImageMagick, etc) is also welcome!
Thanks everyone!
Its very hard to say why you see the result you do, without seeing the original PDF file, rather than a picture of it.
The most likely explanation is that your original PDF file uses a font, but does not embed that font in the PDF. When Ghostscript comes to render it to an image it must then substitute 'something' in place of the missing font. If the metrics (eg spacing) of the substituted font do not match precisely the metrics of the missing font, then the rendered text will be misplaced/incorrectly sized. Of course since its not using the same font it also won't match the shapes of the characters either.
This can result in several different kinds of problems, but what you show is pretty typical of one such class of problem. Although you haven't mentioned it, I can also see several places in the document where text overwrites as well, which is another symptom of exactly the same problem.
If this is the case then the Ghostscript back channel transcript will have told you that it was unable to find a font and is substituting a named font for the missing one. I can't tell you if Imagemagick stores that anywhere, my guess would be it doesn't. However you can copy the command line from the ImagMagick profile.xml file and then use that to run Ghostscript yourself, and then you will be able to see if that's what is happening.
If this is what is happening then you must either;
Create your PDF file with the fonts embedded (this is good practice anyway)
Supply Ghostscript with a copy of the missing font as a substitute
Live with the text as it is

Can't use custom fonts to PHP Imagick with Pango / another solution for arabic ligatures

We have an Image processing microservice that created rich images with text on top of it and we are in the process of adding arabic locale to our website
While translating some of the content to Arabic our translator told us that the text in generated images not rendering correctly in arabic - they appear with no ligatures, so every character is separated.
We currently use ImageMagick (6.8.9-9) with PHP and the text generation used ImagickDraw->annotateImage that worked fine until we hit that ligatures problem.
I googled a little and found "pango" that solves that problem and also allows some sort of "html" syntax to define multiple settings for one text line which is pretty cool.
Problem is that I didn't find a way to use custom font files
This is the code:
$img = new \Imagick;
$img->newImage($width, $height, new ImagickPixel('transparent'));
$img->setBackgroundColor(new ImagickPixel('transparent'));
$img->setFont($fontFile); // didn't work
$img->setPointSize($fontSize);
$img->setOption("font", $fontFile); // didn't work
$img->newPseudoImage($width, $height, "pango:<span font='".$fontName."' foreground='".$textColor."'>".$text."</span>"); // font='".$fontName."' - also didn't work
I also tried to install the font in ubuntu OS and use only the font name but that didn't worked either, as well as combining all the options above or some of them.
I found that question:
PHP Imagick don't use custom font > convert does. But he said he used "caption" instead of pango - which doesn't provide that cool "html" syntax and does not solve the Arabic issue.
Please - if you know any solution I will love you forever!
Thanks.

How to fix that adding a custom font in tcpdf results in dotted output?

I want to add a custom font. I converted an otf file to ttf, and load them via:
$std = \TCPDF_FONTS::addTTFfont($frutigerStd, 'TrueTypeUnicode', '', 96);
These command seem to do something as these values are set, as $std will have the value frutigerltstdcn.
And use set them, in my extended TCDP class via:
$this->SetFont($std);
Yet once I open my generated pdf, Adobe Reader will declare:
Cannot extract the embedded font 'AAAAAC+FrutigerLTStd-Cn'.
Some characters may not display or print correctly.
And true enough, the result is a dotted mess:
What am I missing or doing wrong?
It turned out that my font was erroneous. It was generated by converting an otf font to ttf, and even though I can use that generated font within MacOS, it has issues with TCDP. Once I got the font as an actual true type file, the issue was resolved.

Metadata extraction from PNG images

How to extract metadata from a image like this website? I have used exev2 library but it gives only limited data as compared to this website. Is there some more advanced library?
I have already tried hacoir-metadata Python library.
Also how does Windows extract details of image (the one we see from properties)?
PNG files are made up of blocks, most of which are IDAT blocks which contain compressed pixel data in an average PNG. All PNG's start with a IHDR block and end with an IEND block. Since PNG is a very flexible standard in this way, it can be extended by making up new types of blocks--this is how animated Animated PNG works. All browsers can see the first frame, but browsers which understand the types of blocks used in APNG can see the animation.
There are many places that text data can live in a PNG image, and even more places metadata can live. Here is a very convenient summary. You mentioned the "Description tag", which can only live in text blocks, so that it was I'll be focusing on.
The PNG standard contains three different types of text blocks: tEXt (Latin-1 encoded, uncompressed), zTXt (compressed, also Latin-1), and finally iTXt, which is the most useful of all three as it can contain UTF-8 encoded text and can either be compressed or decompressed.
So, your question becomes, "what is a convenient way to extract the text blocks?"
At first, I thought pypng could do this, but it cannot:
tEXt/zTXt/iTXt
Ignored when reading. Not generated.
Luckily, Pillow has support for this - humorously it was added only one day before you asked your original question!
So, without further ado, let's find an image containing an iTXt chunk: this example ought to do.
>>> from PIL import Image
>>> im = Image.open('/tmp/itxt.png')
>>> im.info
{'interlace': 1, 'gamma': 0.45455, 'dpi': (72, 72), 'Title': 'PNG', 'Author': 'La plume de ma tante'}
According to the source code, tEXt and zTXt are also covered.
For the more general case, looking over the other readers, the JPEG and GIF ones also seem to have good coverage of those formats as well - so I would recommend PIL for this. That's not to say that the maintainers of hacoir-metadata wouldn't appreciate a pull request adding text block support though! :-)
I found this code buried in a Pillow pull request
from PIL import PngImagePlugin
info = PngImagePlugin.PngInfo() # read PNG data
info.add_text("foo", "bar") # write PNG data
img.save(filenew, "png", pnginfo=info)
You can try this pre-alpha solution by Daniel Chesterton. I am not sure is it just what you want or is it a part of the wanted solution, but I believe you can sort it out by playing with it.
https://github.com/dchesterton/image

Categories