Metadata extraction from PNG images

Metadata extraction from PNG images - php

How to extract metadata from a image like this website? I have used exev2 library but it gives only limited data as compared to this website. Is there some more advanced library?
I have already tried hacoir-metadata Python library.
Also how does Windows extract details of image (the one we see from properties)?

PNG files are made up of blocks, most of which are IDAT blocks which contain compressed pixel data in an average PNG. All PNG's start with a IHDR block and end with an IEND block. Since PNG is a very flexible standard in this way, it can be extended by making up new types of blocks--this is how animated Animated PNG works. All browsers can see the first frame, but browsers which understand the types of blocks used in APNG can see the animation.
There are many places that text data can live in a PNG image, and even more places metadata can live. Here is a very convenient summary. You mentioned the "Description tag", which can only live in text blocks, so that it was I'll be focusing on.
The PNG standard contains three different types of text blocks: tEXt (Latin-1 encoded, uncompressed), zTXt (compressed, also Latin-1), and finally iTXt, which is the most useful of all three as it can contain UTF-8 encoded text and can either be compressed or decompressed.
So, your question becomes, "what is a convenient way to extract the text blocks?"
At first, I thought pypng could do this, but it cannot:
tEXt/zTXt/iTXt
Ignored when reading. Not generated.
Luckily, Pillow has support for this - humorously it was added only one day before you asked your original question!
So, without further ado, let's find an image containing an iTXt chunk: this example ought to do.
>>> from PIL import Image
>>> im = Image.open('/tmp/itxt.png')
>>> im.info
{'interlace': 1, 'gamma': 0.45455, 'dpi': (72, 72), 'Title': 'PNG', 'Author': 'La plume de ma tante'}
According to the source code, tEXt and zTXt are also covered.
For the more general case, looking over the other readers, the JPEG and GIF ones also seem to have good coverage of those formats as well - so I would recommend PIL for this. That's not to say that the maintainers of hacoir-metadata wouldn't appreciate a pull request adding text block support though! :-)

I found this code buried in a Pillow pull request
from PIL import PngImagePlugin
info = PngImagePlugin.PngInfo() # read PNG data
info.add_text("foo", "bar") # write PNG data
img.save(filenew, "png", pnginfo=info)

You can try this pre-alpha solution by Daniel Chesterton. I am not sure is it just what you want or is it a part of the wanted solution, but I believe you can sort it out by playing with it.
https://github.com/dchesterton/image

Related

ImageMagick with PHP text overflowing PDF to JPG conversion

I'm trying now to convert a PDF file to JPG, using ImageMagick with PHP and CakePHP. The PDF is in perfect shape and it's right the way it should be, but the image generated from the PDF is always overflowing the borders of the file.
Until now, I've tried tweaking the code for the generation with no sucess, reading a lot from the PHP docs (http://php.net/manual/pt_BR/book.imagick.php).
Here are the convertion code:
$image = new Imagick();
$image->setResolution(300,300);
$image->setBackgroundColor('white');
$image->readImage($workfile);
$image->setGravity(Imagick::GRAVITY_CENTER);
$image->setOption('pdf:fit-to-page',true);
$image->setImageFormat('jpeg');
$image->setImageCompression(imagick::COMPRESSION_JPEG);
$image->setImageCompressionQuality(60);
$image->scaleImage(1200,1200, true);
$image->mergeImageLayers(Imagick::LAYERMETHOD_FLATTEN);
$image->setImageAlphaChannel(Imagick::ALPHACHANNEL_REMOVE);
$image->writeImage(WWW_ROOT . 'files' . DS . 'Snapshots' . DS . $filename);
Here are the results:
https://imgur.com/a/ISBmDMv
The first image is the PDF before the conversion and the second one, the image generated from the PDF where the right side text overflows.
So, why this is happening? And if someone got some alternative for any tech used (the GhostScript, ImageMagick, etc) is also welcome!
Thanks everyone!

Its very hard to say why you see the result you do, without seeing the original PDF file, rather than a picture of it.
The most likely explanation is that your original PDF file uses a font, but does not embed that font in the PDF. When Ghostscript comes to render it to an image it must then substitute 'something' in place of the missing font. If the metrics (eg spacing) of the substituted font do not match precisely the metrics of the missing font, then the rendered text will be misplaced/incorrectly sized. Of course since its not using the same font it also won't match the shapes of the characters either.
This can result in several different kinds of problems, but what you show is pretty typical of one such class of problem. Although you haven't mentioned it, I can also see several places in the document where text overwrites as well, which is another symptom of exactly the same problem.
If this is the case then the Ghostscript back channel transcript will have told you that it was unable to find a font and is substituting a named font for the missing one. I can't tell you if Imagemagick stores that anywhere, my guess would be it doesn't. However you can copy the command line from the ImagMagick profile.xml file and then use that to run Ghostscript yourself, and then you will be able to see if that's what is happening.
If this is what is happening then you must either;
Create your PDF file with the fonts embedded (this is good practice anyway)
Supply Ghostscript with a copy of the missing font as a substitute
Live with the text as it is

Imagemagick resizing control quality and extension

I am trying to learn Imagemagick, php.net docs are terrible T_T, and I cannot seem to find any answers to my questions. I am wanting to allow people to upload images then resize them and lose EXIF data.
Heres what I have currently.
$thumbnail = new Imagick("http://4.bp.blogspot.com/-hsypkqxCH6g/UGHEHIH43sI/AAAAAAAADGE/0JBu9izewQs/s1600/luna-llena1.jpg");
$thumbnail->thumbnailImage( 100, 100, true );
$thumbnail->writeImage( "avatar/thumbnail.jpg" );
Now how do I control the image file that it is being saved as? Lets say the user submits a gif/png/jpg how would I go about taking that image then saving it as the same input format or changing them all to .png?

This IMO produces the best results for imagick thumbnails;
Load the picture
$img = new imagick( $_FILES['Picture']['tmp_name'] );
Trim an excess off the picture
$img->trimImage(0);
Create the thumbnail, in this case, I'm using 'cropThumbnailImage'
$img->cropThumbnailImage( 180, 180 );
Set the format so all pics can now be the same standard format
$img->setImageFormat( 'jpeg' );
Set the Image compression to that of a jpg
$img->setImageCompression(Imagick::COMPRESSION_JPEG);
Set the quality to be 100
$img->setImageCompressionQuality(100);
The resulting thumbnail is then a little bit blury IMO, so I add a slight sharpening effect to make it 'sharper'. . play around with these settings, but I like..
$img->unsharpMaskImage(0.5 , 1 , 1 , 0.05);

I agree, the PHP.net docs are not very helpful. I've found that it's easiest to find how to do things using commands, then match the commands up with the PHP methods. I'm a little late replying so you might have figured it out by now, but if not, or for the benefit of anyone else:
If you want to change the image format before saving, add this before your writeImage line:
$thumbnail->setImageFormat('png');
Then change the extension in your writeImage line to match, e.g. thumbnail.png
To change the quality, write:
$thumbnail->setImageCompressionQuality(40); // Adjust the number 40
In some cases you might also want to set the compression type by writing:
$thumbnail->setImageCompression(Imagick::COMPRESSION_JPEG);
You can find the COMPRESSION constants here: http://www.php.net/manual/en/imagick.constants.php
Note: These are just examples. This compression would not actually work with a png file.

exif_read_data - Incorrect APP1 Exif Identifier Code

I have problem with some of my photos when i want to read EXIF data.
My code below:
$exif_date = exif_read_data($file_path, 'IFD0');
With some images i get warrning:
Message: exif_read_data(001.jpg) [function.exif-read-data]: Incorrect APP1 Exif Identifier Code
My question is: how can I awoid this warrning, can I check somehow if app1 is correct before exif_read?
Thanks for help.

For the quick answer, take a look at the last rows of this post.
I think some code is still missing. I came exactly across the same problem and after searching I found multiple websites related to this problem:
http://drupal.org/node/556970
a bug report with 2 solutions:
simply put an # in front of exif_read_data
check $imageinfo['APP1'] if it contains Exif
After reading dcro's answer here, I found out that the second parameter of getimagesize() returns such an $imageinfo array. Now I tested one of my images with the following code:
<?php
getimagesize("test.jpg", $info);
var_dump($info);
?>
This returned the following:
array(1) {
["APP1"]=>
string(434) "http://ns.adobe.com/xap/1.0/<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Exempi + XMP Core 4.1.1">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:type>Image</dc:type>
<dc:format>image/jpeg</dc:format>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end="w"?>"
}
This btw. doesn't look like Exif. This looks more like XMP, but the funny part is that for example the exiftool finds some exif data (orientation for example). In the XMP specification I found that it is possible to have XMP and Exif data side by side in one file (page 18). Further search revealed that there are script like this one to extract Exif from XMP.
Anyway, since
getimagesize() does not give me usable information about the Exif in my picture and
the stated script shows that in my image the Exif data is not embedded into the XMP data and
it simply works to suppress the exif-read-data() warning
I will still use the #exif-read-data($file_path) solution.

You can use PHP's getimagesize() function to extract the APP markers from the file and then verify if the APP1 marker actually contains EXIF data (the content for that marker should start with 'Exif')

On the fly pdf creation with 16bit colour depth png support in php

I'm trying to create pdf documents on the fly in an application, i.e. a user clicks a link and a pdf document is displayed to them with some text and some images.
I'm currently using FPDF v1.6 (http://www.fpdf.org/) which supports 24bit (true colour) png's but the problem I have is that this is a legacy application and there's 1000's of png's that are of 16bit colour depth which FPDF does not support and I can't simply convert due to other parts of the application using these images.
The only solutions I see are:
convert the 16bit png image on the fly and embed that into the pdf.
find a new class pdf class that will accept 16bit colour depth png's.
Anyone have any ideas?

Maybe you could try using TCPDF (never used it with 16bit PNGs but it should be easy to test it).

Fixed with this in python:
def fix_16_bit_depth_not_supported(raw_image_path):
"""
fix
RuntimeError: FPDF error: 16-bit depth not supported: test.png
"""
new_file, filename = tempfile.mkstemp(suffix='.png')
os.close(new_file)
i = cv2.imread(raw_image_path, cv2.IMREAD_UNCHANGED)
img = np.array(i, dtype=np.float32)
convert = img / 255.
cv2.imwrite(filename, convert)
return filename

ImageMagick failing to convert to JPG

We recently installed the latest version of ImageMagick onto our Linux server. I seem to be having issues performing the most basic of tasks.
I am running this command line:
/usr/bin/convert /location/to/source/design.ai /location/to/save/output.jpg
Unfortunatly is saves design.jpg as an illustrator file (if I rename the file to output.ai it opens). Even if I do this:
/usr/bin/convert /location/to/source/design.ai -rotate 90 /location/to/save/design.jpg
It rotates the file and saves again as an illustrator document. This happens with all filetypes (e.g. png, bmp, etc...)
It appears ImageMagick cannot figure out what I want it converted to and just saves as the same file type.
Any ideas on fixing this?
Regards:
John

(Yes, McKay is properly right. This question would be better placed at serverfault.)
But I have an idea. By doing 'convert' only one gets a hint at the bottom:
To specify a particular image format, precede the filename
with an image format name and a colon (i.e. ps:image) or specify the
image type as the filename suffix (i.e. image.ps).
Perhaps convert gets confused by the path given.
So you could try this:
convert /location/to/source/design.ai output.jpg
or
convert /location/to/source/design.ai jpg:/location/to/save/output.jpg
Regards
Sigersted

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.