Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Please, any ideas on how to extract image from pdf in php?
Take a look at pdfimages. Here is the description from the page:
Pdfimages saves images from a Portable
Document Format (PDF) file as
Portable Pixmap (PPM), Portable Bitmap
(PBM), or JPEG files.
Pdfimages reads the PDF file, scans
one or more pages, PDF-file, and
writes one PPM, PBM, or JPEG file for
each image, image-root-nnn.xxx,
where nnn is the image number and
xxx is the image type (.ppm, .pbm,
.jpg).
NB: pdfimages extracts the raw image
data from the PDF file, without
performing any additional
transforms. Any rotation, clipping,
color inversion, etc. done by the PDF
content stream is ignored.
I believe you can use imagemagic as well. You can send it command line arguments and snap a picture given the coordinates you can provide. You will need to install some rpms etc.
Check out PDFLib. Their TET product does just that. You can get the images and text out... Only thing it doesn't cover is vector images.
If you have an existing PDF File I guess it's pretty impossible to extract an image from there using PHP, maybe you'll have better luck with C: you need to disassemble the binary file, decode/decompress/decompile it and find where the image is stored, then copy it.
It's easier if you just copy'n'paste it.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 10 months ago.
Improve this question
I want to display a simple image gallery on a PHP webpage where the images are compressed; however, it allows for full res jpegs to be downloaded. And I'm just curious what method you would recommend for a project like this?
I'm thinking I store the full-res jpeg on my server and use server-side PHP imagecreatefromjpeg() and imagejpeg() to create a lower-res thumbnail of the image with an option to download? Or I suppose I could store lower res and high res jpegs both on the server and just echo them out but I would rather not store the lower res if possible.
Are there any other options for a project like this? And if imagejepg() is a good option, would someone direct me in how to use it?
Lower-rez images typically take much less disk space, much less, than hi-rez images. Disk space is extremely unlikely to be a limiting factor when you pre-create the smaller images.
And, resizing and decompressing on the fly in response to user requests eats server power. Store the low-rez images: think green.
For what it's worth, WordPress (40% of web servers on the net) resizes on upload and stores resized images, so that approach is proven effective.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have written a REST application in laravel. It accepts a Json payload and creates a formatted pdf using this data.
Is it possible to write a test that checks the pdf has been generated correctly?
Edit:
Ideally I'd like to know the pdf is not corrupted, ie. Will open in a pdf reader.
Also it would be good to somehow check the content of the pdf. For example... does it contain the customers name?
Thanks.
Yes this is possible, given that the same JSON input always generates the same PDF.
You wouldn't really check the PDF file. PDF is a complex format, based on PostScript and some dark magic.
What you can do is generate a “sample” PDF once, then write a unit test that uses the same input data to generate a PDF file, then compare this to your sample.
This would look something like (just some example code):
$myPdf = $pdfGenerator->generatePdf();
$samplePdf = file_get_contents('/some/example/file.pdf');
// with PHPunit
$this->assertEquals(0, strcmp($myPdf, $samplePdf));
That's a bit dirty, but it does the job … if something in your PDF or JSON implementation changes, the unit test will make you aware of it.
It is important, however, that your PDF generator does not insert any “dynamic” data, such as date stamps. In that case, the PDF files could obviously never be identical.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I want to display some pdf file on my website, but I do not want people to download the file. like scribd.com that displays the document.
anyone can help me?
If you want to make sure your user cannot download the original PDF file you will have to convert it to something else before.
Scribd converts the PDF to HTML (with one image in the background that contains all non-text objects). I am not aware of any PDF to HTML parsers, so you would have to write your own. Due to the nature of PDF files, this will unfortunately not be easy (see this question for some more details: Convert PDF to HTML in PHP?)
If you are fine with relying on some external web service, you might try this: https://cloudconvert.org/pdf-to-html
As an alternative to parsing the PDF to HTML, you could also just output it as an image. This is much easier to achieve but also not very nice in terms of user experience. If you chose this method, the easiest way would be to use ImageMagick and Ghostscript (see https://stackoverflow.com/a/467805):
<?php
$im = new imagick('file.pdf[0]');
$im->setImageFormat('jpg');
header('Content-Type: image/jpeg');
echo $im;
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm generating a pdf for every page using FPDF. Its working properly. But the main problem is with the file size. Some files have a size of more than 2Mb. I want to limit the size. How can I limit the size within 300 kb? Any help would be appreciated.
Typically PDFs are large because they contain large images or because they contain large fonts.
So the solution is typically to reduce the resolution of the images and to avoid the fonts getting embedded.
If FPDF will allow you to do this then this will likely solve your problem.
If not then you will need to post-process your PDF using another library to unembed the fonts and resample the images.
ABCpdf will do this using the ReduceSize operation. No doubt other libraries will allow something similar.
I work on the ABCpdf .NET software component so my replies may feature concepts based around ABCpdf. It's just what I know. :-)
The best way is to compress pdf file is that first you generate the pdf and then use any tool to compress the pdf file. As far as my experience is concerned, there is no other way to compress pdf file.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 months ago.
Improve this question
I am working on a web application where users can upload different files MS Word (.doc and .docx), Excel (.xls and .xlsx), Power point, PDF, text files and Rich Text Files (.rtf).
As part of the application flow I would like to display a preview of the contents of the files in an IFrame, HTML best but I can go with text, using a PHP class
The approach I am using is:
Identify the extension of each file
Process each file differently
Display the text or HMTL
Is there any library that does this?
There is no single library that solves the problem so I solved it using the following libraries for each file type:
a) MS Word documents - Live Docx (http://www.phplivedocx.org/2009/08/13/convert-docx-doc-rtf-to-html-in-php/)
b) MS Excel - PHP Excel (http://phpexcel.codeplex.com/)
c) Text from PDF - class from this Pastebin http://pastebin.com/hRviHKp1
d) Powerpoint - still work in progress
I have provided more details on my blog http://ssmusoke.wordpress.com/2012/06/16/display-contents-of-different-file-formats-wordexcelpowerpointpdfrtf-as-html/
I had a similar task a few years ago and we ended up using OpenOffice in server mode with ImageMagick to retrieve Thumbnails images of PowerPoint documents. For some kind of presentations library.
Basically the idea is to run OpenOffice and convert your documents to PDF and then use ImageMagick to create a thumbnail image of the first page of that PDF.
This guy here uses OpenOffice with another tool to convert documents: https://stackoverflow.com/a/1046159/626621 (could help you)
Advantage of this is, I think, that an image as a preview of the document will be more telling to your users than just the text.