Creating printable content with Php/JavaScript/Html/CSS

Creating printable content with Php/JavaScript/Html/CSS - php

I work for a care centre that would like a feature on their website where friends and family can choose from a selection of care cards to deliver to someone they know. They will be able to choose a title, an image and type in some text on the card that we assemble and deliver. They need me to make an application for them that assembles the cards in a printer-friendly fashion (placing text and images in the right areas) that they will print and fold before delivery.
Image of what I am trying to create: http://i.imgur.com/f8GnD.png
Reading about how to do this I realize that I have two issues:
Size of card on-screen can't be fixed due to printer DPI
Should I use html/CSS to make a table with 4 cells to create this card? Php image library? JavaScript?
Any help would great.

I have the best luck, in terms of printing, with PDFs. The document format is nice, too, because it is portable and the user may choose to print somewhere other than where they accessed your site.
The best PDF-generating library I've used for PHP is fPDF: http://www.fpdf.org/
PDFs are great for printing full-page documents. All but the most ancient operating systems provide users the ability to open and print PDFs, and because PDF is a document format the printed output is fairly consistent between systems and printers.
The other route you suggest is certainly possible - you can build it up using HTML and CSS. There are serious drawbacks to this, however. Foremost, each user is going to have varying printer settings in their browser, and the browser is not configured by default to be good to your full-page printing. Most user agents add page numbers, margins, the date & time, the URL.... in short, your print from the browser is going to rely on the user tinkering with their browser print settings. There is nothing you can do to influence these settings from your end.

There are third-party utilities that generate PDFs on the server, based on your HTML. PDFs have solved many print-related issues internally so you don't have to worry about them yourself.

Related

Save html file as PDF

I'm using a PHP Output Buffer to create an HTML file of a dynamic 'Data Review' page, I then save this output as an HTML file to the server and would like to create a PDF file of this HTML file (stored on the server) but every solution I've looked at requires you to put in HTML code into a variable, but I have the .HTML file that I want to convert to PDF automatically but can't seem to find a solution.
The overall idea here is to supply the user a 'copy' of the data review via email, so I assumed a PDF would be best, but if there are any other suggestions, I would happily consider something else.
Any help would be greatly appreciated.
Thank you!

I've looked heavily into generating PDFs in PHP and so here is what I've found over a few years...
PDF Conversion tools
FPDF
This option is really good if you want to generate a PDF file using the PDF method (I will coin it this because you literally generate the PDF piece by piece).
Features include:
Choice of measure unit, page format and margins
Page header and footer management
Automatic page break
Automatic line break and text justification
Image support (JPEG, PNG and GIF)
Colors
Links
TrueType, Type1 and encoding support
Page compression
Notes
Performance: Fast
Cost: Free
Ease of use: Difficult
Difficult to use unless you play a lot with it.
Good documentation.
Other:
Duplication of files (need to have HTML version of a page and an FPDF version of a page if you need to generate PDFs)
MPDF
This option is really good if you want to generate a PDF file from HTML and CSS and still have additional and extensive PDF customization.
Features include:
PDF generation from UTF-8 encoded HTML
It is based on FPDF and HTML2FPDF with a number of enhancements
Notes
Performance: Mediocre
Not the fastest but does the job
Cost: Free
Ease of use: Easy
Hardest part is knowing what is and is not valid HTML and CSS for MPDF)
Great documentation.
Not all CSS is supported and some CSS is extended causing some confusion
PrinceXML
This option is probably the best if you want high performance and high reliability.
Features include:
Powerful Layout
Headers and footers
Page numbers, duplex printing
Tables, lists, columns, floats
Footnotes, cross-references
Web Standards
HTML, XHTML, XML, SVG
Cascading Style Sheets (CSS)
JavaScript/ECMAScript
JPEG, PNG, GIF, TIFF
PDF Output
Bookmarks, links, metadata
Encryption and Document Security
Font embedding and subsetting
PDF attachments
Easy Integration
PHP and Ruby on Rails
Java class for servlets
.NET for C# and ASP
ActiveX/COM for VB6
Fonts & Unicode
OpenType fonts, TrueType and CFF
Kerning, Ligatures, Small Caps
Chinese, Japanese, Korean, Arabic, Hebrew, Hindi and others
Friendly Support
Prompt email support
Web forum, user guide
Regular upgrades
Notes
Performance: Fast
Pricing: $$$
Server License
1 license - $3,800
2 license - $3,420
3 license - $3,040
4 license - $2,850
5+ license - $2,800
OEM (with minimum commitment of 2 years, can be run on any number of servers; so you can create a server farm if you really need)
20,000 documents/month at $5,000
100,000 documents/month at $7,500
500,000 documents/month at $10,000
They also have an academic discount of 50% at $1,900 and a Desktop License for $495 as well as other plans (see here for full list)
Ease of use: Easy
I have not used PrinceXML directly (pricey), but we are currently looking into this as an option for our business.
DocRaptor
This option is really good if you want a high quality API. This is a cloud-hosted option for creating PDF and XLS files. Uses PrinceXML in the backend.
Features include:
You just send HTML, JS, and CSS
Uptime guaranteed
Unlimited document size
Expert support, including document debugging
Pretty much offers everything that PrinceXML does, but double check with their support or documentation for anything specific you may require.
API-based: Works with PHP, NodeJS, Ruby, Python, Java, C#
Notes
Performance: Fast
Depends on internet connection, so if your internet goes down, so does this part of your code.
Pricing: $ - $$$
Currently, their pricing plans are as follows (taken from their website):
Basic - 125 docs/mo - $15/mo
Professional - 325 docs/mo - $29/mo
Premium - 1,250 docs/mo - $75/mo
Max - 5,000 docs/mo - $149/mo
Bronze - 15,000 docs/mo - $399/mo
Silver - 40,000 docs/mo - $1,000/mo
Gold - 100,000 docs/mo - $2,250/mo
Enterprise - ∞ docs/mo - unlisted (contact them)
Ease of use: Very easy
Probably the easiest because you don't actually deal with the document or setup, etc. You just send your files and get a PDF back.
Great documentation
I contacted their support in the past and it was actually very helpful.
They use a proprietary JavaScript engine that allows you to use delayed or asynchronous JavaScript
wkhtmltopdf
This option is really good if you want the next best thing behind the purchased options above (PrinceXML and DocRaptor).
Features include:
[Uses] the Qt WebKit rendering engine
Create your HTML document that you want to turn into a PDF (or image). Run your HTML document through the tool.
Notes
Performance: Fast
Cost: Free
Ease of use: Easy
Uses command line unless you use a library such as the one created by MikeHaertl
We currently use this option and find it performs very well and has great support for HTML tags and CSS properties.
If you need to send variables to the PDF pages that need to be generated, you cannot use $_SESSION variables as this is ran through the command line and uses a separate browser. You need to pass all your variables through $_GET variables.
Other options: Many taken from this question
Cloud-based
HTM2PDF: Source
PDFmyURL: Source
PDFCrowd: Source 1, Source 2
PDFLayer: Source
RotativaHQ: Source
Client-side
jsPDF: Source
Server-side
TCPDF - Many people recommended this option: Source
ZendPDF - Part of Zend Framework: Source
flying-saucer - Java library usable via system(): Source 1, Source 2
CutyCapt: Source
PhantomJS: Source
Snappy: Source
DOMPDF: Source
HTML2PDF: Source
PDFReactor
HTML2PS - No solid links for this project, so I linked to Google search for it
Apache FOP
PHP - PHP has its native library for creating PDFs, I assume this is probably one of the most difficult ways to go about doing this, but if you're really adventurous, why not?
PDFLib - Many other libraries are based off this one
ReportLab - Python-based
iText - Java-based: Source
ActivePDF
WeasyPrint - Python-based. This is apparently really good?
xHTML2PDF - Python-based
Other options
We deal with many vendors. Some vendors send us PDFs for their invoices or other documents while others send us HTML emails (with all our invoice information in it), and some others even send us links to the invoices.
The easiest option is to create the document in HTML and send users a link to that document (secured obviously). This would allow users to view the invoice whenever they want (and from any device with a browser) and would also allow them to print from the browser if needed. This method also generates traffic to your website which is usually also beneficial to the business.
What we've done in the past is create a link to the file on the website (secured) so that they can view it in the browser, and then have a button to download the invoice (which just downloads a PDF version of that webpage generated with one of the PDF Conversion tools listed above - currently wkhtmltopdf).
In my opinion, the best method would be to combine all delivery approaches into one. Send an email with the file information in the email's HTML content and attach a PDF of that file. Inside the header portion of the email content (at the top of the email), send a link giving the recipient direct access to the webpage containing all the information (located within their account in your secure portal). This allows them to view it in the browser just in case they can't view it properly in their email and in case they don't have a PDF viewer (I know it's rare nowadays, but you'd be surprised just how many people out there have outdated systems - we still need to send faxes to some clients because they still don't have emails; yes still now in 2017, sigh...). On your website, also provide them with a download link for the PDF document (which would again just take the page they are currently on and convert it into a PDF and automatically download it through the browser).
I hope this helps!

I would like to add another option in the probable solution list. Aspose.PDF Cloud API also offers features to convert HTML to PDF. It provides SDKs for all popular programming languages.
PHP sample code for HTML to PDF conversion:
//Html file with resource files
$name = "HtmlWithImage.zip";
$html_file_name = "HtmlWithImage.html";
$height = 650;
$width = 250;
$src_path = $name;
$response = pdfApi->getHtmlInStorageToPdf($src_path, $html_file_name, $height, $width);
print_r($response);
echo "Completed!!!!";
I work with Aspose as developer evangelist.

how mature is HTML+CSS now in relation to generating reports for printing?

I'm considering creating all the reports of a series of desktop business apps directly to html. Most of the reports are tables (maybe compound reports), headers, footers, etc. (no images, vector graphics, etc.).
After a search in SO, I've read lots of post regarding problems with page breaks and things like that (I don't need pixel positioning at all, but yes control at page breaks).
For example, let's say I have a big table with currency values and I need the last row of the table per page to display the running totals at that point.. it is something feasible to do easily or I will run in lots of trouble?
What technologies can help me here?
HTML5
Javascript
CSS
PHP Librarys
JQuery
Some notes:
The html will be displayed with the chrome or firefox engine embeded, so the diferences between browsers it's not a problem for me.
I can have the php preprocessor embedded if that helps to generate more easily the reports, I'm just looking fot the best technology at hand to make the work well..
I'm tired of report generators with "WYSIWYG" designers (Crystal Report, FastReport, ReportBuilder, etc.)
Thanks!

We made the exact move you're thinking about almost a year ago and haven't looked back. Most communication with our client is over the web, so it's been a perfect fit. They can view html outputs easily on our website, and can generate pdf's of the page (server side) whenever necessary. The program we use for pdf conversion is a free, easy-to-use, open-source project called wkhtmltopdf.
Where we are is great, but getting here was difficult.
Deciding which pdf engine to use was a long, painful process. The short of it is that HTML is for viewing pages on the internet, not for viewing pages on paper. Page-breaks will be the bane of your existence in this game -- you literally have to measure each page and create your own clean-looking breaks for every single report (otherwise, all html-to-pdf converters out there will just keep rendering the document onto the next page as it if encountered no page-break at all). Further complicating the matter is that every html-to-pdf engine out there handles this sh*t differently and you'll have to write a tailored solution to test each one to see if it meets your individual needs.
Now, the good news:
You can save yourself a lot of trouble by heeding my advice and going with wkhtmltopdf for your finalized reporting outputs. This little program is simply amazing -- it uses a webkit engine, renders CSS/javascript accurately, has header/footer control, optionally creates a table-of-contents page, and (most importantly) consistently produces excellent looking pdf's without having to customize your code base. It also has a variety of great command line switches, and it is very, very fast. I say again: it is very, very fast.
Best of all, it's a command line tool that can be used in batch processing. And did I mention that it's really, really fast?

Browser support for printing is generally terrible. However, there are other tools, notably Prince (which is not free) and Flying Saucer (which is free) that can generate PDF output from XML/HTML plus CSS. Prince even supports JavaScript though I don't have any experience with it.
I've got a Java back end in my current application, so for me Flying Saucer works fine for simple reports. I pre-process an HTML template with FreeMarker and then run the result through Flying Saucer. It's got a surprisingly smart rendering engine.
The CSS3 Paged Media spec (well, proposed spec) has all sorts of cool stuff in it but they're almost totally unimplemented in the browsers. Even the CSS2 paged media stuff is only supported half-heartedly.

Speaking of Prince, you might look into DocRaptor. DocRaptor is another HTML to PDF conversion application. It uses Prince XML, and handles CSS better than comparable programs.
It isn't free, but offers a free 30 day trial for all accounts, so there's no harm in trying it out, at least.
DocRaptor

How do extract text layer and background layer from pdf?

In my project I've to do a PDF Viewer in HTML5/CSS3 and the application has to allow user to add comments and annotation. Actually, I've to do something very similar to crocodoc.com.
At the beginning I was thinking to create images from the PDF and allow user create area and post comments associates to this area. Unfortunately, the client wants also navigate in this PDF and add only comments on allowed sections (for example, paragraphs or selected text).
And now I'm in front of one problem that is to get the text and the best way to do it. If any body has some clues how I can reach it, I would appreciate.
I tried pdftohtml, but output doesn't look like the original document whom is really complex (example of document). Even this one doesn't reflect really the output, but is much better than pdftohtml.
I'm open to any solutions, with preference for command line under linux.

I've been down the same road as you, with even much more complex tasks.
After trying out everything I ended up using C# under Mono (so it runs on linux) with iTextSharp.
Even with a very complete library such as iTextSharp, some tasks required allot of trial-and-error :)
To extract the text from a page is easy (check the below snipper), however if you intend to keep the text coordinates, fonts and sizes, you will have more work to do.
int pdf_page = 5;
string page_text = "";
PdfReader reader = new PdfReader("path/to/pdf/file.pdf");
PRTokeniser token = new PRTokeniser(reader.GetPageContent(pdf_page));
while(token.NextToken())
{
if(token.TokenType == PRTokeniser.TokType.STRING)
{
page_text += token.StringValue;
}
else if(token.StringValue == "Tj")
{
page_text += " ";
}
}
Do a Console.WriteLine(token.StringValue) on all tokens to see how paragraphs of text are structured in PDFs. This way you can detect coordinates, font, font size, etc.
Addition:
Given the task you are required to do, I have a suggestion for you:
Extract the text with coordinates and font families and sizes - all information about each paragraph. Then, to a PDF-to-images, and in your online viewer, apply invisible selectable text over the paragraphs on the image where needed.
This way your users can select a part of the text where needed, without the need of reconstructing the whole PDF in html :)

I recently researched and discovered a native PHP solution to achieve this using FOSS. The FPDI PHP class can be used to import a PDF document for use with either the TCPDF or FPDF PHP classes, both of which provide functionality for creating, reading, updating and writing PDF documents. Personally, I prefer TCPDF as it provides a larger feature set (TCPDF vs. FPDF), a richer API (TCPDF vs. FPDF), more usage examples (TCPDF vs. FPDF) and a more active community forum (TCPDF vs. FPDF).
Choose one of the before mentioned classes, or another, to programmatically handle PDF documents. Focusing on both current and possible future deliverables, as well as the desired user experience, decide where (e.g. server - PHP, client - JavaScript, both) and to what extent (feature driven) your interactive logic should be implemented.
Personally, I would use a TCPDF instance obtained by importing a PDF document via FPDI to iteratively inspect, translate to a common format (XML, JSON, etc.) and store the resulting representation in relational tables designed to persist data pertinent to the desired level of document hierarchy and detail. The necessary level of detail is often dictated by a specifications document and its mention of both current and possible future deliverables.
Note: In this case, I strongly advise translating documents and storing them in a common format to create a layer of abstraction and transparency. For example, a possible and unforeseen future deliverable might be to provide the same application functionality for users uploading Microsoft Word documents. If the uploaded Microsoft Word document was not translated and stored in a common format then updates to the Web service API and dependent business logic would almost certainly be necessary. This ultimately results in storing bloated, sub-optimal data and inefficient use of development resources in designing, developing and supporting multiple translators. It would also be an inefficient use of server resources to translate outbound data for every request, as opposed to translating inbound data to an optimal format only once.
I would then extend the base document tables by designing and relating additional tables for persisting functionality specific document asset data such as:
Versioned Additions / Edits / Deletions
What
Header / Footer
Text
Original Value
New Value
Image
Page(s) (one, many or all)
Location (relative - textual anchor, absolute - x/y coordinates)
File (relative or absolute directory or url)
Brush (drawing)
Page(s) (one, many or all)
Location (relative - textual anchor, absolute - x/y coordinates)
Shape (x/y coordinates to redraw line, square, circle, user defined, etc.)
Type (pen, pencil, marker, etc.)
Weight (1px, 3px, 5px, etc.)
Color
Annotation
Page
Location (relative - textual anchor, absolute - x/y coordinates)
Shape (line, square, circle, user defined, etc.)
Value (annotation text)
Comment
Target (page, another text/image/brush/annotation asset, parent comment - threading)
Value (comment text)
When
Date
Time
Who
User
Once some, all or more, of the document and its asset data has a place to persist I would design, document and develop a PHP Web service API to expose CRUD and PDF document upload functionality to the UI consumer, while enforcing core business rules. At this point, the remaining work now lies on the Client-side. Currently, I have relational tables persisting both a document and its asset data, as well as an API exposing sufficient functionality to the consumer, in this case the Client-side JavaScript.
I can now design and develop a Client-side application using the latest Web technologies such as HTML5, JavaScript and CSS3. I can upload and request PDF documents using the Web service API and easily render the returned common format out to the browser however I decide (probably HTML in this case). I can then use 100% native JavaScript and/or 3rd party libraries for DOM helper functionality, creating vector graphics to provide drawing and annotation features, as well as access and control functional and stylistic attributes of currently selected document text and/or images. I can provide a real-time collaborative experience by employing WebSockets (before mentioned WebService API does not apply), or a semi-delayed, but still fairly seamless experience using XMLHttpRequest.
From this point forward the sky is the limit and the ball is in your court!

It's a hard task you're trying to accomplish.
To read text from a PDF, have a look at PEAR's PDF_Reader proposal code.

There's also a very extensive documentation around Zend_PDF(), which also allows the loading and parsing of a PDF document. The various elements of the PDF can be iterated on and thus also being transformed to HTML5 or whatever you like. You may even embed the notations from your website into the PDFs and vice versa.
Still, you have been given no easy task. Good Luck.

pdftk is a very good tool to do thinks like that (I don't know if it can do exactly this task).
http://www.pdflabs.com/docs/pdftk-cli-examples/

Problem in picture overlapping when I convert HTML page to PDF

I want to overlap pictures, but it is not working and I need some help.
Here's the link to the page I'd like to convert:
http://9m9.com/innovative/sample/two.html
I want to convert this page to a PDF. You can see the small image overlapping the bigger one.
This is the page where you can click on a link that will convert the page to PDF.
http://citysoftsolutions.com/eclients/virtualtour/view_property_images.php?pid=9&uid=67
As you can see the image is placed behind the big image.
I'm using this converter script: http://mpdf.bpm1.com/

When I printed it using PrimoPDF driver it came out just fine. Last image was easily laid over. So there must be a bug with the script you're using.
What do I suggest?
If you'd like to convert your pages to PDFs "on the fly" I suggest you either
contact script creator and inform them of a bug in the script
use a different script (I'd check out this question that can help you)
If you'd like to just provide PDFs of your page I suggest you install a PDF printer driver (like PrimoPDF that I'm using) and print those pages yourself and use those.
I'm not working for Nitro PDF Software company nor am I related to them in any way. So this is not me advertising their products/services.
On a sidenote
Something's telling me that what you'd actually like to do is to create a PDF flyer/promo material or something. If that's actually what you're after I suggest you do that using some software that's meant for such a job. Microsoft Office Word will do, but you'll better off using some other. If it's a one page leaflet you could use Adobe Illustrator or CorelDraw. But if it's going to be an actual multipage document use something like Word or Adobe InDesign.
Word is probably something you can easily master. So go with that one.

Is php capable of doing what I want?

I'm working on a biology web based application and trying to figure out what language to use. The features I need to include are:
Image viewing frame - This area will display the current image that the biologists wish to see. The application needs to take in a number of coordinates from a file and draw those points on the image displayed here. When the biologist wishes to change images there needs to be no flickering from the refresh. Will do this using multiple image buffers probably. Content needs to be scrollable and able to be zoomed in.
There need to be labeled buttons that advance, step back, zoom, and play the images displaying in the image frame. There also needs to be some type of list view where images titles can be selected to be displayed.
There will be a bunch of folders of images on the server that can be selected from. The application must allow the user to select which folder of images to be loaded. It also must be able to read from either an txt or xml file and visually display the information there by way of line graph.
Would like to be able to run scripts on the server from the application.
I feel that all these things are doable by a web application but I have no idea what language to use. Most people recommend php, but i don't want to delve deeper until I know what its limitations are. Any suggestions are welcome. Thanks in advance.
-Mike

PHP can do everything you need for the back end, but most of the stuff that you describe is UI based, and this is dependent on the client, which is, of course, the browser. For highly graphical projects, you can do a lot in JavaScript and some JavaScript libraries have a lot of these capabilities built in. You might also consider Flash or Flex.
You might even consider a desktop application that runs outside of the browser. You can use Java, which is easy to deploy, but still requires the user to have the Java Runtime Engine, or you could go with a language that you can compile down to a native application.
Regardless of the front end technology that you choose, you'll still need a back end, and PHP can handle this.

You will find almost every server side platforms such as php , asp.net, asp, etc will do all of the above.

PHP is a language that resides on the server and handles all requests. Javascript (and associated libraries) is a language which is executed by the client's browser and handles (almost) all interaction. PHP is definitely able to do what you want, but for the interaction stuff (particularly the zoom, scrolling, etc.), you'll also need to use Javascript.
So, short answer, PHP is good, but you're going to need to use client-side scripting as well.

PHP is more than capable of doing this. You are going to need to use it in combination with some Javascript to handle the client side effects you describe. I would look into modifying galerific for your needs and then whip up some javascript to write points over the images.

From your concerns about image refresh/flicker, it really sounds like a desktop app is what you are looking for, for a rapid response on image changes. The requirements on this really seem to need to be defined better before you can choose a language... PHP can do all the server side stuff you mentioned, but you might have a harder time getting the image viewing "frame" to provide the functionality you want.

Due to the image manipulation requirements it might be easier to go with something like flash with a php backend or asp.net with silverlight. It might be difficult to prevent flicker and delays with using pure javascript as opposed to flash/silverlight.

Image viewing frame
This will most likely need to be done on the client side using tools/frameworks such as jQuery, the canvas element, silverlight, or any of the other 100's that are out there.
There need to be labeled buttons that advance, step back, zoom, and play the images displaying in the image frame. There also needs to be some type of list view where images titles can be selected to be displayed.
PHP or any other server-side scripting language could pull this off. If this is meant to be a quick project running on free/cheap hardware then PHP would be a good choice. If the plan is a large application that will have to be maintained over the course of many years and hosting/price is not an issue then I would suggest something like ASP.NET
There will be a bunch of folders of images on the server that can be selected from. The application must allow the user to select which folder of images to be loaded. It also must be able to read from either an txt or xml file and visually display the information there by way of line graph.
Again any server side language could do the folder listing portion. As for reading files and creating graphs, this would most likely be a combination of server side and client side programming. jQuery for example, has plugins that could quite easily take a xml file and create a line graph.
Would like to be able to run scripts on the server from the application.
PHP, ASP.NET - both could do this. I'm sure many others could, but these are the ones i use most often
The issue with PHP is that quite often, the code turns into a mess over time. This is maybe not so much an issue with the language as the people using it and the purpose the app was built for (a quick, one time project). Classic ASP also has the same issues.
ASP.NET is a good combination of OOP programming that allows you to separate presentation from logic with minimal effort.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.