I'm currently building an app for managing a small organization. One of the functions of the app is printing out a bunch of monthly letters to the members. The way it works is I pull data from the DB about the member (name, address, dates, etc...) and then populate a letter template that has placeholder variables for all the details.
After all the letters are populated I need to give the user the option to print the entire block of letters. This is where my problem comes in. I want it so that each letter would print on a single sheet of paper, and the content be centered and aligned in the middle of the page.
I've attempted to make a Print Media stylesheet and inject the content into a div which I then style to fill the page, but this solution doesn't seem to work properly mainly due to layout issues.
Is this something I should be doing with another format? Should I be sending this to Word or PDF for proper handling or is this something that can done with HTML, CSS?
Note: The stack i'm using is bog standard Linux/PHP and I can pretty much install any additional 3rd party library that I might need.
Any ideas?
HTML isn't very good for this job, especially if you have a single HTML page with many letters (i.e. you want each letter to start on a new page).
There are lots of PDF generation tools for PHP and PDF is actually rather good for print layouts.
PDF will let you control the fine-grained aspects of printing, such as keeping blocks together, when to start a new page and much, much more!
Which one is the best PDF-API for PHP?
Related
I want to extract the table data from images or scanned documents and map the header fields to their particular values mostly in an insurance document.I have tried by extracting them line by line and then mapping them using their position on the page. I gave the table boundary by defining a table start and end pivot, but it doesn't give me proper result, since headers have multiple lines sometimes (I had implemented this in php). I also want to know whether I can use machine learning to achieve the same.
For pdf documents I have used tabula-java which worked pretty well for me. Is there a similar type of implementation for images as well?
Insurance_Image
The documents would be of similar type as in the link above but of different service providers so a generic method of extracting such data would be very useful.
In the image above I want map values like Make = YAMAHA, MODEL= FZ-S, CC= 153 etc
Thanks.
I would definitively give a go to Tesseract, a very good OCR engine. I have been using it successfully in reading all sorts of documents embedded in emails (PDF, images) and a colleague of mine used it for something very similar to your use case - reading specific fields from invoices.
After you parse the document, simply use regex to pick the fields of interest.
I don't think machine learning would be particularly useful for you, unless you plan to build your own OCR engine. I'd start with existing libraries, they offer very good performance.
The easiest and most reliable way to do it without much knowledge in OCR would be this:
- Take an empty template for reference and mark the boxes coordinates that you need to extract the data from. Label them and save them for future use. This will be done only once for each template.
- Now when reading the same template, resize it to match the reference templates dimensions (If it's not already matching).
- You have already every box's coordinates and know what data it should contain (because you labeled them and saved them on the first step).
Which means that now you can just analyze the pixels contained in each box to know what is written there.
This means that given a list of labeled boxes (that you extracted in the first step), you should be able to get the data in each one of these boxes. If this data is typed and not hand written the extracted data would be easier to analyze or do whatever you want with it using simple OCR libraries.
Or if the data is always the same size and font like your example template above, then you could just build your own small database of letters of that font and size. or maybe full words? Depends on each box's possible answers.
Anyway this is not the best approach by far but it would definitely get the work done with minimal effort and knowledge in OCR.
I have a web app that let users design their own invitation cards that are then ordered and printed by us, and send to the customer.
The problem we have is, it's difficult to print the cards, exactly the way the user designed it. We are currently using wkhtmltopdf to export a pdf file from the users design, that is then send to print. This has caused us months of headache. See this example:
As you can see, there are some important differences between the result of the HTML and the PDF. Most noticable is the line break of "Välkommen på". Other common differences is line-height changing so that text overlap eachother on the PDF file, because they are more, or less seperated from eachother than in the HTML.
My questions to you is:
Would you use this method or is there any other, simpler method could use to print the cards? For example, is there an easy way to just print the HTML itself from the browser (Auto fitting to the correct size of the content and so on), or do you have any other idea?
If you are a wkhtmltopdf wizard, do you know how we can solve issues like in the picture with the fonts?
I was able to solve the problem with the line breaks by using the CSS Attribute white-space: nowrap on all divs of my HTML content.
Our company allows its clients to view reports via our website. The pages are php based and the data is collected from MySQL. These reports were written a long time ago and include inline css. The pages themselves look fine, but the print version is lacking. I want to take the reports and create visually appealing "printable" pages that contain our branding.
I have found three solutions so far.
#Media Print Stylesheets
This is the easiest method, but does not give me complete layout control. I want landscape mode and need to control where the page breaks occur so this method has been eliminated from my list of possible solutions. The reports are built by looping through PHP data, so while I can always put a page break after a or for example, I can't stop the page from breaking before it gets to the next set of data.
TCPDF/FPDF
From what I have seen these classes will give me all of the control I need to customer a PDF. The challenge is that this appears to be a little more advanced than my programming skills require, and all of the inline CSS contained within the HTML tables may throw off formatting.
FDF
I am leaning towards this method if I understand it correctly. First I would create a PDF form and define all of the fields to be populated by the MySQL data. Then I would create a FDF file that would populate the form template with the data from the database. It seems easier to me to create a visually pleasing form via PDF and then populate that form using this method, rather than create the entire pdf from scratch using method 2.
Does it sound like I am on the right track? Are any of these methods "easier" than the other?
Any help is greatly appreciated.
TCPDF has the most control of each page which is what I am looking for. It is extremely sensitive when writing HTML, but that is the only downside I have found so far.
There's this excellent answer on SO already.
If you're looking for easy, my money is on mPDF. I found it to be the easiest, and essentially an out-of-the-box solution (often zero server configuration to do).
I think you should try out wkhtmltopdf.
https://code.google.com/p/wkhtmltopdf/
As for the TCPDF/FPDF pagination issue, you can see this other question for the solution provided and use the flow in it to sort yours out.
TCPDF / FPDF - Page break issue
Just found this other solution as well and think you'll need it
Convert HTML + CSS to PDF with PHP?
For me personally, FPDF works great to fetch data from my database, insert into the FPDF class and dynamically create PDF's for customers.
I see some people want to write HTML/CSS to create PDF's but you will always have
differences as the browser parses the HTML/CSS differently than when using it in PDF's.
When using FPDF's built-in method's, I have been able to get exactly what I wanted
and haven't seen any issues (yet).
I have been looking extensively for a simple solution to a not-very-complicated problem.
I have a great deal of data in a sql database which needs to be printed (for example, each entry would have name, address, phone number, etc).
The vast majority of the data on the eventual printed page is static- there would only need to be a small handful of fields that need to be 'variables' in the 'template'. Quite beneficially the areas that the variable data would be dropped into are themselves in both location orientation and dimensions fixed-- so there need be no adjustments to spacing for the other static/redundant data on the page.
I would like to have some form of 'accounting' in the sense that, since the amount of pages printed are going to be on the order of the tens of thousands, I would like to know which sql entries have been printed thus far.
I would not like to 'reinvent the wheel' and write a php front end which loops through arrays and deposits the sql data onto the right place on the page before or after it is rendered as pdf...
I would prefer to print directly from the server (*nix), and would be very enthusiastic if there is a way to do this without actually having to render tens of thousands of individual pdfs. With todays open source software packages, which route is the best to take?
(so far, it is looking like if there isn't a simple way, I am going to need to learn LaTeX, Cheetah, and some python)
Dabo's report writer is a banded reporting engine like Crystal, which takes as input a set of data (output of cur.fetchall(), for example) and a report template (xml string or file), and outputs a PDF or set of PDF's (it can output a stream of bytes instead of writing to a file directly, if desired).
Dabo's main purpose is a desktop-application framework on top of wxPython, but the reporting can be done on the web with no desktop interaction. Though it does help to design the reports using the desktop though using the included report designer.
http://dabodev.com
There will be some installation hurdles and a learning curve, but you'll find this to be an easy task once you are ramped up.
I work for a care centre that would like a feature on their website where friends and family can choose from a selection of care cards to deliver to someone they know. They will be able to choose a title, an image and type in some text on the card that we assemble and deliver. They need me to make an application for them that assembles the cards in a printer-friendly fashion (placing text and images in the right areas) that they will print and fold before delivery.
Image of what I am trying to create: http://i.imgur.com/f8GnD.png
Reading about how to do this I realize that I have two issues:
Size of card on-screen can't be fixed due to printer DPI
Should I use html/CSS to make a table with 4 cells to create this card? Php image library? JavaScript?
Any help would great.
I have the best luck, in terms of printing, with PDFs. The document format is nice, too, because it is portable and the user may choose to print somewhere other than where they accessed your site.
The best PDF-generating library I've used for PHP is fPDF: http://www.fpdf.org/
PDFs are great for printing full-page documents. All but the most ancient operating systems provide users the ability to open and print PDFs, and because PDF is a document format the printed output is fairly consistent between systems and printers.
The other route you suggest is certainly possible - you can build it up using HTML and CSS. There are serious drawbacks to this, however. Foremost, each user is going to have varying printer settings in their browser, and the browser is not configured by default to be good to your full-page printing. Most user agents add page numbers, margins, the date & time, the URL.... in short, your print from the browser is going to rely on the user tinkering with their browser print settings. There is nothing you can do to influence these settings from your end.
There are third-party utilities that generate PDFs on the server, based on your HTML. PDFs have solved many print-related issues internally so you don't have to worry about them yourself.