Easiest and fastest way to template, possibly in a PDF - php

I have been looking extensively for a simple solution to a not-very-complicated problem.
I have a great deal of data in a sql database which needs to be printed (for example, each entry would have name, address, phone number, etc).
The vast majority of the data on the eventual printed page is static- there would only need to be a small handful of fields that need to be 'variables' in the 'template'. Quite beneficially the areas that the variable data would be dropped into are themselves in both location orientation and dimensions fixed-- so there need be no adjustments to spacing for the other static/redundant data on the page.
I would like to have some form of 'accounting' in the sense that, since the amount of pages printed are going to be on the order of the tens of thousands, I would like to know which sql entries have been printed thus far.
I would not like to 'reinvent the wheel' and write a php front end which loops through arrays and deposits the sql data onto the right place on the page before or after it is rendered as pdf...
I would prefer to print directly from the server (*nix), and would be very enthusiastic if there is a way to do this without actually having to render tens of thousands of individual pdfs. With todays open source software packages, which route is the best to take?
(so far, it is looking like if there isn't a simple way, I am going to need to learn LaTeX, Cheetah, and some python)

Dabo's report writer is a banded reporting engine like Crystal, which takes as input a set of data (output of cur.fetchall(), for example) and a report template (xml string or file), and outputs a PDF or set of PDF's (it can output a stream of bytes instead of writing to a file directly, if desired).
Dabo's main purpose is a desktop-application framework on top of wxPython, but the reporting can be done on the web with no desktop interaction. Though it does help to design the reports using the desktop though using the included report designer.
http://dabodev.com
There will be some installation hurdles and a learning curve, but you'll find this to be an easy task once you are ramped up.

Related

Is it bad to use JSON files instead of real databases?

I have no access to create/edit any databases but due to the massive amount of content I need some sort of management system which is why I created my own. Here's how it works:
Each blog post has got its own .php file which loads static parts of the website like the header or the menu bar. But there are many category sites that display previews of the respective posts. It would be sooo annoying having to edit the same preview on 10 sites due to a misspelled word or so. That's why I store those previews (not the full content since there's no need to) in a JSON file.
Is that bad practice? Could this lead to long loading times if the number of previews rose? And could I prevent that by creating multiple JSON files?
Thanks for your advice!
If you are working on a small scale, then using JSON files is fine however it would definitely be beneficial to switch to a database management system whenever possible for storage when it comes to PHP (Or the majority of languages for that matter).
It can be considered bad practice if JSON is used to store large amounts of data or if a lot of data is stored in the same file in which case yes, using multiple JSON files rather than one large one is indeed more viable since the input stream when reading the file does not have to go over as much data.

Database for Content - OK to store HTML?

Basic question is - is it safe to store HTML in a database if I restrict who can submit to it?
I have a pretty simple question. I provide video tutorials and other content. Without spending months writing a proper BBCode parser, I would need to store the HTML so I can have it look exactly the way I want when I grab it from the database.
Basically I plan to store all information in the database about a tutorial series and each episode. I would like to have some formatting for the descriptions for both so I can add multiple paragraphs, ordered and unordered lists, links to required resources, and so on.
I am using PHP and creating my own database. I am using phpMyAdmin to store the information in the table right now. I will use a user with read only rights when I pull the information in the PHP code.
What is the best way to do this? Thank you!
Like others have pointed out there's nothing dangerous about storing HTML in the DB. But when you display it you need to know the HTML is safe. Seeing as you're the only one editing the HTML I see no problem.
However, I wouldn't store HTML at all. If all you need are headings, paragraphs, lists, links, images etc I'd say Markdown is a perfect fit. The benefit with Markdown is that it looks just like normal text (ie you could send your articles as e-mails or save them as txt-documents), it takes up a lot less space than HTML and you don't have to change it once HTML gets updated.
http://michelf.ca/projects/php-markdown/
From the security point of view it is not less secure to store your HTML in a database than storing it anywhere else - if you are the only author of that HTML. But then again if other people can author HTML in your website then it doesn't matter where you store it - only how you sanitize it and how and where you display it.
Now whether or not it is an efficient way to store HTML is a completely different matter. If I were you I would use some decent templating system and store HTML in files.
Storing HTML code is fine. But if it is not from trusted source, you need to check it and allow a secure subset of markup only. HTML Tidy library will help you with that.
Also, you need to count with a future change in website design, so do not use too much markup, only basic tags. To make it look like you want, use global CSS rules and semantically named classes in the markup.
But even better is to use Markdown or another wiki-like syntax. There are nice JS editors for Markdown with real-time preview (like the one here at Stackowerflow), and you can avoid HTML altogether.
My initial answer to "should I store html in a db" is generally no. Sure it's safe if you know what you're storing, but are you really considering best practices when you ask only that question? The true answer is "It depends".
I'm sure there are things like Wordpress that store html in a database, however, as a professional website designer, I like to remember the Separation of Concerns principle. How reusable is storing html in your database for a mobile app? Is your back end now in charge of display as well as data? Do you have many implementation possibilities for a front end or are you now stuck with whatever the back end portrays, what if you want it a different color and you've stacked ul within ul within ul? How easy is the css styling now? How easy is it to change or update that html?
I could be wrong, but even Sitecore and Kentico may store an html template in a database somewhere, but the data associated with that html template is a model, not directly on the html template.
So, when you are considering this question, you may want to store your models one place and your templates another, that way when you say "hey, lets build a mobile app" you can grab your data and go, rather than creating yet another table to store the same data.
I made a really big mistake by storing text data in Mongodb gridFS + compression and using mongodump for daily backup. GridFS is 1GB of textfiles but after backup memory usage rises sometimes 1GB daily after one month 20GB in memory due to how this backup is made.
In mongodb you should do a snapshot of the data folder - rather than do mongodump. The possible reason is that it copies unused data from disk into memory then makes bson dump. So in my case text that was never used for a long time should never be loaded into memory. I think this is how backup works as even right now my Mongodb is using 200MB of ram after run mongodump its can rise to 3GB
So i think the best solution is to use a filesystem for storing HTML files as your even RAID like PERC H700 has many amazing caching features including read ahead. But it has some limitations like network access and with my experiences some data was corrupted in time and needed to run chkdsk for repair as many GB of data was add or removed daily. Also you should consider to use proper raid features like Write trough that prevent data loss when power failure.
Sqlite is not designed to be used with extremely big data so you shouldn't not use it and has missing many caching features.
Not perfect solution is to use MariaDB or its own caching script in nodejs that can use memcached/Linux ramdisk with maybe 1GB of hot cache. Using an internal nodejs caching mechanism after some time can produce many memory leak. So i can use it for network connection and I/O are using filesystem lock and many "HOT" most used files can be programmed to cached in RAM or just leave as is

Bulk template based pdf generation in PHP using pdftk

I am doing a bulk generation of pdf files based on templates and I ran into big performance issues pretty fast.
My current scenario is as follows:
get data to be filled from db
create fdf based on single data row and pdf form
write .fdf file to disk
merge the pdf with fdf using pdftk (fill_form with flatten command)
continue iterating over rows until all .pdf's are generated
all the generated files are merged together in the end and the single pdf is given to the client
I use passthru to give the raw output to the client (saves time writing file), but this is just a little performance improvements. The total operation time is about 50 seconds for 200 records and I would like to get down to at least 10 seconds in some way.
The ideal scenario would be operating all these pdfs in memory and not writing every single one of them to separate file but then the output would be impossible to do as I can't pass that kind of data to external tool like pdftk.
One other idea was to generate one big .fdf file with all those rows, but it looks like that is not allowed.
Am I missing something very trivial here?
I'm thanksfull for any advice.
PS. I know I could use some good library like pdflib but I am considering only open licensed libraries now.
EDIT:
I am up to figuring out the syntax to build an .fdf file with multiple pages using the same pdf as a template, spent few hours and couldn't find any good documentation.
After beeing faced with the same problem for a long time (wanted to generate my pdfs based on LaTeX) i finally decided to switch to another crude but effective technique:
i generate my pdfs in two steps: first i generate html with a template engine like twig or smarty. second i use mpdf to generate pdfs out of it. I tryed many other html2pdf frameworks and ended up using mpdf, it's very mature and is developed since a long time (frequent updates, rich functionality). the benefit using this technique: you can use css to design your documents (mpdf completely features css) - which comes along with the css benefit (http://www.csszengarden.com) and generate dynamic tables very easy.
Mpdf parses the html tables and looks for the theader, tfooter element and puts it on each page if your tables are bigger than one page size. Also you have the possibility to define page header and page footer elements with dynamic entities like page nr and so on.
i know, using this detour seems to be a workaround, but to be honest, no latex, pdf whatever engine is as strong and simple as html!
Try a different less complex library like fpdf (http://www.fpdf.org/)
I find it quite good and lite.
Always find libraries that are small and only do what you need them to do.
The bigger the library the more resources it consumes.
This won't help your multiple-page problem, but I notice that pdftk accepts the - character to mean 'read from standard input'.
You may be able to send the .fdf to the pdftk process via it's stdin, in order to avoid having to write them to disk.

how mature is HTML+CSS now in relation to generating reports for printing?

I'm considering creating all the reports of a series of desktop business apps directly to html. Most of the reports are tables (maybe compound reports), headers, footers, etc. (no images, vector graphics, etc.).
After a search in SO, I've read lots of post regarding problems with page breaks and things like that (I don't need pixel positioning at all, but yes control at page breaks).
For example, let's say I have a big table with currency values and I need the last row of the table per page to display the running totals at that point.. it is something feasible to do easily or I will run in lots of trouble?
What technologies can help me here?
HTML5
Javascript
CSS
PHP Librarys
JQuery
Some notes:
The html will be displayed with the chrome or firefox engine embeded, so the diferences between browsers it's not a problem for me.
I can have the php preprocessor embedded if that helps to generate more easily the reports, I'm just looking fot the best technology at hand to make the work well..
I'm tired of report generators with "WYSIWYG" designers (Crystal Report, FastReport, ReportBuilder, etc.)
Thanks!
We made the exact move you're thinking about almost a year ago and haven't looked back. Most communication with our client is over the web, so it's been a perfect fit. They can view html outputs easily on our website, and can generate pdf's of the page (server side) whenever necessary. The program we use for pdf conversion is a free, easy-to-use, open-source project called wkhtmltopdf.
Where we are is great, but getting here was difficult.
Deciding which pdf engine to use was a long, painful process. The short of it is that HTML is for viewing pages on the internet, not for viewing pages on paper. Page-breaks will be the bane of your existence in this game -- you literally have to measure each page and create your own clean-looking breaks for every single report (otherwise, all html-to-pdf converters out there will just keep rendering the document onto the next page as it if encountered no page-break at all). Further complicating the matter is that every html-to-pdf engine out there handles this sh*t differently and you'll have to write a tailored solution to test each one to see if it meets your individual needs.
Now, the good news:
You can save yourself a lot of trouble by heeding my advice and going with wkhtmltopdf for your finalized reporting outputs. This little program is simply amazing -- it uses a webkit engine, renders CSS/javascript accurately, has header/footer control, optionally creates a table-of-contents page, and (most importantly) consistently produces excellent looking pdf's without having to customize your code base. It also has a variety of great command line switches, and it is very, very fast. I say again: it is very, very fast.
Best of all, it's a command line tool that can be used in batch processing. And did I mention that it's really, really fast?
Browser support for printing is generally terrible. However, there are other tools, notably Prince (which is not free) and Flying Saucer (which is free) that can generate PDF output from XML/HTML plus CSS. Prince even supports JavaScript though I don't have any experience with it.
I've got a Java back end in my current application, so for me Flying Saucer works fine for simple reports. I pre-process an HTML template with FreeMarker and then run the result through Flying Saucer. It's got a surprisingly smart rendering engine.
The CSS3 Paged Media spec (well, proposed spec) has all sorts of cool stuff in it but they're almost totally unimplemented in the browsers. Even the CSS2 paged media stuff is only supported half-heartedly.
Speaking of Prince, you might look into DocRaptor. DocRaptor is another HTML to PDF conversion application. It uses Prince XML, and handles CSS better than comparable programs.
It isn't free, but offers a free 30 day trial for all accounts, so there's no harm in trying it out, at least.
DocRaptor

Is php capable of doing what I want?

I'm working on a biology web based application and trying to figure out what language to use. The features I need to include are:
Image viewing frame - This area will display the current image that the biologists wish to see. The application needs to take in a number of coordinates from a file and draw those points on the image displayed here. When the biologist wishes to change images there needs to be no flickering from the refresh. Will do this using multiple image buffers probably. Content needs to be scrollable and able to be zoomed in.
There need to be labeled buttons that advance, step back, zoom, and play the images displaying in the image frame. There also needs to be some type of list view where images titles can be selected to be displayed.
There will be a bunch of folders of images on the server that can be selected from. The application must allow the user to select which folder of images to be loaded. It also must be able to read from either an txt or xml file and visually display the information there by way of line graph.
Would like to be able to run scripts on the server from the application.
I feel that all these things are doable by a web application but I have no idea what language to use. Most people recommend php, but i don't want to delve deeper until I know what its limitations are. Any suggestions are welcome. Thanks in advance.
-Mike
PHP can do everything you need for the back end, but most of the stuff that you describe is UI based, and this is dependent on the client, which is, of course, the browser. For highly graphical projects, you can do a lot in JavaScript and some JavaScript libraries have a lot of these capabilities built in. You might also consider Flash or Flex.
You might even consider a desktop application that runs outside of the browser. You can use Java, which is easy to deploy, but still requires the user to have the Java Runtime Engine, or you could go with a language that you can compile down to a native application.
Regardless of the front end technology that you choose, you'll still need a back end, and PHP can handle this.
You will find almost every server side platforms such as php , asp.net, asp, etc will do all of the above.
PHP is a language that resides on the server and handles all requests. Javascript (and associated libraries) is a language which is executed by the client's browser and handles (almost) all interaction. PHP is definitely able to do what you want, but for the interaction stuff (particularly the zoom, scrolling, etc.), you'll also need to use Javascript.
So, short answer, PHP is good, but you're going to need to use client-side scripting as well.
PHP is more than capable of doing this. You are going to need to use it in combination with some Javascript to handle the client side effects you describe. I would look into modifying galerific for your needs and then whip up some javascript to write points over the images.
From your concerns about image refresh/flicker, it really sounds like a desktop app is what you are looking for, for a rapid response on image changes. The requirements on this really seem to need to be defined better before you can choose a language... PHP can do all the server side stuff you mentioned, but you might have a harder time getting the image viewing "frame" to provide the functionality you want.
Due to the image manipulation requirements it might be easier to go with something like flash with a php backend or asp.net with silverlight. It might be difficult to prevent flicker and delays with using pure javascript as opposed to flash/silverlight.
Image viewing frame
This will most likely need to be done on the client side using tools/frameworks such as jQuery, the canvas element, silverlight, or any of the other 100's that are out there.
There need to be labeled buttons that advance, step back, zoom, and play the images displaying in the image frame. There also needs to be some type of list view where images titles can be selected to be displayed.
PHP or any other server-side scripting language could pull this off. If this is meant to be a quick project running on free/cheap hardware then PHP would be a good choice. If the plan is a large application that will have to be maintained over the course of many years and hosting/price is not an issue then I would suggest something like ASP.NET
There will be a bunch of folders of images on the server that can be selected from. The application must allow the user to select which folder of images to be loaded. It also must be able to read from either an txt or xml file and visually display the information there by way of line graph.
Again any server side language could do the folder listing portion. As for reading files and creating graphs, this would most likely be a combination of server side and client side programming. jQuery for example, has plugins that could quite easily take a xml file and create a line graph.
Would like to be able to run scripts on the server from the application.
PHP, ASP.NET - both could do this. I'm sure many others could, but these are the ones i use most often
The issue with PHP is that quite often, the code turns into a mess over time. This is maybe not so much an issue with the language as the people using it and the purpose the app was built for (a quick, one time project). Classic ASP also has the same issues.
ASP.NET is a good combination of OOP programming that allows you to separate presentation from logic with minimal effort.

Categories