Creating Tables With PHP Using PDFLib - php

I was wondering if there was a good source on how to build tables using the PDFLib for PHP. I am planning to populate a PDF Document with a Database Table (well a few that I join together to create a new view) and I wanted to make it a PDF document for the web. I've been searching all over and I find plenty of information on PDFLib except how to create a table with it.
I've checked out the PDFLib commands on PHP.net and also can't quite get a clear grasp of what's needed.

I've worked with PDF generation a few times in the past, and generally find it to be a huge pain in the neck.
PDFLib's documentation http://www.pdflib.com/fileadmin/pdflib/pdf/manuals/PDFlib-8.0.2-tutorial.pdf starts explaining what you're looking for in section 8.2, page 193. You'll be creating multi-line flows. The code there looks intimidating, but take some time to work through it, it's pretty close to what you'll end up using.
I may be able to find some code later, but I forget what library I was using. For now a few tips:
Work it out on paper, just like their marked up examples. Where you want things to start, end, and such.
Use clear variable names to store those offsets. Not constants!
Find good extreme examples to test with while developing. Developing with text like "test" to find out later you need to support "I am the very model of the modern major general" may throw off your entire flow, and require you to start from scratch.
Some libraries "support" HTML embeds, including HTML tables. This siren song is sweet, but will lead you into the razor sharp rocks. Every library I've used supports them a little bit, but then you run into a wall where you can't get the next little tweak without dropping tables and reverting to native functions. They've been a huge waste of time to play with, one and all.
update
I've found my most recent code iteration, we used the library from http://www.tcpdf.org. It worked, mostly. I dealt with a lot of inconsistencies in where the cursor was left after writing multiple lines of text to a page. I ended up ripping out anything that used their multi-line code and writing my own. That done it got pretty easy to work with.

Table handling in PDFlib is made extremely difficult. Tables work, but in cases where you have multiple tables in top of each other and want the below one tables to be always at a certain distance of upper table's bottom line or want to use nested tables, you are in trouble. These like behaviors can be made, but the code is complicated. WHY pdflib team didn't take usage behaviour of html tables, where they have worked well two centuries.
Because html tables are easy to use, one good method is to use phantomJS to generate pdf from html. PhantomJS uses webkit for page rendering and supports html5+css3+svg+canvas. And in addition to pdf, it can output png, jpeg and gif.
Here is an example of using phantomJS to generate PDF-invoices:
http://we-love-php.blogspot.fi/2012/12/create-pdf-invoices-with-html5-and-phantomjs.html

Related

Argument for PHP vs. DWT

I was having a "discussion" with my manager today about the merits of using PHP includes and functions as a template to build websites more quickly and efficiently. He has been using Dreamweaver templates for years and sees it as really the best way to go. I would like to start using some different and more efficient methods for web creation, because we need to get through our projects faster. I would like to know in detail what would make Dreamweaver dwts better than using code to accomplish the same task, or vice versa.
His reasoning is:
When you change links on the dwt file, it changes links for every page made from that dwt.
Even if you move pages around in directories, it maintains links to images
Everyone in the company should do it one way, and this is the way he chose (there are two of us, with someone who's just started who needs to learn web design from the beginning, and he plans to teacher her the dwt method)
If you edit a site made with a dwt, you can't change anything in the template (it's grayed out), making it safer
If he's building sites with dwt, and I'm doing it with PHP includes, we can't edit each others' sites. It gets all over the place. When we have new employees in the future, it will get all crazy and people can't make changes to others' sites if they're out of the office.
I've been studying PHP these days, and am thrilled with how powerful it is for creating dynamic pages. The site in question which sparked this "discussion" is more or less static, so a dwt would work fine. However, I wanted to stretch my wings a bit, and the code was getting REALLY jumbled as the pages grew. So I chopped off the header, footer, and sidebar, and brought them in to all the pages with a php include, as well as dynamically assigned the title, meta data, and description for each page using variables echoed in the header.The reasons I like this better are:
It's cleaner. If every page contains all the data for the header and footer, as well as the extra tags Dreamweaver throws in there, then I have to sift through everything to find where I need to be.
It's safer. It's sort of like the above reason dwts are safe, except I do all my code editing in a text editor like Coda. So on occasion I have accidentally deleted a dwt-protected line of code because those rules only apply within dreamweaver. I can't chop off part of the header if I can't see it. Incidentally, this makes it easier to identify bugs.
It's modern. I look through source when I see great pages made by designers and design firms I admire. I've never seen dwt tags. I believe by using PHP to dynamically grab files and perform other tasks that keeps me from having to go through and change something on every page, life becomes easier, and keeps things streamlined and up-to-date with current web trends and standards.
It's simple. This should be at the top of the list. Like I said we have to train a new person in how to create for the web. Isn't it much better for her to learn a simple line of PHP and get an understanding for how the language works, rather than learn an entire piece of (not exactly user-friendly) software just for the purpose of keeping her work the exact same as everyone else's? On that note, I believe PHP is a powerful tool in a web designer's arsenal, and it would be a sin to prevent her from learning it for the sake of uniformity.
It's fast. Am I mistaken in my thought that a page build with header and footer includes loads faster than one big page with everything in it? Or does that just apply when the body is loaded dynamically with AJAX?
I did extensive searching on Google and Stack Overflow on this topic and this is the most relevant article I could find:
Why would one use Dreamweaver Templates over PHP or Javascript for templating?
The answer is helpful, but I would really like to understand in more detail why exactly we shouldn't switch to a new method if it's simpler and has more potential. My manager insists that "the result is the same, so if there isn't something that makes me say, 'oh wow that's amazing and much better!', then we should just stay how we are now."
(I do apologize for the length of this question, but the guidelines asked that I be as specific as possible.)
Like I said in comments, without knowing what exactly sites you are working with it's hard to tell which PHP features are most important to showcase. However, I can try and describe the most simple kind of sites I was dealing with, and where and how PHP came in handy. If you work with something more complicated, the need of programming language may only increase.
The simple website may have a few different pages with text and images. I'm assuming nothing interactive (i.e. no inquiry form), no large amount of structured data (i.e. no product catalog), only one design template which is used by every page with no differences whatsoever. Here's the typical structure:
One PHP file (index.php) for handling all sorts of php-ish stuff
One design file (template.php for example) for storing everything html-ish (including header, footer and more. Basically all html with placeholders for text and menu)
One CSS file for, well, the site CSS
Most of the texts are stored in database or (worst case) just txt files. Menu (navigation) is stored in database as well
Images folder with all the needed images
The key features here are:
Simplicity. You only have as many files and code as you really need to keep things organized and clear
Reusability. You can basically copy/paste your php code with little to no changes for a new similar website
No duplicates whatsoever.
Data and design separation. Wanna change texts, fix typos? You do it without as much as touching design (html) files. Wanna make a completely brand new design for your website? You can do it without even knowing what those texts are or where they are kept.
like deceze said, no lock-ins. Use whatever software you like. (Including Dreamweaver)
PHP is responsible for taking texts, menus, design and rendering them all into a web page. If website is in more than 1 language, PHP code choose the right texts for the language of visitors choice.
If texts are stored in database, you don't even need notepad and ftp. You just need, i.e., phpMyAdmin (stored in server) so you can connect directly to database and edit any text you like using only web browser; from anywhere in the world. (I am assuming no real CMS). If you need to add one more page, you connect to database using myAdmin and browser, enter the page name (for menu) in 1 or more languages, enter the text for new page (in 1 or more languages), done! new page created, name placed in the menu, all hyperlinks generated for you. If you need to remove a page, you connect to database and click delete. If you need to hide a page for a while (i.e. for proof reading before publishing), you connect to database and uncheck "published" box.
All this doesn't come with just using database ofcourse, you need to program these features with PHP. It may take about 1 - 3 hours depending on experience and the code is fully reusable for every similar website in the future. Basically you just copy/paste php file, copy/paste database tables, enter new text and menu into database, put placeholders into your html file and done! brand new site created.
Which immediately makes most of the reasoning for DWT irrelevant. You don't move files around because you have only one html file and no directories, you don't need grayed out template because texts/images (content) and template are not even in the same file, there's no such thing as changing links in dwt file because it's PHP that generates them on the fly (these are not real links to real html files but rather links with parameters to tell PHP which exactly page must be rendered.. because remember we have just 1 file). The bottom line is, comparing features of the two side by side is like comparing features of a sword vs machinegun. Sharpness and length of the blade concepts are meaningless in a case of machinegun; while lifetime sword user won't really get the meaning of velocity and caliber before he tries and uses machinegun. And yet, while you can't compare their features one by one, no one brings sword to a gunfight for a reason :)
As for #3, currently there are many more people working with PHP than DWT (in a case you will need more employees in the future, or if other people will need to work with your websites later, etc.) As for #5, you can edit PHP websites with Dreamweaver as fine as DWT websites.
That's just off the top of my head. I wrote this in my lunch break so I likely forgot or missed quite a few things. I hope you will get a proper answer with detailed DWT vs PHP comparison too.
You simply can't compare PHP vs. DWT.
PHP is a programming language, where templating is just one of it's numerous features, and DWT is just a silly proprietary system to build simple web pages.
there is actually nothing to compare.
I would say that using DWT templates over PHP do have some advantages.
It does not need any extra server-side process, like PHP to process the files at the server.
You can serve all files to the user as .html files rather than .php files, though I suspect that it is possible to hide the .php extension. Why should any user see anything other than .html?
You don't have to learn PHP syntax/programming. It is true that you can do more with PHP that plain .dwt files but for plain templating the .dwt files can be just as clean.
It is not true that .dwt files are a lock-in technology. The feature is also implemented by other web editors, e.g. Microsoft Expression Web.

Thesaurus class or API for PHP [edited]

TL;DR Summary: I need a single command-line application which I can use to get synonyms and other related words. It needs to be multi-lingual and works cross platform. Can anyone suggest a suitable program for me, or help me with the ones I've already found? Thanks.
Longer version:
I've been tasked with writing a system in PHP that can come up with alternative suggestions for words entered by the user. I need to find a thesaurus application / API or similar which I can use to generate these suggestions.
Importantly, it needs to be multilingual (English, Danish, French and German). This rules out most of the software that I managed to find using Google. It also needs to be cross-platform (it needs to work on Linux and Windows).
My research has let me to two promising candidates: WordNet and Stardict.
I've been focusing on WordNet so far, calling it from PHP using the shell_exec() function, and I've managed to use it to create a very promising prototype PHP page, but so far in English only. I'm struggling with how to use it multi-lingual.
The Wordnet site has external links to Wordnet projects in other language (eg DanNet for Danish), but although they're often called Wordnet, they seem to use a variety of database formats and software, which makes them unsuitable for me. I need a consistent interface that I can call from my PHP program.
Stardict looked more promising from that perspective: they provide dictionaries in many languages in a standard DB format for the one application.
But the down-side of Stardict is that its primarily a GUI app. Calling it from the command-line launches the GUI. There is apparently a command-line version (SDCV), but it seems quite out of date (last update 2006), and only for Linux.
Can anyone help me with my problems with either of these programs? Or else, can anyone suggest any other alternative software or API that I could use?
Many thanks.
You could try to leverage PostgreSQL's full text search functionality:
http://www.postgresql.org/docs/9.0/static/textsearch.html
You can configure it with any of the available languages and all sorts of collations to fit your needs. PostgreSQL 9.1 adds some extra collation functionality that you may want to look into if the approach seems reasonable.
The basic steps would be (for each language):
Create the needed table (collated appropriately). For our sake, a single column is enough, e.g.:
create table dict_en (
word text check (word = lower(word)) primary key
);
Fetch the needed dictionary/thesaurus files (those from aspell/Open-Office should work).
Configure text search (see link above, namely section 12.6) using the relevant files.
Insert the whole dictionary into the table. (Surely there's a csv file somewhere...)
And finally index the vector, e.g.:
create index on dict_en using gin (to_tsvector('english', word));
You can now run queries that use this index:
-- Find words related to `:word`
select word
from dict_en
where to_tsvector('english', word) ## plainto_tsquery('english', :word)
and word <> :word;
You might need to create a separate database or schema for each language, and add an additional field (tsvector) if Postgres refuses to index the expression because of the language parameter. (I read the full text docs a long time ago). The details on this would be in section 12.2, and I'm sure you'll know how to adjust the above if this is the case.
Whichever the implementation details, though, I believe the approach should work.
There is a PHP example for a thesaurus API usage here...
http://thesaurus.altervista.org/testphp
Available for Italian, English, French, Deutsch, Spanish and Portuguese.
This seems to be an option, though I'm not sure whether its multilingual:
http://developer.dictionary.com/products/synonyms
I also found the following site which does something similar to your end goal, maybe you could try contacting the owner and ask him how he did it:
http://www.synonymlab.com/

PHP Reporting With Templates

I'm looking for recommendations on how to do some reporting in PHP. Specifically, I need PDF rendering and I would like to separate the presentation from my code as much as possible (templates of some sort?).
I've recently received a PHP4 code-base on the job and am in charge of upgrading to PHP5. The reports currently are created via TCPDF which are driven from HTML generated straight from PHP. The upgrade process is proving extremely challenging. I'm running into multiple problems with TCPDF where it will just go into an infinite loop and never return. Through some troubleshooting and forum posts, it appears we're doing some stuff that TCPDF4 with PHP4 didn't have a problem with but TCPDF5 and PHP5 does. Unfortunately, rather than error on whatever rules we're breaking, I just get infinite hangs. Our in-house code could use some refactoring and I feel a re-write could mitigate many of these problems, however I'm not opposed to looking beyond TCPDF right now.
Our code also has a lot of the HTML generation mixed in with the rest of the reporting features. I would really like to separate this out so that little to no PHP code is required when editing or creating a report. I don't need anything too fancy for the report. Needs to support basic data tables and multiple horizontal panes on a single tab. There are charts but they are currently generated by a separate package and just read directly as an image. Suggestions on the best way to re-do these reports? Thanks.
You can use PHP as a template engine, and than you can convert the HTML output to a PDF.
Here are the options for PHP templating:
Use Smarty, it's pretty easy and has it's own template language, but I think that it's for your usages a bit overhad.
Or you can use just plain PHP, just put the templates code into a sepearated directory and create a function render_template('template_name.php') or something like this.
In your case, I think that you could write your own class for the handling of reports. Here just a simple sketch of the usage:
<?php
$report = new Report(1234); // report id
$report->template = 'default'; // or something different
$report->addField('Users', array('Claire', 'Fred', 'Kevin'), array('renderArray' => true));
$report->addField('Note', '<p>This report was generated automatically.');
So, eventually you can use some of the simple ideas behind this simple sketch.
I hope that I understood your question correctly, and that I could help you with this answer.
I've created some dynamic PDF using multiple classes : separating PDF and Pdf design, from data access and management
I have one abstract class that you can use for all your reports. this class handles the PDF generation (all the set up), and you may design some styles ("writeTitle1()"), or helper methods to render data. ex "writeReportTable()"
If you plan on keeping a system that converts html2pdf, this class can generate some html too, or call any templating engine.
In my design, this class generally wraps the PDF engine. (I often use Cezpdf)
Then for each report, a child class that focuses on data management : getting and formatting the data, then calling the rendering method.
This is a first step to gain some visibility : independant layers.

Saving single pages of a word file as separate documents using COM

Lately I've been playing with Microsoft COM object class for PHP to manipolate word files. So far so good, as I've been able to make it work and do some file conversions, such as saving an entire DOC as a PDF on the server.
Now I'm facing a problem: since I'll be converting and manipulating the given word file a lot at runtime, I thought it would be much better if I could save every single -page- separately and work on them one by one instead of reprocessing the whole document each time.
I have been reading all the MSDN part about the COM Document Class, and I have the feeling that I can't save just one page of the document, unless I do some sort of magic using the Range Method, but apparently there's -no way- to know the 'current end position' for each page. Any ideas?
tl;dr I'm trying to save single pages inside a word document using a 'word.application' COM object through a PHP script, but I can't find examples of the Document.Range method.
Francesco, I'll have to warn you. #SLaks is correct in that you really cannot use Word Automation on a server. No, really. We're serious.
There are two reasons:
First, Word is an incredibly complex piece of software designed to be used by an interactive user. It was not programmed or tested to be used under a server environment, and does not work correctly when running under a non-interactive account (the way services do). Sooner or later it will crash or freeze. I've seen it. I'm not talking necessarily about bugs. There are things that Word will do that require a full user account; or where Word expects somebody will be clicking on message boxes. There is no escaping it.
Second, because even if you manage to make it do what you want, it turns out that the Office license expressely forbids you from running Word that way.
Now, exclusively from the point of view of Automation:
Word doesn't really manipulate 'pages'. 'Pages' are just an incidental side-effect of whichever printer is currently selected. Take the same file to a different computer with a different printer and/or driver, and the pagination can change. On large documents it will change.
Yes, most of the time the page breaks don't move (a lot), particularly if you have a document that is a bunch of not-quite-a-full-page forms, but I'm not trying to be fastidious: The point is, the Word document object model won't help you a lot to manipulate 'pages' because they are not a first-class citizen but incidental formatting.
I guess that your best bet would be to use section breaks between the pages, instead of letting the pages autoflow; that way you have something for the object model to grab onto.
You can use the ActiveDocument.Sections collection to locate your... ahem... 'pages' (really, section objects), then use the Range method (to extract the Range object) and the ExportAsFixedFormat method to export that range to a PDF.
If you want a Word document instead, I don't think the object model allows you to save a piece of the document as a separate document. However you can easily copy-and-paste the range to a new document and save that instead.
I have written some code in VB.net that splits a passed word document into individual pages. It then goes on to save the pages as JPG images so I would think this is what you want.
I am happy to share the code with you if you've not accomplished the task yet?

extract data from site and put into a file

got this project where the client has lost their database,hence i got to look up into their current(live)site and retrieve information... problem is that there is too much data that i have to copy and insert into the database which is taking a lot of time ...could you suggest some code which could help me ?
You can use DOMDocument library for php and write automated scripts to retreive data after identifing where are your informations in the page usin tags.
http://www.php.net/manual/en/book.dom.php
The library is very robust and uses xpaths.
http://www.w3schools.com/xpath/xpath_examples.asp
If the pages are all very similar in structure, you could try to use regular expressions or a html parser (tidy) to filter out the relevant data.
I did a similar thing for a customer who had 200+ handwritten product pages with images, titles and text. The source seemed to have been copy-pasted from the last page, and had evolved into a few different flavors. it worked great after some tweaking.

Categories