OpenTBS - Repeating Paragraph with Token Replacement? - php

I have a text-based (as opposed to table-based) .docx document with a section that has several places in the paragraph for replacement. This paragraph must be duplicated and the tokens replaced for each entry in the array. I'm continuing to comb through the documentation but am kind of stuck on this one -- most of the examples are table-based and I'm not seeing how I can accomplish what I'm shooting for.
Here's an example of the section in the .docx file I am using as a template so far:
[onshow; block=begin; personsblock=tbs:p;]
Person 1
[flex_27.xx.01] [flex_27.xx.02], currently of [flex_27.xx.05], as amazing with the following: __{this has yet to be determined}__.
[onshow; block=end; personsblock=tbs:p]
... and my data so far is:
$personarray = array();
$personarray[] = array('tID27.01.01'=> 'Steve', 'tID27.01.02'=>'Klutcher' , 'tID27.01.05'=>'', 'tID27.01.06'=>'Cook');
$personarray[] = array('tID27.02.01'=> 'Tommy', 'tID27.02.02'=>'Boonary' , 'tID27.02.05'=>'Clarksville', 'tID27.02.06'=>'Montgomery');
... At this point, I am pretty much lost. I'll be programatically replacing the center 'code' (marked by xx) with the count of the person involved. What is the difference between a merge and a replace? How I combine the action? Can I do a multi-pass on the document somehow?
Sorry if these seem such basic questions but as I said, I've been stuck on this for two days.

When you use block=begin ... block=end those is the syntax for absolute bounds. They indicate the position of the start and the end of the block in the source.
In a doc, you don't have access to the inner XML source. So it is advised to use the relative syntax for blocs.
In PHP, your data structure should be like :
$personarray = array();
$personarray[] = array('c01'=> 'Steve', 'c02'=>'Klutcher' , 'c05'=>'', 'c06'=>'Cook');
$personarray[] = array('c01'=> 'Tommy', 'c02'=>'Boonary' , 'c05'=>'Clarksville', 'c06'=>'Montgomery');
In your DOCX, your template snippet should be like :
Person 1
[person.c01;block=tbs:p] [person.c02], currently of [person.c05], as amazing ...
Assuming that those two lines are in the same paragraph.
The expression [person.01;block=tbs:p] is a TBS fields that merges column '01' and also define the block bound as the paragraph that embeds this field.
They are also some problems in your original PHP spinnet :
Column names should not has dot (.) because this is a separator character for TBS field's names.
I use the prefix 'c' because PHP will turn numeric string key as numerical index.

Related

OpenTBS/TinyButStrong Is Not Replacing Merge Fields In DOCX

I'm trying to use the OpenTBS/TinyButStrong library to replace merge fields in a word document.
We can take a very basic word document like this:
Hello, My Name Is Bob, My Age Is <<BOBAGE>>
Which in word has the following code:
{MERGEFIELD BOBAGE\*MERGEFORMAT}
And my code would be basic:
$TBS = new \clsTinyButStrong();
$TBS->PlugIn(TBS_INSTALL,OPENTBS_PLUGIN);
$TBS->LoadTemplate($path,OPENTBS_ALREADY_UTF8);
$TBS->MergeField('BOBAGE','TEST');
$TBS->Show(OPENTBS_FILE,$tmpPath . 'test.docx');
When i open test.docx, the merge field isn't replaced!
It works if i use [bobage] which isn't actually a word merge field! That's not what i expected it to do, that's pretty useless.
Is there a way to replace the actual word merge fields?
The instruction ̀$TBS->MergeField() is for merging TBS fields, not Ms Word Mail Merge Fields.
TBS fields are those like [my_field] or [my_block.my_field] in your template.
So your snippet could work if you have a piece of text in your template like [BOBAGE].
By the way, OpenTBS can merges document fields if the type is IF field, but not MERGEFIELD. See the documentation and examples for more details.

Combining array values in PHP when sometimes a value doesn't exist

I apologize if this question has a no brainer answer. I am still learning more ins and outs of php.
I have a snippet of code that is taking in a CSV file. This CSV file is uploaded by a user who downloads it from an external source. In the CSV file, the person's first name and last name is not split in separate columns. Therefore, in PHP the following is used:
$member_name = explode( " ", $application_data[5]);
The problem is that when this data is then used to render a PDF document to send a letter to the member, it cuts off their last name if their last name is two words.
The information is loaded into the PDF document with the first name and last name field by using:
$member_name[0],
$member_name[1]
Can I safely do:
$member_name[0],
$member_name[1] + $member_name[2]
Even if 99% of the members do not have a space in the last name? Will I get an error that member_name[2] doesn't exist 99% of the time this is done?
I've done some searching on array_merge. Is that a better option? I've been trying to search for how php handles when you add something that doesn't exist and I'm drawing a blank.
I don't want to assume my solution will work and then when the person uploads their CSV file tomorrow, they get an error.
Or maybe I'm looking at this the wrong way and before it attempts to render a pdf document, I should do an if statement that figures out if $member_name[2] exists.
Thank you!
You can use the limit parameter of explode to only split at the first space.
$member_name = explode( " ", $application_data[5], 2);
Of course, if the first name also has more than one word, this still won't be quite right. Names are tricky.
Regarding array_merge, I don't think it would really be useful in this situation.
You could just use a limiter on your explode to only seperate on the first space. Here is an example.
$name = "George The King";
print_r(explode(' ', $name, 2)); //prints -> Array ( [0] => George [1] => The King )

PHP pdf form parse regex

I have a two PDF forms that I'd like to input values for using PHP. There doesn't seem to be any open source solutions. The only solution seems to be SetaSign which is over $400. So instead I'm trying to dump the data as a string, parse using a regex and then save. This is what I have so far:
$pdf = file_get_contents("../forms/mypdf.pdf");
$decode = utf8_decode($pdf);
$re = "/(\d+)\s(?:0 obj <>\/AP<>\/)(.*)(?:>> endobj)/U";
preg_match_all($re, $decode, $matches);
print_r($matches);
However, my print_r is empty even after testing here. The matches on the right are first a numerical identifier for the field (I think) and then V(XX1) where "XX1" is the text I've manually entered into the form and saved (as a test to find how and where that data is stored). I'm assuming (but haven't tested) that N<>>>/AS/Off is a checkbox.
Is there something I need to change in my regex to find matches like (2811 0 obj <>/AP<>/V(XX2)>> endobj) where the first find will be a key and the second find is the value?
Part 1 - Extract text from PDF
Download the class.pdf2text.php # http://pastebin.com/dvwySU1a (Updated on 5 of April 2014) or http://www.phpclasses.org/browse/file/31030.html (Registration required)
Usage:
include('class.pdf2text.php');
$a = new PDF2Text();
$a->setFilename('test.pdf');
$a->decodePDF();
echo $a->output();
The class doesn't work with all pdf's I've tested, give it a try and you may get lucky :)
Part 2 - Write to PDF
To write the pdf contents use tcpdf which is an enhanced and maintained version of fpdf.
Thanks for those who've looked into this. I decided to convert the pdfs (since I'm not doing this as a batch) into svg files. This online converter kept the form fields and with some small edits I've made them printable. Now, I'll be able to populate the values and have a visual representation of the pdf. I may try tcpdf in the event I want to make it an actual pdf again though I'm assuming it wont keep the form fields.

Using non standard characters in associative array

Good day all! I am working on a parser for a chat room that can color text based on who was talking for archive purposes. I have it working perfectly, except now the administrator wants to be able to remove the "fancy" names and replace with more readable versions for some of their regular people.
The chat room allows an extended range of letters and symbols to use, that, when transferred to a rtf file, may not exactly transfer fully.
I cant get it to work, and dont see any reason why it should not.
This is an example of what I have:
$nameconvert = array(
"îrúål__Þħōþħ" => "Eriel__Thoth",
);
***Scripting that parses an uploaded text
file line by line, each line places in an
array using space as delimiter... thus
name of person talking is $row_data[0]***
$name = $row_data[0];
$name = $nameconvert[$name];
** Code to throw everything back together **
Now, this is just a simplified snippet, but for whatever reason, it does not work. Now if I did $name = $nameconvert['îrúål__Þħōþħ'] then it does work, telling me that the name im putting in script, and name being pulled from mytext file are two different things, though they are visually identical
HELP!
I have found the answer, and wish to share my solution to others.
This is the modified code
$nameconvert = array(
"0123456789abcdef" => "Eriel__Thoth",
);
***Scripting that parses an uploaded text
file line by line, each line places in an
array using space as delimiter... thus
name of person talking is $row_data[0]***
$name = $row_data[0]
$name = $nameconvert[bin2hex(mb_convert_encoding($name,"UTF-8"))];
$name = $nameconvert[$name];
** Code to throw everything back together **
The command bin2hex(mb_convert_encoding($name,"UTF-8")) takes the name from the file, ensures it is in UTF-8 format, then creates its hexadecimal equivalent. It then uses that in the array to correspond to a easier to read name
It works just the way I am wanting!

PHP preg_replace markdown issue - detecting duplicates

In a project I am building I would like to use markdown as follows
*text* = <em>text</em>
**text** = <strong>text</strong>
***text*** = <strong><em>text</em><strong>
As those are the only three markdown formats I require, I would like to remain lightweight and avoid importing the entire PHP markdown library as that would introduce features I do not require and create issues.
So I have been trying to build some simple regex replaces. Using preg_replace I run:
'/(\*\*\*)(.*?)\1/' to '<strong><em>\2</em></strong>'
'/(\*\*)(.*?)\1/' to '<strong>\2</strong>'
'/(\*)(.*?)\1/' to '<em>\2</em>',
And this works great! em, bold, and the combo all work fine...
But if the user makes a mistake or enters to many stars, everything breaks.
i.e.
****hello**** = <strong><em><em>hello</em></strong></em>
*****hello***** = <strong><em><strong>hello</em></strong></strong>
******hello****** = <strong><em></em></strong>hello<strong><em></em></strong>
etc
When ideally it would create
****hello**** = *<strong><em>hello</em></strong>*
*****hello***** = **<strong><em>hello</em></strong>**
******hello****** = ***<strong><em>hello</em></strong>***
etc
Ignoring the un-required stars (so it would become clear to the user they made a mistake, and more importantly, the rendered HTML remains valid).
I presume there must be some way to modify my regex to do this but I cannot for the life of my work it out, even after a whole day trying!
I would also be happy with the result of
******hello****** = <strong><em>hello</em></strong>
So please, can anybody help me?
Also please consider uneven stars. In this case the below scenario would be ideal.
***hello* = **<em>hello</em>
And the time when a star should be part of the body and not detected, such as if a user inputs:
'terms and conditions may apply*'
or
'I give the film 5* out of 10'
Many many thanks
Try different capturing pattern (match anything except * one or more times),
'/(\*\*\*)([^*]+)\1/'

Categories