XML Parsing error; PHPWord - php

I am using PHPOffice/PHPWord in my Laravel Application. It is used to generate a .docx document with results in tables. This works great for a document of 3 tables with 6 rows, but when there are more rows the document is generated but when opening it the following error occurs:
We're sorry, We can't open (documentname) because we found a problem with its contents.
Details: XML parsing error Location: Part:/word/document.xml, Line: 2, Column 14349.
Now, I have started working on another result page where I would also want to generate a .docx document. This will contain 5 tables, but with 3 rows I get the same XML parsing error but in a different location (Location: Part: /word/document.xml, Line:4, Column:2888). Could someone explain to me whether this is a error in my code, or phpword/words?
I have done some troubleshooting by deleting everything, and slowly adding new rows. I have found the error but how could i fix it. The first two tables are generated good..
$phpWord = new \PhpOffice\PhpWord\PhpWord();
$section = $phpWord->addSection();
$section->addImage('../public/img/2.jpg', array('width' => 230, 'height' => 65, 'alignment' => 'left'));
$section->addText('Project IDs:' . $parameter);
$header =$section->addHeader();
$header->addText('Results Summary');
$section->addLine(
array(
'width' => \PhpOffice\PhpWord\Shared\Converter::cmToPixel(16),
'height' => \PhpOffice\PhpWord\Shared\Converter::cmToPixel(0),
'positioning' => 'absolute',
)
);
$tableName = 'rStyle';
$phpWord->addFontStyle($tableName, array('italic' => true, 'size' => 12));
$thName = 'tStyle';
$phpWord->addFontStyle($thName, array('bold' => true, 'size' => 9));
$section->addText('General Information Table', $tableName);
$fancyTableStyle = array('borderSize' => 6, 'borderColor' => '999999');
$spanTableStyleName = 'Overview tables';
$phpWord->addTableStyle($spanTableStyleName, $fancyTableStyle);
$table = $section->addTable($spanTableStyleName);
$table->addRow(null, array('tblHeader' => true, 'cantSplit' => true));
$table->addCell(1750)->addText('Project ID',$thName);
$table->addCell(1750)->addText('Description',$thName);
$table->addCell(1750)->addText('Notes',$thName);
foreach ($id_array_explode as $char) {
$table->addRow();
$singlenumber = (int)$char;
$cursor = $collection->find(array("id" => $singlenumber));
foreach ($cursor as $document) {
$table->addCell(1750)->addText($document["project_id"]);
$table->addCell(1750)->addText($document["description"]);
$table->addCell(1750)->addText($document["notes"]);
}
}
$section->addText('
');
$section->addLine(
array(
'width' => \PhpOffice\PhpWord\Shared\Converter::cmToPixel(16),
'height' => \PhpOffice\PhpWord\Shared\Converter::cmToPixel(0),
'positioning' => 'absolute',
)
);
$section->addText('Input Table', $tableName);
$table1 = $section->addTable($spanTableStyleName);
$table1->addRow(null, array('tblHeader' => true, 'cantSplit' => true));
$table1->addCell(1750)->addText('Project ID',$thName);
$table1->addCell(1750)->addText('#',$thName);
foreach ($id_array_explode as $char) {
$table1->addRow();
$singlenumber = (int)$char;
$cursor = $collection->find(array("id" => $singlenumber));
foreach ($cursor as $document) {
if (is_array($document['input'])) {
foreach ($document['input'] as $samples) {
$table1->addCell(1750)->addText($document["project_id"]);
$table1->addCell(1750)->addText($samples['nr']);
}
}
}
}
$section->addText('
');
$section->addLine(
array(
'width' => \PhpOffice\PhpWord\Shared\Converter::cmToPixel(16),
'height' => \PhpOffice\PhpWord\Shared\Converter::cmToPixel(0),
'positioning' => 'absolute',
)
);
$section->addText('Output Table', $tableName);
$table2 = $section->addTable($spanTableStyleName);
//// THIS IS WHERE THE ERROR OCCURS!!
$table2->addRow(null, array('tblHeader' => true, 'cantSplit' => true));
$table2->addCell(1750)->addText('ID',$thName);
Thank you!
SOLUTION
Oke, so I have deleted the whole document and added every single sentence separately to see where the error occurred. This led to seeing that the error came from the data which I was getting. It couldn't handle ">" and "&" signs!
So, if you every have this error, check the data which you're printing!

A better solution is to add the following line of code before you do anything with the word document:
PhpOffice\PhpWord\Settings::setOutputEscapingEnabled(true);
This will automatically escape any problematic characters.

Indeed, It comes from your data : you have a XML special character in it and when Word, parses your doc, it doesn't understand.
I solved this problem by using htmlspecialchars(). I'm not sure it is the best way but it works.

Not very familiar with PHPWord, but make sure the encoding of your document and the data you are inserting into it are the same. Used to have the same problem with an old library for creating excel files.

Related

Can PHPword show Pie Chart in percenage with two decimals?

I use PHPword class to create Word file in PHP.
Can you create Pie chart to show value in percentage with two decimals?
$c3 = array('Expensive kW', 'Cheap kW');
$s3 = array($expensive, $cheap);
$tablePie2Charts = $section->addTable('Chart');
$tablePie2Charts->addRow();
$stylePie2Chart = array(
'width' => Converter::inchToEmu(5),
'height' => Converter::inchToEmu(3),
'valueAxisTitle' => 'Last month consumed in kW',
'showLegend' => true,
'dataLabelOptions' => array(
'showCatName' => false,
'showVal' => false,
'showPercent' => true
)
);
$c1 = $tablePie2Charts->addCell()->addChart('pie', $c3, $s3, $stylePie2Chart);
I also encountered the same problem.But i can't find an api to support it.
And here my solution:
edit the file /PhpOffice/Phpword/Writer/Word2007/Part/Chart.php
near the line 250. Insert the code
$xmlWriter->writeElementBlock("c:numFmt",['formatCode'=>'0.00%','sourceLinked'=>'0']);
between
$xmlWriter->startElement('c:dLbls');
and
foreach ($style->getDataLabelOptions() as $option => $val) {
It finally look like that:
$xmlWriter->startElement('c:dLbls');
$xmlWriter->writeElementBlock("c:numFmt",['formatCode'=>'0.00%','sourceLinked'=>'0']);
foreach ($style->getDataLabelOptions() as $option => $val)
if you just want it in pie chart and you should:
if ('pie' === $this->element->getType()){
$xmlWriter->writeElementBlock("c:numFmt",['formatCode'=>'0.00%','sourceLinked'=>'0']);
}
You can change the formatCode like 0.0% 0.00% 0.000% whatever you want.
If someone find the better way,plz tell me,thanks.

Nested tables with phpdocx

Im trying to create a nested table with the library phpdocx. In their documentation they write that it is possible to have a nested table in a table cell. But its not clearly written how to make it work..
I tried the following code:
$valuesTable = array(
array(
array(array(1,2,34),12,13,14),
array(21,22,23,24),
array(31,32,33,34),
);
$params = array(
'border' => 'single',
'tableAlign' => 'center',
'borderWidth' => 10,
'borderColor' => 'B70000',
'textProperties' => array('bold' => true, 'font' => 'Algerian', 'fontSize' => 18),
);
$docx->addTable($valuesTable, $params);
But the cell is just empty. Is there an easy way to get this nested table displayed?
I finally found the solution. It is possible with WordFragments.
$innerData = array(1,2,3,4);
$innerTable = new \WordFragment($docx);
$innerTable->addTable($innerData, array('rawWordML' => true));
$tableParams = array(); // Add here the table params
$outerData = array("A", "B", $innerTable);
$outerTable->addTable($outerData, $tableParams);

Regarding footer issue in pdf file creation using php and fpdf library

I have created and successfully created my pdf file in php with fpdf
library support.
But the problem is my footer is showing more space.
I want to reduce the space underneath my text. My output is like
this:
Here my code goes:
<?php
require('fpdf/fpdf.php');
class PDF extends FPDF {
function Header() {
$this->SetY(0.208333);
}
function Footer() {
if ($this->footer <> 1)
{
$this->SetY(-15);
}
else
{
echo "bye";
}
}
}
//class instantiation
$pdf=new PDF("l","in",array(8.5,4.17));
$pdf->SetFont('Arial','',8);
$pdf->footer = -15;
//Array2
$datas = array
(
'Address1' => array
(
'Name' => 'Vijaya',
'Area' => 'Valasaravakkam',
'City' => 'Chennai',
),
'Address2' => array
(
'Companyname' => 'Vy Systems',
'Area' => 'Valasaravakkam',
'City' => 'Chennai',
),
'Address3' => array
(
'Companyname' => 'Vy Systems1',
'Area' => 'Valasaravakkam1',
'City' => 'Chennai1',
),
);
//Array2
$datas1 = array
(
'Address4' => array
(
'Name' => 'Jaya',
'Area' => 'Valasaravakkam',
'City' => 'Chennai',
),
);
foreach($datas1 as $address1 => $details1)
{
//pdf_set_text_pos($pdf, 1240, 490);
//$pdf->ln(1);
foreach($datas as $address => $details)
{
$pdf->SetMargins(0,0,0.3);
$pdf->AddPage();
if((is_array($details)) and (is_array($details1)))
{
foreach($details1 as $rows1 => $value1)
{
$pdf->SetX(0.520833);
$pdf->MultiCell(0, 0.2, $value1, 0, "L");
}
$pdf->ln(1.96);
foreach($details as $rows => $value)
{
$pdf->SetX(5);
$pdf->MultiCell(5, 0.2, $value, 0, "L");
}
}
}//end of sub foreach
}//end of main foreach
$pdf->Output();
?>
I didn't follow the code completely, but it seems you're using the Header and Footer methods to set Y and nothing more, expecting that to be enough to correctly position the MultiCells being output outside of the Header and Footer. Maybe so, but the interaction of positioning inside and outside the Header/Footer isn't well defined.
For example, the process may be something like this: Y is calculated for the MultiCell, that trips the footer, the footer changes Y, the MultiCell is output. Is this the original Y, the revised (by the footer Y), or some other value? Absent a precise definition of what happens, you've set up a complex sequence of things that would be very difficult to sort out.
I would suggest vastly simplifying the code. You may find that the automatic header/footer tripping isn't helpful at all. In that case, turn off the auto page break, get rid of the Footer/Header functions, and totally control each page yourself. That way at least you have a clear, reliable model of what's going on.

php logic loop step by 5

I am reading an excell file with php. No problem with that but I am stuck on a little logical part. I want to make an array that containts multiple other arrays with data.
The data is provided in my excell file I know from what column should start reading but not when to stop because this is dynamic.
My question is how can a make a loop that reads my columns and makes on every 5th column a new array.
so what I want is something like this:
(My data for the excell file is proved in $line[] each column has its number.)
array(
'length' => $line[15],
'width' => $line[16]
'price_per' => $line[17],
'price' => $line[18],
'stock' => $line[19]
),
array(
'length' => $line[20],
'width' => $line[21]
'price_per' => $line[22],
'price' => $line[23],
'stock' => $line[24]
),
array(
'length' => $line[25],
'width' => $line[26]
'price_per' => $line[27],
'price' => $line[28],
'stock' => $line[29]
), ....
So how can I make this dynamic (for loop ?) so that I have 1 big indexed Array , with multiple asscociated arrays? Note: my for loop should always star from line[15]!
To begin with, if $line has any elements that you don't want to process (e.g. the first 15 as your example indicates), slice them off with array_slice:
$line = array_slice($line, 15);
Then use array_chunk to split your original array into as many pieces as there are:
$chunks = array_chunk($line, 5);
Then, turn each chunk into its own array by associating each value with the correct key using array_combine:
$results = array();
$keys = array('length', 'width', 'price_per', 'price', 'stock');
foreach ($chunks as $chunk) {
$results[] = array_combine($keys, $chunk);
}
for($i = 15; $i < ????; $i += 5)
{
$your_array[] = array(
'length' => $line[$i],
'width' => $line[$i+1]
'price_per' => $line[$i+2],
'price' => $line[$i+3],
'stock' => $line[$i+4]
);
}
Replace ???? by the number of lines

making IPTC data searchable

I have a question about IPTC metadata. Is it possible to search images that aren't in a database by their IPTC metadata (keywords) and show them and how would I go about doing this? I just need a basic idea.
I know there is the iptcparse() function for PHP.
I have already written a function to grab the image name, location, and extension for all images within a galleries folder and all subdirectories by .jpg extension.
I need to figure out how to extract the metadata without storing it in a database and how to search through it, grab the relevant images that match the search tag (their IPTC keywords should match) and how to display them. I know at the point that I have the final results (post search) i can echo an imagetag with src="$filelocation"> if i have the final results in an array.
Basically, I am not sure if I need to store all my images into a mysql database and also extract the keywords and store them in the database as well before I can actually search and display the results. Also, if you could guide me to any gallery that already is able to do this, that could help as well.
Thanks for any help regarding this issue.
It is not clear what in particular is giving you problems, but perhaps this will give you some ideas:
<?php
# Images we're searching
$images = array('/path/to/image.jpg', 'another-image.jpg');
# IPTC keywords to values (from exiv2, see below)
$query = array('Byline' => 'Some Author');
# Perform the search
$result = select_jpgs_by_iptc_fields($images, $query);
# Display the results
foreach ($result as $path) {
echo '<img src="', htmlspecialchars($path), '">';
}
function select_jpgs_by_iptc_fields($jpgs, $query) {
$matches = array();
foreach ($jpgs as $path) {
$iptc = get_jpg_iptc_metadata($path);
foreach ($query as $name => $values) {
if (!is_array($values))
$values = array($values);
if (count(array_intersect($iptc[$name], $values)) != count($values))
continue 2;
}
$matches[] = $path;
}
return $matches;
}
function get_jpg_iptc_metadata($path) {
$size = getimagesize($path, $info);
if(isset($info['APP13']))
{
return human_readable_iptc(iptcparse($info['APP13']));
}
else {
return null;
}
}
function human_readable_iptc($iptc) {
# From the exiv2 sources
static $iptc_codes_to_names =
array(
// IPTC.Envelope-->
"1#000" => 'ModelVersion',
"1#005" => 'Destination',
"1#020" => 'FileFormat',
"1#022" => 'FileVersion',
"1#030" => 'ServiceId',
"1#040" => 'EnvelopeNumber',
"1#050" => 'ProductId',
"1#060" => 'EnvelopePriority',
"1#070" => 'DateSent',
"1#080" => 'TimeSent',
"1#090" => 'CharacterSet',
"1#100" => 'UNO',
"1#120" => 'ARMId',
"1#122" => 'ARMVersion',
// <-- IPTC.Envelope
// IPTC.Application2 -->
"2#000" => 'RecordVersion',
"2#003" => 'ObjectType',
"2#004" => 'ObjectAttribute',
"2#005" => 'ObjectName',
"2#007" => 'EditStatus',
"2#008" => 'EditorialUpdate',
"2#010" => 'Urgency',
"2#012" => 'Subject',
"2#015" => 'Category',
"2#020" => 'SuppCategory',
"2#022" => 'FixtureId',
"2#025" => 'Keywords',
"2#026" => 'LocationCode',
"2#027" => 'LocationName',
"2#030" => 'ReleaseDate',
"2#035" => 'ReleaseTime',
"2#037" => 'ExpirationDate',
"2#038" => 'ExpirationTime',
"2#040" => 'SpecialInstructions',
"2#042" => 'ActionAdvised',
"2#045" => 'ReferenceService',
"2#047" => 'ReferenceDate',
"2#050" => 'ReferenceNumber',
"2#055" => 'DateCreated',
"2#060" => 'TimeCreated',
"2#062" => 'DigitizationDate',
"2#063" => 'DigitizationTime',
"2#065" => 'Program',
"2#070" => 'ProgramVersion',
"2#075" => 'ObjectCycle',
"2#080" => 'Byline',
"2#085" => 'BylineTitle',
"2#090" => 'City',
"2#092" => 'SubLocation',
"2#095" => 'ProvinceState',
"2#100" => 'CountryCode',
"2#101" => 'CountryName',
"2#103" => 'TransmissionReference',
"2#105" => 'Headline',
"2#110" => 'Credit',
"2#115" => 'Source',
"2#116" => 'Copyright',
"2#118" => 'Contact',
"2#120" => 'Caption',
"2#122" => 'Writer',
"2#125" => 'RasterizedCaption',
"2#130" => 'ImageType',
"2#131" => 'ImageOrientation',
"2#135" => 'Language',
"2#150" => 'AudioType',
"2#151" => 'AudioRate',
"2#152" => 'AudioResolution',
"2#153" => 'AudioDuration',
"2#154" => 'AudioOutcue',
"2#200" => 'PreviewFormat',
"2#201" => 'PreviewVersion',
"2#202" => 'Preview',
// <--IPTC.Application2
);
$human_readable = array();
foreach ($iptc as $code => $field_value) {
$human_readable[$iptc_codes_to_names[$code]] = $field_value;
}
return $human_readable;
}
If you don't have extracted those IPTC data from your images, each time someone will search, you'll have to :
loop on every images
for each image, extract the IPTC data
see if the IPTC data for the current image matches
If you have more than a couple image, this will be really bad for performances, I'd say.
So, in my opinion, it would be far better to :
add a couple of fields in your database
extract the relevant IPTC data when the image is uploaded / stored
store the IPTC data in those DB fields
search in those DB fields
Or use some search engine like Lucene or Sphinx -- but that is another problem.
It'll mean a bit more work for you right now : you have more code to write...
... But it also means your website will have better chances to survive when there are several images and many users doing searches.

Categories