PHPWord: Creating an Arabic right to left word document - php

I'm trying to use PHPWord to create a word document that will include dynamic data pulled out from a MySQL database. The database has MySQL charset: UTF-8 Unicode (utf8)
MySQL connection collation: utf8_unicode_ci and so does the table fields.
Data is stored and previewed fine in HTML, however when creating the document with the arabic variables, the output in Word looks like أحÙد Ùبار٠اÙÙرÙ.
$PHPWord = new PHPWord();
$document = $PHPWord->loadTemplate('templates/.../wtvr.docx');
$document->setValue('name', $name);
$document->setValue('overall_percent_100', $overall_percent_100);
$document->save('Individual Report - ' . $name . '.docx');
Is there anyway to fix that?

Well, yes. But you must unfortunately modify the library. The author of the library uses utf8_encode/utf8_decode obviously without understanding what they do at all.
On line 150, of Shared/String.php:
Replace
public static function IsUTF8($value = '') {
return utf8_encode(utf8_decode($value)) === $value;
}
With
public static function IsUTF8($value = '') {
return mb_check_encoding($value, "UTF-8");
}
Then, if you do
$ grep -rn "utf8_encode" .
On the project root, you will find all lines where utf8_encode is used. You will see lines like
$linkSrc = utf8_encode($linkSrc); //$linkSrc = $linkSrc;
$givenText = utf8_encode($text); //$givenText = $text;
You can simply remove the utf8_encode as shown in the comments.
Why is utf8_encode/utf8_decode wrong? First of all, because that's not what they do. They do from_iso88591_to_utf8 and from_utf8_to_iso88591. Secondly, ISO-8859-1 is almost never used, and usually when someone claims they use it, they are actually using Windows-1252. ISO-8859-1 is a very tiny character set, not even capable of encoding €, let alone arabic letters.
You can do fast reviews of a library by doing:
$ grep -rn "utf8_\(en\|de\)code" .
If you get matches, you should move on and look for some other library. These functions simply do the wrong thing every time, and even if someone needed some edge case to use these functions, it's far better to be explicit about it when you really need ISO-8859-1, because you normally never do.

Please find the following points to write all types of utf-8 right to left data insertion in phpword template.
In setValue function (line #95) in Template.php please comment the following portion of code
//if(!is_array($replace)) {
// $replace = utf8_encode($replace);
//}
If you have problem with right to left which in some language the text mix up with left to right text add the following code in the same setValue function.
$replace = "<w:rPr><w:rtl/></w:rPr>".$replace;
//==== here is a working example of how the word data can be write inside the word template
//--- load phpword libraries ----
$this->load->library("phpword/PHPWord");
$PHPWord = new PHPWord();
$document = $PHPWord->loadTemplate('./forms/data.docx');
$document->setValue('NAME', 'شراف الدين');
$document->setValue('SURNAME', 'مشرف');
$document->setValue('FNAME', 'ظهرالدين');
$document->setValue('MYVALUE', '15 / سنبله / 1363');
$document->setValue('PROVINCE', 'سمنگان');
$document->setValue('DNAME', 'عبدالله');
$document->setValue('DMOBILE', '0775060701');
$document->setValue('BOX','<w:sym w:font="Wingdings" w:char="F06F"/>');
$document->setValue('NO','<w:sym w:font="Wingdings" w:char="F06F"/>');
//$document->setValue('BOX2','<w:sectPr w:rsidR="00000000"><w:pgSz w:w="12240" w:h="15840"/><w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440" w:header="720" w:footer="720" w:gutter="0"/><w:cols w:space="720"/><w:docGrid w:linePitch="360"/>');
$document->setValue('YES','<w:sym w:font="Wingdings" w:char="F0FE"/>');
$document->setValue('CLASS1','<w:sym w:font="Wingdings" w:char="F06F"/>');
$document->setValue('CLASS2','<w:sym w:font="Wingdings" w:char="F0FE"/>');
$document->setValue('DNAME','يما شاه رخي');
$document->setValue('TEL','0799852369');
$document->setValue('ENTITY','مشاور حقوقي و نهادي');
$document->setValue('ENTITY','مشاور حقوقي و نهادي');
$document->setValue('REMARKS','در مسابقات سال 2012 میلادی در میدان Judo بر علاوه به تعداد 39 نفر در تاریخ 4/میزان/ سال 1391 قرار ذیل اند.');
$file = "./forms/data2.docx";
$document->save($file);
header("Cache-Control: public");
header("Content-Description: File Transfer");
header("Content-Disposition: attachment; filename=data2.docx");
header("Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document");
header("Content-Transfer-Encoding: binary");
ob_clean();
flush();
readfile($file);
//need how design can change the looking.
colr #E4EDF9

Find
$objWriter->startElement('w:t');
$objWriter->writeAttribute('xml:space', 'preserve'); // needed because of drawing spaces before and after text
$objWriter->writeRaw($strText);
$objWriter->endElement();
In Writer/Word2007/Base.php
replace with
$objWriter->startElement('w:textDirection');
$objWriter->writeAttribute('w:val', 'rlTb');
$objWriter->startElement('w:t');
$objWriter->writeAttribute('xml:space', 'preserve'); // needed because of drawing spaces before and after text
$objWriter->writeRaw($strText);
$objWriter->endElement();
$objWriter->endElement();
Also, make sure you don't use any styles to make it work, or else you will have to repeat this step in every function you use.

I had to fix it in two place different than Nasers's way:
1- in Section.php addText function:
I did this:
//$givenText = utf8_encode($text);
$givenText = $text;
2- in cell.php addText function
I did this:
// $text = utf8_encode($text);
now your word file will display unicode characters in right way.
And then i had a problem in texts directions.
i found the solution by using this code
$section->addText($val['notetitle'],array('textDirection'=>PHPWord_Style_Cell::TEXT_DIR_TBRL));
u can see the two constants in the cell.php file
const TEXT_DIR_TBRL = 'tbRl';
const TEXT_DIR_BTLR = 'btLr';
note that u can not apply other array combined styles like Paragraph before than 'textDirection' , because whose styles make 'textDirection' disabled.

Open PHPWord\Template.php
Change in setValue function (line no 89.) as below.
Change $replace = utf8_encode($replace);
to
$replace = $replace;

Related

Cakephp response cannot read UTF-8 file name

I want to download the file after login check so wrote a function in my controller like
// Function to check login and download News PDF file
public function download(){
if($this->Auth->user()){
// Get the news file path from newsId
$pNewsObj = ClassRegistry::init('PublicNews');
$news = $pNewsObj->findById($newsId);
$filePath = ROOT.DS.APP_DIR.DS.'webroot/upload_news'.DS.$news['PublicNews']['reference'];
// Check if file exists
if(!file_exists($filePath)){
return $this->redirect('/404/index.php');
}
$this->response->charset('UTF-8');
//$this->response->type('pdf');
$this->response->file('webroot/upload_news'.DS.$news['PublicNews']['reference'], array('download' => true, 'name' => $news['PublicNews']['reference']));
//$this->response->download($news['PublicNews']['reference']);
return $this->response;
}else{
return $this->redirect(array('controller'=> 'users', 'action' => 'login'));
}
}
Now, everything works fine as required.
PROBLEM : when the file name is in UTF-8 eg. テスト.pdf (its Test.pdf in japanese) cakephp throws error like this.
For English filename it works perfectly fine but my client wants the filename should be the same as uploaded, so I can't change the filename to English.
If you want to know character encoding, you can use mb_detect_encoding() function if input text has enough length to detect encoding.
But I am guessing your client would upload SJIS file. Because most Japanese people are using SJIS, as Windows has adopted SJIS for Japanese language.
I confirmed your code in my local environment. As cake's File class seems to be not able to handle SJIS correctly, you cannot use Response::file(). So I wrote alternative code.
public function download(){
if($this->Auth->user()){
// Get the news file path from newsId
$pNewsObj = ClassRegistry::init('PublicNews');
$news = $pNewsObj->findById($newsId);
if (!$news) {
throw new NotFoundException();
}
$fileName = mb_convert_encoding($news['PublicNews']['reference'], 'SJIS-win', 'UTF8');
// Directory traversal protection
if (strpos($fileName, '..') !== false) {
throw new ForbiddenException();
}
$filePath = WWW_ROOT . 'upload_news' . DS . $fileName;
if (!is_readable($filePath)) {
throw new NotFoundException();
}
if (function_exists('mime_content_type')) {
$type = mime_content_type($filePath);
$this->response->type( $type );
} else {
// TODO: If Finfo extension is not loaded, you need to detect content type here;
}
$this->response->download( $fileName );
$this->response->body( file_get_contents($filePath) );
return $this->response;
}else{
return $this->redirect(array('controller'=> 'users', 'action' => 'login'));
}
}
However, I recommend you to convert SJIS to UTF8 before save it into your database and your disk. It is difficult to handle SJIS characters without enough knowledge about it. Because SJIS characters may contain ascii characters in the second byte. Especially backslash (\) is most dangerous. For example, 表 (955C) contains a backslash (5C = backslash). Note that I am not talking about rare cases. 表 means table or appearance in Japanese. 十 also contains a backslash and it means 10 in Japanese. 能 also contains a backslash and it means skill.
Unlike UTF-8 byte sequence, if you handle SJIS characters, almost all string functions don't work correctly. explode() would break SJIS byte sequence. strpos() would return wrong result.
Does your client connect to your server by using FTP or SCP directly? If not, it would be better to convert SJIS to UTF-8 before save, and re-convert UTF-8 to SJIS before return to your client.
If you like you can change the file name before uploading the file so at time of downloading this error will not happen.
public function change_file_name($fileName= '') {
$ext = pathinfo($fileName, PATHINFO_EXTENSION);
$fileName = 'file_'.time().".".$ext;
$exFileName = strtolower(substr($fileName,strrpos($fileName,".") + 1));
$sampleFileName = str_replace('.'.$exFileName,'', $fileName);
$name = Sanitize::paranoid($sampleFileName,array('_'));
$fileRename = $name.'.'.$exFileName;
return $fileRename;
}
Call this function before uploading the file
$return_file_name = $this->change_file_name($file_name);
if($this->moveUploadedFile($tmp_name,WEBSITE_PROFILE_ROOT_PATH.$return_file_name)){
$saveData['profile_image'] = $return_file_name;
}
I know this is not proper answer for your case.For this you can make a function like this which will fetch data from database and automatic rename all your save file and update it in your database
Some more information about your client's specifications would help greatly, but Tom Scott found base64 to be the simplest method of making Unicode characters work correctly in PHP.
Depending on how crucial the preservation of filenames in storage is, a solution could be to encode the filenames in base64 when files are uploaded, and reverse the encoding on download. You can then know that you are dealing with ASCII, which should be much more likely to work correctly.
You may need to replace / characters with %2F to make it work.
Hope this helps,
Issa Chanzi

Writing to a file adds weird content at the end of the line

I am working on a program that parses text files uploaded by a user and then saves the parsed XML file on the server. However, when I write the XML file I get some the text
at the end of each line. This text is not in my original text file. I didn't even notice it until I opened the new XML file to verify that it was righting all of the content. Has anyone ran into this before and if so can you tell me if it's due to the way I'm creating and writing my file?
fileUpload.php - These 3 lines occur when the user uploads the file.
$fileName = basename($_FILES['fileaddress']['name']);
$fileContents = file_get_contents($_FILES['fileaddress']['tmp_name']);
$xml = $parser->parseUnformattedText($fileContents);
$parsedFileName = pathinfo($fileName, PATHINFO_FILENAME) . ".xml";
file_put_contents($parsedFileName, $xml);
parser.php
function parseUnformattedText($inputText, $bookName = "")
{
//create book, clause, text nodes
$book = new SimpleXmlElement("<book></book>");
$book->addAttribute("bookName", $bookName);
$conj = $book->addChild("conj", "X");
$clause = $book->addChild("clause");
$trimmedText = $this->trimNewLines($inputText);
$trimmedText = $this->trimSpaces($inputText);
$text = $clause->addChild("text", $trimmedText);
$this->addChapterVerse($text, "", "");
//make list of pconj's for beginning of file
$pconjs = $this->getPconjList();
//convert the xml to string
$xml = $book->asXml();
//combine the list of pconj's and xml string
$xml = "$pconjs\n$xml";
return $xml;
}
Input text file
1:1 X
it seemed good to me also,
X
having had perfect understanding of all things from the very first
to write you an orderly account, [most] excellent Theophilius
and
1:4
that
you may know the certainty of those things in which you were instructed
1:5 X
There was in the days of Herod, the king of Judea and a certain priest named Zacharias
X
his wife[was] of the daughters of Aaron
and
her name [was] Elizabeth.
1:8 So
it was,
that
while he was serving as priest 1:9 before God in the order of his division,
1:10 and
the whole multitude of the people was praying outside at the hour of incense
but
therefore
it was done.
Going off of Seroczynski's answer I was able to create a function that trimmed removed any carriage returns from the text. The XML output looked fine after that. Here's the function I used to fix the issue:
function trimCarriageReturns($text)
{
$textOut = str_replace("\r", "\n", $text);
$textOut = str_replace("\n\n", "\n", $textOut);
return $textOut;
}
is the ASCII character for \r\n which doesn't seem to come out correctly from parseUnformattedText().
Try $xml = nl2br($parser->parseUnformattedText($fileContents));

Giving images random name in PHP

this way i load image using php:
header("Content-type: image/jpeg");
$image=imagecreatefromjpeg("http://i.imgur.com/zWaQJNCb.jpg");
imagejpeg($image);
but problem is, if i want to save the image manually from my web browser to my desktop then all the images has same name like ix.jpeg [here file name is: ix.php] but i cant understand what is the way to configure the header so that images will have random name.. like 25xc.jpeg, 36s5a2f.jpeg... while saving it on desktop.. any idea?
this is will do the job!
$dt = date(time());
header("Content-type: image/jpeg");
header('Content-Disposition: inline; filename="'. $dt .'.jpg"');
$image=imagecreatefromjpeg("http://i.imgur.com/knNxDFnb.jpg");
imagejpeg($image);
Use the filename value in your header call. See Example 1.
<?php
// We'll be outputting a PDF
header('Content-type: application/pdf');
// It will be called downloaded.pdf
header('Content-Disposition: attachment; filename="downloaded.pdf"');
// The PDF source is in original.pdf
readfile('original.pdf');
?>
I'll leave the "generate a random string" part up to you. I'd suggest basing it on the checksum of the file, but that's just me.
This is what you need to do if you would like to generate unique names for your users, all you need to do is to use actual time stamp and use the filename parameter in the header, you can do like below (if you use random generation there will be a very few cases where you will get the same name twice) :
$dt = date(time());
header("Content-type: image/jpeg");
header('Content-Disposition: attachment; filename="'. $dt .'.jpg"');
The above will generate unique images names like below :
1362504465.jpg
I hope this helps.
There are dozens if not hundred ways to do that.
Append time() after file name. This way all the file names will
have a (unique)number.
Use a hashing function, like md5().
Use this code-
function generateRandomName(len) {
$out = '';
for($i=0; $i<len; $i++) {
$out.=chr(rand(65,122));
}
return $out;
}
Simply append rand function after your file name (But then the file names won't be of equal length).
Google is your friend.
This would make an 8 characters long name of letters a-z, but you could do some magic and/or use http://www.asciitable.com/
function randChar() {
return char(rand(97, 122);
// Then you could either do a random number from 97 to 122 for a-z or just make an array of all the characters you would like in the filename.
}
$filename = '';
$amountOfChars = 8;
for($i = 0; $i < $amountOfChars; $i++; ) $filename .= randChar();
Hope this helps

writing special characters to txt file

I am reading some data from a remote file, got every thing working till the point when i write some specific lines to a text file.
problem here is, when i write something like Girl's Goldtone 'X' CZ Ring it becomes Girl & apos;s Goldtone &apos ;X & apos; CZ Ring in txt file.
how do i write to txt file so that it retains text like written above and not show character code but actual character.
sample of my code.
$content_to_write = '<li class="category-top"><span class="top-span"><a class="category-top" href="'.$linktext.'.html">'.$productName.'</a></span></li>'."\r\n";
fwrite($fp, $content_to_write);
$linktext = "Girls-Goldtone-X-CZ-Ring";
$productName = "Girl's Goldtone 'X' CZ Ring";
var_dump
string '<li class="category-top"><span class="top-span"><a class="category-top" href="Stellar-Steed-Gallery-wrapped-Canvas-Art.html">&apos;Stellar Steed&apos; Gallery-wrapped Canvas Art</a></span></li>
' (length=195)
Code
$productName =$linktext;
$linktext = str_replace(" ", "-", $linktext);
$delChar = substr($linktext, -1);
if($delChar == '.')
{
$linktext = substr($linktext, 0, -1);
}
$linktext = removeRepeated($linktext);
$linktext = remove_invalid_char($linktext);
$productName = html_entity_decode($productName);
$content_to_write = '<li class="category-top"><span class="top-span"><a class="category-top" href="'.$linktext.'.html">'.$productName.'</a></span></li>'."\r\n";
var_dump($content_to_write);
fwrite($fp, utf8_encode($content_to_write));
Is it that you are reading the data from a remote file and then writing the same to a txt file? Agree with the above comment, its an issue with encoding. Try the following code:
$file = file_get_contents("messages.txt");
$file = mb_convert_encoding($file, 'HTML-ENTITIES', "UTF-8");
echo $file;
echo the response to your browser and see. If found proper, write the response to your txt file. Ensure that your txt file is UTF8 - encoded.
Check this out:: Write Special characters in a file.
fwrite is binary-safe, meaning it doesn't do any encoding stuff but just writes whatever you feed it directly to the file. It looks like the $productName variable you're writing is already entity-encoded before writing. Try running html_entity_decode over the variable first.
Note that html_entity_decode doesn't touch single quotes (&apos;) by default; you'll have to set the ENT_QUOTES flag in the second parameter. You might also want to explicitly specify an encoding in the third parameter.

exporting php output as excel

include_once 'mysqlconn.php';
include_once "functions.php";
$filename = $_GET['par'].".xls";
header("Content-type: application/x-msexcel");
header('Content-Disposition: attachment; filename="'.basename($filename).'"');
if ($_GET['i'] == "par1") {
func1();
} else if ($_GET['i'] == "par2") {
echo "şşşıııİİİ";
func2();
} else if ($_GET['i'] == "par3") {
echo "şşşıııİİİ";
func3();
}
this is my export2excel.php file and func1,2,3 are in functions.php file and produces table output all work well except character encoding in a strange way. I am using utf-8 encoding for all my files. 2nd else if statement above produces healthy encoded output but rest 2 are encodes my output with strange characters like "BÃœTÇE İÇİ". it is "BÜTÇE İÇİ" in turkish.
in short. same files, same encoding, same database but different results.
any idea?
Excel uses UTF-16LE + BOM as default Unicode encoding.
So you have to convert your output to UTF-16LE and prepend the UTF-16LE-BOM "\xFF\xFE".
Some further information:
Microsoft Excel mangles Diacritics in .csv files?
Exporting data to CSV and Excel in your Rails apps
Instead I would use one of the existing libraries
PHP Excel Extension PECL extension by Ilia Alshanetsky (Core PHP Developer & Release Master)
Spreadsheet_Excel_Writer PEAR Package
PHPExcel
Edit:
Some code that could help if you really not want to use an existing library
<?php
$output = <<<EOT
<table>
<tr>
<td>Foo</td>
<td>IñtërnâtiônàlizætiøöäÄn</td>
</tr>
<tr>
<td>Bar</td>
<td>Перевод русского текста в транслит</td>
</tr>
</table>
EOT;
// Convert to UTF-16LE
$output = mb_convert_encoding($output, 'UTF-16LE', 'UTF-8');
// Prepend BOM
$output = "\xFF\xFE" . $output;
header('Pragma: public');
header("Content-type: application/x-msexcel");
header('Content-Disposition: attachment; filename="utf8_bom.xls"');
echo $output;
if anyone is trying to use the excel_writer in moodle and is getting encoding issues with output - say if you're developing a report that has a url as data in a field - then in this instance to simply fix this issue I wrapped the data in quotes so it at least opened up in excel here's my example:
// Moodles using the PEAR excel_writer export
$table->setup();
$ex=new table_excel_export_format($table);
$ex->start_document( {string} );
$ex->start_table( {string} );
// heading on the spreadsheet
$title = array('Report Title'=>'Report 1');
$ex->add_data($title);
// end heading
$ex->output_headers( array_keys($table->columns) );
**foreach($data as $row){
$string="'".trim($row->resname,"'")."'";
$row->resname=$string;
$ex->add_data( $table->get_row_from_keyed($row) );
}**
$ex->finish_table();
$ex->finish_document();
Excel uses UTF-16LE as the default encoding. So you should either convert UTF-8 to UTF-16LE yourself or use one of the tried and tested Excel PHP libs instead of trying to reinvent the wheel. I would recommend using PHPExcel...

Categories