PHP TCPDF - Half-width kana is being considered as Full-width - php

I'm having a problem with rendering a string of text that contains half-width kana in a PDF. It considers the half-width kana to be full-width so it turns out something like this:
This is my code snippet:
PDF::Cell(15, 6, '商品コード', 1, 0, 'C', 0, '', 0);
I'm also using the cid0jp font provided in TCPDF to display Japanese characters:
PDF::SetFont('cid0jp', 'B', 9);
In the end, I want it to maintain the half-width katakana to fit the cell and remove unnecessary spaces.
TCPDF Library used: https://tcpdf.org/

When you use the cid0jp font you're leaving the font rendering up to the PDF reader which can introduce differences in rendering between different readers and operating systems. The spacing differences can be pretty major, but I'm not sure if that's an issue with TCPDF's implementation or just a consequence of relying on the reader to provide the font.
Below, I've included an example comparing Microsoft Edge and Foxit Reader rendering of that text in cid0jp. I also included the full-width versions on the second line. Edge came a little closer on the spacing for half-widths than Foxit. Google Drive's PDF preview did the same thing as Foxit did with the additional spacing around the half-widths.
Since the space you're working with there is so tight, it might be worth embedding a specific font into the document. In my tests that was a lot more reliable as far as rendering went. (I've also included screenshots of that test below. Make sure subsetting is on if you don't want the entire font included in each file.)
Just in case you might not know how to do that:
$embfont = TCPDF_FONTS::addTTFfont('/Path/to/font.ttf', 'TrueTypeUnicode', '', 32);
$pdf->setFont($embfont, '', '9');
$pdf->Cell(15,6,'商品コード',1,0,'C',0,'',0);
Examples with cid0jp:
Examples with embedded font:
(Admittedly, this font isn't very good at small sizes.)

Related

How to export emoji to PDF document using PHP?

I am trying to export to PDF using FPDF and TCPDF php library. I found that the emojis like 😁 😀 💃🏻 ❤️ 🥳 where not converted. Only ️️some rectangle box there in generated pdf. I also tried tfpdf.
$text = "There is my text 😁 , 😀 and emojis 💃🏻 ❤️ 🥳";
require('tfpdf/tfpdf.php');
$pdf = new tFPDF();
$pdf->AddPage();
//Add a Unicode font (uses UTF-8)
$pdf->AddFont('Segoe UI Symbol','','seguisym.ttf',true); // DejaVuSans.ttf
$pdf->SetFont('Segoe UI Symbol','',12);
$pdf->Write(8,$text);
$pdf->Output();
I also tried different font. But didn't work for me. Can any one help me in this regard?
Sadly fPDF, TCPDF nor tFPDF can't print those characters. Issue is, these characters are not part of BMP, they are expressed with surrogate pairs, meaning they behave like multiple characters in UTF-16 (because of that one emoticon is printed as 2 rectangle boxes, not one) and also they have codepoint above 65535. However all mentioned PDF libraries relies on codepoint index being <= 65535 as well as TFontFile class reading TTF files.
You would also need to add TTF file having complete set of Unicode charset, or at least emoticons. Most fonts does not have it. This brings another issue for PDF library, which would probably need to have support for fallback font, which will be used when codepoint is not found in main font (for example you want to print text in Gotham, but since that does not include emoji, use other font for them). Btw for example emoji font "Noto Color Emoji" has 23 MB TTF file. So it gets big easily.
Anyway, all of the above can be added to PDF libraries, but it will require some effort. I am planning to do it for my needs as well sometimes. I think it will take roughly 1 man day.
Alternativelly, you might try something more robust like mPDF, but that library is huge, slow and require complete rewrite of your fPDF code. Also can't guarantee it can print emojis as well.

tcpdf truetype unicode font install

I cannot convert this font (TrueType Unicode) for use in tcpdf. I'm using UTF-8 encoding
https://dl.dropboxusercontent.com/u/14964998/chamberssansoffc.ttf
https://dl.dropboxusercontent.com/u/14964998/chamberssansoffcb.ttf
https://dl.dropboxusercontent.com/u/14964998/chamberssansoffcbi.ttf
https://dl.dropboxusercontent.com/u/14964998/chamberssansoffci.ttf
This font works fine on me website - renders Chinese, Polish, German, Romanian characters.
I would be grateful for help.
To see all UTF-8 encoding in tcpdf I will prefer use freeserif font.You can see a good example in tcpdf, example number 8.
Try this
$pdf->SetFont('freeserif', '', 12);
Now if you don't want to use freeserif, you can convert this font by tcpdf example by use this line.open tcpdf example folder, than open example __001, after 31 number line anywhere just past this code.Here make sure your path directory.Than run this code.After run go to font folder.You will get all your desire file.
$fontname = $pdf->addTTFfont('/wamp/www/tcpdf/arial/arialuni.ttf', 'TrueTypeUnicode', '', 32);
As far as i know you can`t use ttf directly but you can search for font converter tcpdf or use this tool: http://fonts.snm-portal.com/. Helped me.
It looks like you are trying to load the fonts remotely. That wont work with TCPDF because it needs to write the the font directory in order to create the pseudo-files for the PDF. Try adding them locally:
$fontname = $pdf->addTTFfont('/path-to-font/chamberssansoffc.ttf', 'TrueTypeUnicode', '', 32);
Also, you have to make sure that the font folder is writable on the server to allow TCPDF to create the pseudo-files.
Hope this helps!

can't display ¥ with TCPDF, but other kanji are okay

I'm using TCPDF to create PDFs that include Japanese characters. Using the TrueType font ArialUni, most characters are displayed correctly, except the yen symbol shows up as a square box instead of ¥.
Here's a snippet of the resulting PDF using ArialUni:
So I tried another font. Here's the same section of the resulting PDF using GT200001:
And here's the same section using Helvetica:
Here's the same section using GNU's FreeSans:
I would like that second line to show up as "(渋谷猿, ¥8,000)"
I'm not surprised that Helvetica and Freesans cannot render the kanji correctly, but I cannot fathom why the other two fonts can render the kanji, but not the yen symbol, which is much more common.
The web server creating the PDFs is LAMP running Ubuntu. I'm viewing the PDFs on OS X with Chrome (using its in-browser view). I've also tried downloading the PDFs with Firefox and displaying in Preview. I get essentially the same results: ArialUni and GT200001 don't display the yen symbol, while Helvetica and Freesans don't display the kanji (but do display the yen symbol).
I know I can use different fonts for different lines/cells of the PDF, but the kanji and yen symbol are on the same line.
How can I get the kanji and yen symbol to display in a single line using TCPDF?
Near the top of my PDF code, I load the font using TCPDF's addTTFfont();
$this->font = $this->addTTFfont(K_PATH_FONTS.'arialuni.ttf', 'TrueTypeUnicode', '', 32);
Here's the code I'm using to write the section of the PDF.
$pdf->SetFont('arialuni','',10);
$pdf->MultiCell(105, $remarks_height, $remarks, 'B', 'L', false, 0, '', '', true, 1, false, true, $remarks_height, 'T');
In this wikipedia article you can read some more about this character.
Basically, there are 2 different ways of writing this Japanese Kanji, as it happens with some other symbols. From accepted answer "I was using ¥, not ¥", we can see that he was using the 'occidental' or Unicode ¥ symbol, when he actually wanted to use the double-width character ¥.
Oh dear, I figured it out.
I was using ¥, not ¥. Sorry for being confused!!
What I have noticed with TCPDF is, it's all about fonts.
I tried to use it for 15 totally different languages with different writing style too and only font I am using is Arial UNICODE MS.
TCPDF has function which will convert that font into TCPDF friendly files, which are arialuni.ctg.z(70k), arialuni.php(447k), and arialuni.z (14M).
the functions are,
http://www.tcpdf.org/fonts.php
$fontname = $pdf->addTTFfont('/path-to-font/ARIALUNICODE.TTF', 'TrueTypeUnicode', '', 32);

Creating PDFs using TCPDF that supports all languages especially CJK

Can someone put together a clear and concise example of how you can create a PDF using TCPDF that will support text strings from any language?
It appears there is not a single font that will support all languages. I'm guessing the font would be too large?
I assume the correct way would be to detect the language of the string and dynamically set the font type to a compatible font. If this is the case then it gets very complex in detecting the language for each string.
Most languages are supported if you use the "freeserif" font. However it does not support CJK fonts. I've tried many fonts (kozminproregular, cid0jp, cid0kr, cid0jp, stsongstdlight) to get support for Chinese, Japanse, and Korean, but none of them seem to support all three languages.
This worked out perfectly for me. Thank you!
To make sure, the generated PDF file will not get to big, use FontSubsetting - I have a 10 page PDF generated with only a few Lines of chinese (Names on Diplomas)
$pdf->setFontSubsetting(true); => PDF File slightly bigger 925kb vs 755kb without the chinese names
if you use
$pdf->setFontSubsetting(false); => PDF File size as about 17.5 MB ...
Managed this problem by making my own font from arial ms unicode with these steps:
In a temporal script put and execute this
1. put a copy of ARIALUNI.ttf in fonts folder under tcpdf installation (i've taken my copy from windows\fonts folder.
2. make a temporary script in examples folder of tcpdf and execute it with this line:
$fontname = $pdf->addTTFfont('../fonts/ARIALUNI.ttf', 'TrueTypeUnicode', '', 32);
3. set the new font in your pdf generator script:
$pdf->SetFont('arialuni', '', 20);
Now the pdf should be showing correctly CJK characters.
Hope this helps so many people.
I just tried Etiennez0r's solution, and it didn't work for me. Needed to make a minor modification as below:
$fontname = TCPDF_FONTS::addTTFfont('../fonts/ARIALUNI.TTF', 'TrueTypeUnicode', '', 96);
I setting:
$fontname = TCPDF_FONTS::addTTFfont(FCPATH . 'TCPDF/fonts/ARIALUNI.ttf', 'TrueTypeUnicode', '', 32);
.......
// set font
$pdf->SetFont('dejavusans', '', 14);
$pdf->SetFont('cid0cs', '', 14);
Export Japanese is working well

How to generate pdf files _with_ utf-8 multibyte characters using Zend Framework

I've got a "little" problem with Zend Framework Zend_Pdf class. Multibyte characters are stripped from generated pdf files. E.g. when I write aąbcčdeę it becomes abcd with lithuanian letters stripped.
I'm not sure if it's particularly Zend_Pdf problem or php in general.
Source text is encoded in utf-8, as well as the php source file which does the job.
Thank you in advance for your help ;)
P.S. I run Zend Framework v. 1.6 and I use FONT_TIMES_BOLD font. FONT_TIMES_ROMAN does work
Zend_Pdf supports UTF-8 in version 1.5 of Zend Framework. However, the standard PDF fonts support only the Latin1 character set. This means you can't use Zend_Pdf_Font::FONT_TIMES_BOLD or any other "built-in" font. To use special characters you must load another TTF font that includes characters from other character sets.
I use Mac OS X, so I tried the following code and it produces a PDF document with the correct characters.
$pdfDoc = new Zend_Pdf();
$pdfPage = $pdfDoc->newPage(Zend_Pdf_Page::SIZE_LETTER);
// load TTF font from Mac system library
$font = Zend_Pdf_Font::fontWithPath('/Library/Fonts/Times New Roman Bold.ttf');
$pdfPage->setFont($font, 36);
$unicodeString = 'aąbcčdeę';
$pdfPage->drawText($unicodeString, 72, 720, 'UTF-8');
$pdfDoc->pages[] = $pdfPage;
$pdfDoc->save('utf8.pdf');
See also this bug log: http://framework.zend.com/issues/browse/ZF-3649
I believe Zend_Pdf got UTF-8 support in 1.5 - What version of Zend Framework are you running?
Also - what font are you trying to render with? Have you tried alternate fonts?
Have you made sure that you are setting the character encoding as this example from the manual?
// Draw the string on the page
$pdfPage->drawText($unicodeString, 72, 720, 'UTF-8');
If you're stuck into having to use a bold font, maybe try one of the other bold fonts?
Zend_Pdf_Font::FONT_COURIER_BOLD
Zend_Pdf_Font::FONT_TIMES_BOLD
Zend_Pdf_Font::FONT_HELVETICA_BOLD
ZF v. 1.6, TIMES_BOLD (as I understand thats the only way to make text bold?)

Categories