I find Yii great framework, and the example website created with yiic shell is a good point to start... however it doesn't cover the topic of multi-language websites, unfortunately. The docs covers the topic of translating short messages, but not keeping the multi-lingual content ...
I'm about to start working on a website which needs to be in at least two languages, and I'm wondering what is the best way to keep content for that ...
The problem is that the content is mixed extensively with common elements (like embedded video files).
I need to avoid duplicating those commons ... so far I used to have an array of arrays containing texts (usually no more than 1-2 short paragraphs), then the view file was just rendering the text from an array.
Now I'd like to avoid keeping it in arrays (which requires some attention when putting double quotations " " and is inconvenient in general...).
So, what is the best way to keep those short paragraphs? Should I keep them in DB like (id | msg_id | language | content ) and then select them by msg_id & language? That still requires me to create some msg_id's and embed them into view file ...
Is there any recommended paradigm for which Yii has some solutions?
Thanks,
m.
Gettext is good for its ease of translation, but the default PHP implementation is not thread safe. Yii therefore uses its own unpacker, dramatically increasing processing time compared to php arrays.
Since I was setting up a high volume, high transaction site, the performance hit was not acceptable. Also, by using APC, we could cache the PHP translation further increasing performance.
My approach was therefore to use PHP arrays but to keep the translations in a DB for ease of translation, generating the needed files when translations are changed.
The DB is similar to this :
TABLE Message // stores source language, updated by script
id INT UNSIGNED
category VARCHAR(20) // first argument to Yii::t()
key TEXT // second argument to Yii::t()
occurences TINYINT UNSIGNED // number of times found in sources
TABLE MessageTranslation // stores target language, translated by human
id INT UNSIGNED
language VARCHAR(3) // ISO 639-1 or 639-3, as used by Yii
messageId INT UNSIGNED // foreign key on Message table
value TEXT
version VARCHAR(15)
creationTime TIMESTAMP DEFAULT NOW()
lastModifiedTime TIMESTAMP DEFAULT NULL
lastModifiedUserId INT UNSIGNED
I then modified the CLI tool yiic 'message' command to dump the collected strings into the DB.
http://www.yiiframework.com/wiki/41/how-to-extend-yiic-shell-commands/
Once in the DB, a simple CMS can be setup to provide translators an easy way to translate and at the same time providing versioning information, reverting to older versions, checking quality of translators, etc ...
Another script, also modified from yiic, then takes the DB info and compiles it into PHP arrays. Basically a JOIN of the two tables for each language, then build an array using 'Message'.'key' and 'MessageTranslation'.'value' as (what else?) key => value ... saving to file named from 'Message'.'category' in folder specified by language.
The generated files are loaded as normal by Yii CPhpMessageSource.
For images, this was as simple as placing them in folders with the proper language and getting the app language when linking.
<img src="/images/<?php echo Yii::app()->language; ?>/help_button.png">
Note that in real life, I wrote a little helper method to strip off the country from the language string, 'en_us' should be 'en'.
A Yii application by default uses yii::t() method for translating text messages and there are 3 different types for message sources:
CPhpMessageSource : Translations are stored as key-value pairs in a PHP array.
CGettextMessageSource : Translations are stored as GNU Gettext files. (PO Files)
CDbMessageSource : Message translations are stored in database tables.
If i don't misunderstand, you are using classic arrays for translations. I recommend to you using GetText and PO files with Yii for translation operations.
You can find lot of information about translation and i18n with yii in this official documentation page.
Well I think what is concerned here is how to translate static text/messages on the page and Yii solves it pretty well using Yii:t() and Edigu's answer is for it.
I check out the post on FlexicaCMS about translating dynamic content in database, well ultimately that will be the next after you solve static text/message problem, and that is a truly good approach using Yii's behavior. Not sure if FlexicaCMS authors are too ambitious in supporting translation that way as it would make content translation a worry-free thing - really great.
One thing they don't mention is the url of translated page. For example your.site.com/fr/translated_article_title.html. I mean the url must has /language_id/ part in it so it can help with SEO.
In Yii1 and Yii2 yii\i18n\GettextMessageSource doesn't use Yii perfect cache engine anyway (look at the source) to enhance the load of PO or MO files. It's not recommended to load these files by using php pure code (including yii\i18n\GettextMessageSource) (it's so slower than php array idx) :
http://mel.melaxis.com/devblog/2006/04/10/benchmarking-php-localization-is-gettext-fast-enough/
However php gettext ext for MO files is a few faster than translation php array because it uses cache but the negative point is : every change in MO requires server restart.
I think the best solution would be to extend yii\i18n\GettextMessageSource in your own code library and add the cache ability to GettextMessageSource to enhance its performance and use your extended version as the component.
protected function loadMessages($category, $language);
Just don't check MO modified date in every load to compare against the cache , instead clear the cache when the MO or PO files are changed (it can be a schedule).
Related
I'm trying to rewrite asp.net mvc application to codeigniter.
Basically codeigniter follow mvc pattern so it's pretty much ok. Now I'm stuck at localization.
I do not want to change url, in order that /Company/About remains the same in english and german. Inside view I had in asp.net Views/Index.cshtml as default for german and Views/Index.en.US.cshtml for english localized page.
I will describe scenario which works perfect on my asp website.
User clicks on country flag
Based on 1. step value cookie is populated with country value
Helper load desired thred into current thread
Views are localized
How can I apply this approach to codeigniter, or similar at least?
Thanks
I think it will be not a good idea to have separate files for each language. You can do it easily using some other techniques, which are easy to implement.
While working on one of my CI project, i need to make it to support multiple languages. As i worked in other multilingual systems like Prestashop, so i borrowed ideas from there and implemented it in my CI project.
I have implemented it as followed:
1) I am storing words in language files. Each language has its single file named as language ISO code, like for english its name is en.php In this language file, words are stored as file_name_md5(of the word) in array like below for Hello World in view file hello.php .
$_lang = array(
'hello_b10a8db164e0754105b7a99be72e3fe5' => 'hallo Welt',
...
...
...
)
The key of the $_lang associative array is the world appended with file name, to be translated, and the value is the translation.
Storing words and fetching the words into / from these language files, are handled by the Translation library I created.
2) All my static text in views files are written in english. I created helper function for it called "l" , small L . Lets say i want Hello world to in my view (say hello.php) and should be translated into multiple languages. So in my view i write it like
<?=$this->l('Hello World...', 'hello')?>
3) Now the l helper function perform small operation on the arguments passed to it. It takes md5 of the world and append it with the file name as you can see above. Then it calls a member function of my Translation library, which looks into the $_lang array to find that match. If it finds the match, it returns the translation. If no translation is found for that word, then the l helper function returns the original text back.
4) I have created my own controller library from which all my controllers are extended. To preserve CI features, my parent controllers are extended from default CI controller. In my parent controller, i load the language files according to the user language. This way the $_lang array is available for the Translation library to look into for the words.
5) In my admin side, i have created a translation system, which reads all my View files for a specific pattern like the below one
<?=$this->l('Hello World...', 'hello')?>
The code generate a form in which a text field is created for each word. The text field name is the same as the $_lang array keys, like filename_md5_of_word . The text field label is the original word in this case "Hello World...". And the translation has to be written to the text field.
On saving, the translations are stored in to specific language file for that particular language selected for translation.
Using this method, you will be able to add as many languages as you like in future, without creating view files for each language, so it is flexible.
I hope i have explained enough so you can take the idea of how easily you can implement translation system, and avoid separate view files for each language.
If you have any questions, feel free to contact me here.
Thank you
This is the scenario:
I have a website that I'll translate and eventually apply a good SEO on it.
Which method is best for translate the content (menu links, about 10 articles, alt tags, title tags, meta tags, html lang, etc) while being easely indexed by Google, Bing, Yandex and other search engines?
My first idea is to use a translate php function that consists of arrays made by myself (I have a prototyope of it already) that takes the content and displays it in the user's language.
Is this the right path? the problem here is that I wanted to be sure to have a dynamic system that allows me to add a new language in the future.
Maybe MySql is the right choice?
The website doesn't use a cms, I made it by myself with php though I have no problem to rely on MySql if I need to.
Thank you in advance :)
You've basically got 3 choices and there are pros and cons to each:
1: as Dainis Abols suggests, chuck it in the database - depending on how your server is set up this could be the slowest, most system heavy route (it's all relative though, it's unlikely to make any difference unless you're getting millions of view an hour).
2: use PHP library files; I tend to use library files for small, single items like field labels (forename, surname etc) and store larger things like CMS-managed HTML in the database... this reduces the database calls but adds a small overhead for each dictionary you load into a script <?php $this->page->dictionary->product = Dictionary::load("product"); ?> sort of thing.
3: finally, I personally think it's worth taking a look at PHP's implementation of gettext though you'll need something like poedit to maintain the PO (compressed translation files). This gives you the ability to very rapidly maintain translations as you just enter the text in your PHP document by wrapping it in a simple underscore function:
e.g. <?= _("Hello World"); ?>
You then maintain the translations in compressed PO files - it's very efficient (potentially faster than doing it with native PHP files) however it does have some drawbacks when it comes to the nuances of natural language.
As an example, if you have a field label "title" <?= _("Title"); ?> then all instances of _("Title") will be translated in the same way.
This means you can't use "Title" as both a form label for a person's title and as the title of a book; for instance, in German, you may want to use Anrede for one "Title" and Titel for the other.
Although, to really use gettext you'd probably need to be running your own server - it can require an Apache reboot when you change the PO files :\
As for Search Engines they read the output from your code so it doesn't really make a lot of difference which method you use to perform the translations but ideally you may want to keep the URLs RESTful so whether you're including PHP dictionaries, calling the database or using gettext (or changing your mind from one to another later), you'll be able to map the language to the URL with something like http://www.mysite.com/en_gb/widgets so you can change how the program works without changing the URLs.
Store all texts inside a db and apply another field for language:
+----+---------+---------+
| id | text_en | text_de |
+----+---------+---------+
| 1 | English | Deutch |
+----+---------+---------+
Now, when user switches languages, just use the field for that language:
$lang = 'en';
$query = "SELECT text_".$lang." FROM texts WHERE id = 1";
Something like that. So, all your client side texts will be stored inside the db at all times. So your output will be like:
<div id="header"><?=get_db_text_for_id(1)?></div>
Of course, you need precautions and some more field, but thats the general idea.
I've been struggling with this for a while now. One my CMS' is ready to be extended with a translation module. I've been thinking of different methods but haven't figured out what is the best way so far.
Basically I have a CMS which uses a template system to parse all data from a database to the screen. I've come so far to "split up" my templates in different folders to be able to translate things that are "static" like images with text, footer links, etc.
However, there are many modules (pages, news, products) that have multiple fields that require a database driven method to be translated. I started off with a "languages" table which describes languages (id, iso_code, name). That's as far as I've come.. since there were a couple of projects that had to be done I haven't spend any more time to this subject thus far.
My first thought ("the quick fix") was to add multiple fields inside the tables (such as "title_nl", "title_en"), but this actually makes the database more crowded than is needed in my opinion.
My second thought was to create a table, "news_translations" for example. Which contained the language iso code, a news_id, the fields that require translation. Obviously a news_id connects the translation to it's original and the language iso code is used to get the right language from the database. Then in my front-end code I would first check if the default language is selected (=> select from the "news" table) or a translation (=> check inside translations table). If the 2nd case does not return any results a message is display "Sorry, no translation available" and the default is shown (or an error message, what fits the client best..).
But then there's a 3rd option.. my websites all use search engine friendly links (www.domain.com/pagename/ or www.domain.com/news/1-news-item-here.html). It would be far better if I had the ability to also "override" the SEF URL in my translations table. But I guess in this case I would always need 1 extra query to the translations table (since we first want to check for a translated page)... guess it's not such a big deal, but it's worth considering I guess.
In the end I guess by describing my options number 3 is what I need. But I'd like to have some other opinions on the subject as well! This is what I am trying to achieve:
Create a CMS system with multi language support
No language files (obviously this is why I use templates)
Being able to translate an original page/newsitem/product
Optionally: to change the SEF URL according to the language
I think option 3 has all this.. so the steps to create this solution is:
Create a _translation table for each item (or perhaps even in the
original by adding 2 new fields 'translation_to' (containing the
PrimaryKey) and 'translation_is' (containing ISO code) - however..
in that case all fields would need to be edited (which is not always
necessary.. plus by creating a 2nd table I keep the originals
divided with their translations, right?)
If the default language is NOT chosen first query the translations table to find a translation, if one is found display the
translation. Otherwise notify/error the user and/or display the
original text (based on the SEF URL... if the SEF is not found
within the translations or original table, then obviously display an
error only).
Any suggestions? :-)
Thanks for thinking along!
I would like to see what your table structure looks like. Probably the best thing you can do is generate two seperate new tables named something like "CONTENT_MULTI_LANG" & "SITE_LOCALES".
Then in the code that prints out your content do an initial check for a language flag. I'd create two separate classes for loading static content, something like "Content_LoadStandard" and "Content_LoadMultiLang". So then your conditional will look like this.
if ($this->site_locale == 'standard'){
$contentLoader = new Content_LoadStandard();
} else {
$contentLoader = new Content_LoadMultiLang($this->site_locale);
}
$content->blah($cheese);
Your "CONTENT_MULTI_LANG" table should be a narrowed down version of your standard CMS object table, only containing the relevant content field(s) that need to be in alternative languages.
// PSEUDO SQL
CREATE TABLE `LOCALE` (
`id` int(11),
`locale` varchar(16), // name of locale (language)
... // any other fields
)
CREATE TABLE `CONTENT_MULTI_LANG` (
`id` int(11),
`pcid` int(11), // parent content id
`lid` medint(), // locale id
`content` {$type}, // whatever type you use (varchar, text, bin, etc)
... // any other fields
)
In your Content_LoadMultiLang class, create methods to query alternate content using a join.
TIP: Might be a good idea to establish relationships in your table to do cascading deletes on content rows, that way if you delete content in standard your multi lingual version(s) will also be deleted.
From what I've seen from Drupal, option three is how they handle it, with a couple of tweaks. They keep it all in one table and a field called language. Then there is a separate table that maps which items are connected.
This way is primary language agnostic, meaning the content can be created in any language without requiring a translation in any other.
I have website build in php .
I like the revision control system used in SO edited answers. like we can see al the old revisions.
Now i am new to that and have no idea how to implement it . i mean its a software plugin or its programmed like that.
If i want to do that on my all php files how to do that.
I know there are software for that but how link those with website like SO has done . i mean although in the software there may be all the old versions but how to link those with webiste php
It requires more work with the database. Your application will have to store old revisions itself (or use triggers/views). Commonly a separate archive table is created for everything you normally store in your database. The crucial part of that is a version field:
CREATE TABLE articles ( // always the current version
title VARCHAR,
content TEXT
)
CREATE TABLE articles_ARCHIVE (
version INT with AUTO_INCREMENT,
title VARCHAR,
content TEXT,
) // yes, that's not a correct CREATE TABLE, just figuratively
And whenever you regularily would just update the articles table, you will instead first store the current version into the _archive table. And only afterwards store the new current text into the regular table.
Now to replicate what SO provides, you will also need some more UI logic. But a diff view for comparing _archive texts against the current version is not difficult (see PEAR Text_Diff).
I think you could look into a common Wiki implementation to get an idea how it is done in practice.
Use GIT for your code control, and have a look at these existing PHP frontends to GIT to get you started.
SO's review system is custom-built I guess, so you'll have to build what you want on your own.
I'm developing a website in PHP and I'd like to give the user to switch from German to English easily.
So, a translation politic must be considered:
Should I store the data and its translation in a database table ((1, "Hello", "hallo"), (2, "Good morning", "Guten Tag") etc .. ?
Or should I use the ".mo" Files to store it?
Which way is the best?
What are the pros and the cons?
After having just tackled this myself recently (12 languages and counting) on a production system and having run into some major performance issues along the way I would suggest a hybrid system.
1) Store the language strings and translations in a database--this will make it easy to interact with/update/remove items plus will be part of your normal backup routines.
2) Cache the languages into flat files on the server and draw those out as necessary to display on the page.
The benefits here are many--mostly it is fast! I am not dealing with connection overhead for MySQL or any traffic slowdowns during the transfer. (especially important if your DB server is not localhost).
This will also make it very easy to use. Store the data from your database in the file as a php serialized array and GZIP the contents of the file to shrink storage overhead (this also makes it faster in my benchmarking).
Example:
$lang = array(
'hello' => 'Hallo',
'good_morning' => 'Guten Tag',
'logout_message' = > 'We are sorry to see you go, come again!'
);
$storage_lang = gzcompress( serialize( $lang ) );
// WRITE THIS INTO A FILE SUCH AS 'my_page.de'
When a user loads your system for the first time do a file_exists('/files/languages/my_page.de'). If the file exists then load the content, un-gzip, and un-serialize and it is ready to go.
Example
$file_contents = get_contents( 'my_page.de' );
$lang = unserialize( gzuncompress( $file_contents ) );
As you can see you can make the caching specific to each page in the system keeping the overhead even smaller and use the file extension to denote language... (my_page.en, my_page.de, my_page.fr)
If the file DOESN'T exist then query the DB, build your array, serialize it, gzip it and write the missing file--at the same time you have just constructed the array that the page needed so continue on to display the page and everyone is happy.
Finally, this allows you to build in update pages accessible to non-programmers but you also control when changes appear by deciding when to remove cache files so they can be rebuilt by the system.
Warnings and Pitfalls
When I kept everything in the database directly we hit some MAJOR slowdowns when our traffic spiked.
Trying to keep them in flat-file arrays only was so much trouble because updates were painful and prone to errors.
Not GZIP compressing the contents of the cache files made the language system about 20% slower in my benchmarks.
Make sure all of your database fields containing languages are set to UTF8-general-ci (or at least one of the UTF8 options, I find general-ci best for my use). If you don't you will not be able to store non-unicode character sets in your database (like Chinese, Japanese, etc)
Extension:
In response to a comment below, be sure to set your database tables up with page level language strings in mind.
id string page global
1 hello NULL 1
2 good_morning my_page.php 0
Anything that shows up in headers or footers can have a global flag that will be queried in every cache file created, otherwise query them by page to keep your system responsive.
PHP arrays are indeed the fastest way to load translations. However, you really don't want to update these files by hand in an editor. This might work in the beginning, and for one or two languages, but when your site grows this gets really hard to maintain.
I advise you to setup a few simple tables in a database where you keep the translations, and build a simple app that lets you update the translations (some forms to add and update texts). As for the database: use one table to store translation variables; use another to link translations to these variables.
Example:
`text`
id variable
1 hello
2 bye
`text_translations`
id textId language translation
1 1 en hello
2 1 de hallo
3 2 en bye
4 2 de tschüss
So what you do is:
create the variable in the first table
add translations for it in the second table (in whatever language you want)
After you've updated the translations, create/update a language file for each language that you're using:
select the variables you need and its translation (tip: use English if there's no translation)
create a big array with all this stuff, e.g.:
$texts = array('hello' => 'hallo', 'bye' => 'tschüss');
write the array to a file, e.g.:
file_put_contents('de.php', serialize($texts));
in your PHP/HTML create the array from the file (based on selected language by user), e.g.:
$texts = unserialize(file_get_contents('de.php'));
in your PHP/HTML use the variables, e.g.:
<h1><?php echo $texts['hello']; ?></h1>
or if you like/enabled PHP short tags:
<p><?=$texts['bye'];?></p>
This setup is very flexible, and with a few forms to update the translations it's easy to keep your site up to date in multiple languages.
I'd also suggest Zend Framework Zend_Translate package.
The manual gives a good overview on How to decide which translation adapter to use. Even when not using ZF, this will give you some ideas about what is out there and what the pros and cons are.
Adapters for Zend_Translate
Array
Use PHP arrays Small pages;
simplest usage; only for programmers
Csv
Use comma separated (.csv/.txt) files
Simple text file format; fast; possible problems with unicode characters
Gettext
Use binary gettext (*.mo) files GNU standard for linux;
thread-safe; needs tools for translation
Ini
Use simple ini (*.ini) files
Simple text file format; fast; possible problems with unicode characters
Tbx
Use termbase exchange (.tbx/.xml) files
Industry standard for inter application terminology strings; XML format
Tmx
Use tmx (.tmx/.xml) files
Industry standard for inter application translation; XML format; human readable
Qt
Use qt linguist (*.ts) files
Cross platform application framework; XML format; human readable
Xliff
Use xliff (.xliff/.xml) files
A simpler format as TMX but related to it; XML format; human readable
XmlTm
Use xmltm (*.xml) files
Industry standard for XML document translation memory; XML format; human readable
There are some factors you should consider.
Will the website be updated frequenytly? if yes, by whom? you or the owner? how much data / information are you dealing with? and also... are you doing this frequently (for many clients) ?
I can hardly think that using a relational database can couse any serious speed impacts unless you are having VERY high traffic (several hundreds of thousands of pageviews per day).
Should you be doing this frequently (for lots of clients) think no further: build up a CMS (or use an existing one). If you really need to consider speed impact, you can customize it so that when you are done with the website you can export static HTML pages where possible.
If you are updating frequently, the same as above applies.
If the client has to update (and not you), again, you need a CMS.
If you are dealing with lots of infomration (big and lots of articles), you need a CMS.
All in all, a CMS will help you build up your website structure fast, add content fast and not worry that much about code since it will be reusable.
Now, if you just need to create a small website fast, you can easily do this with hardcoded arrays and datafiles.
If you need to provide web interface for adding/editting translations, then database is a good idea.
If, however, your translations are static, I would use gettext or even plain PHP array.
Either way you can take advantage of Zend_Translate.
Small comparison, the first two from Zend tutorial:
Plain PHP arrays: Small pages; simplest usage; only for programmers.
Gettext: GNU standard for linux; thread-safe; needs tools for translation.
Database: Dynamic; Worst performance.
I would recommend PHP arrays, they can be built around a GUI for easy access.
Be realize the everybody in the world when dealing with computer, they usually know some common English used in computer or internet like About Us, Home, Send, Delete, Read More etc. Question : Are they really need to be translated?
Ok, honestly, some translation to that words is actually not about 'required', it's all about 'style'.
Now, if it's really wanted, for the common words that no need to be changed forever, it's better use a php file which output lang array for only local and English. And for some contents such as blog, news and some descriptions, use database and save in as many as language translation required. You must do it manually.
Using and rely on Google Translate? I think you have to think 1000 times. At least for this decade.