I have a PHP application and now I need to implement multi language support. This is the first time I have to deal with this.
I did some searches on the internet and always come to PHP's gettext function, which I have compiled on my server.
I would just like to know if gettext() is the best way doing this? Most articles date back to even 2002, isn't there a new way maybe in PHP 5.2?
Also I read that you have to reboot the server when you make adjustments to the translations??
The intl extension uses the new ICU library but it's only available in PHP 5.3+
Yes, rebooting the server is a major issue which was a dealbraker for me. Also gettext's primary intention is to translate language strings, not substitute constants with text, whether that's good for you is for you to decide (i.e. text replacement is 'A dog is brown'=>'Das hund ist braun' (I don't know German;), constant replacement is 'catalog_greeting'=>'Welcome to the catalog').
There are lots of alternative pure-PHP solutions that may work for you. I use a constant replacement scheme that I save in the database and create a separate serialized array for every language on each save, so fetching is extremely quick and the database format matters does not matter to performance. Works great, easy to setup (even from scratch), easy to maintain and extend.
Related
I'm making constants in my PHP application to organize its responses, and I'm planning to add foreign language support soon. Here's what I'm doing right now: define('SOME_SYSTEM_MESSAGE',array('EN' => 'This is the system message!') [USER_LANGUAGE]); (I know that the [ ] syntax is only supported in PHP 5.5, I have a backwards-compatible function for older versions). Is that good practice for coding, or should I keep the messages in an array (like $en for english messages or $fr for french messages), or some other way?
I think it depends on where you want to go with your application.
Your proposed solution might work if you want to keep it mostly limited to yourself / your team.
But if other people should use your code later on or if you want to be able to translate it into other languages, it would be handy to have files like .po or .csv which translaters can handle.
Constants can not hold arrays, constants are limited to the following types: null, boolean, integer, string or resource.
If you want to localize your application into different languages use gettext or other localization technologies. You should prefer components instead of reinventing something.
As an Example: The Symfony Framework contains an (standalone) Translation-Component.
To make my application multilingual I'm wondering if there are big advantages to GNU's gettext or if there are big disadvantages of building your own 'library'.
Also if 'build your own' is more advised, what are the best practices? Obviously they have to be stored in the database, I doubt I wanna work with flat files, so at some point I'm better off caching them, how should I go about this?
The gettext extension has some quirks.
It keeps translation strings in memory, and thus can necessitate a restart (under the mod_php runtime that is) when catalogs are updated.
The gettext API wasn't really designed for web apps. (It looks for environment variables and system settings. You have to spoon feed the Accept-Language header.)
Many people run into problems setting it up.
On the other hand there is more tool support for gettext.
You will almost always have less trouble with a handicrafted solution. But that being said, the gettext API is unbeatable in conciseness. _("orig text") is more or less the optimal interface for translating text.
If you want to code something up yourself, I recommend you concentrate on that.
Use a simple function name. In lieu of _() a few php apps use the double underscore __(). Don't adopt any library that makes it cumbersome to actually use translated strings. (E.g. if using Zend Framework, always write a wrapper function.)
Accept raw English text as input. Avoid mnemonic translation keys (e.g. BTN_SUBMT)
Do not under no circumstances use the database for translation catalogues. Those texts are runtime data, not application data. (For a bad example see osCommerce.)
You can often get away with PHP array scripts lang/nl.php containing nothing but $text["orig english"] = "dutch here";, which are easy to utilize from whatever access method you use.
Also avoid pressing everything into that system. Sometimes it's unavoidable to adopt a second mechanism for longer texts. I for example used template/mail.EN.txt for bigger blobs.
Gettext is not thread-safe.
Before deciding to implement your own I suggest you take a look at Zend_Translate. It has native support for a gettext adapter, as well as TMX, CSV, INI, Array and more formats. It should also be easy enough to write your own adapter if your preferred format isn't supported, such as database storage.
I am currently researching the best methods to integrate i18n into projects.
There's several methods I have thought of doing this, first being a database scheme to store the strings and relevant locale, but the problem with this is that it would not be that easy to select the strings, because i would not like to perform quesries like so:
SELECT text FROM locales WHERE locale = 'en_GB' AND text_id = 245543
Or
SELECT text FROM locales WHERE locale = 'en_GB' AND text_primary = 'hello'
The next method would be to store them within files such as locales/en_gb/login/strings.php and then try and access them via an class specifically developed like so:
$Language = Registry::Construct('Language',array('en_GB'));
echo $Language->login->strings->hello;
The issue with this is I would have to build a system that would update these files via an administration panel witch is very time consuming, not just building the system to manage the strings but actually managing the strings as the site grows
What other methods are there that will be beneficial for a large system
Is there any automated way to do 'Translation' as such
Should I stick with a database method and build a system for users to translate strings with rating / suggest better version ?
What systems have you tried in the past and should I look into them or totally avoid them.
In addition to gettext already mentioned, PHP 5.3 has native Internationalization support
If that's not an option, consider using Zend Framework's Zend_Translate, Zend_Locale and related components for that. Zend_Translate supports a number of adapters, including but not limited to simple arrays, gettext, XmlTm and others.
I've implemented a XML translation utility as part of a bigger project. You can find it here, and a sample translation file is here (en_US).
The most impressive method to study is Drupal's implementation. Second best, would be Wordpress. Both use gettext and .pot/.po/.mo for localization. And, the good thing is that there is a beautiful Open Source .po editor called Poedit. It's available for Windows System users, which gives a wider appeal. It's also available for Mac and Linux. Check it out here: http://www.poedit.net/
Have a look at the Gettext (http://php.net/manual/en/book.gettext.php) library.
Don't put your text into a database. That'll just make life hard on the translation team.
I am building a website and it need to be in 7 languages?
I was wondering if there is a good practice can be applied to get multilingual php script?
Easy for me
Easy for the translators
Also what do you think , should I Store it in DB , XML or in PHP file?
There are plenty of options for storing translations:
TMX: A relatively new XML format for translations. Seems to be gaining in popularity.
Gettext is another open format for translations. Been the de-facto standard for a long time.
ini files - easy to edit, very simple format
PHP files (arrays) - easy to edit for PHP programmers, good performance
CSV format - relatively simple to use.
I'd suggest you use something like Zend_Translate which supports multiple adapters and provides a basic approach to embedding translations in your application.
Contrary to daddz I would recommend against using gettext in PHP:
The locale setting is per-process. This means that when you are working with a multithreaded apache or any other multithreaded webserver running PHP in-process, calling setlocale in one thread will affect the other threads.
Because you can't know which thread/process is handling which request, you'll run into awful problems with users intermittently getting the wrong locale.
The locale you set in PHP has influence on functions like printf or even strtotime. You will certainly get bit by "strange" number formats arriving in your backend code if you work with gettext/setlocale
Use any of the other solutions lined to by Eran or quickly do something yourself (PHP arrays work very nicely). Also use the intl-extension which will be in core PHP 5.3 for number and date formating and collation.
Using gettext on a web based solution over and over proved to be quite like opening the proverbial can of worms.
I'd suggest Gettext.
It's cross-platform, open-source, widely used and available for php: PHP Gettext
I have built multilingual CMS. All content was stored in a database, with main tables for common (not language specific values) and separate tables for the language specific content.
For instance, let us imagine storing products - we have a 'products' table (contains unique_id, date created, image urls etc etc) and a 'product_local' table (contains any language specific fields).
Using this method it is very easy to maintain content.
I have no experience on gettext so no comment on that topic, but I have built a few multi-lingual sites using the following methods:
METHOD 1
I wouldn't say my format is the best, just that it's effective. I've also used array. Depending on where the content is stored.
For example, I'll have an associative array of text with the indexes identifying which text:
$text['english']['welcome'] = "Welcome to my site. blah blah blah";
$text['english']['login'] = "Please enter your username and password to login";
And maybe set your language with a constant or config variable.
METHOD 2
I've built two sites with identical structures and back-ends but each one used a different database and were maintained separately: data_french, data_english.
You may find this article on the topic an interesting read:
http://cubicspot.blogspot.com/2011/12/cross-platform-multilingual-support-in.html
The author advocates a "lazy programmer" strategy - do it only if you need multilingual stuff - and seems to recommend the PHP array approach with IANA language codes. The article is kind of vague though.
Check this forum. I think you'd probably need a different approach if you have somebody helps you with translation.
Most efficient approach for multilingual PHP website
I have been looking at a few options for enabling localization and internationalization of a dynamic php application. There appears to be a variety of tools available such as gettext and Yahoo's R3 and I am interested in hearing from both developers and translators about which tools are good to use and what functionality is important in easing the task of implementation and translation.
PHP gettext implementation works very smoothly. And po files with po edit and gettext are about as good a way as you can get to deal with localization bearing in mind that no solution of this kind can completely handle the complexities of the various languages. For example, the gettext method is very good on plural forms, but nothing I've seen can handle things like conjugation.
For more info see my post here: How do you build a multi-language web site?
We've been tinkering withZend_Translate, since we use the Zend Framework anyway. It's very well documented and so far extremly solid.
In the past, I've pretty much used my own home-grown solution mostly. Which involves language files with constants or variables which hold all text parts and are just echo'ed in the view/template later on.
As for gettext, in the past I've heard references about PHP's gettext implementation being faulty, but I can't really back that up nor do I have any references right now.
There are a number of useful extensions in pecl:
http://pecl.php.net/packages.php?catpid=28&catname=Internationalization
In particular, you may want to check out php-intl, which provides most of the key i18n functions from International Components for Unicode (ICU)
the database driven solution to show the messages is not always the good one, I worked in a site with more than 15 languages and translations were an issue.
so our design was:
translation app in php-mysql (translation access, etc.)
then translations are written in php arrrays
these arrays are also cached in APC to speed up the site.
so to localize different languages you only need do an include
like
<?php
include('lang/en.php');
include('lang/en_us.php'); // this file overrides few keys from the last one.
?>
Xataface can be used to quite easily internationalize an arbitrary PHP/MySQL application. It support translation of both your static text, and your database data. All you have to do is add a line or 2 of code to a couple of places in your application and it's good to go.
http://xataface.com/documentation/tutorial/internationalization-with-dataface-0.6