I am currently researching the best methods to integrate i18n into projects.
There's several methods I have thought of doing this, first being a database scheme to store the strings and relevant locale, but the problem with this is that it would not be that easy to select the strings, because i would not like to perform quesries like so:
SELECT text FROM locales WHERE locale = 'en_GB' AND text_id = 245543
Or
SELECT text FROM locales WHERE locale = 'en_GB' AND text_primary = 'hello'
The next method would be to store them within files such as locales/en_gb/login/strings.php and then try and access them via an class specifically developed like so:
$Language = Registry::Construct('Language',array('en_GB'));
echo $Language->login->strings->hello;
The issue with this is I would have to build a system that would update these files via an administration panel witch is very time consuming, not just building the system to manage the strings but actually managing the strings as the site grows
What other methods are there that will be beneficial for a large system
Is there any automated way to do 'Translation' as such
Should I stick with a database method and build a system for users to translate strings with rating / suggest better version ?
What systems have you tried in the past and should I look into them or totally avoid them.
In addition to gettext already mentioned, PHP 5.3 has native Internationalization support
If that's not an option, consider using Zend Framework's Zend_Translate, Zend_Locale and related components for that. Zend_Translate supports a number of adapters, including but not limited to simple arrays, gettext, XmlTm and others.
I've implemented a XML translation utility as part of a bigger project. You can find it here, and a sample translation file is here (en_US).
The most impressive method to study is Drupal's implementation. Second best, would be Wordpress. Both use gettext and .pot/.po/.mo for localization. And, the good thing is that there is a beautiful Open Source .po editor called Poedit. It's available for Windows System users, which gives a wider appeal. It's also available for Mac and Linux. Check it out here: http://www.poedit.net/
Have a look at the Gettext (http://php.net/manual/en/book.gettext.php) library.
Don't put your text into a database. That'll just make life hard on the translation team.
Related
Which is better:
gettext
custom MySQL+cache based functionality
Gettext is a sort of builtin feature, so I assume it's tweaked for performance. Using poedit is a pain and impossible to show to any client.
A custom functionality allows for simple translation interface. But might be heavy on php/db usage.
I suppose, which one would you use when?
Localization is difficult. It is really difficult. It's not just "pairs of words" => "Wortpaare", it's a lot more complex than that. What most people forget when they look at gettext and go "Ugh, ugly" is that the localization process is a lot more important than the technical details of the implementation. That's because the actual translators are typically not programmers and are probably not even in-house. This causes a lot more headaches than you may think. gettext is really old, is battle tested and has a huge toolchain behind it that is tuned to support this process. If you want to do i18n and l10n properly, you need a powerful system. gettext is that and has support from a wide range of tools. Your Homebrewed Translation System™ does not.
First of all, you need a robust system to extract translatable strings. Without being able to automatically and reproducibly extract translatable strings from source code, you have a mountain of work for each new string you want to translate. In gettext, xgettext does that.
Next, you need a tool to synchronize the extracted strings with already existing translations in a way that no translations are lost and that only slightly changed translations are kept if possible. In gettext, msgmerge does that.
Next, you want a way to add extra information to strings. You want to be able to group them by category, "domain" and context, you may want to add comments for the translator to the source code and you may want translators to be able to add comments to the translations. gettext supports all that.
Next, you want a file format that has good support from a variety of tools, since you may be sending your files to China to get them translated there. The reason you may be sending them away to external translators is also the reason you need a good synching tool to merge changes, since this can be a very asynchronous process. PO files are very well supported, because gettext is so old. There are many open source and commercial tools that support the localization process at many levels, depending on your specific needs.
Do not underestimate the task of localization, choose a tool that is well suited for the process and learn it. gettext is a great tool, if admittedly not the most beginner friendly.
For what it's worth, here's my gettext extension for Twig, which makes gettext for PHP even better.
Maybe you should look into Memcached which you can use it in combination with MySQL.
It's very useful for fetching data which doesn't change too often, like translations.
Gettext is a very old format. Its using files to store translations. Its clumsy, especially when you have translations by the thousands lets say 20,000. Managing a PO file with 20,000 translation strings is a nightmare, across 50 languages is imposible. Then you have to actually compile it in a MO file. No thanks. It might have made sense back in the 1990, not now.
Databases instead are powerful. Like really powerful. Name what you need and you can get it. In a second they can tell you exactly:
Which of the translation strings are not translated in which language
When was the translation first created and by whom
When was the translation last updated and by whom
Full history of every translation with person who has done the change
You can have all texts pre-translated in materialized views and get them with one select statement
Order translation strings in alphabetic order, in pages for page by page view and edit
Set which user can update exactly which translations
With some simple HTML web forms, anyone anywhere in the world can translate your application in real time, within seconds, and with full history, comment on every translation pair, recieve and read replies, flag translations for todo, etc, etc
Have analytics in seconds of who has made how many translations, in the last day, week, month, year - so you can give incentives out
Still want PO files? You can get these created by your database on a schedule that you need them
Missing translations? Your database can send automatic emails, SMS to the responsible translator for this language.
A translation has been updated by the translator? Now the database can send an email to the responsible reviewer to approve
Need the translation fast? Your database can call an API to have it translated right now, then send email to the responsible person to review
I am creating a website and it has to be multi-language. The translation has to be made and prefixed (NO auto-translations api). My question is, what is more efficient?:
Create one file set for each language.
Create one file set and show text through PHP constants.
I also thought of making a MySql query to get an array with all translations at the beginning of the document.
Note:* There will not be really large texts.
Longer term, you're best option is going to be using one file set for each language. If you use an industry standard format, such as GNU gettext, PHP has built in support. Also, 3rd party translation companies and translation tools generally support the format, so long term site maintenance requires less dependencies on developers.
I'm using Zend Translate for a website I'm working on.
Most of the components in Zend Framework can be used as standalone components. I'm using the full stack but it shouldn't be a major problem to use only Zend Translate.
As for using a database to get translations, I think it depends on the type of content you are dealing with. For instance, for Joomla! there are components that store different versions of the same article, in different languages.
I would recommend Zend Translate, as you have different options to get the translations from: PHP arrays, INI files, gettext, xml.
You can event extend the adapter class to create a database backend adapter.
Hope it helps.
I think it will help you: http://www.youtube.com/watch?v=v7vCp_TFcdU
It uses session to store the chosen language and use an array for each language.
I've been surprised by how little I've found on externalizing strings in PHP. Does everyone use gettext, or is there some other framework or tool that I'm not aware of?
Zend_Translate / Zend_Locale are nice and very flexible. They do not need the whole Zend Framework to be present. They support gettext moo/.po files but also CSV and other formats.
Hope this library helps you:
The i18n package is a punch of
classes for internationalization. It
gives you the possibility to maintain
multilanguage webpages more easily.
The translation strings are stored in
flat text files, special Gettext files
which are basically precompiled
translation files or in a MySQL
database. And it works independently
from PHP’s setlocale function.
I would say that you should use gettext because it is mature and easy to setup. Also BU using gettext you will be able to extend its useage for other type of sources than php. Consider the PO file format the standard for this.
Im working in i18n area for many years and I can tell you that gettext will provide you best results with minimal efforts if you have more than 50-100 strings in your project.
Once you've set the foundation for localizing your application, if you find yourself needing to manage and / or just get the actual translation done we have (what I like to think, obviously :) a pretty cool tool called String - http://mygengo.com/string
String is great for not just managing translations, where you can invite others to projects to help with translation, but you can order translations right in the service too. We've integrated our API into String to showcase our API and the ability to see status updates for numerous (100s...1000s) of jobs, translated by real people!
If you're interested in the API itself, we held a bounty contest not long ago with some fun winners for a number of platforms (Wordpress, Django, etc.): http://mygengo.com/services/api/lab/winners/
Just thought I'd share.
I am building a website and it need to be in 7 languages?
I was wondering if there is a good practice can be applied to get multilingual php script?
Easy for me
Easy for the translators
Also what do you think , should I Store it in DB , XML or in PHP file?
There are plenty of options for storing translations:
TMX: A relatively new XML format for translations. Seems to be gaining in popularity.
Gettext is another open format for translations. Been the de-facto standard for a long time.
ini files - easy to edit, very simple format
PHP files (arrays) - easy to edit for PHP programmers, good performance
CSV format - relatively simple to use.
I'd suggest you use something like Zend_Translate which supports multiple adapters and provides a basic approach to embedding translations in your application.
Contrary to daddz I would recommend against using gettext in PHP:
The locale setting is per-process. This means that when you are working with a multithreaded apache or any other multithreaded webserver running PHP in-process, calling setlocale in one thread will affect the other threads.
Because you can't know which thread/process is handling which request, you'll run into awful problems with users intermittently getting the wrong locale.
The locale you set in PHP has influence on functions like printf or even strtotime. You will certainly get bit by "strange" number formats arriving in your backend code if you work with gettext/setlocale
Use any of the other solutions lined to by Eran or quickly do something yourself (PHP arrays work very nicely). Also use the intl-extension which will be in core PHP 5.3 for number and date formating and collation.
Using gettext on a web based solution over and over proved to be quite like opening the proverbial can of worms.
I'd suggest Gettext.
It's cross-platform, open-source, widely used and available for php: PHP Gettext
I have built multilingual CMS. All content was stored in a database, with main tables for common (not language specific values) and separate tables for the language specific content.
For instance, let us imagine storing products - we have a 'products' table (contains unique_id, date created, image urls etc etc) and a 'product_local' table (contains any language specific fields).
Using this method it is very easy to maintain content.
I have no experience on gettext so no comment on that topic, but I have built a few multi-lingual sites using the following methods:
METHOD 1
I wouldn't say my format is the best, just that it's effective. I've also used array. Depending on where the content is stored.
For example, I'll have an associative array of text with the indexes identifying which text:
$text['english']['welcome'] = "Welcome to my site. blah blah blah";
$text['english']['login'] = "Please enter your username and password to login";
And maybe set your language with a constant or config variable.
METHOD 2
I've built two sites with identical structures and back-ends but each one used a different database and were maintained separately: data_french, data_english.
You may find this article on the topic an interesting read:
http://cubicspot.blogspot.com/2011/12/cross-platform-multilingual-support-in.html
The author advocates a "lazy programmer" strategy - do it only if you need multilingual stuff - and seems to recommend the PHP array approach with IANA language codes. The article is kind of vague though.
Check this forum. I think you'd probably need a different approach if you have somebody helps you with translation.
Most efficient approach for multilingual PHP website
I have been looking at a few options for enabling localization and internationalization of a dynamic php application. There appears to be a variety of tools available such as gettext and Yahoo's R3 and I am interested in hearing from both developers and translators about which tools are good to use and what functionality is important in easing the task of implementation and translation.
PHP gettext implementation works very smoothly. And po files with po edit and gettext are about as good a way as you can get to deal with localization bearing in mind that no solution of this kind can completely handle the complexities of the various languages. For example, the gettext method is very good on plural forms, but nothing I've seen can handle things like conjugation.
For more info see my post here: How do you build a multi-language web site?
We've been tinkering withZend_Translate, since we use the Zend Framework anyway. It's very well documented and so far extremly solid.
In the past, I've pretty much used my own home-grown solution mostly. Which involves language files with constants or variables which hold all text parts and are just echo'ed in the view/template later on.
As for gettext, in the past I've heard references about PHP's gettext implementation being faulty, but I can't really back that up nor do I have any references right now.
There are a number of useful extensions in pecl:
http://pecl.php.net/packages.php?catpid=28&catname=Internationalization
In particular, you may want to check out php-intl, which provides most of the key i18n functions from International Components for Unicode (ICU)
the database driven solution to show the messages is not always the good one, I worked in a site with more than 15 languages and translations were an issue.
so our design was:
translation app in php-mysql (translation access, etc.)
then translations are written in php arrrays
these arrays are also cached in APC to speed up the site.
so to localize different languages you only need do an include
like
<?php
include('lang/en.php');
include('lang/en_us.php'); // this file overrides few keys from the last one.
?>
Xataface can be used to quite easily internationalize an arbitrary PHP/MySQL application. It support translation of both your static text, and your database data. All you have to do is add a line or 2 of code to a couple of places in your application and it's good to go.
http://xataface.com/documentation/tutorial/internationalization-with-dataface-0.6