Trying to use gettext without country code ('es' instead 'es_ES') - php

I have the next files in my php project:
libraries/locale/es_ES/LC_MESSAGES/messages.po
libraries/locale/es_ES/LC_MESSAGES/messages.mo
libraries/locale/es/LC_MESSAGES/messages.po
libraries/locale/es/LC_MESSAGES/messages.mo
Both are the same file edited with PoEdit just differenced by Catalog->Properties->Language (es and es_ES respectively)
And this code into localization.php file
$language = "es_ES.UTF-8";
putenv("LANG=$language");
setlocale(LC_ALL, $language);
bindtextdomain(STRING_DOMAIN, LOCALE_PATH);
textdomain(STRING_DOMAIN);
echo "Test translation: "._('string to translate');
This code works fine and 'string to translate' is displayed correctly. However if I try to use the generic 'es' code:
$language = "es.UTF-8";
...String is not translated. Seems to be related to locales installed in my ubuntu (es_ES.utf8 exists but not es.utf8)
Can I force gettext to use es.UTF-8 file?

As a workaround, you can always use localedef to compile a new locale based on an existing one. To create es based on es_ES.UTF-8:
localedef -i es_ES -f UTF-8 es.UTF8
But here comes some important questions which covers areas beyond simple loading of translation files. Since your locale-specific information like date and time formats, measurements, etc are depending on installed locales, it is always a good idea to have a plan regarding use of locales. Assuming that you are using es_ES for Spanish (Spain), what is es? Is it intended for a specific flavor of Spanish like es_IC (Canary Islands) or some Latin American flavor?
Here is a clarifying example; if I want to design a customized repertoire of locales to cover various flavors of Spanish, I will do it like this; first I add those locales which are easily installable with locale-gen:
locale-gen es_ES.UTF-8 es_MX.UTF-8 es_AR.UTF-8
Then I would like to have es_US based on es_MX and es_419 (419 is the geographical area code for Latin America) based on es_AR:
localedef -i es_MX -f UTF-8 es_US.UTF-8
localedef -i es_AR -f UTF-8 es_419.UTF-8

Related

gettext only translate to English in PHP

I'm spanish, and making tests internacionalizing a text width PHP, i only get it translated to english.
I got this structure of files:
locale/en_US/LC_MESSAGES/con los ficheros messages.mo y messages.po
locale/es_ES/LC_MESSAGES/con los ficheros messages.mo y messages.po
locale/fr_FR/LC_MESSAGES/con los ficheros messages.mo y messages.po
Every files have the key word "Servicios" translated to each languaje.
And in PHP i have this code:
<?php
putenv("LANG=en_US");
setlocale(LC_ALL, "en_US");
bindtextdomain("messages", "locale");
textdomain("messages");
?>
When i put the code 'en_US' show the good translation, but when i change it to 'es_ES' or 'fr_FR' that way:
<?php
putenv("LANG=es_ES");
setlocale(LC_ALL, "es_ES");
?>
or
<?php
putenv("LANG=fr_FR");
setlocale(LC_ALL, "fr_FR");
?>
still showing the translation to English
I am working on Widnows 7 and the function
echo $_SERVER['HTTP_ACCEPT_LANGUAGE'] ;
returns to
"es-ES,es;q=0.8"
always,
Which problem could it be?
Thank you
It is quite likely that the languages are not installed on the server your running the script on - do you have shell access to the server? Then try
locale -a
to see which locales are installed. Also have a look here Is it feasible to rely on setlocale, and rely on locales being installed?
NOTE:
be careful with the LC_ALL setting, as it may introduce some unwanted conversions. For example, I used
setlocale (LC_ALL, "Dutch");
to get my weekdays in dutch on the page. From that moment on (as I found out many hours later) my floating point values from MYSQL where interpreted as integers because the Dutch locale wants a comma (,) instead of a point (.) before the decimals. I tried printf, number_format, floatval.... all to no avail. 1.50 was always printed as 1.00 :(
When I set my locale to :
setlocale (LC_TIME, "Dutch");
my weekdays are good now and my floating point values too.

NumberFormatter::SPELLOUT spellout-ordinal in russian and italian

this code works for english, spanish and german ordninal numbers, but with russian or italian ordninal numbers it doesn't work.
'ru-RU','it-IT' also don't work
I get for example in russian for 2 -> два (this is the cardinal number) , but I want the ordinal number and this would be here 2 -> второй.
I get for example in italian for 2 -> due (this is the cardinal number) , but I want the ordinal number and this would be here 2 -> secondo.
Update:
I found a solution with works in french, spain, german and some other languages:
maskuline ordinal numbers: %spellout-ordinal-maskuline
feminine ordinal numbers: %spellout-ordinal-feminine
russian and italian version doesn't work and I tried already with -maskuline/-feminine
$ru_ordinal = new NumberFormatter('ru', NumberFormatter::SPELLOUT);
$ru_ordinal->setTextAttribute(NumberFormatter::DEFAULT_RULESET, "%spellout-ordinal");
NumberFormatter is using ICU formatting.
As you can check here: http://saxonica.com/html/documentation/extensibility/config-extend/localizing/ICU-numbering-dates/ICU-numbering.html
... Russian (ru) has following formatting available:
spellout-cardinal-feminine (scf)
spellout-cardinal-masculine (scm)
spellout-cardinal-neuter (scne)
spellout-numbering (sn)
spellout-numbering-year (sny)
... and Italian (it):
spellout-cardinal-feminine (scf)
spellout-cardinal-masculine (scm)
spellout-numbering (sn)
spellout-numbering-year (sny)
spellout-ordinal-feminine (sof)
spellout-ordinal-masculine (som)
That is why you will not be able to set ordinal format for (ru) and following code:
$nFormat = new NumberFormatter('it', NumberFormatter::SPELLOUT);
$nFormat->setTextAttribute(NumberFormatter::DEFAULT_RULESET, "%spellout-ordinal-feminine");
var_dump($nFormat->format(42));
Will print:
string 'quaranta­duesima' (length=17)
Like you (propably) want.
EDIT:
Informations about used formatting with references to ICU: http://php.net/manual/en/numberformatter.create.php
Tested with PHP 5.4.x and ICU version => 51.2; ICU Data version => 51.2.
You can use shell command:
$ php -i | grep ICU
To check what version of ICU you have.
For latest ICU version you should propably install/update php-intl package: http://php.net/manual/en/intl.installation.php
EDIT 2:
I have created extension for NumberFormatter (so far with polish ordinals). Feel free to contribute another languages: https://github.com/arius86/number-formatter
Just a Recommendation, I am not sure if this works or have an Apache services open at this point of time as I am at college, but have you tried to put ru-RU for Russia. In PHP I personally put my Language Codes as "en-GB"
http://download1.parallels.com/SiteBuilder/Windows/docs/3.2/en_US/sitebulder-3.2-win-sdk-localization-pack-creation-guide/30801.htm
Here is a List I found on the internet with some to help you.

Get "list separator" character for any locale

Starting with only the locale identifier name (string) provided by clients, how or where do I look up the default "list separator" character for that locale?
The "list separator" setting is the character many different types of applications and programming languages may use as the default grouping character when joining or splitting strings and arrays. This is especially important for opening CSV files in spreadsheet programs. Though this is often the comma ",", this default character may be different depending on the machine's region settings. It may even differ between OS's.
I'm not interested in my own server environment here. Instead, I need to know more about the client's based off their locale identifier which they've given to me, so my own server settings are irrelevant. Also for this solution, I can not change the locale setting on this server to match a client's for the entire current process as a shortcut to look this value up.
If this is defined in the ICU library, I'm not able to find any way to look this value up using the INTL extension.
Any hints?
I am not sure if my answer will satisfy your requirements but I suggest (especially as you don't want to change the locale on the server) to use a function that will give you the answer:
To my knowledge (and also Wikipedia's it seems) the list separator in a CSV is a comma unless the decimal point of the locale is a comma, in that case the list separator is a semicolon.
So you could get a list of all locales that use a comma (Unicode U+002C) as separator using this command:
cd /usr/share/i18n/locales/
grep decimal_point.*2C *_* -l
and you could then take this list to determine the appropriate list separator:
function get_csv_list_separator($locale) {
$locales_with_comma_separator = "az_AZ be_BY bg_BG bs_BA ca_ES crh_UA cs_CZ da_DK de_AT de_BE de_DE de_LU el_CY el_GR es_AR es_BO es_CL es_CO es_CR es_EC es_ES es_PY es_UY es_VE et_EE eu_ES eu_ES#euro ff_SN fi_FI fr_BE fr_CA fr_FR fr_LU gl_ES hr_HR ht_HT hu_HU id_ID is_IS it_IT ka_GE kk_KZ ky_KG lt_LT lv_LV mg_MG mk_MK mn_MN nb_NO nl_AW nl_NL nn_NO pap_AN pl_PL pt_BR pt_PT ro_RO ru_RU ru_UA rw_RW se_NO sk_SK sl_SI sq_AL sq_MK sr_ME sr_RS sr_RS#latin sv_SE tg_TJ tr_TR tt_RU#iqtelif uk_UA vi_VN wo_SN");
if (stripos($locales_with_comma_separator, $locale) !== false) {
return ";";
}
return ",";
}
(the list of locales is taken from my own Debian machine, I don't know about the completeness of the list)
If you don't want to have this static list of locales (though I assume that this doesn't change that often), you can of course generate the list using the command above and cache it.
As a final note, according to RFC4180 section 2.6 the list separator actually never changes but rather fields containing a comma (so this also means floating numbers, depending on the locale) should be enclosed in double-quotes. Though (as linked above) not many people follow the RFC standard.
There's no such locale setting as "list separator" it might be software specific, but I doubt it's user specific.
However... You can detect user's locale and try to match the settings.
Get browsers locale: $accept_lang = $_SERVER['HTTP_ACCEPT_LANGUAGE']; this might contain a list of comma-separated values. Some browser don't send this though. more here...
Next you can use setlocale(LC_ALL, $accept_lang); and get available locale settings using $locale_info = localeconv(); more here...

PHP Gettext - No translation

I am trying to use the PHP gettext extension in order to translate some strings. All functions appear to return the correct values but calling gettext()/_() returns the original string only. The PO/MO files seem correct and I believe I have set the directories up correctly. I am running WAMP Server with PHP 5.3.10 on Windows (also tried running 5.3.4 and 5.3.8 because I have the installations).
Firstly, see /new2/www/index.php:
$locale = 'esn'; # returns Spanish_Spain.1252 in var dump
putenv("LC_ALL={$locale}"); // Returns TRUE
setlocale(LC_ALL, $locale); // Returns 'Spanish_Spain.1252'
$domain = 'messages';
bindtextdomain($domain, './locale'); // Returns C:\wamp\www\new2\www\locale
bind_textdomain_codeset($domain, 'UTF-8'); // Returns UTF-8
textdomain($domain); // Returns'messages'
print gettext("In the dashboard"); // Prints the original text, not the translation.
exit;
I have created the following file structure:
www/new2/www/locale/Spanish_Spain.1252/LC_MESSAGES/messages.mo
I have also tried replacing Spanish_Spain.1252 with: es_ES, esn, esp, Spanish, and Spanish_Spain.
The PO file used to generate the MO is like so (only the relevant entry given):
#: C:\wamp\www\new2/www/index.php:76
msgid "In the dashboard"
msgstr "TRANSLATED es_ES DASHBOARD"
This was generated using PoEdit. I have restarted Apache after adding any new .MO file. Also note that I was previously using Zend_Translate with Gettext and it was translating correctly. I wish to rely on the native gettext extension, though, in part because I am attempting to create a lightweight framework of my own.
Any help would be appreciated.
Edit: Amended directory structure. Note - will be able to try recent answers within 24hrs.
I set this up on my XAMPP instance and figure it out.
Flat out setlocale does not work on Windows, so what it returns is irrelevant.
For Windows you set the locale using the standard language/country codes (in this case es_ES is Spanish as spoken in Spain)
Under your locale directory create es_ES/LC_MESSAGES/. This where your messages.mo file lives.
$locale = 'es_ES';
putenv("LC_ALL={$locale}"); // Returns TRUE
$domain = 'messages';
bindtextdomain($domain, './locale');
bind_textdomain_codeset($domain, 'UTF-8');
textdomain($domain); // Returns'messages'
print gettext("In the dashboard");
exit;
I am not sure if this made a different, but I did two things when creating the po file. In poEdit under File -> Preferences I changed the Line ending format to Windows. And after I created the initial po with poEdit I opened the file in Notepad++ and switched the encoding type to UTF-8 as poEdit did not do this.
I hope this at least points you in the right direction.
References
PHP Localization Tutorial on Windows
Country Codes
Language Codes
Your code mentions this as the return value from bindtextdomain:
C:\wamp\www\new2\www\locale
With the setlocale of Spanish_Spain.1252 and textdomain of messages, calls to gettext will look in this path:
C:\wamp\www\new2\www\locale\Spanish_Spain.1252\LC_MESSAGES\messages.mo
But you created the file structure of:
www/new2/locale/Spanish_Spain.1252/LC_MESSAGES/messages.mo
^^
www/ missing here
Edit
Okay, so that didn't help. I've created a test script on Windows and using POEdit like you:
$locale = "Dutch_Netherlands.1252";
putenv("LC_ALL=$locale"); // 'true'
setlocale(LC_ALL, $locale); // 'Dutch_Netherlands.1252'
bindtextdomain("messages", "./locale"); // 'D:\work\so\l10n\locale'
textdomain("messages"); // 'messages'
echo _("Hello world"); // 'Hallo wereld'
My folder structure is like this:
D:\work\so\l10n\
\locale\Dutch_Netherlands.1252\LC_MESSAGES\messages.mo
\locale\Dutch_Netherlands.1252\LC_MESSAGES\messages.po
\test.php
Hope it helps, although it looks almost identical to yours. A few things I found online:
It's important to set the character set in .po file
Spaces inside the localization file might have a UTF8 alternative, so be wary of key lookups failing. Probably the best thing to test first is keys without spaces at all.
A suggestion: you may need the full locale for the .mo file. This is probably Spanish_Spain.UTF8 or esn_esn.UTF8 or esn_esp.UTF8 (not 1252, as you change the code base).
To track what directory it's looking for, you can install Process monitor (http://technet.microsoft.com/en-us/sysinternals/bb896645). It spews out bucket loads on info, but you should be able to find out which file/directory is being looked for.
(My other thought is to check file permissions - but if you already had something similar in Zend_Translate, then probably not the cause, but worth checking anyway).
Sorry if not good - but might give you a clue.
Look here. It works for me on windows and on linux also. The last values in the array works for windows. List of languages names can be found here. My catalogs are in
./locales/en/LC_MESSAGES/domain.mo
/cs/LC_MESSAGES/domain.mo
I have never tried using gettext on Windows, but each time I had problems with gettext on linux systems, the reason was that an appropriate language pack was not installed.
Problem can be also that when you change your *.po and *.mo files, you have to restart the Apache Server. This can be problem, so you can use workaround - always rename these files to some new name and they will be reloaded.

Adding support for i18n in PHP with gettext?

I've always heard about gettext - I know it's some sort of unix command to lookup a translation based on the string argument provided, and then produces a .pot file but can someone explain to me in layman's terms how this is taken care of in a web framework?
I might get around to looking at how some established framework has done it, but a layman's explanation would help because it just might help clear the picture a bit more before I actually delve into things to provide my own solution.
The gettext system echoes strings from a set of binary files that are created from source text files containing the translations in different languages for the same sentence.
The lookup key is the sentence in a "base" language.
in your source code you will have something like
echo _("Hello, world!");
for each language you will have a corresponding text file with the key and the translated version (note the %s that can be used with printf functions)
french
msgid "Hello, world!"
msgstr "Salut, monde!"
msgid "My name is %s"
msgstr "Mon nom est %s"
italian
msgid "Hello, world!"
msgstr "Ciao, mondo!"
msgid "My name is %s"
msgstr "Il mio nome è %s"
These are the main steps you need to go through for creating your localizations
all your text output must use gettext functions (gettext(), ngettext(), _())
use xgettext (*nix) to parse your php files and create the base .po text file
use poedit to add translation texts to the .po files
use msgfmt (*nix) to create the binary .mo file from the .po file
put the .mo files in a directory structure like
locale/de_DE/LC_MESSAGES/myPHPApp.mo
locale/en_EN/LC_MESSAGES/myPHPApp.mo
locale/it_IT/LC_MESSAGES/myPHPApp.mo
then you php script must set the locale that need to be used
The example from the php manual is very clear for that part
<?php
// Set language to German
setlocale(LC_ALL, 'de_DE');
// Specify location of translation tables
bindtextdomain("myPHPApp", "./locale");
// Choose domain
textdomain("myPHPApp");
// Translation is looking for in ./locale/de_DE/LC_MESSAGES/myPHPApp.mo now
// Print a test message
echo gettext("Welcome to My PHP Application");
// Or use the alias _() for gettext()
echo _("Have a nice day");
?>
Always from the php manual look here for a good tutorial

Categories