I want to use PHP's Intl's NumberFormatter class to display prices in a human-readable format. What our project needs:
The CLDR number pattern, and the currency and separator symbols will need to be configured through our code and not default to what Intl/ICU knows.
Our application will take care of the decimals. NumberFormatter should display any decimals that we pass on to it.
However, when playing around with different configurations to find the exact combination that works for our project, I noticed some effects that I can't explain. The three formatters in the following code snippet are almost identical. As opposed to the first one, the second one uses the euro instead of the U.S. dollar, and the third one has a currency sign set. The output of the first formatter is as I expected it to be, but when I change the currency or set a currency sign, the MIN_FRACTION_DIGITS attribute is ignored and the sign is never changed.
<?php
$fmt = new NumberFormatter('de_DE', NumberFormatter::CURRENCY);
$fmt->setAttribute(NumberFormatter::MIN_FRACTION_DIGITS, 4);
echo $fmt->formatCurrency(1234567890.891234567890000, "EUR")."\n";
// Outputs 1.234.567.890,8912 €
$fmt = new NumberFormatter('de_DE', NumberFormatter::CURRENCY);
$fmt->setAttribute(NumberFormatter::MIN_FRACTION_DIGITS, 4);
echo $fmt->formatCurrency(1234567890.891234567890000, "USD")."\n";
// Ouputs 1.234.567.890,89 $
$fmt = new NumberFormatter('de_DE', NumberFormatter::CURRENCY);
$fmt->setAttribute(NumberFormatter::MIN_FRACTION_DIGITS, 4);
$fmt->setSymbol(\NumberFormatter::CURRENCY_SYMBOL, '%');
echo $fmt->formatCurrency(1234567890.891234567890000, "EUR")."\n";
// Outputs 1.234.567.890,89 €
?>
The first table row under General Purpose Numbers of the Unicode CLDR number pattern documentation describes that when parsing currency patterns, the two zeroes in the decimal part of the pattern will need to be replaced by however many digits the application thinks is appropriate. The application here is ICU (the C library that PHP uses for this), and the MIN_FRACTION_DIGITS attribute does its job of letting me override default behavior in the first example, but not in the second or the third.
Can someone please explain this seemingly random change in behavior? Let me know if there is any additional information that you need.
I just found the following:
https://bugs.php.net/bug.php?id=63140
http://bugs.icu-project.org/trac/ticket/7667
[2012-10-05 08:21 UTC] jpauli#email.com
I confirm this is an ICU bug in 4.4.x branch.
Consider upgrading libicu, 4.8.x gives correct result
Related
setlocale is taking a country and a language as parameters.
'money_format' is taking amount and other params.
But how can I tell a currency to PHP?
What if I want to use EUR in Australia?
like javascript can:
var formatter = new Intl.NumberFormat('en-AU', {
style: 'currency',
currency: 'EUR',
});
formatter.format(2500);
Chrome obviously understands that it is a foreign currency that need to be formatted to international currency format like so "EUR 2,500.00".
IE does not understand it, but it offers something "€2,500.00"
PHP (that is how I use it)
setlocale(LC_MONETARY, 'en-AU');
return money_format('%.2n', 2500);
gives "$2,500.00" (I know it is Ubuntu should be blamed)
It takes some default currency for Australia and I cannot find a way to change it. Is there a formatting library or something that I'm missing that can help?
Java script does not require installing locales individually, like Ubuntu does.
Maybe should we rely on Browsers in this case?
You can use the NumberFormatter in the intl extension instead.
$fmt = new NumberFormatter('en_AU.UTF8', NumberFormatter::CURRENCY);
print $fmt->formatCurrency(2500, 'EUR');
And it doesn't require to install every locale individually, since it ships with its own locale data.
When we write an invoice, we have to respect the money format
For example :
in France, you will write 1000,00 €
In the USA, $ 1,000.00
I would like to know if it is handled by some PHP library ? especially the money symbol at the left or right.
Edit :
I have never been downvoted like this and i think my question wasn't that well asked. Sorry for that.
I already know different formatting functions in PHP and I understand that the formatter options should be selected for each country and their money format. Thou, i don't have the time to do that job.
My objective is to format any money values for all possible locales in the world without registering all those locales.
Maybe somebody wrote a class which can do the job but I didn't find it.
Btw I know it is difficult, I can for example talk of the example of the EUR.
In many contry they write EUR xxxxx. In some countries, it is written xxxx EUR.
Yep, by the intl module.
For currencies there is a NumberFormater class:
http://www.php.net/manual/en/numberformatter.formatcurrency.php
http://ru2.php.net/number_format will help you.
You can simply setup it by yourself using number_format(). Save rule for each currency in db then apply on view regarding it's type.
The basic PHP function money_format can handle your output quite well.
If you need more, check out the Money PHP Library. IT is very powerfull.
From the Documentation:
<?php
$number = 1234.56;
// let's print the international format for the en_US locale
setlocale(LC_MONETARY, 'en_US');
echo money_format('%i', $number) . "\n";
// USD 1,234.56
// Italian national format with 2 decimals`
setlocale(LC_MONETARY, 'it_IT');
echo money_format('%.2n', $number) . "\n";
// Eu 1.234,56
// Using a negative number
$number = -1234.5672;
// US national format, using () for negative numbers
// and 10 digits for left precision
setlocale(LC_MONETARY, 'en_US');
echo money_format('%(#10n', $number) . "\n";
// ($ 1,234.57)
// Similar format as above, adding the use of 2 digits of right
// precision and '*' as a fill character
echo money_format('%=*(#10.2n', $number) . "\n";
// ($********1,234.57)
I'm using gettext() to translate some of my texts in my website. Mostly these are short texts/buttons like "Back", "Name",...
// I18N support information here
$language = "en_US";
putenv("LANG=$language");
setlocale(LC_ALL, $language);
// Set the text domain as 'messages'
$domain = 'messages';
bindtextdomain($domain, "/opt/www/abc/web/www/lcl");
textdomain($domain);
echo gettext("Back");
My question is, how 'long' can this text (id) be in the echo gettext("") part ?
Is it slowing down the process for long texts? Or does it work just fine too? Like this for example:
echo _("LZ adfadffs is a VVV contributor who writes a weekly column for Cv00m. The former Hechinger Institute Fellow has had his commentary recognized by the Online News Association, the National Association of Black Journalists and the National ");
The official gettext documentation merely has this advice:
Translatable strings should be limited to one paragraph; don't let a single message be longer than ten lines. The reason is that when the translatable string changes, the translator is faced with the task of updating the entire translated string. Maybe only a single word will have changed in the English string, but the translator doesn't see that (with the current translation tools), therefore she has to proofread the entire message.
There's no official limitation on the length of strings, and they can obviously exceed at least "one paragraph/10 lines".
There should be virtually no measurable performance penalty for long strings.
gettext effectively has a limit of 4096 chars on the length of strings.
When you pass this limit you get a warning:
Warning: gettext(): msgid passed too long in %s on line %d
and returns you bool(false) instead of the text.
Source:
PHP Interpreter repository - The real fix for the gettext overflow bug
function gettext http://www.php.net/manual/en/function.gettext.php
it's defined as a string input so your machines memory would be the limiting factor.
try to benchmark it with microtime or better with xdebug if you have it on your development machine.
Starting with only the locale identifier name (string) provided by clients, how or where do I look up the default "list separator" character for that locale?
The "list separator" setting is the character many different types of applications and programming languages may use as the default grouping character when joining or splitting strings and arrays. This is especially important for opening CSV files in spreadsheet programs. Though this is often the comma ",", this default character may be different depending on the machine's region settings. It may even differ between OS's.
I'm not interested in my own server environment here. Instead, I need to know more about the client's based off their locale identifier which they've given to me, so my own server settings are irrelevant. Also for this solution, I can not change the locale setting on this server to match a client's for the entire current process as a shortcut to look this value up.
If this is defined in the ICU library, I'm not able to find any way to look this value up using the INTL extension.
Any hints?
I am not sure if my answer will satisfy your requirements but I suggest (especially as you don't want to change the locale on the server) to use a function that will give you the answer:
To my knowledge (and also Wikipedia's it seems) the list separator in a CSV is a comma unless the decimal point of the locale is a comma, in that case the list separator is a semicolon.
So you could get a list of all locales that use a comma (Unicode U+002C) as separator using this command:
cd /usr/share/i18n/locales/
grep decimal_point.*2C *_* -l
and you could then take this list to determine the appropriate list separator:
function get_csv_list_separator($locale) {
$locales_with_comma_separator = "az_AZ be_BY bg_BG bs_BA ca_ES crh_UA cs_CZ da_DK de_AT de_BE de_DE de_LU el_CY el_GR es_AR es_BO es_CL es_CO es_CR es_EC es_ES es_PY es_UY es_VE et_EE eu_ES eu_ES#euro ff_SN fi_FI fr_BE fr_CA fr_FR fr_LU gl_ES hr_HR ht_HT hu_HU id_ID is_IS it_IT ka_GE kk_KZ ky_KG lt_LT lv_LV mg_MG mk_MK mn_MN nb_NO nl_AW nl_NL nn_NO pap_AN pl_PL pt_BR pt_PT ro_RO ru_RU ru_UA rw_RW se_NO sk_SK sl_SI sq_AL sq_MK sr_ME sr_RS sr_RS#latin sv_SE tg_TJ tr_TR tt_RU#iqtelif uk_UA vi_VN wo_SN");
if (stripos($locales_with_comma_separator, $locale) !== false) {
return ";";
}
return ",";
}
(the list of locales is taken from my own Debian machine, I don't know about the completeness of the list)
If you don't want to have this static list of locales (though I assume that this doesn't change that often), you can of course generate the list using the command above and cache it.
As a final note, according to RFC4180 section 2.6 the list separator actually never changes but rather fields containing a comma (so this also means floating numbers, depending on the locale) should be enclosed in double-quotes. Though (as linked above) not many people follow the RFC standard.
There's no such locale setting as "list separator" it might be software specific, but I doubt it's user specific.
However... You can detect user's locale and try to match the settings.
Get browsers locale: $accept_lang = $_SERVER['HTTP_ACCEPT_LANGUAGE']; this might contain a list of comma-separated values. Some browser don't send this though. more here...
Next you can use setlocale(LC_ALL, $accept_lang); and get available locale settings using $locale_info = localeconv(); more here...
I'm using the PECL intl module to localize dates and numbers in a PHP project. In all other languages I'm using (40), localizing ordinal numbers works fine. In Swedish, however, I get strange output. It appears to be the template constants used to generate the ordinals.
$fnf = new NumberFormatter('sv_FI', NumberFormatter::ORDINAL);
echo $fnf->format(1);
and
$snf = new NumberFormatter('sv_SE', NumberFormatter::ORDINAL);
echo $snf->format(1);
Both return 1:e%digits-ordinal-neutre:0: 1:a vs. something like 1st or 1er.
My only guess, other than a bug, is that I'm missing some additional argument such as the gender of an associated verb.
If you output the rule based number formatters rules $fnf->getPattern():
%digits-ordinal-masculine:
0: =#,##0==%%dord-mascabbrev=;
-x: −>%digits-ordinal-masculine>;
%%dord-mascabbrev:
0: :e%digits-ordinal-neutre:0: =%digits-ordinal-feminine=;
%digits-ordinal-reale:
0: =%digits-ordinal-feminine=;
%digits-ordinal-feminine:
0: =#,##0==%%dord-femabbrev=;
-x: −>%digits-ordinal-feminine>;
%%dord-femabbrev:
0: :e;
1: :a;
2: :a;
3: :e;
20: >%%dord-femabbrev>;
100: >%%dord-femabbrev>;
%digits-ordinal:
0: =%digits-ordinal-masculine=;
You can see that the private rule set dord-mascabbrev only has one rule giving that value:
:e%digits-ordinal-neutre:0: 1:a
Which you will have then output after your 1, like you describe in your question.
This is not a bug in PECL INTL, but the underlying rule is malformatted which is part of the ICU Libraries (that rule there). About three years ago the sv number formatter rules were fixed for missing semicolons, it looks like that one line slipped through.
These rules are taken into ICU from the CLDR (Common Locale Data Repository) at the Unicode Consortium. I opened a bug report there, because unless this is fixed in CLDR, and then put into ICU, it can't work with the PHP INTL extension.
The alternative might be to manually patch the ICU libraries (version 4.8) and then build the PECL package against your patched libraries.