Convert en_US to to en-US - php

I'm writing a PHP application that supports multiple languages.
When setting the locale in PHP, I am required to provide a value defined in, what I believe to be, RFC 1766 / ISO 639, according to the setlocale documentation.
setlocale( LC_ALL, 'en_US' );
var_dump( setlocale( LC_MESSAGES, '0' ) );
// string(5) "en_US"
When using this locale to describe the HTML lang attribute, validation fails because it is not formatted to RFC 5646. The RFC 5646 value for this language is actually en-US (note the use of a hyphen instead of an underscore).
Using this value in PHP's setlocale function, as above, results in the following output:
string(1) "C"
I have no idea why it is returning a value of C, but I presume it is because the locale provided was incorrectly formatted. C being the original server default, which is described as ASCII (thanks to #Cheery for the reference).
So, I'm wondering what I should do about that. I could, feasibly, use PHP's str_replace function to switch - to _ before outputting the lang attribute, like so:
<?php setlocale( 'en_US' ); ?>
<!doctype html>
<html lang="<?= str_replace( '_', '-', setlocale(LC_MESSAGES, '0') ); ?>">
...
But, I'm concerned that there may be other differences between the two language specifications that could yield an unexpected problem down the road. If so, is there a preferred way to translate the language codes already in PHP, or a translation class that can be used?
Bonus question, why does my server default to value of C for the locale?

You need to have in mind that setLocal accept many types of "locale" names like names and mixed things, for example in (from php documentation):
$loc_de = setlocale(LC_ALL, 'de_DE#euro', 'de_DE', 'de', 'ge');
You have 'de_DE#euro' which isn't a valid HTML lang code.
So first, you need to ensure that is in the format lang_region before trying to convert it.

You can set local language codes using "setlocale".
You can find here it's documentation as well as this and other examples
Example where they try different possible locale names for german:
<?php
/* Set locale to Dutch */
setlocale(LC_ALL, 'nld_nld');
/* Output: vrijdag 22 december 1978 */
echo strftime("%A %d %B %Y", mktime(0, 0, 0, 12, 22, 1978));
/* try different possible locale names for german as of PHP 4.3.0 */
$loc_de = setlocale(LC_ALL, 'de_DE#euro', 'de_DE', 'deu_deu');
echo "Preferred locale for german on this system is '$loc_de'";
?>

Related

gettext only translate to English in PHP

I'm spanish, and making tests internacionalizing a text width PHP, i only get it translated to english.
I got this structure of files:
locale/en_US/LC_MESSAGES/con los ficheros messages.mo y messages.po
locale/es_ES/LC_MESSAGES/con los ficheros messages.mo y messages.po
locale/fr_FR/LC_MESSAGES/con los ficheros messages.mo y messages.po
Every files have the key word "Servicios" translated to each languaje.
And in PHP i have this code:
<?php
putenv("LANG=en_US");
setlocale(LC_ALL, "en_US");
bindtextdomain("messages", "locale");
textdomain("messages");
?>
When i put the code 'en_US' show the good translation, but when i change it to 'es_ES' or 'fr_FR' that way:
<?php
putenv("LANG=es_ES");
setlocale(LC_ALL, "es_ES");
?>
or
<?php
putenv("LANG=fr_FR");
setlocale(LC_ALL, "fr_FR");
?>
still showing the translation to English
I am working on Widnows 7 and the function
echo $_SERVER['HTTP_ACCEPT_LANGUAGE'] ;
returns to
"es-ES,es;q=0.8"
always,
Which problem could it be?
Thank you
It is quite likely that the languages are not installed on the server your running the script on - do you have shell access to the server? Then try
locale -a
to see which locales are installed. Also have a look here Is it feasible to rely on setlocale, and rely on locales being installed?
NOTE:
be careful with the LC_ALL setting, as it may introduce some unwanted conversions. For example, I used
setlocale (LC_ALL, "Dutch");
to get my weekdays in dutch on the page. From that moment on (as I found out many hours later) my floating point values from MYSQL where interpreted as integers because the Dutch locale wants a comma (,) instead of a point (.) before the decimals. I tried printf, number_format, floatval.... all to no avail. 1.50 was always printed as 1.00 :(
When I set my locale to :
setlocale (LC_TIME, "Dutch");
my weekdays are good now and my floating point values too.

PHP gettext() - putenv and setlocale

I see most of the examples are using something like:
putenv('LC_ALL=de_DE');
setlocale(LC_ALL, 'de_DE');
bindtextdomain("myPHPApp", "./locale");
echo gettext("Welcome to My PHP Application");
If I only want to get message translation done, I've tried that only putenv is needed, and I don't need to the functions provided by setlocale such as time, MONETARY etc.
So, it is safe to ignore the setlocale?
No it is required.
You can of course do setlocale(LC_ALL, ''); as other answer is suggesting but this will just make it fall back to environmental variable set by putenv a line above.
What can be in fact removed is putenv call. At least for me the following snippet is still returning German translation:
putenv('LC_ALL=en_US');
setlocale(LC_ALL, 'de_DE');
bindtextdomain("myPHPApp", "./locale");
echo gettext("Welcome to My PHP Application");
Another good thing you can use setlocale for is checking if given locale is actually installed in the system. E.g.:
if (false === setlocale(LC_ALL, $localeCode)) {
throw new LocaleNotSupportedException(sprintf('Locale "%s" is not installed in the system.', $localeCode));
}
echo gettext("Welcome to My PHP Application");
And also there's no need go set locale before binding a domain. You can also do it anytime after.
It seems you would be safe in not including it.
If locale is NULL or the empty string "", the locale names will be set
from the values of environment variables with the same names as the
above categories, or from "LANG".
Info: http://us2.php.net//manual/en/function.setlocale.php

Setlocale returns false (WAMP)

I have working localization in my project. Working means that my project gets translated to whatever language I have in the locale/sk folder, sk for slovak being my default system language.
Setting to any other language doesn't work. I have tried $lang = 'cs', 'cz', 'en', 'en_UK', 'en_UK.utf8' and others. Still, only the translation in the 'sk' folder is taken and still the setlocale() function returns false. I have tried to change default language in browser - no effect.
This is my code:
putenv("LANG=$lang");
setlocale(LC_ALL, $lang);
bindtextdomain("messages", realpath("../localem"));
textdomain("messages");
...
_("Welcome!")
I have also tried these:
putenv("LANGUAGE=$lang");
putenv('LC_ALL=$lang');
Any suggestions are welcome.
Edit:
$loc = array('nor');
if (setlocale(LC_ALL, $loc)==false) print ' false'; else print setlocale(LC_ALL, $loc);
'nor' prints Norwegian (Bokmĺl)_Norway.1252, 'rus' russian, but 'svk' prints false and so does 'cze'.
On the list all of these are mentioned:
http://msdn.microsoft.com/en-us/library/cdax410z%28v=vs.80%29.aspx
Windows uses another format for the locale setting, see MSDN: List of Country/Region Strings.
You can send a list of locales to setlocale by sending in an array, such as to get Norwegian month names and time formats:
setlocale(LC_TIME, array('nb_NO.UTF-8', 'no_NO.UTF-8', 'nor'));
Windows might however return strings in another encoding than UTF-8, so you might want to handle this manually (converting from cpXXXX to UTF-8).

Bad encoding when using on strftime in spanish?

I'm trying to echo the date with strftime but I'm getting bad encoding on utf-8 only characters. (accented characters basically)
setlocale(LC_TIME, 'spanish');
define("CHARSET", "iso-8859-1");
echo strftime("%A, %d de %B",strtotime($row['Date']));
Is there any problem in this part of the code? Everything is encoded in utf-8 and echoing a 'á' character above it displays the character correctly.
Try adding utf8_encode()
setlocale(LC_TIME, 'spanish');
define("CHARSET", "iso-8859-1");
echo utf8_encode(strftime("%A, %d de %B",strtotime($row['Date'])));
I'm a bit late, but Googling around I found this post. And the answers weren't appropriate in my case.
I'm experiencing the same problem as the OP, but my locale is fr_FR and everything works fine on my computer but not on the dev server.
If I add an iconv (as most people suggest when you Google this issue), it works on the dev server but not on my computer, so I needed a "bulletproof" solution that would work the same everywhere (as there is also a production server).
So, the issue here is with the setlocale, this function changes the locale on the current execution, but every locale is associated with a charset and if none is specified, it falls back to the default one of your system (I think, in my case it was falling back to ISO-8859-1 when using the fr_FR locale). You can list all available locales on your computer/server with the locale -a command. You will most likely see the locale you want, with ".UTF-8" (in my case "fr_FR.UTF-8"), that's how you must set it: setlocale('fr_FR.UTF-8');
perhaps:
echo iconv("iso-8859-1","utf-8",strftime("%A, %d %B",strtotime($row['Date'])));
For those that don't have iconv, you can use the mb function to convert the stftime encoded string to utf-8
echo mb_convert_encoding(strftime("%A, %d de %B",strtotime($row['Date'])), 'UTF-8', mb_internal_encoding());

Inconsistent get_class_methods vs method_exists when using UTF8 characters in PHP code

I have this class in a UTF-8 encoded file called EnUTF8.Class.php:
class EnUTF8 {
public function ñññ() {
return 'ñññ()';
}
}
and in another UTF-8 encoded file:
require_once('EnUTF8.Class.php');
require_once('OneBuggy.Class.php');
$utf8 = new EnUTF8();
//$buggy = new OneBuggy();
echo (method_exists($utf8, 'ñññ')) ? 'ñññ() exists!' : 'ñññ() does not exist...';
echo "\n\n----------------------------------\n\n"
print_r(get_class_methods($utf8));
echo "\n----------------------------------\n\n"
echo $utf8->ñññ();
that produces the expected result:
ñññ() exists!
----------------------------------
Array
(
[0] => ñññ
)
----------------------------------
ñññ()
but if...
require_once('EnUTF8.Class.php');
require_once('OneBuggy.Class.php');
$utf8 = new EnUTF8();
$buggy = new OneBuggy();
echo (method_exists($utf8, 'ñññ')) ? 'ñññ() exists!' : 'ñññ() does not exist...';
echo "\n\n----------------------------------\n\n"
print_r(get_class_methods($utf8));
echo "\n----------------------------------\n\n"
echo $utf8->ñññ();
then the weirdness appears!!!:
ñññ() does not exist!
----------------------------------
Array
(
[0] => ñññ
)
----------------------------------
Fatal error: Call to undefined method EnUTF8::ñññ() in /var/www/test.php on line 16
Well, the thing is that OneBuggy.Class.php is UTF-8 encoded too and shares absolutly nothing with EnUTF8.Class.php so...
where is the bug?
UPDATED:
Well, after a long debugging time I found this in OneBuggy.Class.php constructor:
setlocale (LC_ALL, "es_ES#euro", "es_ES", "esp");
so I did...
//setlocale (LC_ALL, "es_ES#euro", "es_ES", "esp");
and now it works but why?.
Re your update, I think it goes into this direction:
With setlocale(), among other things, you set
LC_CTYPE for character classification and conversion, for example strtoupper()
method_exists() is case insensitive, so within method_exists(), some case conversion must take place. I bet the string breaks at that point. Why it would break if you explicitly set the spanish locale, but not if you don't, I don't understand, though.
Is there a specific spanish rule for uppercasing ñ other than making it Ñ? Is it possible to lowercase ñ?
It could also be that the spanish locale the function is trying to switch to isn't installed on your system at all, and the fallback locale is a different one than PHP uses by default.
If you are working with PHP 5.x, you should not develop using names in UTF-8 for your variables/classes/functions/... : in some cases, for some characters, it will work, but in a general situation, it will not.
And note this is true for identifiers, but you'll have the same problem for the content of variables, for instance -- as an example, to manipulate strings in UTF-8, you have to work with the mb_* familly of functions.
This is because PHP 5.x is not really using Unicode : it's the big thing that's planned for PHP 6 (which is not even in alpha stage yet).

Categories