Template Strings.
This link might help a little bit:
Does PHP have a feature like Python's template strings?
What my main issue is, is to know if there's a better way to store Text Strings.
Now, is this normally done with one folder (DIR), and plenty of single standalone files with different strings, and depending on what one might need, grab the contents of one file, process and replace the {tags} with values.
Or, is it better to define all of them inside one single file array[]?
greetings.tpl.txt
['welcome'] = 'Welcome {firstname} {lastname}'.
['good_morning'] = 'Good morning {firstname}'.
['good_afternoon'] = 'Good afternoon {firstname}'.
Here's another example, https://github.com/oren/string-template-example/blob/master/template.txt
Thx in advance!
Answers that include solutions, that state that one should use include("../file.php"); are NEVER ACCEPTED HERE. A solution that shows how to read a LIST of defined strings into an array. The definition is already array based.
To add values to templates, you can use strtr. Example below:
$msg = strtr('Welcome {firstname} {lastname}', array(
'{firstname}' => $user->getFistName(),
'{lastname}' => $user->getLastName()
));
Regarding storing strings, you can save one array per language and then load only relevent one. E.g. you'll have a directory with 2 files:
language
en.php
de.php
Each file should contain the following:
<?php
return (object) array(
'WELCOME' => 'Welcome {firstname} {lastname}'
);
When you need translations, you can just do the following:
$dictionary = include('language/en.php');
And the dictionary will then have an object that you can address. Changing the example above, it will be something like this:
$dic = include('language/en.php');
$msg = strtr($dic->WELCOME, array(
'{firstname}' => $user->getFistName(),
'{lastname}' => $user->getLastName()
));
To avoid the situation when you don't have the template in dictionary, you can use a ternary operator with the default text:
$dic = include('language/en.php');
$tpl = $dic->WELCOME ?: 'Welcome {firstname} {lastname}';
$msg = strtr($tpl, array(
'{firstname}' => $user->getFistName(),
'{lastname}' => $user->getLastName()
));
What people usually do to be able to edit the texts in db, you can have a simple export (e.g. var_export) script to sync from db to files.
Hope this helps.
OK John I will elaborate.
The best way is to create a php file, for each language, containing the definition of an array of texts, using printf format for string substitution.
If the amount of text is very large, you might consider partitioning it further. (a few MB is usually fine)
This is efficient in production, assuming the OS has a well tuned file cash. Slightly more so, it you use numerical indexes to the array.
It is much more efficient to let php populate the array, then to do it your self, reading a text file. this is after all, I assume, static text?
If production performance is not an issue, please disregard this post.
greetings_tpl_en.php
$text_tpl={
'welcome' => 'Welcome %s %s'
,'good_morning' => 'Good morning %s'
,'good_afternoon' => 'Good afternoon %s'
};
your.php
$language="en";
require('greetings_tpl_'. $language .'php');
....
printf($text_tpl['welcome'],$first_name,$last_name);
printf i a nice legacy from the C language. sprintf returns a string instead of outputting it.
You can find the full description of the php printf format here: http://php.net/manual/en/function.sprintf.php
(Do read Josef Kufner post again, when this is solved. +1 :c)
Hope this helps?
First, take a look at gettext. It is widely used and there is plenty of tools to handle translation process, like xgettext and POEdit. It is more comfortable to use real english strings in source code and then extract them using xgettext tool. Gettext can handle plural forms of practically all languages, which is not possible when using simple arrays of strings.
Very useful function to combine with gettext is sprintf() (or printf(), if you want to output text directly).
Example:
printf(gettext('Welcome %s %s.'), $firstname, $lastname);
printf(ngettext('You have %d new message.', 'You have %d new messages.',
$number_of_new_messages), $number_of_new_messages);
Then, when you want to translate this into language where last name usually precedes first name, you can use this: 'Welcome %2$s, %1$s.'
The second example, the plural form, can be translated using more than two strings, because part of localization file is how plural forms are arranges. While for english it is nplurals=2; plural=(n != 1);, for example in czech it is nplurals=3; plural=(n==1) ? 0 : (n>=2 && n<=4) ? 1 : 2; (three forms, first is for one item, second for 2 to 4 items and third for the rest). For example Irish language has five plural forms.
To extract strings from source code use xgettext -L php .... I recommend writing short script with the exact command fitting your project, something like:
# assuming this file is in locales directory
# and source code in src directory
find ../src -type f -iname '*.php' > "files.list"
xgettext -L php --from-code 'UTF-8' -f "files.list" -o messages.pot
You may want to add custom function names using -k argument.
You could store all the templates in one associative array and also the variables that are to replace the placeholders, like
$capt=array('welcome' => 'Welcome {firstname} {lastname}',
'good_morning' => 'Good morning {firstname}',
'good_afternoon' => 'Good afternoon {firstname}');
$vars=array('firstname'=>'Harry','lastname'=>'Potter', 'profession'=>'wizzard');
Then, you could transform the sentences through a simple preg_replace_callback call like
function repl($a){ global $vars;
return $vars[$a[1]];
}
function getcapt($type){ global $capt;
$str=$capt[$type];
$str=preg_replace_callback('/\{([^}]+)\}/','repl' ,$str);
echo "$str<br>";
}
getcapt('welcome');
getcapt('good_afternoon');
This example would produce
Welcome Harry Potter
Good afternoon Harry
Related
I am working on a project and there are going to be 3 different languages: English, French and Spanish. This will be defined when the user signs up.
Now in my config file I have the following:
define("DEFAULT_SLOGAN", "The default slogan will go here.");
Until I started realizing that I needed to accept different languages.
The user has an assigned language code (EN, FR, SP). How would I go about having different language strings for each page? Would I need to have something like this:
define("DEFAULT_SLOGAN_EN", "Slogan in english");
define("DEFAULT_SLOGAN_FR", "Slogan in french");
define("DEFAULT_SLOGAN_SP", "Slogan in spanish");
And for each string just have 3 different versions of it? Not too sure the best way to approach this.
Thanks!
A simple strategy that is often used are arrays:
$lang['en'] = array(
'DEFAULT_SLOGAN' => 'The default slogan will go here.',
);
// Same for other languages
Then, in your actual code, make sure that $lang is available (you can use the global keyword for this) and use these arrays.
A better approach would be to create
locale_definitions_EN.php containing
define("DEFAULT_SLOGAN", "The english default slogan will go here.");
local_definitions_FR.php containing
define("DEFAULT_SLOGAN", "The french default slogan will go here.");
etc. and then do something like
include "locale_definitions_$userLocale.php";
This has the advantage, that you don't need the memory to hold the unused other language constants.
I need to be able to permanently change variables in a php file using php.
I am creating a multilanguage site using codeigniter and using the language helper which stores the text in php files in variables in this format:
$lang['title'] = "Stuff";
I've been able to access the plain text of the files using fopen() etc and I it seems that I could probably locate the areas I want to edit with with regular expressions and rewrite the file once I've made the changes but it seems a bit hacky.
Is there any easy way to edit these variables permanently using php?
Cheers
If it's just an array you're dealing with, you may want to consider var_export. It will print out or return the expression in a format that's valid PHP code.
So if you had language_foo.php which contained a bunch of $lang['title'] = "Stuff"; lines, you could do something along the lines of:
include('language_foo.php');
$lang['title2'] = 'stuff2';
$data = '$lang = ' . var_export($lang, true) . ';';
file_put_contents('language_foo.php', '<?PHP ' . $data . ' ?>');
Alternatively, if you won't want to hand-edit them in the future, you should consider storing the data in a different way (such as in a database, or serialize()'d, etc etc).
It looks way easier to store data somewhere else (for instance, a database) and write a simple script to generate the *.php files, with this comment on top:
#
# THIS FILE IS AUTOGENERATED - DO NOT EDIT
#
I once faced a similar issue. I fixed it by simply adding a smarty template. The way I did it was as follows:
Read the array from the file
Add to the array
Pass the array to smarty
Loop over the array in smarty and generate the file using a template (this way you have total control, which might be missing in reg-ex)
Replace the file
Let me know if this helps.
Assuming that
You need the dictionary file in a human-readable and human-editable form (no serializing etc.)
The Dictionary array is an one-dimensional, associative array:
I would
Include() the dictionary file inside a function
Do all necessary operations on the $lang array (add words, remove words, change words)
Write the $lang array back into the file using a simple loop:
foreach ($lang as $key => $value)
fwrite ($file, "\$lang['$key'] = '$value';\n";
this is an extremely limited approach, of course. I would be really interested to see whether there is a genuine "PHP source code parser, changer and writer" around. This should be possible to do using the tokenizer functions.
If it also is about a truly multilingual site, you might enjoy looking into the gettext extension of PHP. It falls back to a library that has been in use for localizing stuff for many years, and where tools to keep up with the translation files have been around for almost quite as long. This makes supporting all the languages in later revisions of the product more fun, too.
In other news, I would not use an array but rather appropriate definitions, so that you have a file
switch ($lang) {
case 'de':
define('HELLO','Hallo.');
define('BYE','Auf wiedersehen.');
break;
case 'fr':
define('HELLO','Bonjour');
define('BYE','Au revoir.');
break;
case 'en':
default:
define ('HELLO','Hello.');
define ('BYE','Bye.');
}
And I'd also auto-generate that from a database, if maintenance becomes a hassle.
Pear Config will let you read and write PHP files containing settings using its 'PHPArray' container. I have found that the generated PHP is more readable than that from var_export()
This is the most optimal way of dealing with a multilingual website I can think of, right now (not sure) which doesn't involve gettext, zend_translate or any php plugin or framework.
I think its pretty straight forward: I have 3 languages and I write their "content" in different files (in form of arrays), and later, I call that content to my index.php like you can appreciate in the following picture:
alt text http://img31.imageshack.us/img31/1471/codew.png
I just started with php and I would like to know if I'm breaking php good practices, if the code is vulnerable to XSS attack or if I'm writing more code than necessary.
EDIT: I posted a picture so that you can see the files tree (I'm not being lazy)
EDIT2: I'm using Vim with the theme ir_black and NERDTree.
Looks all right to me, although I personally prefer creating and using a dictionary helper function:
<?php echo dictionary("showcase_li2"); ?>
that would enable you to easily switch methods later, and gives you generally more control over your dictionary. Also with an array, you will have the problem of scope - you will have to import it into every function using global $language; very annoying.
You will probably also reach the point when you have to insert values into an internationalized string:
You have %1 votes left in the next %2 hours.
Sie haben %1 stimmen übrig für die nächsten %2 stunden.
Sinulla on %1 ääntä jäljellä seuraavan %2 tunnin ajassa.
that is something a helper function can be very useful for:
<?php echo dictionary("xyz", $value1, $value2 ); ?>
$value1 and $value2 would be inserted into %1 and %2 in the dictionary string.
Such a helper function can easily be built with an unlimited number of parameters using func_get_args().
It's OK generally. For instance, punBB's localization works this way. It is very fast. Faster than calling a function or an object's method or property. But I see a problem with this approach, since it doesn't support language fallbacks easily. I mean, if you don't have a string for Chinese, let it be displayed in English.
This problem is topical when you upgrade your system and you don't have time to translate everything in every language.
I'd better use something like
lang.en.php
$langs['en'] = array(
...
);
lang.cn.php
$langs['cn'] = array(
...
);
[prepend].php (some common lib)
define('DEFAULT_LANG', 'en');
include_once('lang.' . DEFAULT_LANG '.php');
include_once('lang.' . $user->lang . '.php');
$lang = array_merge($langs[DEFAULT_LANG], $langs[$user->lang]);
Looks all right to me also, but:
Seems that you have localization for multiple modules/sites, so why not break it down to multidimensional array?
$localization = array(
'module' => (object)array(
'heading' => 'oh, no!',
'perex' => 'oh, yes!'
)
);
I personally like to creat stdClass out of arrays with
$localization = (object)$localization;
so you can use
$localization->module->heading;
:) my 2 cents
The only way that this could be xss is if you have register_globals=On and you don't set $lang['showcase_lil'] or other $lang's. But I don't think you have to worry about this. So I think your in the clear.
as an xss test:
http://127.0.0.1/whatever.php?lang[showcase_lil]=alert(/xss/)
Wouldn't it have been better to post code and briefly explain this issue to us?
Anyway, putting each language in its own file and loading it through some sort of language component seems okay. I'd prefer using some sort of gettext, but this is okay too, I guess.
You should make a function for calling the language keys rather than relying on an array, something like
<?php echo lang('yourKey'); ?>
One thing to watch for is interpolation; that's really the only place XSS could sneak in if your server settings are sensible. If you at any point need to do something along the lines of translating "$project->name has $project->member_count members", you'll have to make sure you escape all HTML that goes in there.
But other than that, you should be fine.
My current implementation, which is array based stores keys and values in a dictionary, example:
$arr = array(
'message' => 'Paste a flickr URL below.',
);
I realize that it was probably a bad idea storing html inside of a string such as this, but if I'm using gettext then in my .mo/.po files how should I handle storing a similar string? Should I just store words, such as 'Paste a' and 'URL below' and 'flickr' separately?
You should store something like
"Paste a %1 URL below"
and replace all 'vars' using something simple like str_replace('%1', $link, $message);
$link can also be translatable
"%1"
although that might be overkill (does flickr translate between languages?)
rationale behind this is that different languages have different grammatical structures and the ordering of the words wont always be the same.
Update:
as #alex and #chelmertz mention in the comments, try using the sprintf function, which is built for this very thing.
I'd go for this:
$arr = array(
'message' => _('Paste a %s URL below.'),
);
Having all translations as string literals within gettext function calls allows to use standard tools to update *.po catalogues.
In php (or maybe gettext in general), what does gettext do when it sees a variable to dynamic content?
I have 2 cases in mind.
1) Let's say I have <?=$user1?> poked John <?=$user2?>. Maybe in some language the order of the words is different. How does gettext handle that? (no, I'm not building facebook, that was just an example)
2) Let's say I store some categories in a database. They rarely, but they are store in a database. What would happen if I do <?php echo gettext($data['name']); ?> ? I would like the translators to translate those category names too, but does it have to be done in the database itself?
Thanks
Your best option is to use sprintf() function. Then you would use printf notation to handle dynamic content in your strings. Here is a function I found on here a while ago to handle this easily for you:
function translate()
{
$args = func_get_args();
$num = func_num_args();
$args[0] = gettext($args[0]);
if($num <= 1)
return $args[0];
return call_user_func_array('sprintf', $args);
}
Now for example 1, you would want to change the string to:
%s poked %s
Which you would input into the translate() function like this:
<?php echo translate('%s poked %s', $user1, $user2); ?>
You would parse out all translate() functions with poEdit. and then translate the string "%s poked %s" into whatever language you wanted, without modifying the %s string placeholders. Those would get replace upon output by the translate() function with user1 and user2 respectively. You can read more on sprintf() in the PHP Manual for more advanced usages.
For issue #2. You would need to create a static file which poEdit could parse containing the category names. For example misctranslations.php:
<?php
_('Cars');
_('Trains');
_('Airplanes');
Then have poEdit parse misctranslations.php. You would then be able to output the category name translation using <?php echo gettext($data['name']); ?>
To build a little on what Mark said... the only problem with the above solution is that the static list must be always maintained by hand and if you add a new string before all the others or you completely change an existing one, the soft you use for translating might confuse the new strings and you could lose some translations.
I'm actually writing an article about this (too little time to finish it anytime soon!) but my proposed answer goes something like this:
Gettext allows you to store the line number that the string appears in the code inside the .po file. If you change the string entirely, the .po editor will know that the string is not new but it is an old one (thanks to the line number).
My solution to this is to write a script that reads the database and creates a static file with all the gettext strings. The big difference to Mark's solution is to have the primary key (let's call it ID) on the database match the line number in the new file. In that case, if you completely change one original translation, the lines are still the same and your translator soft will recognize the strings.
Of course there might be newer and more intelligent .po editors out there but at least if yours is giving you trouble with newer strings then this will solve them.
My 2 cents.
If you have somewhere in your code:
<?=sprintf(_('%s poked %s'), $user1, $user2)?>
and one of your languages needs to swap the arguments it is very simple. Simply translate your code like this:
msgid "%s poked %s"
msgstr "%2$s translation_of_poked %1$s"