In php (or maybe gettext in general), what does gettext do when it sees a variable to dynamic content?
I have 2 cases in mind.
1) Let's say I have <?=$user1?> poked John <?=$user2?>. Maybe in some language the order of the words is different. How does gettext handle that? (no, I'm not building facebook, that was just an example)
2) Let's say I store some categories in a database. They rarely, but they are store in a database. What would happen if I do <?php echo gettext($data['name']); ?> ? I would like the translators to translate those category names too, but does it have to be done in the database itself?
Thanks
Your best option is to use sprintf() function. Then you would use printf notation to handle dynamic content in your strings. Here is a function I found on here a while ago to handle this easily for you:
function translate()
{
$args = func_get_args();
$num = func_num_args();
$args[0] = gettext($args[0]);
if($num <= 1)
return $args[0];
return call_user_func_array('sprintf', $args);
}
Now for example 1, you would want to change the string to:
%s poked %s
Which you would input into the translate() function like this:
<?php echo translate('%s poked %s', $user1, $user2); ?>
You would parse out all translate() functions with poEdit. and then translate the string "%s poked %s" into whatever language you wanted, without modifying the %s string placeholders. Those would get replace upon output by the translate() function with user1 and user2 respectively. You can read more on sprintf() in the PHP Manual for more advanced usages.
For issue #2. You would need to create a static file which poEdit could parse containing the category names. For example misctranslations.php:
<?php
_('Cars');
_('Trains');
_('Airplanes');
Then have poEdit parse misctranslations.php. You would then be able to output the category name translation using <?php echo gettext($data['name']); ?>
To build a little on what Mark said... the only problem with the above solution is that the static list must be always maintained by hand and if you add a new string before all the others or you completely change an existing one, the soft you use for translating might confuse the new strings and you could lose some translations.
I'm actually writing an article about this (too little time to finish it anytime soon!) but my proposed answer goes something like this:
Gettext allows you to store the line number that the string appears in the code inside the .po file. If you change the string entirely, the .po editor will know that the string is not new but it is an old one (thanks to the line number).
My solution to this is to write a script that reads the database and creates a static file with all the gettext strings. The big difference to Mark's solution is to have the primary key (let's call it ID) on the database match the line number in the new file. In that case, if you completely change one original translation, the lines are still the same and your translator soft will recognize the strings.
Of course there might be newer and more intelligent .po editors out there but at least if yours is giving you trouble with newer strings then this will solve them.
My 2 cents.
If you have somewhere in your code:
<?=sprintf(_('%s poked %s'), $user1, $user2)?>
and one of your languages needs to swap the arguments it is very simple. Simply translate your code like this:
msgid "%s poked %s"
msgstr "%2$s translation_of_poked %1$s"
Related
I'm a newbie starting to learn from source code. I bought a source code on the internet with full source code switching but it turns out there is a part that is hidden. How to do decrypt/decode for lines like this:
<?php
$keystroke1 = base64_decode("d2RyMTU5c3E0YXllejd4Y2duZl90djhubHVrNmpoYmlvMzJtcA==");
eval(gzinflate(base64_decode('hY5NCsIwEIWv8ixdZDCKWZcuPUfRdqrBmsBkAkrp3aVIi3Tj9v1+vje7PodWfQwNv3zSZAqJyqGNHRdE4+JiVU2ZVHy42fLyjDkoYUT54DdqpHxNKmsAJwtHFXxvksrAYXGort1cE9YsAe1dTJTOzCuEPZbhChN4SPw/iePMd/7ybSmcxeb+4Mj+vkzTBw==')));
$O0O0O0O0O0O0=$keystroke1[2].$keystroke1[32].$keystroke1[20].$keystroke1[11].$keystroke1[23].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
$keystroke2 = $O0O0O0O0O0O0("xes26:tr5bzf{8ydhog`uw9omvl7kicjp43nq", -1);
$OO000OO000OO=$keystroke2[16].$keystroke2[12].$keystroke2[31].$keystroke2[23].$keystroke2[18].$keystroke2[24].$keystroke2[9].$keystroke2[20].$keystroke2[11];
$O0000000000O=$keystroke1[30].$keystroke1[9].$keystroke1[6].$keystroke1[11].$keystroke1[27].$keystroke1[8].$keystroke1[19].$keystroke1[1].$keystroke1[11].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
eval($OO000OO000OO(base64_decode('LcTLsm
tKAADQn7lVZ+8yoBtB3ZH3OyEEMbnl0SLxTJrQvv
5M7hos9C36n38uF4Zh/u+nLDA6cf/VqJpq9PPHq2
IHD+dQlrVwpIa3BPicV2atbjLVsx+to7il1297dn
c+9PeDJGOoGn0MJUJnSqiJwrGcK5/bG2iiJtUoOk
3GKbHYjjzd5yLu3q2dPpWSFjDVTKWSS6MFsF6MU5
dsbJn7qHRxhGo0MNuluk29F3iwyAx/cYO+OfPWi1
ECDkWG1NsMLuAcM3F98vtMsubbvQjf1ZpVMUP5Eh
puFNzCi/CYkoM1VgsAetzjpvEe1M2AlX4YFjQZF0
A0VBRQKS0B5mcI7na2N/nER993+qocgmh9WawUrU
YhBMUiPNpuXNQy2o7VxHvhyO3nZkcWTmQu5kV1C2
ECbZiH8XsL4QuYbf7lI4SF1gDM/vVqRz4qyj7a8b
qS1nXP79731t4O0qcDaqN97BHDzlPwTEF6H7p9a3
Zu1Ut6X5GNTgZhWe3dHa+6yzJ58MX1Pc8mwAWK4v
EVLjGolQQLieOvkn4jD4d0FMQuLYvXhaxbzJyLR2
OHDKhMu2EwHthDt+I7YwOvVUydwEnCigk/n4iQei
SzwWNKicdunzmrVoOWl9gt8lhK+WzNpbPqkHEK7i
xBHT84UAbkHpity8i9eLUUulASI5d7cfpGWF6I4l
7tYBeJmYzXycA3FbbrSb+yNgd8XM5u7wU0mL8tVP
hJ2J/nu2QLr/OgzZrmp7xvKmpZCgHU7w0RlS1PT9
4JvxXtekif9dDGvBxSQjcwj2i32C7Abbcosvey5I
iq2hW7mjn/lUS6OUQ64Kw/v7+///4F')));
?>
is code like this dangerous?
You are looking at a piece of obfuscated code. I will explain it line by line, but first let's go over the functions that are used:
base64_decode()
This function decodes a base64 encoded string. It's used here to unscramble intentionally scrambled code.
gzinflate()
This function decompresses a compressed string. It's used the same way as base64_decode().
eval()
This function executes a string as code. Its use is discouraged and is in itself a bit of a red flag, though it has legitimate uses.
$keystroke1 = base64_decode("d2RyMTU5c3E0YXllejd4Y2duZl90djhubHVrNmpoYmlvMzJtcA==");
This line creates an apparently random string of characters: wdr159sq4ayez7xcgnf_tv8nluk6jhbio32mp
This string is saved to a variable, $keystroke1. The string itself is not important, other than that it contains some letters that are used later.
eval(gzinflate(base64_decode('hY5NCsIwEIWv8ixdZDCKWZcuPUfRdqrBmsBkAkrp3aVIi3Tj9v1+vje7PodWfQwNv3zSZAqJyqGNHRdE4+JiVU2ZVHy42fLyjDkoYUT54DdqpHxNKmsAJwtHFXxvksrAYXGort1cE9YsAe1dTJTOzCuEPZbhChN4SPw/iePMd/7ybSmcxeb+4Mj+vkzTBw==')));
This line unscrambles a doubly scrambled string and then runs this resulting code:
if(!function_exists("rotencode")){function rotencode($string,$amount) { $key = substr($string, 0, 1); if(strlen($string)==1) { return chr(ord($key) + $amount); } else { return chr(ord($key) + $amount) . rotEncode(substr($string, 1, strlen($string)-1), $amount); }}}
This creates a new function called rotencode(), which is yet another way of unscrambling strings.
$O0O0O0O0O0O0=$keystroke1[2].$keystroke1[32].$keystroke1[20].$keystroke1[11].$keystroke1[23].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
This line takes specific characters from that random string from earlier to create the word "rotencode" as a string, stored in the variable named $O0O0O0O0O0O0.
$keystroke2 = $O0O0O0O0O0O0("xes26:tr5bzf{8ydhog`uw9omvl7kicjp43nq", -1);
This line uses the rotencode() function to unscramble yet another string (actually exactly the same string as before, for some reason).
$OO000OO000OO=$keystroke2[16].$keystroke2[12].$keystroke2[31].$keystroke2[23].$keystroke2[18].$keystroke2[24].$keystroke2[9].$keystroke2[20].$keystroke2[11];
$O0000000000O=$keystroke1[30].$keystroke1[9].$keystroke1[6].$keystroke1[11].$keystroke1[27].$keystroke1[8].$keystroke1[19].$keystroke1[1].$keystroke1[11].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
On these lines the two (identical but separate) random strings are used to create the words gzinflate and base64_decode. This is done so the coder can use these functions without it being apparent that that's what is happening. However, base64_decode() is never used this way in the snippet you posted. That might suggest that it is used later in the code in places you haven't seen or recognized yet. Searching your code for "$O0000000000O" might yield other uses.
eval($OO000OO000OO(base64_decode('LcTLsmtKAADQn7lVZ+8yoBtB3ZH3OyEEMbnl0SLxTJrQvv5M7hos9C36n38uF4Zh/u+nLDA6cf/VqJpq9PPHq2IHD+dQlrVwpIa3BPicV2atbjLVsx+to7il1297dnc+9PeDJGOoGn0MJUJnSqiJwrGcK5/bG2iiJtUoOk3GKbHYjjzd5yLu3q2dPpWSFjDVTKWSS6MFsF6MU5dsbJn7qHRxhGo0MNuluk29F3iwyAx/cYO+OfPWi1ECDkWG1NsMLuAcM3F98vtMsubbvQjf1ZpVMUP5EhpuFNzCi/CYkoM1VgsAetzjpvEe1M2AlX4YFjQZF0A0VBRQKS0B5mcI7na2N/nER993+qocgmh9WawUrUYhBMUiPNpuXNQy2o7VxHvhyO3nZkcWTmQu5kV1C2ECbZiH8XsL4QuYbf7lI4SF1gDM/vVqRz4qyj7a8bqS1nXP79731t4O0qcDaqN97BHDzlPwTEF6H7p9a3Zu1Ut6X5GNTgZhWe3dHa+6yzJ58MX1Pc8mwAWK4vEVLjGolQQLieOvkn4jD4d0FMQuLYvXhaxbzJyLR2OHDKhMu2EwHthDt+I7YwOvVUydwEnCigk/n4iQeiSzwWNKicdunzmrVoOWl9gt8lhK+WzNpbPqkHEK7ixBHT84UAbkHpity8i9eLUUulASI5d7cfpGWF6I4l7tYBeJmYzXycA3FbbrSb+yNgd8XM5u7wU0mL8tVPhJ2J/nu2QLr/OgzZrmp7xvKmpZCgHU7w0RlS1PT94JvxXtekif9dDGvBxSQjcwj2i32C7Abbcosvey5Iiq2hW7mjn/lUS6OUQ64Kw/v7+///4F')));
This is where it all comes together. This line unscrambles a line of code which has been compressed and encoded 10 times over. The final result is this:
$cnk = array('localhost');
That's it. It sets the string "localhost" as the sole element of an array and saves it in a variable named $cnk.
In and of itself, there's nothing hazardous about running this code, but noting the lengths that the coder went to in order to hide this line, it's probably a safe bet that it wasn't placed there to help you - the buyer - in any way. Search your code for the $cnk variable if you want to know exactly what's being done. Or better yet, chalk this experience down to a loss and find a better way to learn coding. There are plenty of books, video tutorials and free resources online. Do not place your trust in whoever sold you this code. While they may not have been malicious (people suggested in comments that this might be part of a license check), anyone who includes something like this in their code is not someone you should be learning from.
Good luck on your coding journey!
Template Strings.
This link might help a little bit:
Does PHP have a feature like Python's template strings?
What my main issue is, is to know if there's a better way to store Text Strings.
Now, is this normally done with one folder (DIR), and plenty of single standalone files with different strings, and depending on what one might need, grab the contents of one file, process and replace the {tags} with values.
Or, is it better to define all of them inside one single file array[]?
greetings.tpl.txt
['welcome'] = 'Welcome {firstname} {lastname}'.
['good_morning'] = 'Good morning {firstname}'.
['good_afternoon'] = 'Good afternoon {firstname}'.
Here's another example, https://github.com/oren/string-template-example/blob/master/template.txt
Thx in advance!
Answers that include solutions, that state that one should use include("../file.php"); are NEVER ACCEPTED HERE. A solution that shows how to read a LIST of defined strings into an array. The definition is already array based.
To add values to templates, you can use strtr. Example below:
$msg = strtr('Welcome {firstname} {lastname}', array(
'{firstname}' => $user->getFistName(),
'{lastname}' => $user->getLastName()
));
Regarding storing strings, you can save one array per language and then load only relevent one. E.g. you'll have a directory with 2 files:
language
en.php
de.php
Each file should contain the following:
<?php
return (object) array(
'WELCOME' => 'Welcome {firstname} {lastname}'
);
When you need translations, you can just do the following:
$dictionary = include('language/en.php');
And the dictionary will then have an object that you can address. Changing the example above, it will be something like this:
$dic = include('language/en.php');
$msg = strtr($dic->WELCOME, array(
'{firstname}' => $user->getFistName(),
'{lastname}' => $user->getLastName()
));
To avoid the situation when you don't have the template in dictionary, you can use a ternary operator with the default text:
$dic = include('language/en.php');
$tpl = $dic->WELCOME ?: 'Welcome {firstname} {lastname}';
$msg = strtr($tpl, array(
'{firstname}' => $user->getFistName(),
'{lastname}' => $user->getLastName()
));
What people usually do to be able to edit the texts in db, you can have a simple export (e.g. var_export) script to sync from db to files.
Hope this helps.
OK John I will elaborate.
The best way is to create a php file, for each language, containing the definition of an array of texts, using printf format for string substitution.
If the amount of text is very large, you might consider partitioning it further. (a few MB is usually fine)
This is efficient in production, assuming the OS has a well tuned file cash. Slightly more so, it you use numerical indexes to the array.
It is much more efficient to let php populate the array, then to do it your self, reading a text file. this is after all, I assume, static text?
If production performance is not an issue, please disregard this post.
greetings_tpl_en.php
$text_tpl={
'welcome' => 'Welcome %s %s'
,'good_morning' => 'Good morning %s'
,'good_afternoon' => 'Good afternoon %s'
};
your.php
$language="en";
require('greetings_tpl_'. $language .'php');
....
printf($text_tpl['welcome'],$first_name,$last_name);
printf i a nice legacy from the C language. sprintf returns a string instead of outputting it.
You can find the full description of the php printf format here: http://php.net/manual/en/function.sprintf.php
(Do read Josef Kufner post again, when this is solved. +1 :c)
Hope this helps?
First, take a look at gettext. It is widely used and there is plenty of tools to handle translation process, like xgettext and POEdit. It is more comfortable to use real english strings in source code and then extract them using xgettext tool. Gettext can handle plural forms of practically all languages, which is not possible when using simple arrays of strings.
Very useful function to combine with gettext is sprintf() (or printf(), if you want to output text directly).
Example:
printf(gettext('Welcome %s %s.'), $firstname, $lastname);
printf(ngettext('You have %d new message.', 'You have %d new messages.',
$number_of_new_messages), $number_of_new_messages);
Then, when you want to translate this into language where last name usually precedes first name, you can use this: 'Welcome %2$s, %1$s.'
The second example, the plural form, can be translated using more than two strings, because part of localization file is how plural forms are arranges. While for english it is nplurals=2; plural=(n != 1);, for example in czech it is nplurals=3; plural=(n==1) ? 0 : (n>=2 && n<=4) ? 1 : 2; (three forms, first is for one item, second for 2 to 4 items and third for the rest). For example Irish language has five plural forms.
To extract strings from source code use xgettext -L php .... I recommend writing short script with the exact command fitting your project, something like:
# assuming this file is in locales directory
# and source code in src directory
find ../src -type f -iname '*.php' > "files.list"
xgettext -L php --from-code 'UTF-8' -f "files.list" -o messages.pot
You may want to add custom function names using -k argument.
You could store all the templates in one associative array and also the variables that are to replace the placeholders, like
$capt=array('welcome' => 'Welcome {firstname} {lastname}',
'good_morning' => 'Good morning {firstname}',
'good_afternoon' => 'Good afternoon {firstname}');
$vars=array('firstname'=>'Harry','lastname'=>'Potter', 'profession'=>'wizzard');
Then, you could transform the sentences through a simple preg_replace_callback call like
function repl($a){ global $vars;
return $vars[$a[1]];
}
function getcapt($type){ global $capt;
$str=$capt[$type];
$str=preg_replace_callback('/\{([^}]+)\}/','repl' ,$str);
echo "$str<br>";
}
getcapt('welcome');
getcapt('good_afternoon');
This example would produce
Welcome Harry Potter
Good afternoon Harry
So I'm importing ExpressionEngine fields into a php array. I want to display one field, called {gearboxx_body}, unless that field has more then 300 characters, in which case I want to display a field called {article_blurb}. I'm pretty sure there isn't a way to do this just in ExpressionEngine fields and conditionals, so I tried some PHP, which I'm just starting to learn:
<?php
$info = array('{gearboxx_body}','{article_blurb}');
if(mb_strlen($info[0]) <= 300)
echo($info[0]);
}
else {
echo($info[1]);
}
?>
So that works well, but there's a problem. If the tag includes any apostrophes or quote marks, it ends the string and the page won't load. So what can I do about this? I've tried to replace the quote marks in the string, but I have to have loaded the string from the fields first, and as soon as I do that the page is already broken.
Hopefully that made sense. Any suggestions?
I would recommend you handle this in an EE plugin rather than in the template:
Faster to render (because you don't need the overhead of PHP in the templates)
More secure and reliable
Faster to develop once you get the basics of EE development down which is a useful life skill
All around best-practice
The plugin I have in mind takes three parameters:
body, blurb and character limit.
Let's say you call your plugin "Blurby". In the template you would just have this:
{exp:blurby body="{gearboxx_body}" blurb="{article_blurb}" char_limit="300"}
It variably returns either of your fields based on the logic you define in the plugin itself.
See plugin developer documentation.
Alternatively you could use the dreaded HEREDOC syntax to set variables before passing them into your array:
$body = <<<EOT
{gearboxx_body}
EOT;
$blurb = <<<EOT
{article_blurb}
EOT;
I would like my php website to be able to be multilinguistic. I thought of using:
echo $lang[$_SESSION['lang']]['WellcomeMessage'];
but I found that I will be needing to format the text, say for example male/female or putting some values from the DB. So I thought that simple strings might not do the trick for formatting?
I know #define might have worked in C as the string translates to code, but I don't know how php does that. For example:
define ($lang['en']['credit_left'],'you have $credits_left');
define ($lang['sp']['credit_left'],'tienes $credits_left creditos mas');
Any suggestions?
This is the most optimal way of dealing with a multilingual website I can think of, right now (not sure) which doesn't involve gettext, zend_translate or any php plugin or framework.
I think its pretty straight forward: I have 3 languages and I write their "content" in different files (in form of arrays), and later, I call that content to my index.php like you can appreciate in the following picture:
alt text http://img31.imageshack.us/img31/1471/codew.png
I just started with php and I would like to know if I'm breaking php good practices, if the code is vulnerable to XSS attack or if I'm writing more code than necessary.
EDIT: I posted a picture so that you can see the files tree (I'm not being lazy)
EDIT2: I'm using Vim with the theme ir_black and NERDTree.
Looks all right to me, although I personally prefer creating and using a dictionary helper function:
<?php echo dictionary("showcase_li2"); ?>
that would enable you to easily switch methods later, and gives you generally more control over your dictionary. Also with an array, you will have the problem of scope - you will have to import it into every function using global $language; very annoying.
You will probably also reach the point when you have to insert values into an internationalized string:
You have %1 votes left in the next %2 hours.
Sie haben %1 stimmen übrig für die nächsten %2 stunden.
Sinulla on %1 ääntä jäljellä seuraavan %2 tunnin ajassa.
that is something a helper function can be very useful for:
<?php echo dictionary("xyz", $value1, $value2 ); ?>
$value1 and $value2 would be inserted into %1 and %2 in the dictionary string.
Such a helper function can easily be built with an unlimited number of parameters using func_get_args().
It's OK generally. For instance, punBB's localization works this way. It is very fast. Faster than calling a function or an object's method or property. But I see a problem with this approach, since it doesn't support language fallbacks easily. I mean, if you don't have a string for Chinese, let it be displayed in English.
This problem is topical when you upgrade your system and you don't have time to translate everything in every language.
I'd better use something like
lang.en.php
$langs['en'] = array(
...
);
lang.cn.php
$langs['cn'] = array(
...
);
[prepend].php (some common lib)
define('DEFAULT_LANG', 'en');
include_once('lang.' . DEFAULT_LANG '.php');
include_once('lang.' . $user->lang . '.php');
$lang = array_merge($langs[DEFAULT_LANG], $langs[$user->lang]);
Looks all right to me also, but:
Seems that you have localization for multiple modules/sites, so why not break it down to multidimensional array?
$localization = array(
'module' => (object)array(
'heading' => 'oh, no!',
'perex' => 'oh, yes!'
)
);
I personally like to creat stdClass out of arrays with
$localization = (object)$localization;
so you can use
$localization->module->heading;
:) my 2 cents
The only way that this could be xss is if you have register_globals=On and you don't set $lang['showcase_lil'] or other $lang's. But I don't think you have to worry about this. So I think your in the clear.
as an xss test:
http://127.0.0.1/whatever.php?lang[showcase_lil]=alert(/xss/)
Wouldn't it have been better to post code and briefly explain this issue to us?
Anyway, putting each language in its own file and loading it through some sort of language component seems okay. I'd prefer using some sort of gettext, but this is okay too, I guess.
You should make a function for calling the language keys rather than relying on an array, something like
<?php echo lang('yourKey'); ?>
One thing to watch for is interpolation; that's really the only place XSS could sneak in if your server settings are sensible. If you at any point need to do something along the lines of translating "$project->name has $project->member_count members", you'll have to make sure you escape all HTML that goes in there.
But other than that, you should be fine.