What is the better site localization method? - php

I recently joined a team that was formerly a one man show to maintain and develop a company's PHP-mysql website.
The current localization method is, for each section of the site, there exists a file ending in _en.php and _fr.php that contains long lists of same named variables with text in the appropriate language. At the top of each content page, the user language is determined and then the appropriate 'dictionary' file is loaded.
I am trying to promote as an alternative is using a db table like (id, code, en, fr) and a function to lookup the correct translation in the current page.
My boss tells me that the benefits of the first approach are: having a context for each translation, and having the translations under source control
His concerns with my proposed approach are the lack of these things, and doesn't like the idea of having two translation systems on the site.
My concerns are that, this is data in a code file, which i was taught as a bad idea. To search for a string you have to use an ide search tool, and so I don't see how a none programmer would be comfortable editing these.
So, is his approach better? Is mine better but only marginally and not worth rocking the boat? Is the current system a disastor waiting to happen that I shouldn't let go?

I think that for interface things (name, surname, text in buttons etc...) is more natural to use a resource file. In .NET we use .resx, in PHP, an include file is enough.
To use an archive with an include is not resource-consuming, it would be to parse a XML.
If we were talking about big texts I would put them in a db with a different code, merely because normally I would have a backoffice to modify these contents,not for performance issues.
Take in mind that Db access is consuming too, it depends on the number of users.

Related

Best practice for dynamically translating content into different languages

I am the project manager on a website that needs to be converted into multiple languages. I am trying to figure out what the best option to go with is. I don't have a problem paying for something, but I just want to make sure it will work properly.
The options that I have thought of was to either (somehow) integrate google translate that when the user clicks on the language they want to read the page in, it updates the language for google to translate into. I did work with Google translate a little bit, but I found it to be little clumsy. Maybe I am not using it properly.
Another alternative I had, definitely not the best idea, but a backup if need be is to have the content put in a database and pulling the content dependent on the user's language. The only problem I have is that changing one word on the English version would have to change on every other language.
I am open to any other idea. I can clarify the project more, if need be.
As someone who speaks several languages, I can assure you that Google Translate often misses the mark. In many cases their translations are embarrassing, especially when you try to translate individual words or phrases without a sufficient context. Some language pairs are better than others, but overall this is not an option at this point.
Compiled languages have an advantage of static i18n, when a different version of a code is compiled for each UI language.
Database-driven dynamic i18n is a bad option, and almost all programming frameworks try to avoid it. I would recommend, therefore, that you look for an i18n solution that works with properties (text) files to lookup translated strings. In PHP this is gettext or intl.
Note also that i18n involves not only translation of text, but it also requires appropriate localization of dates, numbers, currencies, etc.
I don't have a problem paying for something, but I just want to make
sure it will work properly.
Based on that statement of yours I would like to suggest that hiring a firm that specializes in translation will be your best bet, then just put a multiple links that will lead to multiple languages of your website.
Problems that you might encounter:
Adjusting contents, some translations might be too short, some might be too long.
Using google translate can ruin your site, because sometimes it fails especially for some languages.

Can XML be used to store, modify and retrieve data in PHP like it can with MySQL? Managing data for general-purpose storage (eg. CRUD, CMS)

This is a long and old question that doesn't get to the point. I basically wanted to know practices involving flat files and the extent they could be used, as a replacement for SQL, mostly in terms of multi-user capability.
At the time, I was wanting to replicate a SQL table editor interface with flat files, allowing collaborative editing. Basically like a multiuser Excel, with an automated data-entry interface, and interactive sortable tables.
I also wanted to build a CMS index page for a server, which parsed text files in order to construct a dynamic webpage, which allowed for easy updating/managing.
I've been beginning to learn MySQL, and XML. For dynamic data storage, I prefer XML over MySQL because it doesn't require a server and can be edited within a text-editor, but I'm unsure whether they can be used for similar things.
(I know that MySQL and XML are two completely different things, but I'm looking at this in regards to data storage.)
In the past I've manually stored lists of stuff in *.txt files (to keep track of things), sometimes with multiple fields per-row kinda thing, like a table, or lines with related data. HTML tables are good for this, but it would be even nicer to be able to edit directly in the page without the need for a text editor, and in certain situations, allow multiple persons (collaborative editing) to edit different sections at the same time.
(I want to use PHP to create scripts that can do this - allow editing of files in browser, including collaborative. I want to learn data manipulation methods in general.)
So basically, I want to create an index for whatever scripts and documents I'd want to display, in the form of a Content Management System. I'd want pages to be modular somehow.. Some modules would be a CRUD (create, read, update, delete) with tabular data, another module could be a pastebin-like text dump derived from a PHP script, some sort of article-publishing system for wiki-like linked articles, single articles or blog posts.
Anyway, I've made scripts that parse XML files, and I like the idea of separating content from presentation, but I don't know how/if XML could be incorporated into a CMS (or any dynamically-editable situation), as most popular ones use MySQL. This is only for personal use and not for some big site, and it would be nice for it to be simple and portable, only requiring the Web server. I'd only prefer MySQL as a last resort, as I don't like having to setup MySQL every time I switch servers, or going through MySQL connection errors.
What should I do / Any suggestions?
I prefer XML over MySQL because it doesn't require a server
I prefer to travel on foot rather on wheels because it doesn't require a car. So, I spend 6 hours to get to my job and back every day.
XML can be edited within a text-editor
In theory.
In practice, XML is bound with such a number of strict rules and standards that you scarcely can edit a comma without breaking the whole file.
Face the truth - it is for programs, not humans.
In the past I've manually stored lists of stuff in *.txt files
You'd better stick with this approach further
HTML tables are good for this,
HTML tables are worse for this, even worse than XML.
I want to create an index for whatever scripts and documents I'd want to display, in the form of a Content Management System.
You are taking Content Management Systems wrong. It's Content Management System, not scripts management system. It merely manages the content, the data stored somewhere.
I like the idea of separating content from presentation,
I like it too, but your XML has nothing to do with this idea.
What should I do / Any suggestions?
Learn to drive a car. Do not remain a pedestrian.
Learn databases.
An interesting reading http://www.joelonsoftware.com/articles/fog0000000319.html
So you have two questions: "Can XML be used like MySQL?" (No, it must be treated different, use XPath instead of SQL etc.) and "Can XML be used for building a CMS?" (Yes, there are some like that, e.g. GetSimple CMS - see http://get-simple.info/start/)
You must be aware that XML is suitable only for smaller amounts of data, but in that case you probably don't need the weight of a database.

How to make a Multilanguage website [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Can any body tell me how to make a dynamic multilanguage website in PHP and MySQL? I have no idea about it. I searched on Google and didn't find any good solutions.
Can any one provide me with a step by step guide? If possible make a demo for a multilanguage website. Or please refer me to any link where it explains the details about it.
Short answer: there is no short answer, as there are a lot of variables to consider, and plenty of work to do. So...
Long answer: I'm going to break it down as well as I can, but there isn't a "good for all" answer to a question as broad as yours.
First, variables of the task at hand:
List of languages: will your site be in a predefined set of languages, or will it be varied/heterogeneous? For example, a site may be entirely bilingual in two well defined languages (or, to put another example, I run an English/Catalan/Spanish site); or different sections could be available on different sets of languages (for an example, look at MS's sites: they are mostly homogeneous, but stuff like blogs, KB articles, and some docs are just available in a subset of the supposedly supported languages).
Translations source: is content provided in each relevant language by you or some collaborator? Or are some versions run through translation software from a single "base" language? The first approach takes a lot of extra work to produce the contents, but yields higher quality results than the second.
Languages themselves: once you have 1) and 2) answered, you will need to be aware of exactly which languages are you working with. Note that in the case you include dialects (ex: US English + UK English, or Argentina Spanish + Spain Spanish), you may encounter some "duplicate content" issues with search engines, but details on that are too off-topic here (just mentioning so you are aware of the potential issues).
Are you targeting languages in the abstract (for example, my site offers the three languages without caring at all where the visitor is: that's what I have, so choose what you prefer); or rather targeting different regions/countries? In the later case, things can get extra complex, as you may need to care about other stuff besides languages (like timezones, currencies, or date-time format conventions, to name some), but you get the benefit of being able to use country-specific TLDs.
Once you have the above well-defined, let's start working. These are the most prominent tasks you'd need to do handle:
Language detection: the most reasonable approach is to use a GET parameter (something like ?lang=en-us on the URL). Also, you might use some cookie and/or IP geolocation to fall back when a URL with no language argument is requested. Also, if you have the means, consider the topic of URL beautification (what looks better: example.com/index.php?lang=en-us or example.com/en-us/home?). Personally, I love the power ModRewrite grants to my .htaccess file, but that'll only work on Apache-powered servers.
Content management: regardless of whether you are fetching content from a DB (like article content), include files (typical for breadcrumbs, menus, site-wide headings, etc), or any other means, you will need some way to separate each version (language) of the content. Here are some examples of how it can be done:
For DB content, my best advise is to come up with some solid field naming pattern and stick to it. For example, I append _en, _es, or _ca to all language-dependent fields of my DB. This way, I can access the right content with expressions like $row["title_$lang"].
For include files, again a file naming convention is the sanest approach. In my case, I have file names ending with .en.php, .ca.htm, etc. My include calls then look like include("some-filename.$lang.php).
From time to time, you will be spitting out small chunks of text directly from your PHP code (for example, when labeling the headings of a table). You can use an include file per language defining a "chunks" array with the same keys, or a DB table like Geert suggested. The former approach takes less work to develop, the latter should take less work to maintain (especially if you aren't working alone).
Language pick: quite essential, you should provide your users a way to choose their own language, other than tweaking the GET arguments on the URL itself. For few languages, "flags" often work great, since they can be understood even if the page has initially fallen back to a language the user doesn't know at all. For more languages, a dropdown menu seems more efficient (in terms of viewport space), but you should make sure to add some visual (ie: non-textual) hints. Some sites force you to pick a language upon entering, and only have links to the home-page on each language. Personally, I have my three flags standing out on top of my site's menu, each pointing to the current address with only the language argument changed. A code like this can work quite well:
function translatedURI($new_lang) {
return str_replace("lang=$lang", "lang=$new_lang", "http://" . $_SERVER["HTTP_HOST"] . $_SERVER["REQUEST_URI"];
}
CMS tweaking: if your site (or part of it) is using some kind of CMS, discussion board, etc, things can get quite messy. Speaking from my own experience, I have a phpBB forum on my site split in three main categories (one per language), in such a way that they look like three independent forums (but users just need to login/register on one of them to gain access to all languages, since they are indeed just categories of the same board). The tweaks I had to make for this to work smoothly threatened the last remnants of sanity I still keep :S. For these cases, I advise looking up the docs and support features of the specific software you are using.
Well, that's everything I can come out with for now. I think you should have enough to pull up your sleeves and get to work. Then, if you hit some wall on your path, come back with specific questions and I'm confident you'll get more specific answers.
Hope this helps.
The solution I always use is to create a database table, with all possible messages. For each message I use a code (abbreviation) to look it up. So for example:
lang id message
en login Login
en lostpass Lost your password?
de login Anmelden
de lostpass Paswort vergessen?
nl login Aanmelden
nl lostpass Wachtwoord vergeten?
etc. Looking up the translations is usually fast enough by using a MySQL query, but you can also place all messages in a array and load it into memory when your script loads. Users should always be able to set the language they prefer, don't rely blindly on the language header set by the web browser.
I am now designing a very tiny CMS that must be multilanguage.
One of the features that most concerns to me, is that the client can spontaneously decide to add or remove a language.
For this reason, I am not aiming the design adding suffixes to the database tables, I can not (and want not) to modify the table names or access them using dynamic names, nor adding or removing fields each time a language is defined or removed.
I would not use files either, just because I like databases and they are easy to maintain.
And lastly, I think in two types of translation:
The web text.
The content text.
Therefore, my design aims to:
languages A table with the languages defined.
translations A single table that will have all the messages, as follows:
[pk] table_name the name of the table which content will be translated.
[pk] field_name the name of the field which content will be translated.
[pk] row_id the row identificator for the item that will be translated.
[pk] language the language that the text is translated.
text the translated text.
That means that the tables which fields will have content in a single-language scenario, now will have its content void, because it will always be in the translations table.
That will increase the SQL queries complexity, but it allows me to develop tools to maintain the translations in an easy way. Also, the complexity of the SQL will exist only once, just when implementing the solution. If that implementation is properly designed, the maintenance / extensibility of the site doesn't have to be a major problem.
Edit:
After some conversation with developer friends, I think that the solution i am approaching here has too much charge on a single table.
Another approach that I will study from now on is creating an extra table for each "translatable table" as follows:
any_translatable_table: The table which need to translate any of its fields
any_translatable_table_translations: The table where the translations will be stored.
[pk] field_name the name of the field which content will be translated.
[pk] row_id the row identificator for the item that will be translated.
[pk] language the language that the text is translated.
text the translated text.
This scheme inherits the concepts from the first one, but separates it's content per tables. This alternative solution may increase the performance and isolate the problems (as indexes problems).
The extra translation table per "translatable table" will be created at the same time that the original one.
And about the SQL queries, the complexity is still the same: The first approach needs the table name to search into translations table, but the second just adds the suffix "_translations" at the table name.

Several copies of a PHP site with tweaks: maximize code reuse and minimize duplication?

Sorry for the confusing title....
We are developing an application to be used by multiple companies. For the most part, the application is the same, your standard sort of database manipulation pages (search pages, edit pages, etc.) customized for the data that it is designed for.
However, each company has a slightly different process, and we will be dealing directly with each company so we'd like to use some sort of system that would allow us to tweak pages depending on which company is viewing the page. For example, one company might want a couple extra fields on a data input page, or another company might want to view a different piece of data on a search results screen, and so on.
I understand this is all hypothetical and I wish I had a concrete example to give you, but honestly the companies haven't even given us very good examples. We just want to be ready.
So my basic question is, what is the most flexible way to allow for these tweaks and customizations on a per-company basis? Obviously, the most flexible but least programmer-friendly way would be to make a complete copy of the app for each company. This obviously isn't an option because we'd need to manage updating code on all the sites, trying to keep them all running and tested and having issues resulting from the customized code.
What are your thoughts on Smarty being a solution to this? Perhaps if we have a master set of templates, but then each company can have a different subfolder with any replacement template files... Of course we'd still need to update a bunch of different template files whenever we change one of them, but it would be a little more localized anyway.
Is there a better way? Some sort of differencing template engine maybe, so that we can still edit the original files and the changes will adapt on top of the originals (kind of like a patch)? Or perhaps we should use the object-oriented features of PHP5 and then use polymorphism? What is your best suggestion, and especially if you've had experience with this sort of thing, what are the options and which have you used and why?
I think the template method pattern will help you out a lot. It's really a great pattern for factoring stuff that is mostly the same but differs in a few places. I'm actually working out a template method hierarchy for my own project right now.
I would suggest you try to create the application either using an mvc framework or using your own implementation of mvc.
In this manner you could create models that could be reused (and also views) for other companies.

internationalization of php website

I am currently working on a project / website and I will need to make it available in several languages. The site was done with PHP / mysql and a lot of javascript (jQuery). I have no idea where to start and I was hoping somebody could give me some hints. I would like to know opinions about what is the best approach to take, if there are some good tools for such a php site, what to do with the existing scripts, or better, with the text inside of the scripts that need to be translated as well. Does anybody had to do something like this before that could guide me through the right path :) ??
thanks
There are a number of ways of tackling this. None of them "the best way" and all of them with problems in the short term or the long term. The very first thing to say is that multi lingual sites are not easy, translators and lovely people but hard to work with and most programmers see the problem as a technical one only. There is also another dimension, outside the scope of this answer, as to whether you are translating or localising. This involves looking at the target audiences cultural mores and then tailoring language, style, layout, colour, typeface etc., to that culture. Finally do not use MT, Machine Translation, for anything serious or if it needs to be accurate and when acquiring translators ensure that they are translating from a foreign language into their native language which means that they understand all the nuances of the target language.
Right. Solutions. On the basis that you do not want to rewrite the site then simply clone the site you have and translate the copies to the target language. Assuming the code base is stable you can use a VCS to manage any code changes. You can tweak individual parts of the site to fit the target language, for example French text is on average 30% larger than the equivalent English text so using one site to deliver this means you may (will) have formatting problems and need to swap a different css file in and out depending on the language. It might seem a clunky way to do it but then how long are the sites going to exist? The management overhead of doing it this way may well be less than other options.
Second way without rebuilding. Replace all content in the current site with tags and then put the different language in file or db tables, sniff the users desired language (do you have registered users who can make a preference or do you want to get the browser language tag, or is it going to be URL dot-com dot-fr, dot-de that make the choice) and then replace the tags with the target language. Then you need to address the sizing issues and the image issues separately. This solution is in effect when frameworks like Symfony and Zend do to implement l10n.
Then you could rebuild with a framework or with gettext and and possibly have a cleaner solution but remember frameworks were designed to solve other problems, not translation and the translation component has come into the framework as partial solution not the full one.
The big problem with all the solutions is ongoing maintenance. Because not not only do you have a code base but also multiple language bases to maintain. Unless you all in one solution is really clever and effective then to ongoing task will be difficult.

Categories