I’m creating websites in PHP since some years but I never had to manage multi-language scenarios. I plan to create from scratch a website which will be available in French, English, Spanish and German, and I’d like to avoid common mistakes! :-)
I already read different blogs and post, and this is actually how I see the things for now:
Regarding the URLs, I will use static routes which will associate each URL to a specific controller/action. This should allow me to have SEO friendly URL, and should be quite fast (I’ll not use regular expressions but compare the URLs parts to define the route to use).
Note that I won’t have too many pages - probably less than 100 so the route shouldn’t be hard to define.
Regarding the user interface, I’ll have one template per language in order to be able to be able to make adjustments (change the buttons, personnalize the design for a specific country, …).
I plan to use the database to store most of the website content (routes, menus, error messages, static page contents, page titles …).
I will separate the localized content in different tables in order to minimize the size of each table.
I chose this option in order to be able to easily edit the content using the GUI (since I want to allow admins to be able to change the translation if they want, without FTP or phpmyadmin access).
I don’t expect any extra charge on the SQL server since the content which should be almost static (content pages, menus, error messages, route list, etc) will be cached, and the cache will be recreated only in case of content update via the GUI.
My question is the following:
- What do you think about my plan? Do you see important drawback regarding the choices that I did? Did I forget something important?
Thanks in advance!
NOTES:
I don't plan to use a framework as I want to do the things by myself in order to improve my knowledge
I'm already used to use UTF8 everywhere.
I follow the MVC pattern.
I'd like to avoid templating language, and keep only PHP in my views.
I'm using laravel framework and mysql to build a site which needs to be translated into, at the moment 3 but it may increase, different languages. Using Laravel it is relatively easy to create different language files for different languages and call them in the html for relatively static strings. However, for data coming out of a database, what is the best practice for localizing such data? It seems that storing the localized versions of the strings would work well in the sense that they can all be added at the same time as the data is created; however, it seems it would be horrible in the sense of adding a new language. Using language files for the data seems better for adding new languages, but would be annoying to keep up with when new rows are added to the database.
I've come up with a solution to translate some of the data in a relatively static selectbox (it uses a static array of ids/names of some of the most common data) so that users may be able to utilize that part of the site pretty well no matter the language, but the data as it shows up on the rest of the site is naturally unaffected by that change.
What is the best practice for handling such an issue? The few solutions I can come up with all seem rather flawed.
Note: the new rows are added on a management site which is separate from the main site, but shares the same data in the database. The people adding the new rows would not have direct access to any language files.
Instead of reinventing the wheel, you can try something like Laravel Translation - https://github.com/Waavi/translation
I'm working in some php web apps that are almost identic, except content database, texts, and themes. Let's think in some similar like could be stackexchange sites.
The objective is mantain only one project in the same repository, so if I introduce a new feature, I only have to implement once, and not for every site.
For the themes and database is not a problem, but for the texts I don't know how can I proceed (now the texts are hardcoded in the php files).
I've googled and searched in SO, and I've find some questions similars like this where the answer was using gettext for i18n.
But in my case, it's not exactly i18n, because some of the sites are in the same language but the texts are different.
How can I store these strings?
There are tons of ways to get this done. I have a fitting project-scenario where I archive different things in different ways:
All instances of a project come from the same SVN folder.
Each instance has its own config-file, which is not included in SVN.
In addition to that, every project has its own:
Textbase (Labels, etc)
Templateset
Content
I use my own translation-system to distribute the textbase over my instances. You can use some of php's build in functions: http://verens.com/2008/04/03/translation-in-php/
The Templateset is just done by a config-variable that sets the root directory for the template-engine: ./templates/instance-x/...
The content could also referenced by a config-variable. In my case, I have a shop-system, that uses a product-base, which is used by all instances identically. To make a product available/visible to the public, you have to assign it to a category. Categories and instance-dependent.
Categories got a instance-specific ID (in my case its the shop_id).
You could for example create a table called "texts" in each database that contains the specific texts for that app. As you already have to connect with a different database for each app, you can easily implement to always get SELECT value FROM {current_database}.texts WHERE key = "headertext" or so.
That way, your code can stay the same and only the databases differ from each other.
UPDATE: And of course i18n localization is also easily implemented this way, by adding another column to the texts table containing the localization value (like en, de or nl etc.)
I recently joined a team that was formerly a one man show to maintain and develop a company's PHP-mysql website.
The current localization method is, for each section of the site, there exists a file ending in _en.php and _fr.php that contains long lists of same named variables with text in the appropriate language. At the top of each content page, the user language is determined and then the appropriate 'dictionary' file is loaded.
I am trying to promote as an alternative is using a db table like (id, code, en, fr) and a function to lookup the correct translation in the current page.
My boss tells me that the benefits of the first approach are: having a context for each translation, and having the translations under source control
His concerns with my proposed approach are the lack of these things, and doesn't like the idea of having two translation systems on the site.
My concerns are that, this is data in a code file, which i was taught as a bad idea. To search for a string you have to use an ide search tool, and so I don't see how a none programmer would be comfortable editing these.
So, is his approach better? Is mine better but only marginally and not worth rocking the boat? Is the current system a disastor waiting to happen that I shouldn't let go?
I think that for interface things (name, surname, text in buttons etc...) is more natural to use a resource file. In .NET we use .resx, in PHP, an include file is enough.
To use an archive with an include is not resource-consuming, it would be to parse a XML.
If we were talking about big texts I would put them in a db with a different code, merely because normally I would have a backoffice to modify these contents,not for performance issues.
Take in mind that Db access is consuming too, it depends on the number of users.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Can any body tell me how to make a dynamic multilanguage website in PHP and MySQL? I have no idea about it. I searched on Google and didn't find any good solutions.
Can any one provide me with a step by step guide? If possible make a demo for a multilanguage website. Or please refer me to any link where it explains the details about it.
Short answer: there is no short answer, as there are a lot of variables to consider, and plenty of work to do. So...
Long answer: I'm going to break it down as well as I can, but there isn't a "good for all" answer to a question as broad as yours.
First, variables of the task at hand:
List of languages: will your site be in a predefined set of languages, or will it be varied/heterogeneous? For example, a site may be entirely bilingual in two well defined languages (or, to put another example, I run an English/Catalan/Spanish site); or different sections could be available on different sets of languages (for an example, look at MS's sites: they are mostly homogeneous, but stuff like blogs, KB articles, and some docs are just available in a subset of the supposedly supported languages).
Translations source: is content provided in each relevant language by you or some collaborator? Or are some versions run through translation software from a single "base" language? The first approach takes a lot of extra work to produce the contents, but yields higher quality results than the second.
Languages themselves: once you have 1) and 2) answered, you will need to be aware of exactly which languages are you working with. Note that in the case you include dialects (ex: US English + UK English, or Argentina Spanish + Spain Spanish), you may encounter some "duplicate content" issues with search engines, but details on that are too off-topic here (just mentioning so you are aware of the potential issues).
Are you targeting languages in the abstract (for example, my site offers the three languages without caring at all where the visitor is: that's what I have, so choose what you prefer); or rather targeting different regions/countries? In the later case, things can get extra complex, as you may need to care about other stuff besides languages (like timezones, currencies, or date-time format conventions, to name some), but you get the benefit of being able to use country-specific TLDs.
Once you have the above well-defined, let's start working. These are the most prominent tasks you'd need to do handle:
Language detection: the most reasonable approach is to use a GET parameter (something like ?lang=en-us on the URL). Also, you might use some cookie and/or IP geolocation to fall back when a URL with no language argument is requested. Also, if you have the means, consider the topic of URL beautification (what looks better: example.com/index.php?lang=en-us or example.com/en-us/home?). Personally, I love the power ModRewrite grants to my .htaccess file, but that'll only work on Apache-powered servers.
Content management: regardless of whether you are fetching content from a DB (like article content), include files (typical for breadcrumbs, menus, site-wide headings, etc), or any other means, you will need some way to separate each version (language) of the content. Here are some examples of how it can be done:
For DB content, my best advise is to come up with some solid field naming pattern and stick to it. For example, I append _en, _es, or _ca to all language-dependent fields of my DB. This way, I can access the right content with expressions like $row["title_$lang"].
For include files, again a file naming convention is the sanest approach. In my case, I have file names ending with .en.php, .ca.htm, etc. My include calls then look like include("some-filename.$lang.php).
From time to time, you will be spitting out small chunks of text directly from your PHP code (for example, when labeling the headings of a table). You can use an include file per language defining a "chunks" array with the same keys, or a DB table like Geert suggested. The former approach takes less work to develop, the latter should take less work to maintain (especially if you aren't working alone).
Language pick: quite essential, you should provide your users a way to choose their own language, other than tweaking the GET arguments on the URL itself. For few languages, "flags" often work great, since they can be understood even if the page has initially fallen back to a language the user doesn't know at all. For more languages, a dropdown menu seems more efficient (in terms of viewport space), but you should make sure to add some visual (ie: non-textual) hints. Some sites force you to pick a language upon entering, and only have links to the home-page on each language. Personally, I have my three flags standing out on top of my site's menu, each pointing to the current address with only the language argument changed. A code like this can work quite well:
function translatedURI($new_lang) {
return str_replace("lang=$lang", "lang=$new_lang", "http://" . $_SERVER["HTTP_HOST"] . $_SERVER["REQUEST_URI"];
}
CMS tweaking: if your site (or part of it) is using some kind of CMS, discussion board, etc, things can get quite messy. Speaking from my own experience, I have a phpBB forum on my site split in three main categories (one per language), in such a way that they look like three independent forums (but users just need to login/register on one of them to gain access to all languages, since they are indeed just categories of the same board). The tweaks I had to make for this to work smoothly threatened the last remnants of sanity I still keep :S. For these cases, I advise looking up the docs and support features of the specific software you are using.
Well, that's everything I can come out with for now. I think you should have enough to pull up your sleeves and get to work. Then, if you hit some wall on your path, come back with specific questions and I'm confident you'll get more specific answers.
Hope this helps.
The solution I always use is to create a database table, with all possible messages. For each message I use a code (abbreviation) to look it up. So for example:
lang id message
en login Login
en lostpass Lost your password?
de login Anmelden
de lostpass Paswort vergessen?
nl login Aanmelden
nl lostpass Wachtwoord vergeten?
etc. Looking up the translations is usually fast enough by using a MySQL query, but you can also place all messages in a array and load it into memory when your script loads. Users should always be able to set the language they prefer, don't rely blindly on the language header set by the web browser.
I am now designing a very tiny CMS that must be multilanguage.
One of the features that most concerns to me, is that the client can spontaneously decide to add or remove a language.
For this reason, I am not aiming the design adding suffixes to the database tables, I can not (and want not) to modify the table names or access them using dynamic names, nor adding or removing fields each time a language is defined or removed.
I would not use files either, just because I like databases and they are easy to maintain.
And lastly, I think in two types of translation:
The web text.
The content text.
Therefore, my design aims to:
languages A table with the languages defined.
translations A single table that will have all the messages, as follows:
[pk] table_name the name of the table which content will be translated.
[pk] field_name the name of the field which content will be translated.
[pk] row_id the row identificator for the item that will be translated.
[pk] language the language that the text is translated.
text the translated text.
That means that the tables which fields will have content in a single-language scenario, now will have its content void, because it will always be in the translations table.
That will increase the SQL queries complexity, but it allows me to develop tools to maintain the translations in an easy way. Also, the complexity of the SQL will exist only once, just when implementing the solution. If that implementation is properly designed, the maintenance / extensibility of the site doesn't have to be a major problem.
Edit:
After some conversation with developer friends, I think that the solution i am approaching here has too much charge on a single table.
Another approach that I will study from now on is creating an extra table for each "translatable table" as follows:
any_translatable_table: The table which need to translate any of its fields
any_translatable_table_translations: The table where the translations will be stored.
[pk] field_name the name of the field which content will be translated.
[pk] row_id the row identificator for the item that will be translated.
[pk] language the language that the text is translated.
text the translated text.
This scheme inherits the concepts from the first one, but separates it's content per tables. This alternative solution may increase the performance and isolate the problems (as indexes problems).
The extra translation table per "translatable table" will be created at the same time that the original one.
And about the SQL queries, the complexity is still the same: The first approach needs the table name to search into translations table, but the second just adds the suffix "_translations" at the table name.