Multilanguage Database: which method is better?

Multilanguage Database: which method is better? - php

I have a Website in 3 languages.
Which is the best way to structure the DB?
1) Create 3 table, one for every language (e.g. Product_en, Product_es, Product_de) and retrieve data from the table with an identifier:
e.g. on the php page I have a string:
$language = 'en'
so I get the data only
SELECT FROM Product_$language
2) Create 1 table with:
ID LANGUAGE NAME DESCR
and post on the page only
WHERE LANGUAGE = '$language'
3) Create 1 table with:
ID NAME_EN DESCR_EN NAME_ES DESCR_ES NAME_DE DESCR_DE
Thank you!

I'd rather go for the second option.
The first option for me seems not flexible enough for searching of records. What if you need to search for two languages? The best way you can do on that is to UNION the result of two SELECT statement. The third one seems to have data redundancy. It feels like you need to have a language on every names.
The second one very flexible and handy. You can do whatever operations you want without adding some special methods unless you want to pivot the records.

I would opt for option one or two. Which one really depends on your application and how you plan to access your data. When I have done similar localization in the past, I have used the single table approach.
My preference to this approach is that you don't need to change the DB schema at all should you add additional localizations. You also should not need to change your related code in this case either, as language identifier just becomes another value that is used in the query.

That way you would be killing the database in no-time.
Just do a table like:
TABLE languages with fields:
-- product name
-- product description
-- two-letter language code
This will allow you, not only to have a better structured database, but you could even have products that only have one translation. If you want you can even want to show the default language if no other is specified. That you'll do programmatically of course, but I think you get the idea.

Related

What's the most efficient way to save data in MySQL Database

Hi I have data that I need to store in my database. My website is about tv-shows, and the data I'm talking about is basically seasons and episodes. My concern is about whether I should use two tables or one. I'll make myself more clear:
Option 1:
seasons_table
post_id
season_number
season_title
language
subtitles
item_date (when it was created)
item_modified (when it was last modified)
episodes_table
post_id
season_id
episode_number
episode_title
item_date
item_modified
Option 2:
unique table
post_id
item_type (season or episode)
season_number
season_title
language
subtitles
season_id
episode_number
episode_title
item_date
item_modified
I can already see for myself that with Option 1 there's gonna be a lot of common fields between the two tables, while with Option 2 there's gonna be a lot of fields that are never gonna be used (e.g. an episode will never have a value in the field season_title since it just needs a value for season_id to be linked to that season).
So which one is the best option? I'm willing to choose option 2, but I'm worried that those empty fields are gonna waste memory or loading time or whatever while processing any data in that table. Is that true? Thanks in advance to everyone, I hope I made myself clear.
By the way my website is wordpress based and I'm gonna use a custom table, but I think i'm gonna use some wordpress functions to process data like $wpdb->insert and so on...

Two tables is the best approach here, that more closely adheres to the rules of database normalization. If there's concern about duplication you need to better evaluate where you're storing data.
item_date and item_modified are probably unique for each entry even if they are "duplicated" in terms of schema. Don't worry about this.
Whatever post_id is, you'll have to evaluate if you have a direct relationship between post and these two tables, or from post to seasons to episodes.

more efficient database structure across multiple tables

I am setting up a MySQL database with multiple tables. Several of the tables will have fields with similar names that aren't necessarily for the same purpose.
For example, there's a users table that will have a name field, a category table with a name field and so on.
I've previously seen this setup up either with or without a preface to the field name, so in the above example using user_name, cat_name etc.
As these are all in separate tables, is there any benefit to structuring the database with or without this preface? I know that when using joins and calling the data through PHP you have to add a SELECT users.name AS username... to keep the fields from overwriting each other when using mysql_fetch_array. But i'm not sure if there's any efficiencies in using one method over the other?

It depends on what your shop does or your preference. There is nothing about a prefix that will make this better. Personally I would just keep it as name since: Users.Name and Orders.Name and Products.Name all contain tuples with different object types.
At the end of the day you want to be consistent. If you prefer a cat_ and a user_ prefix just be consistent with your design and include this prefix for all object types. To me less is more.

It's really just a matter of preference. I personally prefer the approach of using just name.
One thing to watch out for though, if you're doing any SELECT * FROM ... queries (which you shouldn't be; always select fields explicitly), you may end up selecting the wrong data.

One disadvantage is if anyone is stupid enough to use natural joins (you can guess that I find this a poor practice but mysql does allow it so you need to consider if that will happen) you may end up joining on those fields with the same name by accident.

How to handle hardcoded stuffs in database and coding

There are some language and courses(based on language) are defined in two table. Language table reference is used in course table to relate course with particular language. I also have a notes table that content notes of specific course and that is related to course table. Now I have two issues.
Now in coding I need to take some specific action for Spanish language only. So how should I handle this as languages will be entered by users and we would not be having any idea about Spanish language ID. If I do use text (the language name) then each time I need to fetch ID for Spanish from language table and then will fetch all course related to this from course table.
Suppose Spanish notes are stored in four separate sections and other notes have only one section so should I use same table with four column (one for each section) or use two tables(notes and spanish_notes). Using former way, will leave three column blank for other languages notes. I don't think that is good.

One quick solution to your first issue about multiple languages is to use language codes such as 'en', 'es', 'fr' etc. For instance in your language table you could have both id, code columns but in your content tables you could have a FK with code. So you could either get this lang. code from requests Accept-Language property. or somewhere else.
For second question in terms of normalization it is better to have separate tables for Spanish notes. It is way better for many reasons such as redundancy and dependency concerns.
EDIT: PS. You could also have a look at language codes from here and HTTP Accept-Language from here.

Some inputs:
There could be 2 ways of doing this:
a. When languages are entered by users, use a SELECT dropdown box for accepting user inputs. For each SELECT option, you can set the language name as the text and language id as the value. This way you will know the language ID as
b. You can use MySQL INNER JOIN between "language" and "courses" table, something like:
SELECT *
FROM `language` `l`
INNER JOIN `courses` `c` ON `l`.`language_id` = `c`.`language_id`
WHERE `l`.`language_name` = `spanish`;
I think it's okay to keep all notes for all sections in the single table. So, for the other 3 columns that will only contain Spanish notes, you can set them to accept NULL values
Hope it helps.

PHP/MySQL web-app Internationalization with enum DB fields

I have joined a project recently and now I'm working on its Internationalization improvement. Technologies used are PHP/MySQL/Zend Framework/Dojo. I18n is implemented using gettext almost as described here link to SO question in the second answer.
But I encountered one problem. Some part of the information specific to certain DB tables is stored within those tables in the enum type columns. For example there is a field usr_online_status in the table "user" which could be one of either 'online' or 'offline'. There are many such tables with enum fields which contain info like ('yes' ,'no') ,('download', 'upload') and so on. Of course this info is displayed in English regardless of the current Language chosen by user.
I would like to solve this inconvenience. But don't know what is the best way to do this in terms of performance and ease of implementation.
I see two possible options:
1) Make language specific dictionary tables for each table which uses such enums.
2) Download all the info from enums. Translate it. Make a script which could on demand alter every table and replace those enums with the required translations.
But there may be simpler or better solutions for this problem.
What would you do ?
Thanks for your answers.
UPD1
Important remark. Info from the enums is not only displayed at the GUI but is used in search. For example - there is a grid on a webpage which contains info about users. You can type 'line' in a search field and the result will be only those users with the word '%line%' in their info, for example 'online' status.

You definitly want dictionary tables: Only with these can 2 different users of the app work in different languages at the same time.
I recommend to put some of these dictionary tables into PHP though, as this has proven to be quite an unintrusive and performant way of doing it - e.g.
$translation=array('yes'=>'Ja','no'=>'Nein', ..)
//...
$row=mysql_fetch_row($qry);
//$row[1] has yes/no
$row[1]=$translation[$row[1]];
//...
$translation could be require_once()'ed depending on the current user's language preferences, the URL or whatever
Basically you trade some RAM for speed and easyness.
UPDATE:
With Gior312 adding the info about search, here is my solution for it: Have the reverse translation in a DB table (you even might use it to create $translation per a script):
CREATE TABLE translations (
id INT PRIMARY KEY AUTO_INCREMENT,
languageid INT NOT NULL,
enumword VARCHAR(m) NOT NULL,
langword VARCHAR(n) NOT NULL,
-- n and m to your needs
INDEX(languageid)
-- other indices to your needs
)
Now when the search up until now was
$line=... //Maybe coming from $_POST['line'] via mysql_real_escape_string()
$sql="SELECT * FROM sometable WHERE somefield LIKE '%$line%'";
What you now do is
$line=... //Maybe coming from $_POST['line'] via mysql_real_escape_string()
$sql="SELECT enumword FROM translations WHERE languageid=$currentlanguageid AND langword LIKE '%$line%'";
//fetch resulting enumwords into array $enumwords
$enumlist=implode("','",$enumwords);
//This assumes, that the field enumwords contains nothing, that needs to be escaped
$sql="SELECT * FROM sometable WHERE somefield IN ('$enumlist')";
The rationale behind treating forward and back translation differently is:
There will be many more lines in the code where you display, than where you search, so the unintrusiveness of the forward translation is more important
The forward trnslation has to be done PER ROW (with a join), the reverse only PER QUERY, so the performance of the forward translation is more important than the performance of the reverse translation

Optimal database structure design for one to many

I am building an inventory tracking system for internal use at my company. I am working on the database structure and want to get some feedback on which design is better*.
I need a recursive(i might be using this term wrong...) system where a part could be made up of zero or more parts. I though of two ways to do this but am not sure which one to use. I am not an expert in database design so maybe there is a their option that i haven't thought of.
Option 1:
Two tables one with the part_id and the other with part_id, sub_part_id (which refers to another part_id) and quantity. so one table part_id would be unique and the other table there could be zero or more rows showing all the parts that make up a certain part.
Option 2:
One table with part_id and assembly. assembly would be a text field that looks something like this, part_id,quantity;part_id,quanity;.... I would then use the PHP explode() function to separate by semi-colon and again by comma to get an array of the sub parts.
I hope this all makes sense. I am using PHP/MySQL.
*community wiki because this may be subjective.

Generally, option 1 is preferable to option 2, not least because some of the part IDs in the assembly would themselves be assemblies.
You do have to deal with recursive or tree-structured queries. That is not particularly easy in any dialect of SQL. Some systems have better support for them than others. Oracle has its CONNECT BY PRIOR system (weird, but it sort of works), and DB2 has recursive WITH clauses, and ...

NEVER, never ever use procedural languages like PHP or C# to process data structures when you have a database engine for that. Relational data structures are much more faster and flexible, and surer, than storing text. Forget about Option 2.
You could use recursive UDFs to retrieve the whole tree with no big fuss about it.

How about a nullable foreign key on the same table? Something like:
CREATE TABLE part (
part_id int not null auto_increment primary key,
parent_part_id int null,
constraint fk_parent_part foreign key (parent_part_id) references part (part_id)
)

Definitely not option 2. That is a recipe for trouble. The correct answer depends on how many potential levels of assemblies are possible, and how you think of the assemblies. Do you think of an assembly (a composite onject consisting of 2 or more atomic parts) as a part in it's own right, that can itself be used as a subpart in anothe assmebly? Or are assemblies a fundementally differrent kind of thing froma an atomic part?
If the former is the case, then put all assemblies and parts in one table, with a PartID, and add a second table that just has the construction details for those parts that are composed of multiple other parts (which themseleves may be assemblies of yet more atomic parts). This second table would look like this:
ConstructionDetails
PartId, SubPartId, QuantityRequired
If you think of things more like the second way, then only put the atomic parts in the first table, and put the assemblies in the second table
Assemblies
AssemblyId, PartId, QuantityRequired

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.