What's the best database structure to keep multilingual data?

MysqlDatabaseLocalizationTranslationMultilingual

Mysql Problem Overview


> Possible Duplicate:
> Schema for a multilanguage database

Here's an example:

[ products ]
id (INT)
name-en_us (VARCHAR)
name-es_es (VARCHAR)
name-pt_br (VARCHAR)
description-en_us (VARCHAR)
description-es_es (VARCHAR)
description-pt_br (VARCHAR)
price (DECIMAL)

The problem: every new language will need modify the table structure.

Here's another example:

[ products-en_us ]
id (INT)
name (VARCHAR)
description (VARCHAR)
price (DECIMAL)

[ products-es_es ]
id (INT)
name (VARCHAR)
description (VARCHAR)
price (DECIMAL)

The problem: every new language will need the creation of new tables and the "price" field is duplicated in every table.

Here's another example:

[ languages ]
id (INT)
name (VARCHAR)

[ products ]
id (INT)
price (DECIMAL)

[ translation ]
id (INT, PK)
model (VARCHAR) // product
field (VARCHAR) // name
language_id (INT, FK) 
text (VARCHAR)

The problem: hard?

Mysql Solutions


Solution 1 - Mysql

Similar to method 3:

[languages]
id (int PK)
code (varchar)

[products]
id (int PK)
neutral_fields (mixed)

[products_t]
id (int FK)
language (int FK)
translated_fields (mixed)
PRIMARY KEY: id,language

So for each table, make another table (in my case with "_t" suffix) which holds the translated fields. When you SELECT * FROM products, simply ... LEFT JOIN products_t ON products_t.id = products.id AND products_t.language = CURRENT_LANGUAGE.

Not that hard, and keeps you free from headaches.

Solution 2 - Mysql

Your third example is actually the way the problem is usually solved. Hard, but doable.

Remove the reference to product from the translation table and put a reference to translation where you need it (the other way around).

[ products ]
id (INT)
price (DECIMAL)
title_translation_id (INT, FK)

[ translation ]
id (INT, PK)
neutral_text (VARCHAR)
-- other properties that may be useful (date, creator etc.)

[ translation_text ]
translation_id (INT, FK)
language_id (INT, FK) 
text (VARCHAR)

As an alternative (not especially a good one) you can have one single field and keep all translations there merged together (as XML, for example).

<translation>
  <en>Supplier</en>
  <de>Lieferant</de>
  <fr>Fournisseur</fr>
</translation>

Solution 3 - Mysql

In order to reduce the number of JOIN's, you could keep separate the translated and non translated in 2 separate tables :

[ products ]
id (INT)
price (DECIMAL)

[ products_i18n ]
id (INT)
name (VARCHAR)
description (VARCHAR)
lang_code (CHAR(5))

Solution 4 - Mysql

At my $DAYJOB we use gettext for I18N. I wrote a plugin to http://search.cpan.org/~drtech/Locale-Maketext-Lexicon-0.77/script/xgettext.pl">xgettext.pl</a> that extracts all English text from the database tables and add them to the master messages.pot.

It works very well - translators deal with only one file when doing translation - the po file. There's no fiddling with database entries when doing translations.

Solution 5 - Mysql

[languages] id (int PK) code (varchar)

[products]
id (int PK)
name
price
all other fields of product
id_language ( int FK )

I actually use this method, but in my case, it's not in a product point of view, for the various pages in my CMS, this work's quite well.

If you have a lot of products it might be a headache to update a single one in 5 or 6 languages... but it's a question of working the layout.

Solution 6 - Mysql

What about fourth solution?

[ products ]
id (INT)
language (VARCHAR 2)
name (VARCHAR)
description (VARCHAR)
price (DECIMAL)
*translation_of (INT FK)*

Translation_of is FK of it self. When You add default language translation_of is set to Null. But when you add second language translation_of takes primary produkt language id.

SELECT * FROM products WHERE id = 1 AND translation_of = 1

In that case we get all translations for product with id is 1.

SELECT * FROM products WHERE id = 1 AND translation_of = 1 AND language = 'pl'

We get only product in Polish translation. Without second table and JOINS.

Solution 7 - Mysql

Have many to many relationship.

You have your data table, languages table and a data_language table.

In the data_language table you have

> id, data_id, language_id

I think that might work best for your.

Solution 8 - Mysql

We use this concept for our webiste (600k views per day) and (maybe surprisingly) it works. Sure along with caching and query optimalization.

[attribute_names]
id (INT)
name (VARCHAR)

[languages_names]
id (INT)
name (VARCHAR)

[products]
id (INT)
attr_id (INT)
value (MEDIUMTEXT)
lang_id (INT)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionThiago BelemView Question on Stackoverflow
Solution 1 - MysqlGipsy KingView Answer on Stackoverflow
Solution 2 - Mysqluser151323View Answer on Stackoverflow
Solution 3 - MysqlClémentView Answer on Stackoverflow
Solution 4 - MysqlholygeekView Answer on Stackoverflow
Solution 5 - MysqlTioView Answer on Stackoverflow
Solution 6 - MysqlchfView Answer on Stackoverflow
Solution 7 - MysqlAntonioCSView Answer on Stackoverflow
Solution 8 - MysqlPetr PellerView Answer on Stackoverflow