Model for storing dimension attributes in multiple languages
4 posters
Page 1 of 1
Model for storing dimension attributes in multiple languages
We have a traditional dimensional model and are about to add multiple languages to our source systems and need the option of reporting certain dimensional attributes in either the local language or English. Has anyone developed a flexible model for this that would allow adding multiple language versions of attributes without adding columns to the dimension table directly?
2 options I can think of are:
1) An abstracted translation dimension that contains the attribute name, language, and the original and translated version of the attribute content. The granularity of this dimension is attribute name, language, and original value of the attribute. With this option any attribute that becomes multi-language in the source system after initial implmentation requires only a new row in this table.
2a) A translated version of the original dimension. This table would contain the dimension surrogate key, natural key, language, and translated version of the attributes requiring translation. With this option any attribute that becomes multi-language in the source system after initial implmentation requires a new column in the translated table.
2b) Similar to 2a except it contains all the columns of the original dimension.
Thanks for any ideas, and especially for examples of models that are in production and working well.
2 options I can think of are:
1) An abstracted translation dimension that contains the attribute name, language, and the original and translated version of the attribute content. The granularity of this dimension is attribute name, language, and original value of the attribute. With this option any attribute that becomes multi-language in the source system after initial implmentation requires only a new row in this table.
2a) A translated version of the original dimension. This table would contain the dimension surrogate key, natural key, language, and translated version of the attributes requiring translation. With this option any attribute that becomes multi-language in the source system after initial implmentation requires a new column in the translated table.
2b) Similar to 2a except it contains all the columns of the original dimension.
Thanks for any ideas, and especially for examples of models that are in production and working well.
jefoster- Posts : 1
Join date : 2009-02-03
Re: Model for storing dimension attributes in multiple languages
I would not go with option 1. While it is a nice, simple, flexible model to construct and maintain on the back-end, it is a horrible model for querying. Pretty much any commercial BI tool would have a very difficult time working with such a model. Queries would be very complex, requiring an alias of the table for every descriptive attribute, plus you have so somehow hard-code column name filters in the BI meta-layer so it gets the right row.
Some variation of option 2 is the way to go. The option you mentioned, keep the existing dimension as-is and add a sub-dimension table with the same key plus a language code to contain translated descriptive attributes, is probably the most flexible and least disruptive (current applications will continue to work). You can carry this futher by implementing language specific schema in the database, which contain views for a specific local language, and users would be directed to their perferred language based on their user ID. Or, most BI tools support user specific content filters allowing you to transparently present the peferred language based on user ID. Also, since you keep the original dimension as is, there is no need to store the English version in the sub-table.
From a maintenance standpoint, you might as well put all descriptive attributes in the sub-table. If there is no translation at this time, leave it blank or populate it with the value in the base dimension table.
Some variation of option 2 is the way to go. The option you mentioned, keep the existing dimension as-is and add a sub-dimension table with the same key plus a language code to contain translated descriptive attributes, is probably the most flexible and least disruptive (current applications will continue to work). You can carry this futher by implementing language specific schema in the database, which contain views for a specific local language, and users would be directed to their perferred language based on their user ID. Or, most BI tools support user specific content filters allowing you to transparently present the peferred language based on user ID. Also, since you keep the original dimension as is, there is no need to store the English version in the sub-table.
From a maintenance standpoint, you might as well put all descriptive attributes in the sub-table. If there is no translation at this time, leave it blank or populate it with the value in the base dimension table.
Re: Model for storing dimension attributes in multiple languages
I wonder if there are new approaches to this topic.
One option that can be considered at a cost of performance is let the BI tool consume data from a set of VIEWs ... you can create a view for each translatable dimension and for each language. With this solution you could have the flexibility of first approach but avoiding all the hard work in the OLAP tool, at a the cost of performance ... however, since most OLAP tools have great cache features the impact of view query complexity (performance) could also be reduced.
im sure here are people much more experienced than me ... hope they will point the weaknesses of this approach.
One option that can be considered at a cost of performance is let the BI tool consume data from a set of VIEWs ... you can create a view for each translatable dimension and for each language. With this solution you could have the flexibility of first approach but avoiding all the hard work in the OLAP tool, at a the cost of performance ... however, since most OLAP tools have great cache features the impact of view query complexity (performance) could also be reduced.
im sure here are people much more experienced than me ... hope they will point the weaknesses of this approach.
blacar- Posts : 1
Join date : 2013-02-15
Re: Model for storing dimension attributes in multiple languages
I think the best approach is to copy the dimension tables completely into new tables. So, for example, if you have a 'PersonDim' table you could copy it into a 'PersonDimSpanish' table. You would keep all the numeric keys the same and change all the translatable columns into the proper languate. You would also change the format on the dates.
I like this better than the sub-dimension table for the following reasons:
1. The queries will all have similar performance, regardless of which language you are querying.
2. Users will only query a single language.
3. You avoid the join to the sub-dimension, so the queries are simpler.
Downsides:
1. You are duplicating columns that don't need translation. However, there are not many of these in a dimension!
2. Each user has to query different tables when using different languages. However, this can be handled in the application based on the user's login. And this problem exists when you use a sub-dimension anyways.
So I think that it's simpler and the downsides are negligible...
Thoughts?
I like this better than the sub-dimension table for the following reasons:
1. The queries will all have similar performance, regardless of which language you are querying.
2. Users will only query a single language.
3. You avoid the join to the sub-dimension, so the queries are simpler.
Downsides:
1. You are duplicating columns that don't need translation. However, there are not many of these in a dimension!
2. Each user has to query different tables when using different languages. However, this can be handled in the application based on the user's login. And this problem exists when you use a sub-dimension anyways.
So I think that it's simpler and the downsides are negligible...
Thoughts?
sbendayan- Posts : 6
Join date : 2013-12-20
Re: Model for storing dimension attributes in multiple languages
That's another approach, and will work fine. The only downside is a situation where you need to create bilinqual (or multi-linqual) reports. It can still be done, but access permissions get messy.
Similar topics
» Dimension Attributes and Fact attributes storing same data in multiple data marts??
» Date Dimension, multiple timezone and offset attributes
» Attributes from a dimension appearing in other dimensions; multiple-inheritance in hierarchies.
» How to Model Fact table having 1:1 relationship with key Dimension attributes
» How to model article dimension for unpredictable changes with multiple versions and multiple article categories
» Date Dimension, multiple timezone and offset attributes
» Attributes from a dimension appearing in other dimensions; multiple-inheritance in hierarchies.
» How to Model Fact table having 1:1 relationship with key Dimension attributes
» How to model article dimension for unpredictable changes with multiple versions and multiple article categories
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum