Language translation for large number of codes

dennisf Fri Aug 07, 2009 5:52 pm

The significant dimension tables in our data model each contain 25 or so codes. Example currency code, category code

etc. Each of these dimension tables have different sets of codes i.e they do not share identical sets of codes. Each

code requires a short and long description for each supported language. The descriptions do not change frequently.

One model would be to store the three attributes required for each code i.e code, short desc and long desc with a

language prefix in the dimension table itself thus:

-- Language English - Currency Code
English Currency Code
English Currency Short Description
English Currency Long Description

-- Language Chinese - Currency Code
Chinese Currency Code
Chinese Currency Short Description
Chinese Currency Long Description

-- Language English - Category Code
English Category Code
English Category Short Description
English Category Long Description

-- Language Chinese - Category Code
Chinese Category Code
Chinese Category Short Description
Chinese Category Long Description
...........
...........
...........

If there were 15 supported languages then 15 * 25* 3 = 1125 distinct attributes would be required. The BI layer, Cognos, knows

the language from the login id. The attribute named say CURRENCY SHORT DESCRIPTION in the Cognos metadata layer

could be a CASE statement returning the appropriate attribute based on the language code of the logged in user. The

other two attributes can be treated similarly. However, the problem with this approach is the large number of

attributes in each dimension table. I would appreciate any suggestions/comments/solutions.

BoxesAndLines Fri Aug 07, 2009 10:05 pm

I would not consider 60 columns large. I think your design is fine.

dennisf Sat Aug 08, 2009 11:20 am

Thanks for the response. I have two comments/questions.

1. These code columns are not the only columns in each of the dimension tables. So how many columns is large for an Oracle table? Is there a performance impact? If there are a large number of columns then there are fewer rows in each block. When a read and a subsequent update is execeuted all the columns not impacted by the update still go for the ride consuming IO channel bandwidth and memory in the Oracle cache?
What are the considerations for separating attributes into multiple tables that share the same primary key? What considertions justify the two joins instead of one?

2. I thought that these 20 * 3 code attributes should be in a separate table with the same primary key as in a MINI DIMENSION, but on closer examination the reasons for a mini dimension does not exist? My understanding of MINI DIMENSIONS is when you need TYPE 2 for a large dimension. Since a new record duplicates every single attribute we isolate the SLOWLY CHANGING dimensions into a separate table and apply TYPE 2 only on that table. The other non slowly changing dimensions are not duplicated in vain.

I would appreciate any suggestions/comments.

Cheers!

BoxesAndLines Sat Aug 08, 2009 2:13 pm

I don't make mini dimensions or other tuning changes unless I know I have a problem. If the basic dimensional model will support your queries then don't bother tweaking every last bit of performance out of Oracle. The most common problem I see is developers putting all of the dimensional columns on the fact table along with the surrogate dimension key. Kind of defeats the whole principle of dimensional modeling. Unless your dimensions are in the multi million row range your joins should be fine.

dennisf Sat Aug 08, 2009 4:00 pm

My original post had a math error which I have correc etd. The number of attributes should be thus:
15 languages * 25 codes * 3 columns each code = 1125 attributes.

Sponsored content

Language translation for large number of codes

Language translation for large number of codes

Re: Language translation for large number of codes

Language translation for large number of codes

Re: Language translation for large number of codes

Language translation for codes

Re: Language translation for large number of codes