Data types for staging area db tables
4 posters
Page 1 of 1
Data types for staging area db tables
I am doing my first real world dwh project and have not found any valuable information or best practice guidelines regarding which datatypes to choose for stagring area tables.
I have some ssis packages which are reading data from flatfile sources and filling tables of my SQL Server database. Transformations and data integration is performed by a T-SQL stored procedure after the initial import. Now i have no idea wether it's ok to use char/varchar datatypes for all columns or if i should do the tedious work of choosing and assigning the expected datatype to each of my 200+ table columns.
Are there any benefits / drawbacks of loose vs. strong typing regarding dwh tables in general and regarding initial staging talbes for 1:1 imports? What is your approach? What would you suggest?
thanks 4 your help & kind regards
I have some ssis packages which are reading data from flatfile sources and filling tables of my SQL Server database. Transformations and data integration is performed by a T-SQL stored procedure after the initial import. Now i have no idea wether it's ok to use char/varchar datatypes for all columns or if i should do the tedious work of choosing and assigning the expected datatype to each of my 200+ table columns.
Are there any benefits / drawbacks of loose vs. strong typing regarding dwh tables in general and regarding initial staging talbes for 1:1 imports? What is your approach? What would you suggest?
thanks 4 your help & kind regards
at710- Posts : 10
Join date : 2012-11-02
Re: Data types for staging area db tables
Well, that's what data modeling is all about. If you don't care to let the database enforce anything whatsoever and do every validation through ETL then you're really doing the same thing. My take is if you're coding up data validation in your ETL then you are better off letting the database do it since the database never has coding errors.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Data types for staging area db tables
That's a good point indeed. I'd have to write at least some data check routines/queries myself if i don't set appropriate datatypes. I think I will stick to the strong typing paradigm when designing my staging area.
Thanks for your advice!
Thanks for your advice!
at710- Posts : 10
Join date : 2012-11-02
Re: Data types for staging area db tables
Hi ... at ?
I certainly wouldn't recommend modeling every column as char/varchar, but I do think it's worthwhile standardising to a handful of datatypes, e.g. nvarchar (250) for string data, datetime2 for dates etc.
If your ETL layer is SQL SPs it may not matter too much in that layer, but other ETL and downstream tools (I'm looking at you, SSIS & SSAS), can be hypersensitive to datatype changes.
You also want to minimize your exposure to trivial datatype changes upstream, e.g. columnX changes from varchar ( 30 ) to varchar ( 60 ).
Good luck!
Mike
I certainly wouldn't recommend modeling every column as char/varchar, but I do think it's worthwhile standardising to a handful of datatypes, e.g. nvarchar (250) for string data, datetime2 for dates etc.
If your ETL layer is SQL SPs it may not matter too much in that layer, but other ETL and downstream tools (I'm looking at you, SSIS & SSAS), can be hypersensitive to datatype changes.
You also want to minimize your exposure to trivial datatype changes upstream, e.g. columnX changes from varchar ( 30 ) to varchar ( 60 ).
Good luck!
Mike
Re: Data types for staging area db tables
It all depends on the source of the data feeds. You must ultimately store the data in its native data type, so you must know what type it is.
If you are receiving data in text files or XML, the data needs to be converted. If the data is from a trusted source that you know correctly represents the data in well formed strings, you don't have to do much validation. On the other hand, if the data is from an unknown source, such as a web log of user entered information, you need to check to see if strings are in the proper form.
If you are receiving data in text files or XML, the data needs to be converted. If the data is from a trusted source that you know correctly represents the data in well formed strings, you don't have to do much validation. On the other hand, if the data is from an unknown source, such as a web log of user entered information, you need to check to see if strings are in the proper form.
Similar topics
» Start Schema vs Snow flake schema
» Data in Staging area
» facts and dimension in staging area
» Fact Tables Types in AdventureWorks Microsoft Sample DB
» Staging area.. is it a must?
» Data in Staging area
» facts and dimension in staging area
» Fact Tables Types in AdventureWorks Microsoft Sample DB
» Staging area.. is it a must?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum