MS SQL Server Collation
2 posters
Page 1 of 1
MS SQL Server Collation
I've started migrating our data warehouse from Oracle to MS SQL and ran into an issue that bit me converting the ETL over to SSIS. It appears that out of the box, the database collation is case-insensitive. I wrote a query to presort some data and do a merge join on some other data. This bit me as the merge join transform IS case-sensitive and it failed to join some rows because it was not sorted as it expected. My question is, what is the standard practice for building data warehouse's on MS SQL? Should I create the database using a case-sensitive collation, only sort in the dataflow, or do what Andy Leong suggested here: http://sqlblog.com/blogs/andy_leonard/archive/2010/05/28/ssis-is-case-senstivie.aspx and use an UPPER() join column. But if I go with Andy's suggestion, how do you handle the case where the source system is case-sensitive and "Andy" is different from "AnDy"? I checked in the MDWT 2008R2 book, but could not find any reference to collation.
mugen_kanosei- Posts : 13
Join date : 2009-02-03
Age : 42
Location : Japan
Re: MS SQL Server Collation
Case sensitivity could affect query result involving joins and delta detection in DW. But sometimes we do want most attribute values to be case insensitive, for fussy matching purpose, and "Andy" should be the same person as "AnDy" if all other identifying attributes match as well. I would treat case sensitive attributes individually by defining them using "COLLATE Latin1_General_CS_AS" option in CREATE statement. Once defined as case sensitive, all the joins should work without having anything extra in the queries.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Similar topics
» Any Benefit to Defining Foreign Keys in SQL Server without enforcing foreign key constraint in SQL Server
» Implementing secondary indexes on fact tables. Is it a good idea?
» ETL from SAP ECC to SQL Server DW
» RAID, SAN & Equality on DEVL & PROD environments
» Run ETL on VMWare server
» Implementing secondary indexes on fact tables. Is it a good idea?
» ETL from SAP ECC to SQL Server DW
» RAID, SAN & Equality on DEVL & PROD environments
» Run ETL on VMWare server
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|