Conformed DW on single or multiple SQL Server DB?
Page 1 of 1 • Share •
Conformed DW on single or multiple SQL Server DB?
We are moving towards an integrated Enterprise Data Warehouse database architecture approach, where we wish to implement conformed dimensions across multiple subject areas. This entails distinct Staging, Integration, and Presentation layers, in a new proposed EDW architecture roadmap.
We would like to get input, however, on the advisability of implementing this Presentation layer within a single physical database in a SQL Server environment, and whether there are practical limitations around doing so, and if so, what those are, and how to address these while still achieving our goals.
While the single Presentation database approach seems clearly to be preferable for several reasons, there may be some general considerations related to this that we’d like to think through, such as database size and maintenance windows, as well as how to isolate user access to subject areas that have been updated while still loading others, if these are within a single database environment. Have others come across any of these sorts of limitations (or others) within an enterprise data warehouse implemented on SQL Server that we should be aware of?
As a way to mitigate those possible limitations if they exist, we’re also interested in learning about implementing conformed dimensions across multiple databases to achieve our goals of an integrated enterprise data warehouse. How would that work from an RI and user standpoint?
Any input, specific SQL Server Data Warehouse considerations we should be aware of, or implementation options, particularly around how to make the conformed dimensions work as intended if we need to have multiple databases, are much appreciated. We are a State University and so with state budget situation, have no training or consulting budget.
We would like to get input, however, on the advisability of implementing this Presentation layer within a single physical database in a SQL Server environment, and whether there are practical limitations around doing so, and if so, what those are, and how to address these while still achieving our goals.
While the single Presentation database approach seems clearly to be preferable for several reasons, there may be some general considerations related to this that we’d like to think through, such as database size and maintenance windows, as well as how to isolate user access to subject areas that have been updated while still loading others, if these are within a single database environment. Have others come across any of these sorts of limitations (or others) within an enterprise data warehouse implemented on SQL Server that we should be aware of?
As a way to mitigate those possible limitations if they exist, we’re also interested in learning about implementing conformed dimensions across multiple databases to achieve our goals of an integrated enterprise data warehouse. How would that work from an RI and user standpoint?
Any input, specific SQL Server Data Warehouse considerations we should be aware of, or implementation options, particularly around how to make the conformed dimensions work as intended if we need to have multiple databases, are much appreciated. We are a State University and so with state budget situation, have no training or consulting budget.
tembree- Posts: 1
Join date: 2009-03-30
Re: Conformed DW on single or multiple SQL Server DB?
The most common solution is to have a monolithic presentation database: a single database that contains the enterprise data warehouse, all conformed dimensions, and the fact tables that hook into them. This is true for SQL Server as for other RDBMSs. It's the most popular architecture because it's (usually) simplest, and because it enables easy connection between subect areas.
It sounds like you may be concerned about scalability. I have seen quite a few terabyte+ databases implemented on SQL Server, which in my experience should meet the needs of a university DW. I have seen complex, multi-subject-areas DWs, and I've seen DWs with many users doing all sorts of crazy things. Combine those different dimensions of "scale," and complexity increases substantially. In other words, if you're trying to solve a hard, large-scale problem, don't expect to be able to get there simply by reading Books Online (ha!). But it is possible to get there on SQL Server.
That said, there's nothing about the Kimball Method that requires a monolithic data warehose database. It is quite feasible to "chunk out" pieces of the DW to multiple servers. Of course, that's quite common with very large systems. One common approach is to identify one server as the Dimension Manager; the final step of dimension processing is to replicate the dimension wherever it's used (using replication or simply copying). With most dimension tables, this task performs fast enough (especially if you do not enforce RI between facts and dimensions in the database).
In the SQL Server world, it's also common to use Analysis Services as the presentation database. Analysis Services does make it easy to do some of the things you mentioned, such as allowing users to query a consistent copy of the database during processing.
It sounds like you may be concerned about scalability. I have seen quite a few terabyte+ databases implemented on SQL Server, which in my experience should meet the needs of a university DW. I have seen complex, multi-subject-areas DWs, and I've seen DWs with many users doing all sorts of crazy things. Combine those different dimensions of "scale," and complexity increases substantially. In other words, if you're trying to solve a hard, large-scale problem, don't expect to be able to get there simply by reading Books Online (ha!). But it is possible to get there on SQL Server.
That said, there's nothing about the Kimball Method that requires a monolithic data warehose database. It is quite feasible to "chunk out" pieces of the DW to multiple servers. Of course, that's quite common with very large systems. One common approach is to identify one server as the Dimension Manager; the final step of dimension processing is to replicate the dimension wherever it's used (using replication or simply copying). With most dimension tables, this task performs fast enough (especially if you do not enforce RI between facts and dimensions in the database).
In the SQL Server world, it's also common to use Analysis Services as the presentation database. Analysis Services does make it easy to do some of the things you mentioned, such as allowing users to query a consistent copy of the database during processing.

Joy- Posts: 11
Join date: 2009-02-03
Location: Kimball Group

Permissions of this forum:
You cannot reply to topics in this forum





