1 instance or 2 in SQL Server 2008
4 posters
Page 1 of 1
1 instance or 2 in SQL Server 2008
I work for a Health care insurance company. I have 2 lines of business. The same data exists for both lines. 99% of the time, queries will be for either Line of business A or line of Business B. Very rarely will queries ever go against data for both lines.
The size of the big fact tables is in the 100s of millions of rows - so the data is not huge but it's not tiny. And remember, we are using SQL Server and not a unix based software.
I was thinking of putting the data for the 2 lines of business on 2 instances. The databases would have the exact same structure. The reason for doing this is purely performance of the queries. If Line of business A will never touch the other line of business B's data, then why force queries by LOB A to search through LOB B's data.
The fact tables are partitioned based on date (year).
One of the issues is the member dimension. Line of business A has 22 million members but relatively few claims. The Line of business B has 3 million members, but lots of claims.
I was planning to share the Dimension tables. For the member dimension, I was going to create 2 dimensions with the exact same structure - 1 member dimension would have surrogates that are positive numbers and the other would have surrogates that are negative numbers, in case I ever wanted to create a union join of the 2 dimension tables.
As I said, the primary reason for doing the 2 instances is for performance. But, to create the 2 instances, I have to allocate ROM and I think CPUs to each instance. A server with 4 quad core processors and 16 GB of ram effectively becomes 2 servers with 2 quad core processors and 8 GB of RAM.
Jeff
The size of the big fact tables is in the 100s of millions of rows - so the data is not huge but it's not tiny. And remember, we are using SQL Server and not a unix based software.
I was thinking of putting the data for the 2 lines of business on 2 instances. The databases would have the exact same structure. The reason for doing this is purely performance of the queries. If Line of business A will never touch the other line of business B's data, then why force queries by LOB A to search through LOB B's data.
The fact tables are partitioned based on date (year).
One of the issues is the member dimension. Line of business A has 22 million members but relatively few claims. The Line of business B has 3 million members, but lots of claims.
I was planning to share the Dimension tables. For the member dimension, I was going to create 2 dimensions with the exact same structure - 1 member dimension would have surrogates that are positive numbers and the other would have surrogates that are negative numbers, in case I ever wanted to create a union join of the 2 dimension tables.
As I said, the primary reason for doing the 2 instances is for performance. But, to create the 2 instances, I have to allocate ROM and I think CPUs to each instance. A server with 4 quad core processors and 16 GB of ram effectively becomes 2 servers with 2 quad core processors and 8 GB of RAM.
Jeff
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: 1 instance or 2 in SQL Server 2008
I had an expririence with something like this at a Health Insurer in the mid 90's.
They cloned their operational system for each line of business. There was an 'HMO' system, a 'PPO' system, and a 'Medicare' system. Each a slightly modified version of the base system. They then followed the same model when building the data warehouse. There were three complete sets of almost identical tables, one for each product line.
I don't know where you work, but at this Insurer, they were always looking across business lines. Reporting was a very tedious and labor intensive challenge.
They cloned their operational system for each line of business. There was an 'HMO' system, a 'PPO' system, and a 'Medicare' system. Each a slightly modified version of the base system. They then followed the same model when building the data warehouse. There were three complete sets of almost identical tables, one for each product line.
I don't know where you work, but at this Insurer, they were always looking across business lines. Reporting was a very tedious and labor intensive challenge.
Re: 1 instance or 2 in SQL Server 2008
Hi Jeff,
I agree with ngalemmo's comments - splitting the content could create a nightmare down the track. From my experience in these scenarios, the worst problem is that the schema and logic starts out the same, then slowly drifts further and further apart, typically because specific projects only are of interest to one LOB or the other. You then end up with massive duplication of slightly different code, both in the datawarehouse and ETL layer, but also in queries, reports and cubes.
I'd suggest keeping everything in one database, and just include a simple LOB dimension linked to every Fact. Ensure that each Fact's FK to the LOB dimension has an index. I think in your scenario that will provide more than adequate performance. It's also easily adapted to LOB changes (e.g. M&A).
If you must achieve total separation on disk, I'd suggest two databases within one SQL instance would give you much better use of resources than two instances. That architecture also has some hope of supporting federated queries.
Good luck!
Mike
I agree with ngalemmo's comments - splitting the content could create a nightmare down the track. From my experience in these scenarios, the worst problem is that the schema and logic starts out the same, then slowly drifts further and further apart, typically because specific projects only are of interest to one LOB or the other. You then end up with massive duplication of slightly different code, both in the datawarehouse and ETL layer, but also in queries, reports and cubes.
I'd suggest keeping everything in one database, and just include a simple LOB dimension linked to every Fact. Ensure that each Fact's FK to the LOB dimension has an index. I think in your scenario that will provide more than adequate performance. It's also easily adapted to LOB changes (e.g. M&A).
If you must achieve total separation on disk, I'd suggest two databases within one SQL instance would give you much better use of resources than two instances. That architecture also has some hope of supporting federated queries.
Good luck!
Mike
Last edited by Mike Honey on Mon Oct 10, 2011 12:25 am; edited 2 times in total (Reason for editing : brain fade)
Re: 1 instance or 2 in SQL Server 2008
We've decided to keep the data in 1 instance and modify the partitions. We are going to create a "Paid_Date_Key" that is connected to a dimension that is essentially the Date Dimension with the Line of Business identifier. We will denormalize the Date Dimension to the Paid Date Dimensions. The Surrogate Key for the Paid Date Dimension is the Date_Key plus either 0 for Line of Business A or 10,000,000 for Line of Business B. We will create partitions in a way that Claims paid in January 2011 for LOB A goes into 1 partition and Claims paid for January 2011 for LOB B go into a different partition. As long as the query includes the LOB and the Paid Date, the database should be able to go directly to the proper database.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: 1 instance or 2 in SQL Server 2008
If you use the Slicer properties within SSAS you can improve your cube performance to restrict it to only the data required for that partition.
Similar topics
» SQL Server 2008, page compression and indexing
» Fact Indexing -SQL Server 2008
» rule of thumb regarding log space for SQL Server 2008
» SK generation in SQL Server 2005/2008
» SQL Server 2008 Date data type as dimension key
» Fact Indexing -SQL Server 2008
» rule of thumb regarding log space for SQL Server 2008
» SK generation in SQL Server 2005/2008
» SQL Server 2008 Date data type as dimension key
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum