How to report on sparse areas of sparse fact table
2 posters
Page 1 of 1
How to report on sparse areas of sparse fact table
A source system tracks student attendance for a school district by reporting absence events. Attendance on any particular day can be determined by examining three datasets: school calendar, student enrollment, and absence.
On any given school day, the number of enrolled students in attendance is usually much larger than the number that are absent, so this approach reduces the number of records stored to track attendance significantly.
I am trying to determine the proper way to represent daily attendance in a dimensional model. The most obvious way is to create a factless table with a grain per school day per student, and an attendance dimension that has values for both attendance and absence reasons. This is quite straightforward to work with OLAP, but the downside is the size of the fact table.
For example, for 30,000 students and 188 school days means that there are approximately 0.5 million records per year (if this doesn't seem large enough to be an issue, then consider an example in which attendance must be reported on per period rather than per day). Contrast this to a fact table that records only absences and the number is considerably smaller. However, if I do this, then I am not sure how to build cubes that aggregate daily attendance facts.
The specific OLAP technology being used is SQL Server Analysis Services 2008 R2. Any thoughts?
On any given school day, the number of enrolled students in attendance is usually much larger than the number that are absent, so this approach reduces the number of records stored to track attendance significantly.
I am trying to determine the proper way to represent daily attendance in a dimensional model. The most obvious way is to create a factless table with a grain per school day per student, and an attendance dimension that has values for both attendance and absence reasons. This is quite straightforward to work with OLAP, but the downside is the size of the fact table.
For example, for 30,000 students and 188 school days means that there are approximately 0.5 million records per year (if this doesn't seem large enough to be an issue, then consider an example in which attendance must be reported on per period rather than per day). Contrast this to a fact table that records only absences and the number is considerably smaller. However, if I do this, then I am not sure how to build cubes that aggregate daily attendance facts.
The specific OLAP technology being used is SQL Server Analysis Services 2008 R2. Any thoughts?
Last edited by gxclarke on Sun Oct 09, 2011 7:03 pm; edited 1 time in total (Reason for editing : Corrected typo)
gxclarke- Posts : 2
Join date : 2011-10-08
Re: How to report on sparse areas of sparse fact table
Hi gxclarke,
Your "obvious" design sounds more useful to me.
From my experience with SQL Server Analysis Services 2008 R2, I'd say your data volumes are relatively light. I'd expect you could comfortably process that data on a commodity server in a matter of minutes, and expect query response times below 5 seconds.
For example, on a virtual dev server at my current site I can process a roughly equivalent-sized fact (2.5m rows) in 30 seconds (including "30%" aggregations).
Good luck!
Mike
Your "obvious" design sounds more useful to me.
From my experience with SQL Server Analysis Services 2008 R2, I'd say your data volumes are relatively light. I'd expect you could comfortably process that data on a commodity server in a matter of minutes, and expect query response times below 5 seconds.
For example, on a virtual dev server at my current site I can process a roughly equivalent-sized fact (2.5m rows) in 30 seconds (including "30%" aggregations).
Good luck!
Mike
Similar topics
» Sparse data in fact table
» Periodic snapshot fact tables with sparse data
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» Integrating new fact table which has one to many relationship with the main fact table in existing star schema
» How to create fact table with measures derived from comparing two fact table rows
» Periodic snapshot fact tables with sparse data
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» Integrating new fact table which has one to many relationship with the main fact table in existing star schema
» How to create fact table with measures derived from comparing two fact table rows
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum