Dimension modeling for academic data
3 posters
Page 1 of 1
Dimension modeling for academic data
What am doing some initial planning of the steps I need to take to create a DW for a medical school. Tables would include students, courses, exams, questions, exam_questions, grades, stats and such. If a fact table is where I would store performance measurement like grades or stats, then all my dimensions tables would be stuff like students, courses, exams... correct? If this is a star schema would grades be my primary table? Seems my fact table would be huge, and my dimensions not so much. My current data is just over 500 students and 5 years of data. They are broken out to around 100 students per year for each of the Class_Years...(1,2,3,4)
Currently reading: The Data Warehouse Toolkit, 2nd Edition and The Data Warehouse Lifecycle Toolkit 2nd Edition.
I don't have any money in the budget and trying to build this on MySQL and utilize Pentaho CE as the front end.
Currently reading: The Data Warehouse Toolkit, 2nd Edition and The Data Warehouse Lifecycle Toolkit 2nd Edition.
I don't have any money in the budget and trying to build this on MySQL and utilize Pentaho CE as the front end.
SFDonovan- Posts : 2
Join date : 2010-08-15
Re: Dimension modeling for academic data
I am doing DW in education area. However we are dealing with half million students each year for the state government. I think that's a proper size for DW venture for the performance benefit when querying against multi-year historical data.
In your case, I am not sure about the justification for different architecture (DW) that is squarely targeted at significant size of database for performance reason. Perhaps the current system needs to be further normalised and have more maintainable structure in place, and have some logical layer (views) based on the dimensional concepts you found in Kimball’s books to cater for OLAP analysis by the tools. I would not bother creating any sophisticated ETL process rather than just simple data dump exercise.
In your case, I am not sure about the justification for different architecture (DW) that is squarely targeted at significant size of database for performance reason. Perhaps the current system needs to be further normalised and have more maintainable structure in place, and have some logical layer (views) based on the dimensional concepts you found in Kimball’s books to cater for OLAP analysis by the tools. I would not bother creating any sophisticated ETL process rather than just simple data dump exercise.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Dimension modeling for academic data
I'm not worried about performance yet as my db is small. My total count of students is over 3000 but I don't have data for all 3000. Most of the data prior to 1990 are in hard copy in filing cabinets. I have some grade data I could extract from webCT, and the rest from our in-house exam tool. We only have an OLTP written in Filemaker that serves the entire medical school, but the application has very little reporting and analysis. This is what I'm trying to correct with a DW. I'm just unsure of what architecture to use. I see that Pentaho Community Edition is a open source BI tool I could use for a MySQL database. I just need to create a DW that can utilize this technology.
SFDonovan- Posts : 2
Join date : 2010-08-15
Re: Dimension modeling for academic data
Your thinking makes sense given the nature of your source system. Since you have to rehost the data for reporting, you might as well restructure it into a dimensional model to make that reporting easier. Given your data size and budget, MySQL and Pentaho ought to be fine. The real challenge, as you are figuring out, is designing a good target dimensional model.
I would suggest working within the academic community to see what other schools, especially medical schools have come up with. There is a group called the Higher Education Data Warehousing forum (http://www.hedw.org/). that meets every year. Join up with them, and try and contact some of the relevent member schools directly. Educause has a data warehouse resource section as well: http://www.educause.edu/Resources/Browse/Data%20Warehouse/17982
Getting some ideas from others in your institution/data/requirements space will help you better understand how to design your specific data model for analytics. The great thing about the academic world is you can actually go ask others and they tend to respond.
I would suggest working within the academic community to see what other schools, especially medical schools have come up with. There is a group called the Higher Education Data Warehousing forum (http://www.hedw.org/). that meets every year. Join up with them, and try and contact some of the relevent member schools directly. Educause has a data warehouse resource section as well: http://www.educause.edu/Resources/Browse/Data%20Warehouse/17982
Getting some ideas from others in your institution/data/requirements space will help you better understand how to design your specific data model for analytics. The great thing about the academic world is you can actually go ask others and they tend to respond.
warrent- Posts : 41
Join date : 2008-08-18
Similar topics
» Modeling Student Academic Programs
» Dimension Modeling for Big Data
» Modeling Supporting Dimension Data in a Enterprise DW
» Data WebHouse WebObject Dimension modeling
» Designing a data warehouse to store academic publications for natural language processing
» Dimension Modeling for Big Data
» Modeling Supporting Dimension Data in a Enterprise DW
» Data WebHouse WebObject Dimension modeling
» Designing a data warehouse to store academic publications for natural language processing
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum