Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

modelling questions & answers dimensions where answer can be multiple choice or freetext

2 posters

Go down

modelling questions & answers dimensions where answer can be multiple choice or freetext Empty modelling questions & answers dimensions where answer can be multiple choice or freetext

Post  wilson_smyth Wed Jun 03, 2015 7:31 am

Im building a star schema to allow reporting against usage of an application.
Application has a section where user answers questions.

A question can have the following types of answers:
- multiple choice (radio buttons), chose 1 out of 4.
- multiple choice (tick box), chose one or more of the following.
- free text (text box), what are your thoughts on...? or if "other" radio button is chosen.

I was considering just a questions dimension, and an answers dimension.
The problem with this is the free text will pollute the answers dimension with lots of answers that are specific to only one user on one question.
This model also means that there is a row in the fact table for each answer, which i dont think is ideal.

Note I will have to report on questions that have not been answered by users as well as what has been answered.

Id appreciate some suggestions on how best to approach this.

Thanks!

wilson_smyth

Posts : 1
Join date : 2015-06-03

Back to top Go down

modelling questions & answers dimensions where answer can be multiple choice or freetext Empty Re: modelling questions & answers dimensions where answer can be multiple choice or freetext

Post  ngalemmo Wed Jun 03, 2015 6:33 pm

If you don't want answer to be the grain, then I assume you want question as the grain. In that case you have a bridge table that relates the instance of the question to multiple answers. There are a couple of ways you can build and maintain such a bridge.

The question about collecting the free-form text is a business one.  Do they want it, do the want to pay for it.  There are a number of ways this data is collected and consumed by analytics.  If they want it and don't know what to do with it, store it in a separate structure from the primary data so it doesn't slow most queries. It would be a dimension of the instance of the fact .  

Note that some databases do this internally.  My understanding is that SQL server stores large objects (text, blobs) in a separate storage structure that is referenced from the main columnar table.  Having such columns in the table does not hinder the performance of queries that do not use the column.  Whereas if the data is stored in the same physical row (as some databases do), queries that do not use the column must still deal with a very wide row, requiring significantly more I/O to get through the table.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum