Seeking suggestions on how to potentially model 0:M scenario

cjtravis Wed Jan 07, 2015 5:42 pm

Greetings,

Perhaps I'm overthinking the situation, but I've come to a mental roadblock on how to sensibly model a scenario within our data warehouse. Simply put, our call center takes in a number of cases each day. Transactionally speaking, the cases are a sort-of "parent-child" relationship in the sense that a case may or may not have case notes. Each new case note ties to an existing case number in the header table.

As expected, due to the nature of the data, this is really large data - a lot of free form text detailing the call/problem being logged in the system - and all information that our reporting end users do want to see.

My question is, how do I model the "zero:many" at the case note level? So far, I have gone about it by having a fact at the case note level but several (if not a majority of) cases may never have notes attached to them. Maintaining the fact at this level would be strange.

I am thinking of creating one fact that is at the CASE NOTE level

CASE_ID
NOTE_SEQ_NBR
Other Dim FKs

Dimensions:

Case Profile Dimension (Junk/Low Cardinality)
Case Detail Dim (High Cardinality, Free Form Fields)
Case Note Detail Dim (High Cardinality, Free Form Fields)

Maybe it's been a long day, but I'm struggling with how to model this with one fact (which may also be my problem). I don't want to repeat the large text fields in a single dimension if I can avoid it. Thanks for your help.

nick_white Thu Jan 08, 2015 4:21 am

Hi - Kimball talks about this in his book (search for "Freeform Text Comments"). To summarise: don't put freeform Notes in Fact tables, either put them in the relevant dimension or create a Notes Dimension that can hold notes created in relation to any activity.

Given that a dimensional model is there to support analytical queries and you can't use freeform text fields in such queries (or not in anyway that makes sense) my personal preference is to put notes in a standalone table. I don't consider it strictly part of the dimensional model but more a standalone table that the the resultset of an analytical query can be joined to

Regards,

BoxesAndLines Thu Jan 08, 2015 1:45 pm

Kimball or any other relational model doesn't handle unstructured data well. If you want to use a dimensional model, you need to do some preprocessing to make it structured. This is one good use for big data, sentiment analysis, etc.

Sponsored content

Seeking suggestions on how to potentially model 0:M scenario

Seeking suggestions on how to potentially model 0:M scenario

Re: Seeking suggestions on how to potentially model 0:M scenario

Re: Seeking suggestions on how to potentially model 0:M scenario

Re: Seeking suggestions on how to potentially model 0:M scenario