Fact table Modeling (1:n relations)

ThomasP Thu Oct 10, 2013 6:43 am

Hello,

in my storage data model I got the following relations:

root_tbl -- 1:n -- entry_tbl -- n:1 -- action_tbl

There are a few more tables but this covers the basics.
Alright, so basically one ID from the root table has several datasets in the entry table.

Example data:

root_tbl:
ID_root ; Country ; FK_User ; FK_Product
1 ; UK ; 23 ; 31
2 ; NL ; 42 ; 01

entry_tbl:
ID_entry ; FK_root ; FK_Action ; Duration
1 ; 1 ; 42 ; 200ms
2 ; 1 ; 10 ; 94ms
3 ; 1 ; 9 ; 300ms
4 ; 2 ; 10 ; 322ms
5 ; 2 ; 30 ; 100ms

So far so good ... with this data model it is pretty easy to answer things like how many records have "UK" as country with action "10" and so on.
Now I would like to put this data into a fact table but my problem are the relations of these three tables. For example would I use the records of entry_tbl as fact than I would have to do a select distinct on ID everytime I count country, user or product.

The fact table would look more or less like this (just imagine the strings as foreign keys ):

entry_tbl:
ID ; FK_root ; FK_Action ; Duration ; Country ; User ; Product
1 ; 42 ; 200ms ; UK ; 23 ; 31
1 ; 10 ; 94ms ; UK ; 23 ; 31
1 ; 9 ; 300ms ; UK ; 23 ; 31
2 ; 10 ; 322ms ; NL ; 42 ; 01
2 ; 30 ; 100ms ; NL ; 42 ; 01

This means I would have a lot of redundant data.

Are there any way around these solution? The fact table would contain ~ 300 - 500m rows.

I hope you got my point. If anything is not clear feel free to ask

regards
Thomas

ngalemmo Thu Oct 10, 2013 9:14 am

Yes. If that bothers you, you can always create an additional aggregate fact that does not include action.

ThomasP Thu Oct 10, 2013 9:24 am

so basically a denormalized fact table like this

ID ; FK_root ; FK_Action ; Duration ; Country ; User ; Product
1 ; 42 ; 200ms ; UK ; 23 ; 31
1 ; 10 ; 94ms ; UK ; 23 ; 31
1 ; 9 ; 300ms ; UK ; 23 ; 31
2 ; 10 ; 322ms ; NL ; 42 ; 01
2 ; 30 ; 100ms ; NL ; 42 ; 01

would be pretty normal?

ngalemmo Thu Oct 10, 2013 10:00 am

In addition to the detail, you have an aggregate like:

FK_root ; Duration ; Country ; User ; Product
1 ; 594ms ; UK ; 23 ; 31
2 ; 422ms ; NL ; 42 ; 01

ThomasP Thu Oct 10, 2013 10:03 am

yes, but "action" will be part of most of the analysis being done.
I guess then my fact table from above would be suitable?

ngalemmo Thu Oct 10, 2013 2:39 pm

Yes, an atomic level fact is the best choice. An aggregate would be an additional table that you could create if the need arises. Usually to address performance issues. If performance is not a problem, but query complexity is (if users are writing SQL versus using a BI tool, you can always create a view that embeds the complexity.

Sponsored content

Fact table Modeling (1:n relations)

Fact table Modeling (1:n relations)

Re: Fact table Modeling (1:n relations)

Re: Fact table Modeling (1:n relations)

Re: Fact table Modeling (1:n relations)

Re: Fact table Modeling (1:n relations)

Re: Fact table Modeling (1:n relations)

Re: Fact table Modeling (1:n relations)