Usage flags in subscriber dimension
4 posters
Page 1 of 1
Usage flags in subscriber dimension
I have a subscriber dimension with 25 millions distinct subscriber. Business users want to see some flag objects (like active data user flag, active mobile content user flag etc) in BO. These columns will be calculated from last month's usage data. For example if one subscriber has a data usage last month, this subscriber will be marked as "active data user".
We have a monthly aggregated snapshot table and I can get monthly usage data of each subscriber. This table has 160 millions row per month with 16 millions distinct subscriber. And the flag columns will be calculated using this aggregated table.
I don't prefer to add these flag columns to the subscriber dimension. Because they are easily calculable from usage aggregation, and the rules may be change in time. And these flag columns will not be used by all departments.
But my aggregation table is huge, so I decided to create another aggregated table with less detail (or a materialized view). When a user needs to query one of these flags, there will be an outer join to the aggregated table. But if the user needs two flag columns, it will be two joins. I did not execute any performnce test but I have some performance concerns. Do you have any comments or design tips?
Thanks in advace.
We have a monthly aggregated snapshot table and I can get monthly usage data of each subscriber. This table has 160 millions row per month with 16 millions distinct subscriber. And the flag columns will be calculated using this aggregated table.
I don't prefer to add these flag columns to the subscriber dimension. Because they are easily calculable from usage aggregation, and the rules may be change in time. And these flag columns will not be used by all departments.
But my aggregation table is huge, so I decided to create another aggregated table with less detail (or a materialized view). When a user needs to query one of these flags, there will be an outer join to the aggregated table. But if the user needs two flag columns, it will be two joins. I did not execute any performnce test but I have some performance concerns. Do you have any comments or design tips?
Thanks in advace.
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 44
Location : Istanbul, Turkey
Usage flags in subscriber dimension
Hi,
you can try to have both the flags in the MV or the summary monthly table ? Did you try that
How frequently you will be refreshong the MV or updating the summary table ? is it every hour, once a day or once a month and how much time dose it take to do the MV/Summary table refresh for a month ?
If it is not taking much time then you can outer join the summary table and see how it goes.
thanks
Himanshu
you can try to have both the flags in the MV or the summary monthly table ? Did you try that
How frequently you will be refreshong the MV or updating the summary table ? is it every hour, once a day or once a month and how much time dose it take to do the MV/Summary table refresh for a month ?
If it is not taking much time then you can outer join the summary table and see how it goes.
thanks
Himanshu
hkandpal- Posts : 113
Join date : 2010-08-16
Re: Usage flags in subscriber dimension
Why not simply define column expressions in BOBJ based on the measures in the monthly aggregate? You could also include predefined filters based on the same.
Re: Usage flags in subscriber dimension
Why not, I think subscriber dimension is the most suitable place for such an attribute. It's true that rule may change, but the model stays the same. This comes down to where you should calculate a derived attribute, in ETL or report layer. In my opinion, including this kind of attribute as type 1 in the dimension is the most pragmatic and efficient approach. You can easily hide any attribute in a view if some user does not need to see it.umutiscan wrote:I don't prefer to add these flag columns to the subscriber dimension. Because they are easily calculable from usage aggregation, and the rules may be change in time. And these flag columns will not be used by all departments.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Usage flags in subscriber dimension
Why not, I think subscriber dimension is the most suitable place for such an attribute. It's true that rule may change, but the model stays the same. This comes down to where you should calculate a derived attribute, in ETL or report layer. In my opinion, including this kind of attribute as type 1 in the dimension is the most pragmatic and efficient approach. You can easily hide any attribute in a view if some user does not need to see it.
These flag columns cannot be type 1 attributes because change of customer behaviour is important for the business. For example, If active data user flag turns to 0 from 1, this is an alert for them. On the other hand, This is the usage behaviour and this can be done querying usage aggregation. These kind of flag requests may be continious (like has automatic payment transaction flag, has sms transaction flag, has unpaid bill flag, has limit excess flag etc). Business users like to see everything as a subscriber attribute, they don't try to calculate them.
Why not simply define column expressions in BOBJ based on the measures in the monthly aggregate? You could also include predefined filters based on the same.
I'm not a BO designer, but this is what I try to implement. I will join subscriber dimension with usage aggregation like below. X,Y and Z are the data usage types and I will show the active_data_user_flag with a decode expression.
SELECT subs.id, decode(agg.subs_id, null, 0, 1) active_data_user_flag
FROM dim_subscriber subs, monthly_aggregation agg
WHERE subs.id = agg.subs_id(+)
AND agg.usage_type(+) in (X,Y,Z)
AND .......
But when I add another flag to the query there will be another join to usage aggregation with the other usage types. This may cause low performance.
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 44
Location : Istanbul, Turkey
Re: Usage flags in subscriber dimension
In that case, I would still resolve it in ETL, materializing the attributes in table, maybe a shadow table would be a good solution. I firstly heard of the concept from Ngalemmo and found its usage by other modelers.
As shadow table is normally behind the scene, you may update the table structure without concern of distabalising the model interface. You may only surface up the shadow attributes for special requirements. I think Ngalemmo might be able to give you the full flavor of its working.
As shadow table is normally behind the scene, you may update the table structure without concern of distabalising the model interface. You may only surface up the shadow attributes for special requirements. I think Ngalemmo might be able to give you the full flavor of its working.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Usage flags in subscriber dimension
Yeah, I was thinking about that. Depending on your DBMS, updating 25M rows can be a big deal. Alternately you can consider creating a table with the subscriber PK and the flags, and just build a new table each month. If you need the previous month to alert changes, you can make it a type 3 dimension and carry forward the old flags from the previous month. Either way, you can construct the process so you just insert rows into a new table. A small table like that (~250MB) should perform fairly well in queries.
Re: Usage flags in subscriber dimension
Thanks for your helpful inputs. To resolve this problem in ETL seems more reasonable to me now. We have another analytics platform and they may need these flag attributes and they will have to calculate these flags again. One single source for these attributes will be more useful and a type three dimension will be a good solution. Thank you!
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 44
Location : Istanbul, Turkey
Re: Usage flags in subscriber dimension
And if you do build a new table, it is generally much faster if you do not define indexes or constraints for the new table. Create indexes after the data has been loaded.
Similar topics
» Factless Fact Table can contain Flags (Yes or No)
» Flag attributes in Subscriber Dimension
» Operational Datastore Usage
» Should a True/False value be a dimension
» Cross Subscriber Analysis
» Flag attributes in Subscriber Dimension
» Operational Datastore Usage
» Should a True/False value be a dimension
» Cross Subscriber Analysis
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum