INTERVAL TIME SUM COLUMN
3 posters
Page 1 of 1
INTERVAL TIME SUM COLUMN
Hello,
In my data warehouse I have a fact table that joins with calendar table on BETWEEN clause ( calendar.date between myfact.startdate and myfact.enddate ).
In calendar date I have a record for each day.
I need to sum a fact table column in my query only one time for each fact table records. Now my fact table column is sum for each records of the query.
select sum(myfact.value)
from calendar, myfact
where calendar.date between myfact.startdate and myfact.enddate
How I can to do this?
In my data warehouse I have a fact table that joins with calendar table on BETWEEN clause ( calendar.date between myfact.startdate and myfact.enddate ).
In calendar date I have a record for each day.
I need to sum a fact table column in my query only one time for each fact table records. Now my fact table column is sum for each records of the query.
select sum(myfact.value)
from calendar, myfact
where calendar.date between myfact.startdate and myfact.enddate
How I can to do this?
Enrico- Posts : 3
Join date : 2009-06-15
Re: INTERVAL TIME SUM COLUMN
You can't do it the way you are doing it.
If you only want to count a fact once, why are you intentionally creating multiple occurances by joining to all dates between the start and end of whatever the fact represents? Why no just join where date = start date? Why are you using date at all?
If you only want to count a fact once, why are you intentionally creating multiple occurances by joining to all dates between the start and end of whatever the fact represents? Why no just join where date = start date? Why are you using date at all?
Re: INTERVAL TIME SUM COLUMN
Thanks,
My fact table contains employees and the fact column contains FTE ( Full Time Equvalent percent).
I have a record for each change of fact column of employee.
Ex. :
Employee start date end date FTE
1 2008-11-01 2009-02-28 100
1 2009-03-01 2009-03-31 75
1 2009-04-01 2050-12-31 50
2 2007-01-01 2009-05-30 60
2 2009-06-01 2050-12-31 100
In my query I need to sum a FTE column at certain date. If the user choose year the query must show the sum of employees in year, if the user choose year and month the query must show the sum of employees in year/month.
I can't join date = start date because the start date couldn't be in required interval time. In 2009 the first record of example don't join.
I use Business Objects.
I could create different fact table ( a year fact table, a year/month fact table, a date fact table ). I a user choose a year ( year = year ) I join with a year fact table ecc...
The limitation of this solution is if the user choose different dimension time object ( for ex. a quarter ). In this case the query don't works and I must create a quarter fact table.
Enrico
My fact table contains employees and the fact column contains FTE ( Full Time Equvalent percent).
I have a record for each change of fact column of employee.
Ex. :
Employee start date end date FTE
1 2008-11-01 2009-02-28 100
1 2009-03-01 2009-03-31 75
1 2009-04-01 2050-12-31 50
2 2007-01-01 2009-05-30 60
2 2009-06-01 2050-12-31 100
In my query I need to sum a FTE column at certain date. If the user choose year the query must show the sum of employees in year, if the user choose year and month the query must show the sum of employees in year/month.
I can't join date = start date because the start date couldn't be in required interval time. In 2009 the first record of example don't join.
I use Business Objects.
I could create different fact table ( a year fact table, a year/month fact table, a date fact table ). I a user choose a year ( year = year ) I join with a year fact table ecc...
The limitation of this solution is if the user choose different dimension time object ( for ex. a quarter ). In this case the query don't works and I must create a quarter fact table.
Enrico
Enrico- Posts : 3
Join date : 2009-06-15
Re: INTERVAL TIME SUM COLUMN
You only have one date, the reporting date. You do not want to find dates within the effective and expration date but rather facts that are in effect on the reporting date. You will not double count employees.
You left out a very important line in your sample query:
select sum(myfact.value)
from calendar, myfact
where calendar.date between myfact.startdate and myfact.enddate
and calendar.date = reporting date
You left out a very important line in your sample query:
select sum(myfact.value)
from calendar, myfact
where calendar.date between myfact.startdate and myfact.enddate
and calendar.date = reporting date
Re: INTERVAL TIME SUM COLUMN
Thanks,
I don't have reporting date ( only one date).
If the user choose the year my SQL is:
select sum(myfact.value), calendar.year
from calendar, myfact
where calendar.date between myfact.startdate and myfact.enddate
group by calendar.year
If the user choose the year and month my SQL is:
select sum(myfact.value), calendar.year, ), calendar.month
from calendar, myfact
where calendar.date between myfact.startdate and myfact.enddate
group by calendar.year, ), calendar.month
Enrico
I don't have reporting date ( only one date).
If the user choose the year my SQL is:
select sum(myfact.value), calendar.year
from calendar, myfact
where calendar.date between myfact.startdate and myfact.enddate
group by calendar.year
If the user choose the year and month my SQL is:
select sum(myfact.value), calendar.year, ), calendar.month
from calendar, myfact
where calendar.date between myfact.startdate and myfact.enddate
group by calendar.year, ), calendar.month
Enrico
Enrico- Posts : 3
Join date : 2009-06-15
Re: INTERVAL TIME SUM COLUMN
So, the problem is identifying facts that fall within a time period.
Because your facts span a time range, you cannot directly join them to the date dimension. You must use the date dimension to define the time range to look for. This requires a two-pass query...
For example, where a user chooses a year, the query would be:
select sum(myfact.value), a.year
from myfact,
(select min(date) mindate, max(date) maxdate, max(year) year
from calendar
where year = #user selected year) a
where myfact.startdate between a.mindate and a.maxdate
or myfact.enddate between a.mindate and a.maxdate
or (myfact.startdate < a.mindate and mystart.enddate > a.maxdate)
group by a.year
Because your facts span a time range, you cannot directly join them to the date dimension. You must use the date dimension to define the time range to look for. This requires a two-pass query...
For example, where a user chooses a year, the query would be:
select sum(myfact.value), a.year
from myfact,
(select min(date) mindate, max(date) maxdate, max(year) year
from calendar
where year = #user selected year) a
where myfact.startdate between a.mindate and a.maxdate
or myfact.enddate between a.mindate and a.maxdate
or (myfact.startdate < a.mindate and mystart.enddate > a.maxdate)
group by a.year
Re: INTERVAL TIME SUM COLUMN
Another thought is that if you are trying to count employees, it usually isn't done the way you are attempting to do it. Usually employee headcounts are as of a specific point in time. A typical HR person may ask 'What is my headcount for June?'... what they are really asking is 'What is my headcount on June 30th?' or 'June 1' or whatever the business rule is as far as when a count is taken. What you can't do is count one employee who left early in the month and another employee who started later in the month. It will create very misleading totals.
So, when you say 'I don't have a report date' is not accurate. You need to go back to the business and find out what they really mean when they say 'Give me counts for xxxx'.
So, when you say 'I don't have a report date' is not accurate. You need to go back to the business and find out what they really mean when they say 'Give me counts for xxxx'.
Define the Grain of the Fact Table
It is not explicit what a row in your fact able represents, could you provide a grain statement to help clarify?
I have done my fair share of HR designs and most don't implement a fact table having a range of dates but either (Periodic Snapshot Type)
1. A reporting date (as ngalemmo references) that aggregates HR data during a period since the last reporting date (Transactional Type)
2. A specific single date that provides details (as an example) for FTE in a given day (this can of course be aggregated as needed
In both of these designs simple joins to the Date dimension provide the answers to question such as:
1. What is the total FTE as of a certain date?
2. What is the total/average FTE in a certain month/quarter/year?
I have done my fair share of HR designs and most don't implement a fact table having a range of dates but either (Periodic Snapshot Type)
1. A reporting date (as ngalemmo references) that aggregates HR data during a period since the last reporting date (Transactional Type)
2. A specific single date that provides details (as an example) for FTE in a given day (this can of course be aggregated as needed
In both of these designs simple joins to the Date dimension provide the answers to question such as:
1. What is the total FTE as of a certain date?
2. What is the total/average FTE in a certain month/quarter/year?
JoeSalvatore- Posts : 4
Join date : 2009-06-19
Similar topics
» Interval analysis
» Combined Date Time dimension vs. Separate Day and Time dimensions and timezones
» Time dimension design for cross time zone and custom calendar warehouse, 1min granularity
» Column Property Enforcement
» SCD type 2 approach.
» Combined Date Time dimension vs. Separate Day and Time dimensions and timezones
» Time dimension design for cross time zone and custom calendar warehouse, 1min granularity
» Column Property Enforcement
» SCD type 2 approach.
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum