Keyword dimension question
2 posters
Page 1 of 1
Keyword dimension question
Hi
I want to build a DW for my websites and ensure that I won't have to re-design it.
I have two main objects. Keyword (for page that belongs to website) and page (that belongs to website). I think doing two dimensions: Keyword and Page is not needed so I thought about creating just one keyword dimension that will contain the following attributes:
Keyword
Page
Website
-- the above attributes are one unique business key
Now complications:
-- Each page has unique ID and name however it is not uncommon that I will do "redirect" on some pages so there will be another record in the table that will represent new Page and the old record/page will have ParentPageID of the new page/PageID (simple parent-child relationship).
-- Keywords may belong to old or new page (but not both) and I will handle that logic in ETL
I have two questions:
1) Is it best to have one dimension "keyword" (any disadvantages?)
2) How to handle load of pages into the dimension with "redirects" . For instance Keyword/Page/Website is unique however I can do Page redirect therefore PageName (and ID) will change but it is still the same Page that has the same keyword and website.
Example of data below (source)
PageID, PageName, WebsiteID, ParentPageID
1, MyPage1, 1, 2
2, MyPage2, 1, NULL
There are two records however it is only one current page (PageID = 2)
Many thanks in advance
Emil
I want to build a DW for my websites and ensure that I won't have to re-design it.
I have two main objects. Keyword (for page that belongs to website) and page (that belongs to website). I think doing two dimensions: Keyword and Page is not needed so I thought about creating just one keyword dimension that will contain the following attributes:
Keyword
Page
Website
-- the above attributes are one unique business key
Now complications:
-- Each page has unique ID and name however it is not uncommon that I will do "redirect" on some pages so there will be another record in the table that will represent new Page and the old record/page will have ParentPageID of the new page/PageID (simple parent-child relationship).
-- Keywords may belong to old or new page (but not both) and I will handle that logic in ETL
I have two questions:
1) Is it best to have one dimension "keyword" (any disadvantages?)
2) How to handle load of pages into the dimension with "redirects" . For instance Keyword/Page/Website is unique however I can do Page redirect therefore PageName (and ID) will change but it is still the same Page that has the same keyword and website.
Example of data below (source)
PageID, PageName, WebsiteID, ParentPageID
1, MyPage1, 1, 2
2, MyPage2, 1, NULL
There are two records however it is only one current page (PageID = 2)
Many thanks in advance
Emil
itcouple- Posts : 45
Join date : 2010-10-13
Keyword dimension
I have been thinking about that and have one idea.
DimKeyword:
ID (surrogateKey)
WebsiteName
PageNameID (unique for website. This is actually page name not id but I called it ID due to next field)
PageName (Latest PageName. This is what I will use but I want "pageNameID" for history analysis)
Keyword
So I think that might work so I extract Keyword (ensure it belongs only to one page) and populate with other attributes. PageNameID would be Type 2 and PageName Type 1 but I would update it for all related PageNameIDs (redirects).
This is my first "on my own" dimension design so I would appreciate your comments.
Regards
Emil
DimKeyword:
ID (surrogateKey)
WebsiteName
PageNameID (unique for website. This is actually page name not id but I called it ID due to next field)
PageName (Latest PageName. This is what I will use but I want "pageNameID" for history analysis)
Keyword
So I think that might work so I extract Keyword (ensure it belongs only to one page) and populate with other attributes. PageNameID would be Type 2 and PageName Type 1 but I would update it for all related PageNameIDs (redirects).
This is my first "on my own" dimension design so I would appreciate your comments.
Regards
Emil
itcouple- Posts : 45
Join date : 2010-10-13
3 dimension after all
I've been thinking about fact tables I need and it seems my dimension approach doesn't make sense... Initially I thought it would be nice to have one dimension so it is easier to find things and I thought my example is similar to product and category/subcategory attribute but now thinking about it it isn't because my attributes are part of fact FKs.
So I think I will go with the following dimensions:
Website:
ID
WebsiteAddress
WebsiteName (for redirect purposes, one name multiple addresses)
Page:
ID
PageID (page name but used as ID)
PageName (latest page name)
Keyword:
ID
Keyword
(I don't think I need anything else here?)
Regards
Emil
So I think I will go with the following dimensions:
Website:
ID
WebsiteAddress
WebsiteName (for redirect purposes, one name multiple addresses)
Page:
ID
PageID (page name but used as ID)
PageName (latest page name)
Keyword:
ID
Keyword
(I don't think I need anything else here?)
Regards
Emil
itcouple- Posts : 45
Join date : 2010-10-13
Re: Keyword dimension question
Is keyword simply an attribute of the page? That is something the categorizes the page?
If you are pulling from your own logs, other than the referring URL, you will only get references to your pages. Why wouldn't a single page dimension work? Keyword (aka page category) and site are just attributes of the page.
If you are pulling from your own logs, other than the referring URL, you will only get references to your pages. Why wouldn't a single page dimension work? Keyword (aka page category) and site are just attributes of the page.
keyword
Hi,
Thanks for your reply. In my case it is actually the other way. Page categorizes keyword. So I can have multiple keywords for one page. Keyword is actually more important for me than page as keyword is what I use for analysis in most cases. The reason why I changed my mind was that I will have several fact tables and some of them will be page granularity and some keyword granularity (I might create website granularity fact table as well for targets/forecast).
The main traffic log table will actually contain website/page and keyword as I do record previous URL and if it comes from google I know what the user typed to get to a specific page.
Initially I thought that having one keyword dimension will be easier to use but now I think 3 dimensions might be much simplier to implement and probably the right thing to do??
Regards
Emil
Thanks for your reply. In my case it is actually the other way. Page categorizes keyword. So I can have multiple keywords for one page. Keyword is actually more important for me than page as keyword is what I use for analysis in most cases. The reason why I changed my mind was that I will have several fact tables and some of them will be page granularity and some keyword granularity (I might create website granularity fact table as well for targets/forecast).
The main traffic log table will actually contain website/page and keyword as I do record previous URL and if it comes from google I know what the user typed to get to a specific page.
Initially I thought that having one keyword dimension will be easier to use but now I think 3 dimensions might be much simplier to implement and probably the right thing to do??
Regards
Emil
itcouple- Posts : 45
Join date : 2010-10-13
Similar topics
» Question about keyword outrigger
» clickstream keyword dimension
» Question about using date dimension keys in other dimension tables
» Question on breaking out Degenerate Dimension to separate dimension
» Dimension Question
» clickstream keyword dimension
» Question about using date dimension keys in other dimension tables
» Question on breaking out Degenerate Dimension to separate dimension
» Dimension Question
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum