predictive analytics suggested reading?
5 posters
Page 1 of 1
predictive analytics suggested reading?
I am curious if anyone has any recommendations on understanding predictive analytics concepts and/or techniques. In particular, how to apply such concepts to datawarehousing and dimensional modeling. I am not sure if there are any conceptual books on this (agnostic of the tools or the vendor) that give a good overview. Thanks.
obiapps- Posts : 21
Join date : 2010-09-28
Re: predictive analytics suggested reading?
I think you are touching on data mining area. Data warehousing and dimensional modeling is about how historical data can be stored and retrieved more efficiently, whereas data mining is the process of exploring data, ideally stored in data warehouse, to find the patterns that describe the data and to predict its future values. Most data mining models are vendor specific, therefore the only chance to get hands on the technology is to play with commercial tools using sample data. The cheapest approach is to play with mining models built in with Microsoft SSAS developer edition at price around $80.
However data mining skills only sound good on your CV, but in practice, it is still a long way for it to become a mainstream technology. Data warehousing has been around for more than two decades, people are still arguing about relational versus dimensional, and only in recent 5-10 years, dimensional modeling has gradually emerged to become a dominant force. Thinking of data mining as a downstream requirement after data warehousing, I can imagine it would take at least another 5 years, if not more, for data mining to become a mainstream technology at the same level as data warehousing today.
However data mining skills only sound good on your CV, but in practice, it is still a long way for it to become a mainstream technology. Data warehousing has been around for more than two decades, people are still arguing about relational versus dimensional, and only in recent 5-10 years, dimensional modeling has gradually emerged to become a dominant force. Thinking of data mining as a downstream requirement after data warehousing, I can imagine it would take at least another 5 years, if not more, for data mining to become a mainstream technology at the same level as data warehousing today.
Last edited by hang on Wed Nov 10, 2010 4:40 pm; edited 1 time in total
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: predictive analytics suggested reading?
You make a very good point. I work with Oracle packaged DW solutions for ERP systems..and there is a lot of literature (vaporware) using the term "predictive analytics". I began researching this and I know that there is a solution called ODM (Oracle Data Mining) that comes as an addon to the DB itself, but there is no relationship per-se to the actual DW model. It seems its possible to use ODM to enhance the existing DW, but there is no literature on how to do this. The few articles I found are heavily focused on algorithms and I cannot find a way to relate that to my existing DW. From what you are saying, Data Mining is in its infancy and there has not been much proof to the actual advantages? Is that correct? I have read many articles about how companies are using predictive models and data mining to gain an advantage. I wonder how many of these companies there really are..and if they are adding Data Mining as a strategy on top of an existing DW model. At my current client, there are a few "statisticians" who have a deep background in these algorithms..but they do not understand the DW Dimensional Model. The have a lot of access to all DBs as well as SAS and they search for patterns. Is there a bridge between data mining and data modeling? If so, I would be interested in learning. Any thoughts or comments are appreciated.
obiapps- Posts : 21
Join date : 2010-09-28
Re: predictive analytics suggested reading?
You need a good understanding of probability and statistics to understand predictive modeling. It is a mathematical discipline, not a data discipline.
I have implemented predictive models in data warehouses based on clickstream data (behavior on a web site). The company had a team of staticians (PhD's) who crunched the data and developed a model. The model was basically a set of scores assigned to a visitor based on what page or type of page they viewed. You sum the scores, mulitply by e (or apply as an exponent... I forget) and come up with a probility that indicates the visitors likelyhood of purhasing a product. This probability is then used to weigh interest in such products to predict future sales.
So the DW implementation is fairly simple, depending on how simple the model the stats people come up with. The process to develop the model is all math.
As far as 'data mining' itself goes, the term has been butchered over time by marketing people to where is has little meaning. It originally was a term applied to statistical analysis of data sets but has morphed into a much wider meaning including simple dashboards, graphic displays and reports.
I have implemented predictive models in data warehouses based on clickstream data (behavior on a web site). The company had a team of staticians (PhD's) who crunched the data and developed a model. The model was basically a set of scores assigned to a visitor based on what page or type of page they viewed. You sum the scores, mulitply by e (or apply as an exponent... I forget) and come up with a probility that indicates the visitors likelyhood of purhasing a product. This probability is then used to weigh interest in such products to predict future sales.
So the DW implementation is fairly simple, depending on how simple the model the stats people come up with. The process to develop the model is all math.
As far as 'data mining' itself goes, the term has been butchered over time by marketing people to where is has little meaning. It originally was a term applied to statistical analysis of data sets but has morphed into a much wider meaning including simple dashboards, graphic displays and reports.
Re: predictive analytics suggested reading?
hang wrote:...However data mining skills only sound good on your CV, but in practice, it is still a long way for it to become a mainstream technology. ...
Walk into a financial services company. Predictive modeling is everywhere.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: predictive analytics suggested reading?
Predictive Analytics is the current marketing term for the same kinds of statistical analyses that have been around, and in use behind the scenes, for decades. This is where products like SAS and SPSS got started.
The previous marketing term was data mining, but that didn't catch on so well. Regardless of the name, these techniques have definitely proven their value in certain industries, and with certain kinds of data. The two referenced in this thread are good examples. As Nick described, web-based activities are a great data mining opportunity because you can track all of your customer's behaviors; not only what they bought, but what they were shown, and what they clicked on along the way.
The dimension model is relevant only in that it is probably part of a well built DW/BI system, with nicely cleaned data, dimensions with lots of interesting attributes whose changes have been accurately tracked, and fact tables from multiple business processes that capture a long history atomic level events using shared, conformed dimensions.
There are lots of other opportunities to apply these algorithms in a useful way. One book that helped me learn more about what the various algorithms are, and how they can be applied to business scenarios is Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, 2nd Ed. by Berry and Linoff. It's not for the hard core statistician, but it's a good general overview for folks getting started in the field.
--Warren
The previous marketing term was data mining, but that didn't catch on so well. Regardless of the name, these techniques have definitely proven their value in certain industries, and with certain kinds of data. The two referenced in this thread are good examples. As Nick described, web-based activities are a great data mining opportunity because you can track all of your customer's behaviors; not only what they bought, but what they were shown, and what they clicked on along the way.
The dimension model is relevant only in that it is probably part of a well built DW/BI system, with nicely cleaned data, dimensions with lots of interesting attributes whose changes have been accurately tracked, and fact tables from multiple business processes that capture a long history atomic level events using shared, conformed dimensions.
There are lots of other opportunities to apply these algorithms in a useful way. One book that helped me learn more about what the various algorithms are, and how they can be applied to business scenarios is Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, 2nd Ed. by Berry and Linoff. It's not for the hard core statistician, but it's a good general overview for folks getting started in the field.
--Warren
warrent- Posts : 41
Join date : 2008-08-18
Re: predictive analytics suggested reading?
Warren, thanks for the suggestion. Any idea if the predictive analytic concept has ever been applied to HR data? I have yet to find any reference to this.
obiapps- Posts : 21
Join date : 2010-09-28
Re: predictive analytics suggested reading?
Yes, sort of. I worked on a project with a large utility to build a database to support workforce planning. It took into account the hiring pipeline and all aspects of turnover to better plan staffing and placement of employees. Of particular interest was the staffing of line positions (ie people who repair electrical lines) and their geographic locations so that appropriate manpower was available during seasonal crisis. The model primarily relied on historical trends.
Similar topics
» Predictive Analytics (Fraud Score) and Dimensional Models
» utility metr reading
» Further reading for Recursive Hierarchy Data Modelling
» I can't figure out how to model this M:M relationship despite reading a dozen articles.
» utility metr reading
» Further reading for Recursive Hierarchy Data Modelling
» I can't figure out how to model this M:M relationship despite reading a dozen articles.
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum