Social Media Data Model
4 posters
Page 1 of 1
Social Media Data Model
Hi All,
I'm looking to design a dimensional model for managing social media...i.e. twitter, blogs, etc. Has anyone ever worked with this? What would be the facts and dimensions? I have so many designs going on in my head, I'm looking for some direction :-).
Thanks in advance,
Krystal
I'm looking to design a dimensional model for managing social media...i.e. twitter, blogs, etc. Has anyone ever worked with this? What would be the facts and dimensions? I have so many designs going on in my head, I'm looking for some direction :-).
Thanks in advance,
Krystal
kclark- Posts : 70
Join date : 2010-08-13
Re: Social Media Data Model
I would start by going back to the business and getting requirements on what they are wanting to get out of this social media data. What are the ultimate goals or what problems are they trying to solve that would possibly be contained in this data? Once you know that, you can model facts and dimension more easily.
TheNJDevil- Posts : 68
Join date : 2011-03-01
Re: Social Media Data Model
Thanks NJ,
Unfortunately, this is new so the business is looking for the basics. What is a common model that is used? That way I can talk my through this...I'm looking to get an idea of what a model others have used look like. Offhand, I can say there is an interest in looking at what tools/social media are utilized more and looking to monitor customer responses on Products. Does that help any?
Thanks,
Krystal
Unfortunately, this is new so the business is looking for the basics. What is a common model that is used? That way I can talk my through this...I'm looking to get an idea of what a model others have used look like. Offhand, I can say there is an interest in looking at what tools/social media are utilized more and looking to monitor customer responses on Products. Does that help any?
Thanks,
Krystal
kclark- Posts : 70
Join date : 2010-08-13
Re: Social Media Data Model
It boils down to the basics: who, what, where, when, and why. The challenge is processing the data so you can answer those questions.
When is easy, and where may not be a concern. Who starts getting messy, particularly if you are trying to relate something like twitter handle with a customer. This requires creative strategies to try to get such people to identify themselves to you, such as getting them to link their twitter, Facebook, and email (and whatever else) accounts to their loyalty account. At the start, just capturing the handle will give you a sense of frequency and general overall perceptions from the population. You can get more sophisticated later.
What and why are the most difficult due to the ambiguity of the source. There are a variety of natural language processing strategies to try to get meaning from such data. Such processes attach attributes to the message ranking it to various degrees of positive and negative, products mentioned, and so forth. HADOOP or similar technologies are often used for this process due to the volumes involved.
Finally, you can go as far as capturing all text and build a keyword structure around it. However, the value of such data degrades very quickly over time as well as grows at a significant rate. Most usually just store the attributes from the NL process, and maybe keep the raw data for a short period of time.
When is easy, and where may not be a concern. Who starts getting messy, particularly if you are trying to relate something like twitter handle with a customer. This requires creative strategies to try to get such people to identify themselves to you, such as getting them to link their twitter, Facebook, and email (and whatever else) accounts to their loyalty account. At the start, just capturing the handle will give you a sense of frequency and general overall perceptions from the population. You can get more sophisticated later.
What and why are the most difficult due to the ambiguity of the source. There are a variety of natural language processing strategies to try to get meaning from such data. Such processes attach attributes to the message ranking it to various degrees of positive and negative, products mentioned, and so forth. HADOOP or similar technologies are often used for this process due to the volumes involved.
Finally, you can go as far as capturing all text and build a keyword structure around it. However, the value of such data degrades very quickly over time as well as grows at a significant rate. Most usually just store the attributes from the NL process, and maybe keep the raw data for a short period of time.
Re: Social Media Data Model
Wow! That's a lot to consider. So hmmm, let's see. Yes, that was my first thought, associating users to actual customers. That sounds like it would take a whole lot of participation from the users. Putting that aside, my thoughts are more along the lines of summaries. So 1) Summary of usage of source (FaceBook - 21K, Twitter -100K), 2) Summary of ratings by Product (Tylonel 4; Aspirin 2), and 3) Summary of product preference by company
Totally agree on the volume
What about that?
Totally agree on the volume
What about that?
kclark- Posts : 70
Join date : 2010-08-13
Re: Social Media Data Model
Hahaha! Gee thanks. :-) Specifically speaking, I have set designed this:
Dimensions:
Users
Source
Product
Date
Facts:
Source Utilization
Blogs
Communities (i.e. Facebook)
Videos (i.e. Youtube)
Movements (i.e. Twitter)
Is this too granular? Or should I have more of the summary tables I mentioned above?
Dimensions:
Users
Source
Product
Date
Facts:
Source Utilization
Blogs
Communities (i.e. Facebook)
Videos (i.e. Youtube)
Movements (i.e. Twitter)
Is this too granular? Or should I have more of the summary tables I mentioned above?
kclark- Posts : 70
Join date : 2010-08-13
Re: Social Media Data Model
I'm not trying to blow you off, but honestly its a pretty broad area and its really hard to comment without spending time looking at what you are collecting from where. As far as summary versus detail goes, I always prefer detail as it gives you opportunities that may be lost in summaries. One option is to collect detail for a short term (frequent historical purges) and maintain summaries for the long term. But, if your not keeping the raw text, the detailed attributes of the text don't take a lot of space.
Re: Social Media Data Model
Hi - at the risk of moving this discussion backwards, I would return to TheNJDevil's comment about getting the business to articulate their requirements. One of the major reasons why so many IT projects fail (to a greater or lesser extent, and not just DW projects) is because they are IT-driven and not business-driven.
If the business cannot articulate their reporting requirements then you should put the project on hold until they can - why is the business spending time and money on building a reporting system if they have no reporting requirements?
You can obviously help/guide them towards being able to articulate their requirements - and if what you are proposing here is a quick and dirty prototype to help the process then that's fine; but in that case I wouldn't worry about granularity too much at this time.
Regards,
If the business cannot articulate their reporting requirements then you should put the project on hold until they can - why is the business spending time and money on building a reporting system if they have no reporting requirements?
You can obviously help/guide them towards being able to articulate their requirements - and if what you are proposing here is a quick and dirty prototype to help the process then that's fine; but in that case I wouldn't worry about granularity too much at this time.
Regards,
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Re: Social Media Data Model
Thanks,
Nick - Yes it is a quick and dirty prototype just to get the business to visualize the possibilities. Then we will get into the requirements.
NG - Thanks for some more feedback. Going back to your comment regarding Hadoop/data processing. Hadoop/Cassandra/etc, are typically used to perform ETL and storage of Big data? Something like DataStage or Informatica wouldn't be able to process the enormous volume of data, correct?
Nick - Yes it is a quick and dirty prototype just to get the business to visualize the possibilities. Then we will get into the requirements.
NG - Thanks for some more feedback. Going back to your comment regarding Hadoop/data processing. Hadoop/Cassandra/etc, are typically used to perform ETL and storage of Big data? Something like DataStage or Informatica wouldn't be able to process the enormous volume of data, correct?
kclark- Posts : 70
Join date : 2010-08-13
Re: Social Media Data Model
Krystal, Seek out whitepapers to give you more clear examples of what has been done and what kind of effort it took to get those projects done. You are in the same boat I was 2 years ago. Build it and they will then see what can be done and that will spark better ideas to build off of. The problem with that is that you do it at little to no cost, and if it is successful, management then has the expectation of being able to deliver these successes with little to no cost.
Check out whitepapers that cover similar topics to what you are trying to do (social media analysis shouldn't be difficult to find). Those would be better than a quick and dirty prototype of unknown purpose or outcome. To find success, you have to first define what will make this a successful project, then work toward that. "Build it and they will come" is just a path to frustration in the long run.
Check out whitepapers that cover similar topics to what you are trying to do (social media analysis shouldn't be difficult to find). Those would be better than a quick and dirty prototype of unknown purpose or outcome. To find success, you have to first define what will make this a successful project, then work toward that. "Build it and they will come" is just a path to frustration in the long run.
TheNJDevil- Posts : 68
Join date : 2011-03-01
Re: Social Media Data Model
NJ,
Totally AGREE! Behind closed doors we say the SAME thing. ARGH! This initiative is very conceptual and the objective is just generic entity models and recommendations. I'm not even doing a prototype or POC; just because the effort and time allotted don't align. I'm just looking for a quick word of direction. But if the advice is whitepapers...I may end up having to shadow my way through this part, since I'm really short on time to read. :-(
Thanks again for your help!
Krystal
Totally AGREE! Behind closed doors we say the SAME thing. ARGH! This initiative is very conceptual and the objective is just generic entity models and recommendations. I'm not even doing a prototype or POC; just because the effort and time allotted don't align. I'm just looking for a quick word of direction. But if the advice is whitepapers...I may end up having to shadow my way through this part, since I'm really short on time to read. :-(
Thanks again for your help!
Krystal
kclark- Posts : 70
Join date : 2010-08-13
Re: Social Media Data Model
One of the ways you can use HADOOP is as a transformation engine to prep data for loading into a DW. INFA and Datastage can do the same thing. The decision to use HADOOP for such a task is driven by data volume above everything else. If you have reasonable volumes that can be handled with you existing toolset, by all means, use them.
Similar topics
» Tracking of historical data using SCD2 in a non-dimensional data model
» data model architecture for economic forecast data
» Canonical Data Model for Data warehouse
» Data Model
» Model Data
» data model architecture for economic forecast data
» Canonical Data Model for Data warehouse
» Data Model
» Model Data
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum