Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Big data vs rdbms

3 posters

Go down

Big data vs rdbms Empty Big data vs rdbms

Post  dbadwh Sun Jan 26, 2014 4:24 am

Hi,
I am new to big data. Can anybody provides comparison between rdbms vs big data? In what way the big data will change the dwh architecture and the way we model the db?

dbadwh

Posts : 31
Join date : 2011-09-30

Back to top Go down

Big data vs rdbms Empty Re: Big data vs rdbms

Post  ngalemmo Sun Jan 26, 2014 7:33 pm

I guess it depends on your definition of 'big data'.

There is the size distinction, which is primarily a hardware and physical modeling issue rather than a logical modeling challenge. There there is the structured versus unstructured data distinction.

When it comes to unstructured data its a matter of deciding if you want to structure it or leave it unstructured. Frankly, for analysis, I tend to lean towards implementing structure to the data so you can deal with it using more traditional SQL based tools. For very high volumes, HADOOP or some other highly parallel framework is useful for handing the parsing and structuring processes before loading into an MPP database platform.

The problem with leaving it unstructured and using HADOOP for analytics is the labor required to do it. Essentially a HADOOP process requires writing a bunch of Java code, which usually means IT needs to be heavily involved in coding whatever needs to get done. If you structure it and put it in a relational database, end users can be much more self-sufficient.

When dealing with an MPP platform there are physical modeling considerations (distribution, organization, indexing, etc…) to achieve optimal performance. What you do varies widely depending on the particular database system.

On the modeling side, choosing between a dimensional model or a 3NF model depends a lot on the peculiarities of the hardware you are using. For example, star schema perform very well on a Netezza platform (aka IBM Pure Data) but not so well on a Teradata platform.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Big data vs rdbms Empty Re: Big data vs rdbms

Post  rendybjunior Thu Oct 23, 2014 9:08 pm

Nowadays, more tools are coming in to the picture to helps end user to use Hadoop without additional effort on creating Java codes, for example Hive, Presto, Tajo, etc. However, for low latency query, the performance is not as good as RDBMS. If you have a huge number of data and will do massive processing on it, Hadoop will be more appropriate.

rendybjunior

Posts : 7
Join date : 2014-09-30

Back to top Go down

Big data vs rdbms Empty Re: Big data vs rdbms

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum