Loading data without key
2 posters
Page 1 of 1
Loading data without key
For a customer of mine i'm loading messages in a datawarehouse. The messages don't have an id(?!). With this message i want to load some dimensions and the fact. Are there best/Good practices of doing this? Currently i'm thinking of giving these messages a self generated key. Load the data and compare afterwards if the load went ok. So the dimensions are using this key and the fact.
I'm not very happy with this solution. So i hope that someone gives me a better solution. O and we're talking about 100000 messages a day..
Thanks.
I'm not very happy with this solution. So i hope that someone gives me a better solution. O and we're talking about 100000 messages a day..
Thanks.
hennie7863- Posts : 31
Join date : 2009-10-19
Re: Loading data without key
The message is the key (sort of...). It really depends on what you want to do with the message. If all you need to do is store it with the facts and be able to spit it back out, the easiest thing to do is store it in a CLOB column in the fact table. What is nice about this is most database systems will place the CLOB in a separate data structure and place a pointer to it in the fact table. Essentially the same thing as creating a separate dimension table but without the extra work. If your query does not include the message you will not get a performance hit from the message.
If you want to include the ability to do efficient keyword searches on the message, then you are looking at a bit more work...
If you want to include the ability to do efficient keyword searches on the message, then you are looking at a bit more work...
Re: Loading data without key
Thanx for your reply. I want to extract some data and put it into a dimension, like the sender of the message. Also, i want to store the message for retrieval (showing on the screen for some reason). Yes i want to store the message and my plan was to store it in a dimension as a Varchar(MAX) in SQL Server 2008. My ETL proces is first loading the dimensions, store the surogate key with the application key in the lookup table and use this for loading the foreign keys in the fact. BUT i don't have a key....So i can't lookup for a surogate key. So the only solution i have so far is that i'll keep record of the surrogate key in autonumbering field and do a lookup.
Another problem is that i want to check whether i've already loaded the message. Normally, first i do a fast comparison based on a hash and to make sure i'll compare the fields. Then the record is inserted, updated or marked deleted. But in this case the size of the messages can be 10000 characters and after a couple of years the dimension table can be huge. The process will be performance killer over a couple of years. And i have three more types of messages and another increment of 3 messages. So the problem is multiplied to six.
Another problem is that i want to check whether i've already loaded the message. Normally, first i do a fast comparison based on a hash and to make sure i'll compare the fields. Then the record is inserted, updated or marked deleted. But in this case the size of the messages can be 10000 characters and after a couple of years the dimension table can be huge. The process will be performance killer over a couple of years. And i have three more types of messages and another increment of 3 messages. So the problem is multiplied to six.
hennie7863- Posts : 31
Join date : 2009-10-19
Similar topics
» Loading data in Data Warehouse
» Incremental Loading of data
» Loading Duration Data
» Loading data into fact table
» Loading Data Aggregated to Date into Fact Table
» Incremental Loading of data
» Loading Duration Data
» Loading data into fact table
» Loading Data Aggregated to Date into Fact Table
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum