Column Encryption in the data warehouse
3 posters
Page 1 of 1
Column Encryption in the data warehouse
Hello everyone,
I'm new to dimensional modeling so please forgive me for these questions if they seem foolish to some.
In the OLTP there is a standard to keep data encrypted at rest. We have various methods to decrypt the data.
Lets say for example the field SSN is encrypted. SSN is needed in my OLAP system. Should the same security standard be applied to the OLAP system? Reason I ask is because there is a performance hit to decrypt data and if this is done against millions of records then there is an issue.
So generally speaking is data encrypted on the column level in a data warehouse?
If yes then what about performance concerns? If no then what about security concerns? or is there a better way to handle this?
I'm new to dimensional modeling so please forgive me for these questions if they seem foolish to some.
In the OLTP there is a standard to keep data encrypted at rest. We have various methods to decrypt the data.
Lets say for example the field SSN is encrypted. SSN is needed in my OLAP system. Should the same security standard be applied to the OLAP system? Reason I ask is because there is a performance hit to decrypt data and if this is done against millions of records then there is an issue.
So generally speaking is data encrypted on the column level in a data warehouse?
If yes then what about performance concerns? If no then what about security concerns? or is there a better way to handle this?
cmp66- Posts : 6
Join date : 2014-03-12
Re: Column Encryption in the data warehouse
As a general principle, security standards should be consistently applied but it's always useful going back to the reason why a security standard has been defined before making your life difficult by applying it to your DW. For example, if SSN is encrypted in your OLAP system partly because all your company employees (or 3rd parties) have access to your OLAP system - but only a limited group of employees have access to your reporting system, and they can be given a higher security clearance, then it may be this particular security standard need not apply to them.
Assuming that you do have to encrypt data then what you do in your DW probably needs to be considered on a case-by-case basis e.g.
if you need to query by a particular SSN "Show me all transactions in March for SSN = 123456789" then either you can't encrypt the SSN or you need to be able to encrypt your query prompt on the fly to match the encrypted value in the DW - which may or may not be possible depending on your BI tool, encryption process, etc.
The solution is definitely not to decrypt data in order to run a query (and obviously not to display data as if you can display it unencrypted there's no point in encrypting it in the first place) - as you say, decrypting millions of records impacts performance and the whole point of a DW is to deliver high performance results from your queries. Also, you probably want fields like SSN indexed if you try and run queries like the example I've given above - and I'm not aware of any way of indexing the unencrypted version of an encrypted column.
You may also be able to get to a solution by looking at the security options your DB offers. I'm not a DBA and this is not an area of expertise for me so apologies if this is not 100% accurate but I know in Oracle you can restrict access to specific objects (tables, columns, etc) using security roles-but I don't know if restricting visibility of an object means you also can't interact with it i.e. if you stop users being able to select the SSN column being returned in a query would that also stop you filtering by that column.
Probably nothing like a complete answer but hopefully this has given you something to think about and may prompt someone else to contribute their opinions as well
Assuming that you do have to encrypt data then what you do in your DW probably needs to be considered on a case-by-case basis e.g.
if you need to query by a particular SSN "Show me all transactions in March for SSN = 123456789" then either you can't encrypt the SSN or you need to be able to encrypt your query prompt on the fly to match the encrypted value in the DW - which may or may not be possible depending on your BI tool, encryption process, etc.
The solution is definitely not to decrypt data in order to run a query (and obviously not to display data as if you can display it unencrypted there's no point in encrypting it in the first place) - as you say, decrypting millions of records impacts performance and the whole point of a DW is to deliver high performance results from your queries. Also, you probably want fields like SSN indexed if you try and run queries like the example I've given above - and I'm not aware of any way of indexing the unencrypted version of an encrypted column.
You may also be able to get to a solution by looking at the security options your DB offers. I'm not a DBA and this is not an area of expertise for me so apologies if this is not 100% accurate but I know in Oracle you can restrict access to specific objects (tables, columns, etc) using security roles-but I don't know if restricting visibility of an object means you also can't interact with it i.e. if you stop users being able to select the SSN column being returned in a query would that also stop you filtering by that column.
Probably nothing like a complete answer but hopefully this has given you something to think about and may prompt someone else to contribute their opinions as well
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Re: Column Encryption in the data warehouse
The other aspect of this is to question the need and use of sensitive data, such as SSN.
Unless you are the government, specifically the SSA or IRS, you should not have it in the DW. If someone needs to get the SSN of someone (presumably this is an HR application) they should do so from the operational system. There is no valid use of such data in an analytic environment.
Rather than leave your organization exposed to significant liability, if the only reason they want SSN is as an identifier, then the value should be obfuscated in a manner that it cannot be decrypted.
Unless you are the government, specifically the SSA or IRS, you should not have it in the DW. If someone needs to get the SSN of someone (presumably this is an HR application) they should do so from the operational system. There is no valid use of such data in an analytic environment.
Rather than leave your organization exposed to significant liability, if the only reason they want SSN is as an identifier, then the value should be obfuscated in a manner that it cannot be decrypted.
Similar topics
» Ebook The Data Warehouse Lifecycle Toolkit, 2nd Edition: Practical Techniques for Building Data Warehouse and Business Intelligence Systems
» data warehouse or not ? when is it okay to use OLAP without a data warehouse database
» data warehouse and data warehouse system
» difference between data mart and data warehouse at logical/physical level
» Managing column and data conformities
» data warehouse or not ? when is it okay to use OLAP without a data warehouse database
» data warehouse and data warehouse system
» difference between data mart and data warehouse at logical/physical level
» Managing column and data conformities
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum