Wednesday, September 27, 2017

A Data Classification Project

In another post, I mentioned organizations not having a data classification standard and associated policy will have a difficult time implementing many information security related controls such as DLP and rights management.  Data classification can also help with DR optimization, justifying spend on technology, and may be a cyber-insurance requirement. Since many security projects rely on proper classification of data, we have seen an up-tick in requests related to helping clients in this regard. 

What does a Data Classification Project look like?

A Data Classification Project assesses an organization’s digital assets to determine criticality, sensitivity, privacy requirements, and to determine a naming taxonomy to categorize the data.

A typical project consists of a series of interviews, research, and tools based discovery to establish the following regarding the data:
·         Location
·         Criticality to the organization (using the typical CIA Triad)
·         Regulations relevant to data

While a full blown Data Classification Exercise could fill a book, I'll attempt to condense it down in this post.

Location of Data and its Risk

Interviews with business unit managers as well as the technical staff are coupled with a tools based discovery effort to identify all of the repositories containing data.  These repositories will start with physical data containers such as hard drives, SANs, NAS devices, cloud providers, etc., and will ultimately identify files, databases and applications.  Once there is a mapping of the data locations, we will start to group the data by risk to the organization.

The following table illustrates an example of the risk each discovered data element has to the organization and how it might be calculated:

Data Element 1
Data Element 2
Data Element 3
Confidentiality



Data Leakage, Theft, Disclosure
5
2
9
Integrity



Data Compromise, Manipulation
5
8
8
Availability



Failure of system, Comms, Deletion
2
6
5
Risk Score
4
5
7

Data Labels and Classification


Once the data is identified and ranked, a naming taxonomy will need to be decided upon.  This is when data labels will be used to mark data so the appropriate controls can be applied – whether they are automatic or manual, technical or otherwise.  Most people have at least heard of one of the federal government’s data labels – “Top Secret” - a very high level of sensitivity of information, only allowed to be viewed by individuals with a “Top Secret” clearance, or higher clearance.  The “Top Secret” designation is the data label, which is applied to documents, emails, etc. and the classification is the understanding of what information falls into this category.  While regulated entities such as healthcare and banking already have data protections defined, an organization will still need to come up with a labeling scheme.  Carnegie Mellon University has a great example of how their data is labeled and classified:

Restricted DataData should be classified as Restricted when the unauthorized disclosure, alteration or destruction of that data could cause a significant level of risk to the University or its affiliates.  Examples of Restricted data include data protected by state or federal privacy regulations and data protected by confidentiality agreements.  The highest level of security controls should be applied to Restricted data.

Private Data - Data should be classified as Private when the unauthorized disclosure, alteration or destruction of that data could result in a moderate level of risk to the University or its affiliates.  By default, all Institutional Data that is not explicitly classified as Restricted or Public data should be treated as Private data.  A reasonable level of security controls should be applied to Private data.

Public Data - Data should be classified as Public when the unauthorized disclosure, alteration or destruction of that data would result in little or no risk to the University and its affiliates.  Examples of Public data include press releases, course information and research publications.  While little or no controls are required to protect the confidentiality of Public data, some level of control is required to prevent unauthorized modification or destruction of Public data.

Once a project like this is completed, it will become much easier to implement effective protective controls on the appropriate data.  I usually find the exercise also sheds light on the sheer expanse of data an organization maintains.  When management is given visibility into the amount of data, backed up by numbers showing the risk the data poses to an organization, decisions can be made and budgets can be created to ensure the protection of data is appropriate.

[1] http://www.cmu.edu/iso/governance/guidelines/data-classification.html