Data Classification
Data classification is the process of applying confidentiality and privacy labels to information based on the adverse effect of unauthorized disclosure.
- typing schemas tag data assets so that they can be managed through the information lifecycle
- determines data governance and retention processes:
- data governance
- is a collection of processes detailing how data is collected and accessed during the data’s life cycle
- data retention
- is a collection of processes detailing how data is stored for a specified amount of time
- data governance
Data Classification Schema
A data classification schema is a decision tree for applying one or more tags or labels to each data asset.
- multiple kinds of classification schemas
- based on the degree of confidentiality required:
- Public (unclassified)
- no restrictions on viewing the data
- only presents a risk when availability or integrity is compromised
- may require authorization before release
- Confidential
- information is sensitive but can be declassified
- suitable for viewing only by personnel within the organization and possibly by trusted third parties under conditions such as NDAs
- does not necessarily include information requiring protection at the national security level
- Secret
- information that, if disclosed, could cause serious damage to national security
- restricted to individuals with a need to know
- Top Secret
- highest level of classification
- information whose unauthorized disclosure could cause exceptionally grave damage to national security
- extremely restricted and monitored
- Public (unclassified)
- based on the kind of information asset:
- Proprietary
- aka intellectual property (IP)
- is information created and owned by the company, typically about the products or services that they make or perform
- Private/personal data
- information relates to an individual identity
- Sensitive
- label is usually used in the context of personal data privacy-sensitive information about a subject that could harm them if made public and could prejudice decisions made about them if referred to by internal procedures
- as defined by the EU’s GDPR
- includes religious beliefs, political opinions, trade union membership, gender, sexual orientation, racial or ethnic origin, genetic data, and health information
- Restricted
- refers to sensitive information that requires stringent controls and limited access due to its highly confidential nature
- includes data that, if disclosed or accessed by unauthorized individuals, could cause significant harm to individuals, organizations, or national security
- Proprietary
- based on NIST SP 800-53B:
- low level impact
- is a data classification level indicating unauthorized disclosure causes a limited adverse effect
- moderate impact level
- is a data classification level indicating unauthorized disclosure causes a serious adverse effect
- high impact level
- is a data classification level indicating unauthorized disclosure causes a catastrophic adverse effect
- low level impact
- based on the degree of confidentiality required: