The why, how and how-nots of developing a data classification policy – The 2023 guide

Data classification policy

On the heels of a record-breaking year for data breaches in 2021, hackers seem to be doubling down on their efforts. In the first quarter of 2022, data security incidents were up by 14% compared to the previous year, the Identity Theft Resource Center (ITRC) reports, which marks the third year in a row of first-quarter growth. According to ITRC President and CEO Eva Velasquez, historically this period has shown the lowest number of data compromises. As she explained: “The fact the number of breach events in Q1 represents a double-digit increase over the same time last year is another indicator that data compromises will continue to rise in 2022.”

Some 92% of the breaches were orchestrated through cyberattacks, with phishing and ransomware leading the pack. But what if the call – or in this case, the threat to a company’s data assets – is coming from inside the house?

Like in 2016, when an “inadvertent” cyber breach, caused by a leaving employee, hit thousands of Federal Deposit Insurance Corp. customers. Upon departure, the former agency staffer accidentally took a storage device that contained the sensitive information of over 44,000 individuals, including names, addresses and social security numbers. Was she authorized to access the data while working at FDIC? Yes. Was the information shared or misused by anyone? Apparently not. But what does it say about an organization when an employee can walk out the door with sensitive customer data without anyone noticing?

Not having the right data classification framework in place, for one thing. In this article, we’ll dig deep into the whats, whys, hows and how-nots of developing and implementing data classification policies.

What is information classification and why does it matter?

Data classification means categorizing information based on common characteristics, such as type or sensitivity, to make it easier to find, use, and protect. But it’s only half the story. To make sure that employees and security teams are on the same page about who does what and why to safeguard sensitive data, a formalized information classification policy is a must.

The primary purpose of classifying data is to ensure that each piece of information within an organization is handled according to the risk it poses to the business if stolen, modified, or destroyed. If done right, it can save money, lower risk exposure, boost efficiency, and help companies stay compliant with data protection rules, industry or otherwise.

What should go into a data classification policy?

As data classification policies are tailored to businesses’ data management needs and protocols, no two frameworks will look the same. However, there are a few considerations that every security team needs to take into account when developing a data classification program, that is, what type of data the company collects, what level of sensitivity it should be labeled, who owns and who can access it, and what laws or industry standards govern the handling of it. At the minimum, a well-rounded data classification policy should touch upon the following topics.

  • Purpose
  • Give a high-level overview of why your organization needs a data classification policy, outlining the key functions, objectives, and benefits.

  • Scope
  • Describe the types of data that should be classified and the information systems as well as the people (e.g. employees, third-party vendors) the policy applies to.

  • Roles and responsibilities
  • Clearly outline who will be in charge of creating, implementing, updating, and enforcing the data classification policy, and educate stakeholders.

  • Data classification process
  • Break down how data classification will be carried out within your organization, with a special focus on how data will be assessed for sensitivity.

  • Data classification guideline
  • Explain what categories data assets will be classified into (e.g. confidential, internal etc.) and list the specific types of data that fall under each category.

    The name game: the types of data classification, explained

    The terminologies used to describe data classification levels might seem confusing at first. Some companies use three, some four, and some even more categories, especially those in the healthcare and financial services space. Some use labels such as public, controlled, restricted, and confidential, while others go for public, private, internal, confidential, and restricted. Still others sort high-risk data into categories including sensitive, critical, and classified.

    Here’s a rule of thumb to help you decide: the exact terms don’t matter all that much as long as they accurately distinguish and describe the various data sensitivity levels within your organization. These are the most widely used ones, including some common data classification examples:

    • Public: any information that’s intended for public access, including marketing collateral or website content, and can’t cause harm if leaked.
    • Personal: personal data, like social security numbers and health information, the exposure of which can bring about severe legal and financial repercussions.
    • Confidential: sensitive information, such as payroll details or vendor contracts, that can put a dent in your reputation or revenues if made public.
    • Internal: company data that’s critical for business operations, including company-wide guidelines or policies, with a mid-level risk profile.

    That said, region- or industry-specific regulations – think the EU’s >General Data Protection Regulation (GDPR), the Health Insurance Portability and Accountability Act (HIPAA) in the US or the Payment Card Industry Data Security Standard (PCI DSS) that applies to any organization that accepts, transmits, or stores cardholder data – might require you to rethink or further refine your classification scheme.

    Under the GDPR, for example, personal data is any information related to an identified or identifiable natural person, such as name and surname; home address; ID card number; the location data on a mobile phone; IP address; cookie ID; or advertising identifier. On top of that, a special category of personal data, sensitive personal data, is subject to specific processing and protection measures.

    Examples of the latter category include trade union membership, biometric data, such as face, voice, palm, retina, or ear shape recognition, health data, such as medical history or fitness tracker information, genetic data, such as DNA and RNA, and any information revealing someone’s racial or ethnic origin, political opinions, religious or philosophical beliefs, or sexual orientation.

    PCI DSS requires covered entities to build and maintain a secure network and systems to protect cardholder data and use methods such as encryption, truncation, masking, or hashing to render it unreadable to intruders. These include: primary account number, cardholder name, expiration date, service code, full track data (magnetic-stripe data or equivalent on a chip), CAV2, CVC2, CVV2, and CID codes, as well as PINs and PIN blocks.

    What are the benefits of a data classification policy for any organization?

    You can’t protect what you can’t see

    The most obvious plus of a well-thought-out data classification policy is lower organizational exposure to data compromise as well as the disruption and reputational damage that usually come with it. In no small part is this the result of a better understanding and full view of your data landscape thanks to such protocols.

    Go beyond ticking compliance boxes

    Not only can a data classification policy boost immunity against hefty penalties for non-compliance but also help you build a culture of data security awareness within your organization. If detailed but easy to grasp, it “demystifies” data management for employees by providing clear guidance on their data handling obligations.

    Cut unnecessary security costs

    According to Gartner, global cybersecurity spending in 2021 topped at around $150 billion, up more than 12% from 2020. How much of this is spent on safeguarding data that needs no safeguarding at all is unclear, but one thing’s for sure: proper data classification will show you exactly how much you’ve been overspending or underspending on security controls.

    It’s time to make your data work for you

    More than 80% of enterprise data is unstructured. In other words, tucked away in email messages, videos, photos, webpages, and audio files across organizations’ information systems. Data classification can help you unearth these hidden data assets and turn them into searchable, manageable, and actionable insights.

    Best practices and pitfalls in creating a data classification policy

    1. Mind the law (above everything else)
    2. Should your data classification policy be fully tailored to your business’s profile, workflows, goals, and needs? Absolutely. But a deep dive into relevant laws and regulations should be step zero of your data classification process to make sure it’s compliant first and customized second.

    3. Use a top-down approach
    4. Data classification can get overwhelming fast. So it’s best to think about what data classification practices pose the biggest threat to your organization and take it from there. Once you’ve ironed out the high-impact areas, you can continue to patch up other vulnerabilities one step at a time.

    5. Make it detailed but easy to interpret
    6. The only thing that overcomplicated data classification policies (or policies of any kind) drive employees to do is look for workarounds. Restrict details to a need-to-know basis and use language that non-technical stakeholders can understand without any problem.

    7. Make staff education a priority
    8. The weakest link in any organization’s security efforts is, and will always be, people. This is why educating staff on the classification levels of the data they handle in their everyday jobs and what it means in terms of enterprise information security should be a top-of-agenda issue.

    9. Keep the policy up-to-date at all times
    10. Pencil in an annual review to make sure that your data classification policy accurately reflects any potential changes in and outside of the organization, whether it’s the implementation of a new information system or the introduction of new regulatory requirements.