Dark data refers to information collected during regular business operations that needs to be promptly and fully utilized for its intended purpose. This forgotten data remains within the company’s data ecosystem, creating potential problems.
According to a Data Dynamics blog, dark data originates from employees, customers, and business processes. The report revealed that 60 to 85% of unstructured data in shared storage setups is dark and belongs to ex-employees, making it inaccessible to others for years. Identifying its location and content is crucial for proper data management. This initial step allows organizations to protect sensitive information and delete unnecessary data.
The Costs of Dark Data
Storing vast amounts of unused data comes at a significant cost, impacting storage needs, regulatory compliance, and security risks. Here’s a breakdown of these costs:
- Data Breaches: When organizations lack complete visibility into where data resides and its content, accidental copies on unsecured devices expose the data to security threats. This can lead to reputational damage and fines for violating data breach regulations.
- Data Regulation: Regulations require companies to collect and store customer data for compliance. While most of this data may never be used, organizations must adhere to strict security protocols to avoid fines.
- Data Storage: Storing large volumes of unused data creates a significant financial burden. Organizations spend vast amounts of money simply housing dark data, draining resources from other areas.
- Data R.O.T. (Redundant, Obsolete, and Trivial): Dark data can be inaccurate and misleading. If incorrect datasets are used for analysis, redundant, outdated, and trivial data can lead to wasted productivity.
Mitigating the Risks of Dark Data
A report by BigID reveals that 84% of organizations are highly concerned about dark data. IT security officers (CISOs) and legal/compliance teams often implement information governance (IG) and security programs to address dark data. These programs might include defensible deletion, data migration, and data audits across unstructured data sets to identify risks and improve compliance.
However, achieving these goals requires effective and scalable technology platforms. Here are some key challenges in mitigating dark data risks:
- Governance: CISOs typically manage perimeter security, encryption, and other priorities like zero-day attacks and malware. While crucial, these measures often leave distributed dark data as a blind spot. This lack of visibility into data content creates challenges, especially with data privacy regulations like GDPR and CCPA.
To address this, CISOs need to implement scalable solutions. An effective approach involves a dedicated IG/compliance team equipped with technology that empowers them to collaborate with data owners and make informed decisions. - Data Storage: Data discovery involves gaining complete visibility into an organization’s data environment. Companies can utilize various data analytics tools and algorithms to identify valuable data within vast amounts of unstructured data.
The availability of affordable cloud storage has made data hoarding easier than ever. Organizations should proactively update their data retention policies to avoid unnecessary compliance and security risks. This is particularly important for non-critical data, where extended storage exposes the company to potential risks. Remember, storing data comes at a cost, so organizations need to be mindful of the data they accumulate.
Another crucial step is data classification using a dedicated engine. This process helps businesses determine the value, ownership, security needs, and potential risks associated with specific data points within dark data. Classification helps organizations understand the nature of their dark data. - Effective Data Use: CISOs ensure timely and effective data utilization. Investing in tools that enable data querying across storage locations and data migration to centralized platforms can help achieve this goal.
Companies can leverage data analysis tools to access and visualize data stored across various platforms, eliminating the need for redundant storage. This streamlines data management and simplifies access for authorized personnel. Additionally, consolidating data storage reduces the number of data stores requiring management.
Policies that optimize data location further enhance data management. These policies can involve identifying and isolating high-risk dark data, archiving less frequently used data, and prioritizing critical data by storing it on the most secure platforms.
Conclusion
Storing dark data can financially burden organizations, leading to high storage costs and potential non-compliance fines. However, dark data also holds untapped potential for business growth. Identifying and managing dark data effectively unlocks its value and empowers organizations to leverage their data for strategic decision-making.