Sources and volumes of data are growing exponentially.  Website clicks, social media, sensors, and card swipers are generating massive amounts of data every second.  More and more enterprises are beginning to collect and utilize this Big Data for all kinds of purposes, including improved business intelligence, targeted marketing and fraud detection.  With so much attention being focused on the adoption of Big Data and analytics, one important question must be asked — Is this data being properly governed, or governed at all?

What is Information Governance?

Gartner defines Information Governance (IG) as “the specification of decision rights and an accountability framework to encourage desirable behavior in the valuation, creation, storage, use, archival and deletion of information. It includes the processes, roles, standards and metrics that ensure the effective and efficient use of information in enabling an organization to achieve its goals.”  IG is a program, not a project.  Effective IG programs have C-suite support and incorporate input from across the enterprise, including business units, legal, IT, information security, and human resources.  IG will reduce costs and risks, improve efficiencies, and protect the enterprise’s most valuable assets — its information and its reputation.

 Information Governance for Big Data

All data within an enterprise’s possession or control needs to be governed.  Policies should be developed to define how all data will be managed.  The policies should address how the data will be maintained throughout its lifecycle, including how it can be used, accessed and secured, and when it can be archived and legally destroyed.

Big Data is no exception to this rule.  In fact, given the speed with which Big Data volumes multiply, IG takes on added importance.  As the size of data increases, so do the associated costs and risks.  And while analytics are helping enterprises uncover troves of useful information from within Big Data, entities can face liability for using data in a manner that is inconsistent with customer consents, laws, regulations or contractual obligations. These issues are particularly important when data is moved or accessed across state and/or international borders.

IG can mitigate the risk of misuse of Big Data by ensuring, for example, that personal health information is treated in compliance with HIPAA and Hitech regulations.  IG also can reduce the chances of privacy and security incidents involving Big Data.  For instance, the IG policy may mandate masking or redaction of financial information within data sets.  It also may prohibit the inclusion of private information in test data sets.  IG can help enterprises ensure that they’re not keeping data too long (thereby incurring unnecessary costs and increased risks), or destroying data too soon (thereby failing to maximize its business value or violating legal or regulatory requirements).

IG should be established prior to the start of Big Data collection and analysis.  Retrofitting IG to existing Big Data can be difficult and costly, particularly if it must be implemented in reaction to a crisis situation.  Addressing IG at the outset is infinitely more effective, efficient and economical and allows entities to achieve the maximum benefit of their Big Data and analytics projects.