In a Statement by Dr. David McClure, Associate Administrator, Office of Citizen Services and Innovative Technology, at an April 2011 Senate Subcommittee Hearing, McClure stated one of the biggest challenges federal agencies face in migrating to the cloud is data management. Data management in cloud computing is something that needs to be critically analyzed and strategized before solutions can be implemented. So lets take a look at some of the Data Management challenges that exist in Federal Cloud Computing Solutions:
First of all, it is important to understand that the IT needs of global organizations pale in comparison with those of the US federal government. Quite simply, the US Federal Government is enormous – composed of more than 2.1 million full-time federal employees, each of who use at least one IT system and 2,094 federal government data centers composed of thousands of servers.
Some reports estimate the majority of these severs are operating at 20 percent of capacity. Managing that amount of information is a daunting task that all agencies must face. For those who have managed to switch over to cloud based solutions, the change has taken a great amount of time and effort. This can be seen in the examples of cloud computing within the federal government.
Moving toward the government of the future, agencies will continue to be challenged to acquire, identify, integrate, analyze and disseminate relevant information to achieve high performance. In addition, analysts must overcome the following challenges:
- Information Overload – too much information to process
- Information Accessibility – ability to access information needed
- Missed Discoveries – patterns or problems previously overlooked
- Limited Access to Organizational Knowledge – unclear understanding of document flow and ownership leaving questions around permissions
- Reporting Challenges – unclear on how to report problems that need to be addressed within the system
Not to mention the initial consolidation of unprecedented amounts of data that will need to be addressed before a cloud solution can be implemented, a process that could very quickly become a resemblance of coming home to find out someone is reorganizing your closet and currently has all of your clothes spread out across your room. Regardless of the end result, that’s enough to overwhelm anyone.
That being said, consolidation and migration data management strategies need to be tackled first. Managing data during migration is a common challenge organizations face with the cloud, with a number of organizations reporting individuals or business units moving often sensitive data to cloud services without the approval or even notification of IT or security. In both public and private cloud deployments, it is important to protect data in transit, this includes:
- Data moving from traditional infrastructure to cloud providers
- Data moving between cloud providers
- Data moving between instances (or other components) within a given cloud.
A number of technology practices that are not new to the enterprise information arena are key in focusing attention to prevent problems on these three areas. These include Database Activity Monitoring, File Activity Monitoring, Data Loss Prevention all of which are seasoned practices geared towards tracking data movement and migration.
Once information is migrated to the cloud or other repository, controls must be implemented to ensure that all data is continuously utilized in methods that are in accordance with policies and expectations. In order for this to be accomplished, there needs to be a clear understanding of the processes and policies for both understanding how agency information is used, and governing that usage.
Information Governance is the solution to these panic-attack inducing challenges. A coherent information governance solution includes the systems, processes and solutions which manage information according to its value and risk, thus treating information as a corporate asset which can be effectively discovered, managed and disposed of.
Information governance includes the policies and procedures for managing information usage. It includes the following key features:
- Information Classification – high-level descriptions of important information categories. Unlike with data classification the goal is not to label every piece of data in the organization, but rather to define high-level categories like “regulated” and “trade-secret” to determine which security controls may apply
- Information Management Policies – Policies to define what activities are allowed for different information types.
- Location and Jurisdictional Policies – Policies that outline where data may or may not be geographically located. These policies typically have important legal and regulatory ramifications.
- Authorizations – Define which types of employees/users are allowed to access which types of information
- Ownership – Who is ultimately responsible for the information
- Custodianship – Who is responsible for managing the information, at the bequest of the owner.
Another aspect of data management that many experts believe will be key in allowing agencies to meet their security requirements with their cloud computing efforts will be the ability for agencies to track changes to documents to the point of knowing who made a change and what the change was. The logging and auditing controls provided by some vendors are not yet as robust as the log controls provided within enterprises and enterprise applications. The challenge here is to ensure that, post incident, agencies can almost immediately know where the issue occurred, what login credentials were used, and what was done (edit, download, change access, etc.) to a document.
Effective data management of the billions of documents archived by the federal government both in-transit and in stored repository must be controlled by guidelines and procedures. However, once this hurdle is tackled, federal agencies will be able to utilize their information on a whole new level. Increased compliance, exposure of missed problems and patterns unattainable without the use of Big Data, and all around better decision making are results that will be seen in every agency effectively utilizing their data.