Blog

Data, The Neglected Part of Content Management

by | Dec 13, 2010 | Enterprise Content Management, Information Sharing | 1 comment

I call it “the question”. You’re in a crowded conference room and someone asks, (and it’s usually with the senior manager present), “could you give me a definition of content management?” Everyone looks at you, the learned professional for some wise insightful definition and you’re thinking wow how big is big. This would appear to be an easy question to answer. But, in reality it is not.

Experience shapes perception and everyone around the table has a different experience base when it comes to the idea of managing content. Research firms such as Gartner consider content management as part of the essential infrastructure enterprise landscape. But the term doesn’t generate the same instant recognition or understanding as, say, enterprise resource planning. Why is that? It’s simple; the company wouldn’t exist if it weren’t managing its financial resources. The fact that I’m in the room being asked “the question” confirms my suspicion that the  company hasn’t developed a strategy for managing content.

So, back to “the question”. Supreme Court Justice Stewart’s famous phrase “I know it when I see it” applies in this case. It’s hard to define content management but most people know what it is when they see it, or use it. My answer always starts at the lowest level in defining content management; the application of structured data content to index and manage unstructured content. From that level it is easy to build out the various elements of content management as it applies to the target audience. Why do I prefer this definition? It explicitly calls out what I believe is an overlooked area of content management: the element of structured data.

Without structured data unstructured content is virtually useless to an organization; lost in place. Data allows the content to participate in business process decision making. Data allows the content to be located, searched and utilized, secured and protected, retained, archived or destroyed. Without the proper structured data providing the unstructured content in the proper context a content management system runs the risk of being rejected. Users will brand the system as untrustworthy, out of date or just too hard to use.

Understanding the structured data is just one part of implementing a content management system. Why is it overlooked? It’s not sexy or flashy. It’s dirty hard work requiring lots of research, interviews, analysis and, unfortunately, creating new data standards out of thin air. I call this work, “data archaeology “. The language of the business must be mastered and learned, then managed and standardized. And as with most dirty work, the work gets avoided. Where to start?

First determine the scope of content information to be included in the system. Having clearly defined expectations from the executive sponsor will help set boundaries and minimize scope creep. Next roll up your sleeves and collect as much information for the data analysis and aggregation work. There is no exhaustive or complete listing of information sources to consider. It is important to not overlook information in the gathering stage. The information set left out is invariably the one that causes redo work later on in the project. Figure One (below) illustrates this process.

The output of this effort is the metadata required to manage and process the organization’s content. The data must help classify the content in a structured manner to ensure it is consistently and uniformly classified and described. This is an important goal, if you don’t know what the content is how can you trust it and process it.

Simply stated structured data or metadata has only two reasons to exist in a content management system: 1) it is required to support a business process or 2) it is required to support search and filtering. A little more expansive list is put forth by David Haynes in his book “Metadata for Information Management and Retrieval” which lists five reasons for metadata:
1) resource description
2) information retrieval
3) management of information
4) ownership and authentication
5) interoperability.

Either list is fine, mine is just a little more bottom line. But when developing the inventory of structured data the following rules should be applied:
1) each attribute should have defined data owners
2) an established process to populate the data
3) when possible the attribute should be required
4) when possible the entered data be validated

Doing the hard work in mastering the user’s business language and applying these metadata rules will go a long way to helping ensure user acceptance and a successfully content management project.

Categories

1 Comment

  1. James Bailey

    Very good blog and thanks for highlighting the value of structure data. Structured data could be in the form of metadata or associated entities (people, place, email, telephone, GIS, etc.) Getting this information out of unstructured data using business intelligence tools is an important step in help organization leverage their content when making business decisions. At the FBI and DIA, we used two different entity extraction tools. Once we had the data structured, we could use visualization tools to help agents and analyst discover patterns. Like entity extraction, there are great tools on the market for visualization (i.e. i2 Analyst Notebook and Palantir).

    Reply

Submit a Comment

Your email address will not be published. Required fields are marked *