Kirsten McKnight's List: Records and Information Management

The Value of Offsite Storage 3

Apr 29, 09

Discusses the value of off site storage and the costs of storing records on site

www.use-sba.com/...off-site-storage.html Records_Management Cost_Benefit
- Risk management is critical to reducing exposure to litigation, audit and statutory compliance.
- if no formal records management program is in place, “between 30 and 60 percent of its records are either inactive or semiactive and need to be destroyed immediately or transferred from prime office space to a low-cost storage facility.”
- For 50 cabinets of inactive records, (approximately 350 cubic feet) the savings could be $20,000 to $95,000 per year, based on the data provided in the Information and Records Management text. This is based on a computation of $50 in storage costs per 7 cartons per month, or $7.14 per carton. When the number is lowered to a still exaggerated but more realistic figure of $1.00 per month, per carton, the savings per cabinet increases to a range of $916-$2,416 per year or $45,800-$120,800 per year for 50 4-drawer cabinets of inactive records. As you can see, effective records management combined with effective outsourcing can significantly reduce operating costs for the organization.
1 more annotation...
Legal Technology - Legal Holds: Watch the Door

Apr 23, 09

Article about the importance of good legal holds processes particularly around exiting employees

www.law.com/...pubArticleLT.jsp Human_Resources Legal_Holds Records_Hold Litigation Employees Best_Practices
21st century literary archives examined | Columbia Spectator

Apr 23, 09

How will we know what to preserve? Arguably, diaries, letters etc are of interest particularly when then individual in question is of interest to the public for whatever reason - but should things like GChat, Twitters be preserved. How do you determine what is of value? I don't think my diaries ranging back to when I was in grade school are of any value, nor transcripts of my chats or twitters or whatever.

www.columbiaspectator.com/...ury-literary-archives-examined Preservation Records_Management Web_2.0 Information_Overload
Bush library at SMU won't house Cheney records soon | George W. Bush Presidential Library | Local News | News for Dallas, Texas | Dallas Morning News 4

Apr 22, 09

Will they or won't they go to the Bush library? Who knows.

www.dallasnews.com/...40609dnmetveeplib.38d4385.html Politics Records_Management USA
- Comments
  
  Loving some of these comments!
  
  Add Sticky Note
- All vice presidents choose where to keep their official papers, which are handed over to the National Archives and Records Administration when they leave office.
- A Cheney spokeswoman said the former vice president anticipated needing access to his records while working on his book.
  
  "It made more sense and was more convenient to keep them in D.C.," said Lucy Tutwiler.
  
  As one cynical reader commented, is that so he can "edit" his records?
  
  Add Sticky Note
- During talks last year, the National Archives suggested that Cheney's artifacts – like a set of gold Murano glass candlesticks and bowls from Italian Prime Minister Silvio Berlusconi – be sent to the Bush library. That way they could be displayed with Bush's items, including the 9 mm pistol that Saddam Hussein held when captured by American soldiers in Iraq.
  
  Just weird....
  
  Add Sticky Note
2 more annotations...
Feds Urge New Lender Record Keeping Requirements : HousingWire || financial news for the mortgage market 2

Apr 22, 09

www.housingwire.com/...er-record-keeping-requirements
- The U.S. Justice Department is urging Congressional leaders to require mortgage lenders and servicers to retain borrower records for up to 10 years, as part of an effort to make it easier to prosecute fraud
- But the collapse of the mortgage industry has hindered investigations, she contended, as lenders, brokers and even title companies are largely going under at a rapid rate and/or haven’t retained borrower records. In terms of “subprime,” the catch-all that now seems to be synonymous with shady business practices, 5 of the 10 top U.S. originators in the subprime market have since folded or otherwise been acquired.
  
  As far as I am concerned, here is a great argument for regulation of the banks and lending institutions. Gasp!
  
  Add Sticky Note
Mayo Clinic, Microsoft Team on Patient Health Record Plus | BNET Healthcare Blog | BNET 5

Apr 22, 09

Another cloud computing effort, this time by Microsoft and the Mayo Clinic.

industry.bnet.com/...-on-patient-health-record-plus Health Privacy Records_Management
- it is offering a free personal health record on the Microsoft HealthVault platform
  
  Google and MS - head to head!
  
  Add Sticky Note
- The Mayo Clinic Health Manager, as the PHR is known, will provide not only a secure place to store medical records online, but also guidance from Mayo experts that will be tailored to the information contained in that PHR. If you are a 50-year-old man with diabetes and hypertension, for example, you will receive information about how you should take care of those conditions and the tests you should receive.
  
  Still should be concerns around privacy, security, access, retention... granted this is not an in-depth article.
  
  Add Sticky Note
- Initially, Mayo Clinic Health Manager will include tools and features that help consumers manage pediatric and adult wellness, immunizations, pregnancy and asthma. Forthcoming features will help users with type 2 diabetes, high cholesterol, and/or high blood pressure.
- But curiously, the Mayo Clinic’s own patients cannot yet upload their medical records to the PHR, according to Mayo spokesperson Ginger Plumbo. “We’re working on it,” she added, noting that that the group hopes to clear away the technical obstacles before the end of the year.
- Microsoft HealthVault, like Google Health, has not been very successful so far in getting healthcare providers to share their clinical information online with their patients. Dr. Bill Crounse, senior director of worldwide health for Microsoft, told BNET recently that HealthVault won’t realize its full potential until it includes clinical data.
3 more annotations...
Cloud Computing Poses E-Discovery, Legal Risks 4

Apr 22, 09

Cloud computing, also known as Software as a Service (SaaS), according
to one attorney is a corporate counsel nightmare. From a Canadian
perspective I think one of the major issues is the Patriot Act - if you choose
to use these services, such as Google, Yahoo, their servers are all
based in the US and therefore any information stored using these services
is subject to the Patriot Act. Major privacy implications....

www.enterprisestorageforum.com/...3814821 Electronic_records Litigation E-discovery Privacy Patriot_Act
- The 2006 e-discovery amendments to the Federal Rules of Civil Procedure (FRCP) changed the legal and corporate information landscape, putting custody and control at top of mind.
- And that creates a big challenge for corporate counsel. How can they identify "who, when and where" in the cloud? How can organizations handle document retention? And to add another layer of worry, information targeted for the cloud may also be subject to laws requiring privacy and persistent data integrity, and other requirements that the storage manager may not even be aware of.
  
  Not to mention if you are a Canadian company do you
  really want to be subject to the Patriot Act?
  
  Add Sticky Note
- op cloud computing shortcomings: no native security attributes; inadequate or no security provisioning by providers; the lack of understanding of cloud legal issues (a real problem for not only cloud computing providers, but also corporate counsel and IT consultants); and the failure to recognize potential liability from either legal issues or a lack of security.
- Users of cloud services will need to insist on service level agreement (SLA) terms with their providers to ensure legal and regulatory compliance, searchability, demonstrable customer care (security), provably persistent data integrity and reliability, and demonstrable storage security and integrity for electronically stored information in the cloud.
2 more annotations...
E-Mail: Treat It as Just Another Record on ADVANCE for Health Information Professionals

Apr 22, 09

An article with some interesting ideas about how to manage email in a health/patient environment. As pointed out by one reader though, email needs to be managed by its content; a crucial and key concept!

health-information.advanceweb.com/...editorial.aspx E-mail Records_Management Electronic_records
One Picture, 1,000 Tags By PAMELA LiCALZI O'CONNELL Publis...

Mar 13, 08

Classification Taxonomies Information_Management

One Picture, 1,000 Tags

By PAMELA LiCALZI O’CONNELL
Published: March 28, 2007
WHAT do you see when you see a work of art?

Museums
From the Bowery to the Bay Area, museums are expanding up and out in a frenzy of new construction and renovation.
Strolling through a museum, a painting of a shipwreck catches your eye. You are struck by the dominance of blue in another work. Yet another painting, featuring a silvery moon, seems sad.
If you try to find those paintings on the museum’s Web site you will probably fail unless you know the title or artist. You can’t search based on what you see.
“Museums have recognized that their online collections are not doing the job — we’re hiding the content away from nonspecialists,” said Jennifer Trant, a partner at Archives and Museum Informatics in Toronto. “We’ve got to provide access on the same level as visual memory.”
Now, after spending millions of dollars and years of effort on their virtual homes — which draw many more visitors than their physical ones — museums are rethinking their online collections. They are experimenting with one of the hottest Web 2.0 trends: tagging, the basis for popular sites like Flickr.com. In social tagging, users of a service provide the tags, or labels, that describe the content (of photos, Web links, art), thus creating a user-generated taxonomy, or folksonomy, as it’s called.
Museums plan to encourage the public to annotate their collections by supplying descriptive tags that could exist alongside professional documentation, creating a new shared vocabulary. Van Gogh’s “Starry Night,” for example, could elicit tags like “stars,” “planets,” “swirls” or “insanity.”
The Cleveland Museum of Art, the Smithsonian Institution and the Powerhouse Museum in Sydney, Australia, already have prototype tagging applications on their Web sites, and nearly a dozen other museums plan similar projects.
But can the public be trusted to tag art? Will curators let them?
The Metropolitan Museum of Art ran a test in fall 2005 in which volunteers supplied keywords for 30 images of paintings, sculpture and other artwork. The tags were compared with the museum’s curatorial catalog, and more than 80 percent of the terms were not in the museum’s documentation. Joachim Friess’s ornate sculpture “Diana and the Stag,” for example, was tagged with the expected “antler,” “archery” and “huntress.” But it was also tagged “precious” and “luxury.”
“The results were staggering,” said Susan Chun, general manager for collections information planning at the Met. “There’s a huge semantic gap between museums and the public.”
Based on this and other research, a group of museums formed the steve.museum tagging project, which recently received a two-year grant from the Institute of Museum and Library Services. The grant work, which began last fall, is based at the Met and the Indianapolis Museum of Art, and includes the Cleveland Museum of Art, the Denver Art Museum, the Guggenheim Museum, the Minneapolis Institute of Arts, the Rubin Museum of Art in Manhattan and the San Francisco Museum of Modern Art. People may tag selected art from these museums on the project Web site; some of the museums plan applications on their own sites as well.
Aside from the prohibitive cost of subject indexing thousands of works, there are other reasons museums want the public to tag art. For one, “art professionals can find it surprisingly difficult to describe the visual elements of a picture,” said Ms. Trant, who is managing the grant work. She recalled that during early testing of tagging at the Met, a frustrated curator complained, “Everything I know isn’t in the picture.”
“We would never say a work is mostly red, or instills a sense of ennui, or features a dog playing poker,” agreed Bruce Wyman, director of new technologies for the Denver Art Museum. “Tagging gives us a set of eyes we don’t have.”
Since August 2006, the Smithsonian Photography Initiative has asked visitors to its Web site, photography.si.edu, to “Enter the Frame” and label 2,000 images culled from various archives. The tags typed in by users become immediately visible to them, but are not added to the database until a professional has reviewed them.
“Our keywording was insufficient in a lot of ways,” said Effie Kapsalis, senior digital producer of the site. “There’s no taxonomic system that could cover the subjects of all these photographs. And we want a lot of tags for each image. So that’s why we turned to the public.”
The tags range from the obvious and mundane to the impressionistic and personal. A photo of Greta Garbo was labeled “lonely” while one of boys dressed in Civil War uniforms inspired “innocence.” The tags create sometimes instructive, sometimes amusing links between disparate images, and these unexpected connections make the photos easy to browse through in a way museum sites rarely do.
Will the public want to tag art? Most large museums have a healthy volunteer corps, and art lovers in general might jump at the chance to assist in museum work. There’s also a powerful Web ethos that spurs participation in “collective intelligence” projects, like the user-written Wikipedia. And, as tagging projects have revealed so far, some people also are likely to use tags to proclaim a personal connection with a work of art.
A look at the tags on the steve.museum site reveals that for each work there is a tendency for a small number of tags to be assigned frequently and for a large number of terms to be assigned once. Take the case of John Singer Sargent’s “Madame X.” After the top five most common labels (woman, black dress, portrait, table, gown), the tags taper off into a long list of increasingly subjective terms (aristocrat, stiff, daring, snob, scandalous, etc.).
If a tag is a user’s assertion that a work of art is about something, as the proponents of tagging suggest, then clearly everyone has a different idea of what “something” is. That’s a good thing, said Sebastian Chan, manager of Web services for the Powerhouse Museum in Sydney, which, since its site was redesigned last June, has encouraged tagging (powerhousemuseum.com). Paul McCarthy, a digital media manager in Sydney, has tagged numerous images on the Powerhouse site, often using lingo and street nomenclature, like the nickname “spacies” for the arcade game Space Invaders.
“There’s real power in idiomatic lingo,” Mr. McCarthy said via e-mail. Tagging, he added, “unleashes the power of the vernacular. It brings the collection alive.”
Cuisinarts, E-Commerce, and ... Controlled Vocabularies By Lou Rosenfeld,...

Mar 13, 08

Classification Taxonomies Information_Management
Cuisinarts, E-Commerce, and ... Controlled Vocabularies
By Lou Rosenfeld, Dr. Dobb's Journal
Jan 01, 2002
URL:http://www.ddj.com/184412772

My friend related this interesting e-commerce experience to me recently.
He wanted to buy a food processor. Being the sensible guy that he is, he wanted quality but didn't want to pay an arm and a leg for it. A used Cuisinart would fit the bill quite nicely. So my pal fired up his browser and went on his happy way to eBay, where he proceeded to enter a search for exactly what he thought he was looking for:
"Cusinart."
Now, my friend is a pretty bright guy. He has a PhD and a host of publications to his credit. He's been a successful entrepreneur and academic. Nowadays he's helping the federal government get up to speed on e-commerce. So misspelling Cuisinart is obviously independent of intelligence—something that could happen to anyone.
It gets better. He submits his search and, voila, eBay has a "Cusinart" available. He bids successfully, and the misspelled "Cusinart" shows up on his doorstep a few days later.
You think that's the end of story. It's not. My friend eventually realizes his mistake and out of curiosity goes back to eBay. This time he searches for the correctly spelled "Cuisinart" and finds lots of them listed—as well as lots of bids pending. More bidders mean higher prices. The upshot is that if my friend had spelled the word correctly in the first place, he'd probably would have spent roughly triple the amount for a "Cuisinart" rather than his bargain-priced "Cusinart."
So what's my point? Well, it's not so much about how to make the most of your eBay experience (besides, there are already whole books on playing eBay to your advantage). Instead, it's just another illustration of how using controlled vocabularies can be the difference between a productive, efficient site and ... well, giving away your old Cuisinart at a third of its worth.
Controlled vocabularies and the guessing game
Whether you realize it or not, you're already familiar with controlled vocabularies. The Library of Congress subject headings and Yahoo's search criteria are a couple of examples. So, as you've probably guessed by now, controlled vocabularies are predetermined sets of terms that fit together to describe a specific domain such as kitchen appliances, nuclear engineering, or dirt biking.
The terms are standardized because language is ambiguous. People use different terms to say the same thing all the time. Or, worse yet, the same terms can mean different things. Sometimes folks just honestly screw up—like my friend did.
By predetermining the terms that make up a controlled vocabulary, and using those terms to describe your site's content, you can minimize the negative effects that variants, synonyms, and various other annoyances can have on your site and its users.
Here's an example. Let's say you're a webmaster at AT&T. Your site describes a huge host of products; one of them is the One Rate plan. There are many pages in your site that deal with One Rate. The problem is that there isn't a standardized spelling for it. So, some pages are about the "One Rate" plan, others describe it as "1 Rate" and on and on. Here are some of the possible references:
One Rate Plan
OneRate Plan
1 Rate Plan
The One Rate Plan
In this case, a patient user might eventually guess the right variant. But, as we all know, patience is a rare commodity on the Web. Without an effective controlled vocabulary strategy, users who enter the wrong term would find nothing. Consequently AT&T would lose a number of new business opportunities. This is where e-commerce degrades into e-guessing.
The guessing gets a lot more difficult with synonyms. Let's say a user visits a financial services site to find information on "income dividends." Would he or she have guessed that much of the content was listed under the following synonymous terms?
Dividend income
Income returns
Investment income
It's unrealistic to expect users to even bother trying to make guesses. Which, if you've invested any amount of time or money into building your web site, is really bad news.
Users who are browsing a site benefit from controlled vocabularies because they will find the information they need in one place under one heading. It also makes your life as a webmaster easier since you don't have to make up a new label for each piece of information you want to add to your site—just choose from the controlled vocabulary.
An even better approach (though more work) is to consider expanding your controlled vocabulary into a thesaurus that includes variant, related, broader, and narrower terms, as well as glossary definitions. That way, a user browsing your site could, for example, look for "investment income" and be directed to content indexed under the standard term, "income dividends." Searching can be similarly improved—the user's query, "1Rate," would be enriched to automatically include "One Rate" and the variants that have already been mapped to "One Rate," terms that the user hadn't considered.
Resources
If you are going to explore using a controlled vocabulary, here are a few resources. Visit the American Society of Indexers web site, which includes a listing of web-based resources. You might even consider joining ASI, or hiring its members. You might also read Peter Morville's Web Architect column on developing a thesaurus where he defines "a controlled vocabulary that leverages synonymous, hierarchical, and associative relationships among terms to help users find the information they need."
Tips
Consider using multiple controlled vocabularies to describe the same content. Organize each around a theme like product names, subjects, processes, audience types, and so on. Avoid mixing these themes together because that's mixing apples and oranges, making them confusing for users to understand and more difficult for you to maintain.
Balance the desire to use multiple vocabularies with the overhead involved in applying and maintaining each. Just as the domain changes over time, so should the terms used to describe it.
Plagiarize (I mean, borrow from) good vocabularies that are already found in similar web sites, not to mention the examples you'll find linked from the ASI site. You might even find some useful candidates in your organization's printed materials.
As described above, automatically enriching a search query behind the scenes can improve retrieval. But that doesn't mean it should be transparent to users who might not understand what's happening. Always provide at least a basic explanation of how your site's searching system works.
Finally, remember that a good controlled vocabulary not only describes a domain's content, but should also reflect the language of users. If you don't have a good feel for the kinds of words your site's users commonly use, then analyze your search engine's query log.
Junk in, junk out
Looking around the Web and you'll find a wasteland of chaotic vocabularies. It's almost amusing: with all the hype about e-commerce these days (somehow I remember commercial web sites existing long before I ever heard that term), webmasters are going nuts for taxonomies to describe their products and services. But you don't hear too much about what terms should be used to actually populate those taxonomies.
Same thing goes for the corporate portal and the Yahoo-ized intranet. Vendors of portal software, XML-based approaches, and other products espouse metadata as solutions to the challenges of searching and browsing. Again, their solutions are only halfway there. They provide you with descriptive metadata fields, but you'll still need standardized terms to enter into those fields. Otherwise, it's simply another case of junk in, junk out.
Taxonomies Value of Organized Knowledge by Jack Bryar 21-Jan-2002 ...

Mar 13, 08

Classification Taxonomies Information_Management
Taxonomies
Value of Organized Knowledge
by Jack Bryar
21-Jan-2002

An Old Problem Gets Worse
In recent years, the volume of news and information resources available to the typical corporate employee has grown exponentially. Corporate Web traffic has jumped by over 600% annually. Web-available content exceeds several billion items. Executives frequently receive more than 200 e-mails a day. The amount of corporate data generated per employee doubles every 18 months.
"If you printed the information available through our Intranet, it would stretch from the earth to the sun."
-- Marc Auckland, World-Wide Chief Knowledge Manager, British Telecom
Corporate managers are worried. Sixty percent say that info-glut is having a negative effect on productivity. IDC estimated that in 1999, US Fortune 500 companies lost $12 billion due to an inability to locate knowledge resources amidst all the clutter. Eighty percent of executives believe the problem will get worse before it gets better.
Adding to the strain is the fact that this content is so difficult to access. Corporate information exists in many forms. Each form (electronic news, email, databases, Web pages, archived documents, etc) resides in its own format, accessible only though some unique index system. Often content is not easily accessible. Much of it is scattered across the enterprise.
Yet, access is critical. New applications have sprung up requiring access to information across the enterprise, sometimes across multiple businesses. These include next-generation customer care, competitive research, and B2B transactions. In order to get at the information needed to run these applications, information itself needs to be re-structured, and re-organized -- and so does the method of getting at that information.
Info-Illiteracy: A Barrier to Finding Information
The problem of business info-glut is worse than it appears. Many employees lack the skills needed to find the information they require.
For years, putting tools in the hands of the users was considered the best way for companies and their knowledge workers to get their hands on the information they needed. In most cases, that has meant providing users with a search engine similar to systems found on public Websites. Today, many knowledge workers have to navigate as many as six different search engines and database indexes each day.
New research casts doubt on how well search engines works for most users. A study of AltaVista users revealed a surprising amount of info-illiteracy. According to that study:
80% couldn’t/wouldn’t build a working Boolean search
87% used less than 3 words
A big part of the problem is that the same term can have different meanings to different people. Not knowing which terms will uncover sought-after information is a significant barrier for many knowledge workers. Any successful strategy for managing information has to overcome this problem.
In 1814, Thomas Jefferson was so dissatisfied with the ruined and disorganized state of the documents at the Library of Congress that he donated his collection and then personally reclassified the all the books there.
-- Source: Systems of Knowledge Organization for Digital Libraries, Gail Hodge
XML to the Rescue?
The Internet has been described as the world’s largest library, with the books thrown all over the floor. Many corporate information systems look just as disorganized. Information managers are convinced that the best solution to this clutter involves wrapping up all electronic document forms inside a common format, so that the content inside can be more easily found, and used by different applications.
The wrapper being used by most organizations today is XML. XML allows the tagging of a document with a description of what the document is about, and where it came from. Searching on XML meta-tags can certainly simplify the search process.
Unfortunately, XML does not solve the problem of finding information. It only standardizes the problem. It requires that any XML tagging system clearly understand what the document is about, and it needs to anticipate the search process someone might try to use to find it. This takes time, a great deal of sophistication, or both. Otherwise, the process results in hiding essential documents behind generic, idiosyncratic or meaningless tags, making the information management and retrieval problem even worse.
In order for XML tagging to be meaningful for search and retrieval, the terms used to tag content have to be intuitive enough to encourage their use by information-seekers. They should be structured in a standardized way; less as a set of variable keywords and more like a set of subject categories. These subject categories should be set up in a hierarchical fashion, with logical subtopics and overviews. This, in short, is a taxonomy.
Enabling an ability to search or manipulate content, "by category" is an essential benefit of a successful XML tagging process.
Taxonomies Defined
Taxonomies are sometimes called "classification schemes" or "categorization schemes." Each refers to grouping together similar items into broad "buckets" or "topics" which themselves can be grouped together in ever-broader "hierarchies." Examples of taxonomies include systems as diverse as the Dewey Decimal system found in small libraries, Yahoo’s Subject Index, and the massive taxonomic system proposed by Linneaus used by generations of biology students. Wherever they are used, they have the same goal -- to organize knowledge about a given subject.
A sample taxonomy from NewEdge:
Taxonomies and The Search Process
Perhaps the greatest benefit to taxonomies is improved searching.
Properly constructed taxonomies simplify the process of gathering "the right" information for daily business use by simplifying the vocabulary used in the search process. Tagging systems using raw key words or similar strategies are likely to generate search error rates approaching that of straight text searches. For example, while a search on the word "DSL" will find stories on a particular type of broadband technologies, it will miss others, and may accidentally find content referring to Dutch sign language or Data SubLanguage.
A better approach would be to define these documents as belonging to the subject category, "Digital Subscriber Line." If the searcher can focus on a proven set of categories rather guess at keywords, chances of finding the right content, are far greater, and the process will be faster and more reliable.
The most important contribution of taxonomies to the search process is that they work.
Even using a relatively primitive taxonomic system, Microsoft reported a 40% improvement in hit rates. Satisfaction metrics doubled. In addition, the time spent trying to find a given document was significantly reduced. The success rate of taxonomic-based searching reduces the strain on systems and on the people who use them.
Business is Complicated
Naturally, one of the most important criteria for taxonomy is that it should be easy to navigate. But building solid taxonomies is much easier said than done. Consider, for example, a taxonomy of business subjects.
Businesses vary in size and have multiple points of focus. Business activities involve an array of subjects that do not always fall into logical groupings. Subject boundaries are often fuzzy.
Subject hierarchies can feel artificial, as content, particularly business critical content, may fall into multiple categories. Indeed, most executive-level business documents involve several categories. Traditional indexing schemes dissolve in complexity as the number of unique concepts grows.
So, while some subjects are relatively easy to categorize, most business functions are not. (I should know: NewsEdge has spent several years developing a proprietary business taxonomy). Nevertheless, you should seriously consider developing a taxonomy for the content management system residing underneath your e-business efforts in general, and your Intranet in particular. Your content contributors and end-users alike will be grateful.

Next:
Send Feedback
See all ECM Channel feature articles.
Need to select a technology vendor, but confused about your choices? See our vendor-neutral technology reports.
Join the conversation
Digg This | Search Technorati | Tag it on Del.icio.us

About the Author
Jack Bryar is a Practice Leader in the Knowledge Management Consulting and Editorial Services unit of NewsEdge Corporation. Bryar has helped NewsEdge clients to determine their information needs in industries ranging from energy production to international banking. Prior to NewsEdge, Bryar led the I.T. Practice of a corporate advisory services consultancy based in Toronto, Canada.
Taxonomy Design Types Barbara Blackburn 5/31/2006 So you've decided...

Mar 13, 08

Classification Taxonomies Information_Management

Taxonomy Design Types

Barbara Blackburn
5/31/2006

So you’ve decided you need a taxonomy to categorize and organize your documents and records. But how do you decide what type of taxonomy to design? The type of taxonomy you choose is as important as the taxonomy itself. If the design doesn’t meet the needs of the users; it will not be used.
Taxonomy Types
Taxonomies are usually hierarchical. Categories (nodes) in the hierarchy progress from general to specific. Each subsequent node is a subset of the higher level node. There are three basic types of hierarchical taxonomies: subject, business-unit, and functional.
A subject taxonomy uses controlled terms for subjects. The subject headings are arranged in alphabetical order by the broadest subjects, with more precise subjects listed under them. An example is the Library of Congress Subject Headings (LCSH) used to categorize holdings in a library collection (see the example to the right). The Yellow Pages could be considered a subject taxonomy.
It is difficult to establish a universally recognized set of terms in a subject taxonomy. If users are unfamiliar with the topic, they may not know the appropriate term heading in which to begin their search. For example, a person searches through the Yellow Pages for a place to purchase eyeglasses. They begin their search alphabetically by turning to the E’s and scanning for the term “eyeglasses.” Since there are no topics titled “eyeglasses,” the person consults the Yellow Pages index, finds the term eyeglasses, which provides a list of preferred terms or “see also” which directs the person to “optical – retail” for a listing of eyeglass businesses.
LIBRARY OF CONGRESS SUBJECT HEADINGS
H -- SOCIAL SCIENCES
J -- POLITICAL SCIENCE
K -- LAW
L -- EDUCATION
M -- MUSIC AND BOOKS ON MUSIC
N -- FINE ARTS
P -- LANGUAGE AND LITERATURE
Q – SCIENCE
R -- MEDICINE
- Subclass RA Public aspects of medicine
- Subclass RB Pathology
- Subclass RC Internal medicine
- RC31-1245 Internal medicine
- RC49-52 Psychosomatic medicine
- RC251 Constitutional diseases (General)
- RC254-282 Neoplasm. Tumors. Oncology
Subject Taxonomy Sample
COUNTY GOVERNMENT BUSINESS-UNIT TAXONOMY
Assessor
Building
Commissioners
Coroner
District Attorney
Finance
Health and Environment
Human Resources
Human Services
Motor Vehicle
Clerk and Recorder <------------
Department
- Elections <------------ Divisions
- Motor Vehicle
- Recording

- TD1000 <-------- Records
-- Warranty Deed
-- Quit Claim Deed
-- Subdivision Plat
Sheriff
Treasury
Business-Unit Taxonomy Sample
In both examples (LCSH and Yellow Pages), the subject taxonomy is supported by a thesaurus. A thesaurus is a controlled vocabulary that includes synonyms, related terms, and preferred terms. In the case of the Yellow Pages, the index functions as a basic thesaurus.

In a business-unit taxonomy, the hierarchy reflects the organizational charts (e.g., department/division/unit). Records are categorized based on the business unit that manages them. The example above and to the right shows the partial detail of one node of a business unit taxonomy that was developed for a county government.
One advantage of a business-unit taxonomy is that it mimics most existing paper- filing system schemas. Therefore, users are not required to learn a “new” system. However, conflicts arise when documents are managed or shared amongst multiple business units. For example, in the county government example above, a property transfer document called the “TD1000” is submitted to the recording office for recording and then forwarded to the assessor for property tax evaluation processing. This poses a dilemma as to where to categorize the “TD1000” the taxonomy. Another issue arises with organizational changes. When the organizational structure changes, the business-unit taxonomy has to change as well.
In a functional taxonomy, records are categorized based on the functions and activities that produce them (function/activity/ transaction). The organization’s business processes are used to establish the taxonomy. The highest or broadest level represents the business functions. The next level down the hierarchy constitutes the activities performed for the function. The lowest level in the hierarchy consists of the records that are created as a result of the activity (e.g., transactions).
The example above shows partial detail of a functional taxonomy developed for a state government regulatory agency. The agency organizational structure is based on regulatory programs. Within the program areas are similar (repeated) functions and activities (e.g., permitting, compliance, enforcement, etc.). When the repeated functions and activities are universalized, the results are a “flatter” taxonomy. This type of taxonomy is better suited to endure organizational shifts and changes. In addition, the process of universalizing the functions and activities inherently results in broader and more generic naming conventions. This provides flexibility when adding new record types (transactions) because there will be fewer changes to the hierarchy structure.
One disadvantage of a functional taxonomy is its inability to address case files (or project files). A case file is a collection of records that relate to a particular entity, person, or project. The records in the case file can be generated by multiple activities. For example, at the regulatory agency, enforcement files are maintained that contain records generated by enforcement activities (Notice of Violation, Consent Decree, etc.) and other ancillary, but related activities such as contracting, inspections, and permitting. To address the case file issue at the regulatory agency, metadata crossreferencing was used to provide a virtual case-file view of the records collection.
ARTWORK GOES HERE Which Taxonomy Type Should You Use?
Each taxonomy type has its pros and cons. In most cases, a Hybrid approach combining the taxonomy types is the most appropriate.
In choosing a taxonomy type, consider the following:
- Gain an understanding of your organization and how the business units function and interact.
- What are the needs of the users (both internal and external users)? Will you need multiple “views” or methods for records searching and categorization?
- Where will the taxonomy be applied and what are the operating parameters or limitations of those systems? (electronic content or records management system, paper files, shared network drive, website, etc.)

Barbara Blackburn is a senior consultant with IMERGE Consulting (www.imergeconsult. com). She specializes in document and records management technology planning, selection, and deployment. Ms. Blackburn is also an instructor for AIIM’s ERM Certificate program. Contact her at (barbb@imergeconsult.com).

Kirsten McKnight's List: Records and Information Management

The Value of Offsite Storage 3

Bush library at SMU won't house Cheney records soon | George W. Bush Presidential Library | Local News | News for Dallas, Texas | Dallas Morning News 4

Feds Urge New Lender Record Keeping Requirements : HousingWire || financial news for the mortgage market 2

Mayo Clinic, Microsoft Team on Patient Health Record Plus | BNET Healthcare Blog | BNET 5

Cloud Computing Poses E-Discovery, Legal Risks 4

Museums

Controlled vocabularies and the guessing game

Resources

Tips

Junk in, junk out

Data Stewardship Program Manager

Data Stewardship Project Manager

Subject Matter Managers (SMM)

Data Definition Stewards (Definers)

The Data Definition Steward is primarily a business role however there will be opportunities for technology-focused individuals in the areas of IT infrastructure asset management areas to put business definition to technology-based data.

Data Production Stewards (Producers)

Data Usage Stewards (Users)

Why Records Management?

What is Records Management?

What is a Record?

RM Programs, RM Software

Forgot password?

Comments

Post a new comment

Next:

Join the conversation

About the Author

Get a Free Sample

What we do

Contact us

Cleaning house? Don't throw out old documents | Business Insurance News, Analysis & Articles 3

Do You Want Google To Have Access to Your Prescription Records? 2

SKNVibes.com News: Remarks by Deputy Governor In observance of Records and Information 2

List Info