Data Standardization: Short-term Decisions Become Long-term Decisions

October 18, 2010

The first step toward standardization is to develop the belief that data is community property, not personal property. This step is hampered when people zealously believe that their approach to the data is the correct one and everyone else should convert to their religion.

"[H]ow shall we expect charity toward others, when we are uncharitable to ourselves? 'Charity begins at home,' is the voice of the world; yet is every man his greatest enemy, and, as it were, his own executioner." Sir Thomas Browne, 1642.

Like charity, the implementation of standards begins at home and we are our own worst enemies. Human beings are extremely territorial and a bit superstitious. Having tried something new, they are quick to develop a religious attachment to it. Were these nationwide or organization-wide rather than individual traits, standardization would have already happened. Europe and Asia, with their differing cultural traditions and propensities, have made more progress.

The first step toward standardization is to develop the belief that data is community property, not personal property. This step is hampered when people zealously believe that their approach to the data is the correct one and everyone else should convert to their religion. It is complicated by the NIH Syndrome (not invented here) in which people prefer anything they invented over anything else. Lack of familiarity with computer technology, data management principles, and even basic computer literacy complicates matters. Without the ability to envision potential solutions, the only obvious way to solve most data gathering problems is to "create a spread sheet" or the equivalent. That is, to develop a proprietary, totally closed solution to the immediate problem at hand.

Data may actually be available elsewhere within the organization but seeing no clear way to get or use it, data is entered again. Ad hoc solutions often have negative long-term consequences; the life-span of such solutions is limited and the data that is collected spoils rapidly. This behavior pattern has one advantage. People who lack the ability to create a more elegant solution and cannot wait for their IT department to get around to them actually get things done. Recalling the "survival of the fittest" analogy, these small scale activities continue and multiply precisely because they are better adapted to the typical organizational ecosystem than other solutions that are more theoretically sound. While each of these projects may transiently thrive in its niche, from the organizational perspective the situation is chaotic, totally out of control, and creates significant long-term costs and inefficiencies.

It is a principle of data system design that each data element of interest should exist in only one column of one table in one database within a natural boundary, e.g. an organization. Data stored in only one place may be incorrect, but there is never a conflict. Each data element is assigned an owner responsible for assuring accuracy.

First, an organization has to adopt conventions. A database technology must be selected. A catalog of the data that is available must be developed and published. Those needing to create applications that use or add to the store of data must find it easier to acquire the tools and assistance required to make use of the existing data store than to go their own way. These activities must be encouraged, not marginalized or demonized.

The $64,000 questions are:

  • In what ways could an organization act differently to incentivize and assist those who would otherwise build small, isolated data islands to channel their efforts in ways that would share effort and data and benefit the organization at large?
  • Can the existence of national standards help enough to justify the overhead they impose?

Daniel Essin, MA, MD, FAAP, FCCP, will be a regular contributor to the Practice Notes Blog. He has been a programmer since 1967 and earned his MD in 1974. He has worked at the Los Angeles County and USC Medical Center where he developed a number of internal systems, chaired the Medical Records Committee, and served as the director of medical informatics. His main research interests are electronic medical records, systems architecture, software engineering, database theory and inferential methods of achieving security and confidentiality in healthcare systems.