EHRs Have Scientific Value, but are Physicians Gathering Information?

December 9, 2013

It's time for a new paradigm for physicians that focuses on information gathering, not just data gathering.

"Bad times have a scientific value. These are occasions a good learner would not miss."

― Ralph Waldo Emerson

One of the things we have learned is that creating a medical record is not a math problem, it's more like history and archaeology.

In the so-called "soft sciences," the relationship between data and information is reversed. Information comes first and, if faithfully captured and preserved, data can be derived from it. Since their inception, computers were designed and built with one task in mind - calculation - that's why they are called computers and not documenters or informaters.

Being the only thing of their kind, computers have been coerced into performing myriad tasks that their inventors never imagined. The computing paradigm has been absorbed by non-technical people, causing them to view the world through a computational lens. This is why people find it so hard to accept that data associated with medical events (raw values stored in database fields) is not information. The isolated values lack properties that are necessary to make them informative.

This matters because in healthcare organizations, individuals come and go while the "data" remains. It ceases to be fully informative when the people aware of its context move on. Once the institutional memory has been rebooted, resources are spent keeping the data around, even though its meaning can no longer be ascertained. That's why I contend that raw data is fundamentally meaningless.

Certain elements are necessary in order for an item of information to accurately signify what its producer (the author) intended. These elements include the units of measure, a representation of its degree of accuracy and precision and other contextual modifiers, and external references (lexicons, etc.) necessary to make the meaning clear in the future.

Creating information that faithfully represents reality is hard. It requires skill and knowledge to choose those details necessary to convey meaning accurately and unambiguously. It can be even harder to choose words that don't mislead. Probing deeper, how is it that people can ever find data (that is, naked and devoid of modifiers, metadata, etc.) or chart entries informative?

Here's how. Consider a typical chart note written in 1981 by my wife's first partner in practice:

The date (stamped in the chart by the clerk)

Otitis

Ampicillin

(scrawled signature)

Is that data? No. Can it be turned into data? Not really. We're pretty sure that it means otitis media, given the Rx, but we can't be sure. It might have been otitis externa with the Rx given for an undocumented co-occurring condition. Coding it (collecting the data) "otitis media, NOS" would be a stretch. Which ear? Don't know. How many days of treatment? Probably 10, but who could tell. Return for a recheck? We'll know if the patient appears again. Do we need to flag this for a recall? Don't know. Any other instructions? What about fever, signs of toxicity, pain, etc? Don't know.

Doctors use experience and shared knowledge to compensate for missing and uninformative data. People know their co-workers, the policies and procedures, and the work flow. They have used the same data-collection systems and know what compromises and fabrications are needed to complete the task at hand.

When people are shown "data" and reports, they fill in the blanks. They supply the context and infer the meaning. Having done all of that heavy lifting, it's easy to succumb to one or more of the cognitive fallacies that Daniel Kahneman describes in "Thinking Fast and Slow" such as "The Illusion of Understanding" or "The Illusion of Validity." You basically trick yourself into believing that it was the data alone that led directly your conclusions; the data suddenly seems like information. 

Every researcher has felt frustration after designing a "carefully controlled" experiment with carefully constructed data collection, only to overlook, or to be accused of overlooking, some critical element - for failing to collect that crucial piece of data or for wording questionnaires badly and eliciting responses that cannot be interpreted. 

What EHR has had its data-collection processes designed and tested as thoroughly as the typical R-1 research study? I have yet to see one. Hospitals and vendors assign these tasks to novices, not to data scientists and statisticians, and never give it a second thought. That's why I assert that naked data elements, without context and metadata, are not information and have insufficient long-term value to justify the effort and expense that is currently being devoted to their collection.

Today's EHRs are the culmination of a 50-year experiment in applying computing machines to information gathering. There is a growing consensus that something is wrong and a prevailing opinion that the problem is us - that we are resistant, poorly organized and managed, unwilling to conform to standards. The real take-home message is that the computing paradigm doesn't fit the problem. Data can always be distilled from information but given only data, information cannot be resurrected.

If EHR was a drug, the Phase III trial would have ended and the drug (EHR) deemed to be neither sufficiently safe nor effective and should not be approved for general use. It's time for a new paradigm that focuses on information gathering. It's time for a new experiment.