IBM's Watson: Has the Time Come for Expert Systems in Medicine?

March 12, 2012

Effective technology that can help physicians make decisions and improve the coordination of care without being disruptive would indeed constitute a major advance.

IBM scored a huge technical and public relations coup when its Watson computer appeared on Jeopardy. Over the years, IBM has poured millions into developing the hardware and software, not to mention the time and effort involved in accumulating data, vetting it, and loading into Watson's databanks so that Watson would be prepared to respond on Jeopardy.

It is unlikely that IBM committed to this project simply to score a PR coup. Last week the company began to reveal a commercial vision for Watson. Not surprisingly, the target is medicine. Most of what was presented in the flurry of publicity last week had to do with IBM’s vision that Watson, loaded with medical knowledge, could transform medicine as we know it. It could also turn out to be no more than a curiosity. It depends on whether Watson is applied to tasks with which doctors actually need and want assistance and the extent to which care is improved.

IBM has suggested that Watson would be enabled to supplement its databanks with information pulled from the Internet in real time. This not only sounds sexy but probably reflects the true cost of preparing knowledge for Watson, but it carries the implication that data from the Internet potentially has the same value as expert knowledge that has been thoroughly vetted.

Crowd-sourcing (aka the wisdom of crowds) is a fancy name for making decisions based on anecdotal experience rather than on statistically valid samples studied under controlled conditions. Pulling information from the Internet in real time sounds a lot like crowd-sourcing.

Crowd-sourcing is not always appropriate. It has been shown to be effective for tasks that require a quantitative answer, such as estimating the number of jelly beans in a jar. A crowd of people, each making an independent estimate will produce an average result that is more likely to be accurate than the estimate of one person - but this is no surprise. It is a predictable and provable result of statistics and probability theory. It says nothing about the quality of knowledge possessed by the crowd.

On the other hand, there is little evidence that crowd-sourcing can produce meaningful results when no one in the crowd possesses expert knowledge about a question that requires such knowledge. For example, there is a difference between everyone “knowing” that apples fall from trees to the ground, and grasping the full significance of Newton's theory of gravitation which is that it is possible for forces to act at a distance with no obvious agency or mechanism for transmitting those forces over a distance. The concept has such profound implications, and is so contrary to human intuition, that Einstein devoted an entire lecture to the subject in 1920. If there was no gravity, in the Newtonian sense, then perhaps apples might fall from trees but stones on the ground might rise into the air. The crowd would still “know” about falling apples, but that “knowledge” could not be generalized or applied to a novel situation with any certainty.

The value of an expert system is greatest when its recommendations are consistently appropriate. Only then can users develop the confidence necessary to allow them to accept the decision or recommendation of the system. Should IBM choose to mingle expert, well-vetted information with Internet-derived information, which may be little more than hearsay, the consistency and accuracy of advice would always be open to question. It would be incumbent on Watson to inform the user whenever a recommendation was based on anything other than validated, expert knowledge and it would be incumbent on the physician to be skeptical of every recommendation. Certainly, changing the treatment of a patient because a powerful computer crowd-sourced the Internet for a recommendation would be inappropriate.

Effective technology that can help physicians make decisions and improve the coordination of care without being disruptive would indeed constitute a major advance. Anticipating the possibility that such a goal may be realized, we must establish the ground rules for acceptance. First, it must be thoroughly vetted by independent authorities. Second, we must insist on total transparency; this is no place for proprietary code or data. Third, no program should be accepted that that is restricted to running on only one vendor’s platform - remember that the lack of interoperability is considered to be one of the major shortcomings of existing EHR. And finally, no expert system should be accepted that is not fully documented; the sources of information, the validity of that information, and the program logic used to derive each decision or recommendation must be available for public scrutiny.

If IBM succeeds in hitting a “home run” with Watson and produces something that is revolutionary and which results in improved outcome or patient safety, the product cannot remain something from which only the patients of IBM customers can benefit. It must be equally accessible to every practitioner at every patient encounter regardless of which EHR he or she uses.

Find out more about Dan Essin and our other Practice Notes bloggers.