One year after its birth, the ultimate data parasite has come of age by latching on to a $3 billion host. It is big, blue, pretty good at Jeopardy, and if you were to believe the data detractors, utterly useless.

In a recent post on The Healthcare Blog regarding Watson Health’s acquisition of Truven Health Analytics, the authors Drs. Koppel and Meissner rant on why IBM’s very expensive parasitic spawn won’t yield any results. In a particularly snarky back handed compliment, they write:

All data are abstractions and the “bad data” also reflect a reality. We may not like it, but it’s there. Thus, these data will predict things within this reality that we don’t care about (e.g., more redheads are slightly more likely to have heart attacks on Thursdays than would be expected by chance.) But, we can’t rule out the possibility that Watson will predict something we do care about. Science is full of serendipity, Watson’s logical crunching may lead to some discoveries that are useful.

The authors believe that even though Truven’s data suck, and the premise of using those data in analytical constructs is misguided, IBM might as well continue to analyze these data on the off chance that they find something useful. So, IBM, go ahead with your little project, you never know… you may get lucky and be the next Alexander Fleming!

I suppose this viewpoint might be considered enlightened compared to the editors at NEJM, who believe that performing data analytics is akin to doing harm.

My guess is that the Drs. Koppel and Meissner’s goal wasn’t to spurn technological advances in healthcare data science, but to bring data integrity issues that exist within claims and EHR data to light. To be fair, the data are messy, unstandardized, and incomplete. However, does that mean that these data should not be analyzed? Does it mean there cannot be consequential healthcare discoveries made? Does it mean that Watson Health’s purchase of Truven is only a $3 billion shot-in-the-dark?

Absolutely not. The authors’ views are shortsighted and regressive.  They are the same arguments we have heard about utilizing “unusable” datasets like the FDA Adverse Event Reporting System (FAERS) to help improve outcomes and lower costs in the healthcare system. The crux of their argument is  that unless data are pristine going in, there can be no accurate insight coming out. Garbage in, garbage out.  From our perspective, this view leaves out two fundamental aspects of the data optimization process.

First, the argument assumes that there is nothing happening between the time the data goes into the database and the time it hits the analytics engine. At Advera Health, when we take in a dataset there is an amazing amount of actual human work that needs to be done before the data are ready for analysis. In our latest endeavor to curate the results of clinical trials into a usable dataset, raw data was obtained programmatically, but it took the manual work of trained analyst team to correct errors and fill in blanks.  Sure, we could have just taken the raw datasets and built a front end around it, but that would have been incomplete and certainly misleading. The data needed to be annotated and properly optimized. Doing so allowed us to create powerful meta analyses and comparison tools for healthcare decision makers.

Second, there is a reason why data companies employ not just data scientists, but really smart people with clinical and academic training to work closely with the data. To properly use the data, you need to understand both the clinical and academic applications as well as the intricacies of the data themselves. Top tier healthcare companies work closely with their clients to ensure the data is applied within its limitations.  Watson Health didn’t just buy a dataset that they will haphazardly feed into the super computer, it also bought the collective brain power of a lot of very smart people that know these data inside and out. By understanding the nuance of every limitation, they will be able to help correct for that limitation and not generate “misguided” insight as the authors suggest but be able to make true advances in health outcomes.

Will Watson Health make advancements in healthcare or will the big blue parasite simply get lucky? With money, scale, and smarts I’d wager on the former.  

As an aside, I think it would be a real PR milestone for Boston based Watson Health if it were to provide the conclusion that Thursday’s were dangerous for redheads.

To learn more about Advera Health's data optimization processes, visit


Jim Davis, EVP Advera Health Analytics, Inc. 




Topics: Drug Safety, FAERS

Jim Davis

Written by Jim Davis

As Executive Vice President, Jim is responsible for the commercialization strategy for Advera Health Analytics.