news and events

Supporting clinical research with the MIMIC-III Critical Care Database

Monday, June 27, 2016


Critical care decisions in modern day intensive care units (ICUs) are made every day by doctors who rely on decades of studies and clinical trials to inform them of the most effective treatments for these patients, yet the amount of evidence supporting clinical decisions remains embarrassingly low. A recent study evaluated seven interventions in the ICU and found only one provided actual patient benefit, with the others having no effect or even causing harm1.

Thousands of medications and interventions are in regular use in the ICU, with an uncountable number of interactions among them. Further, patient demographics and genetics play a role in what treatments are optimal. Patients have their vital signs and serum blood values monitored frequently, sometimes continuously, and as a result the ICU environment is undeniably complex. The role of the care provider is to synthesize this deluge of data into a useful treatment course, and it is not an easy one.

Paucity of well-curated clinical data is often cited as a key challenge in conducting research2. There is a need for a new approach to knowledge generation; one which is efficient and can progress much faster than research has in the past.

Researchers at the MIT Laboratory for Computational Physiology (LCP) have taken an alternative approach to clinical research by freely releasing ICU data to researchers globally, with a goal of crowdsourcing the knowledge generation process3. Thanks to this work, researchers are able to test hypotheses using real data acquired from patients admitted to the Beth Israel Deaconess Medical Center in Boston, Massachusetts.

“One of the biggest challenges in healthcare research is accessing the data,” says Alistair Johnson, a postdoctoral associate at MIT’s Institute for Medical Science and Engineering (IMES) and author of a paper recently published in Scientific Data that describes the data. “You need ethics approval, assistance from the hospital IT department, technical knowledge, and clinical knowledge. We’ve taken care of all that.”

The database, Medical Information Mart for Intensive Care (MIMIC), houses data on over 40,000 patients admitted ICUs at the Beth Israel Deaconess Medical Center since 2000. The data was de-identified to conform with the Health Insurance Portability and Accountability Act (HIPAA), and interested researchers must sign a data use agreement, promising not to use the data for any unlawful purpose among other guarantees. The data collected includes vital signs, medications, laboratory measurements, observations and notes charted by care providers, fluid balance, procedure codes, diagnostic codes, imaging reports, hospital length of stay, survival data, and more.

In addition to making the data available, the Laboratory has created an open code repository to allow researchers to collaboratively develop and reuse analytical code. Tom Pollard, a co-author on the paper, notes that when data and code are shared together, studies becomes completely reproducible, and says that collaboration is the key to advancing knowledge. “Sharing this code and data helps to advance the work of researchers around the world, from early-career students to highly experienced academics. Our belief is that together we can achieve much more than would be possible in closed groups.”

Alistair Johnson, Tom Pollard, Roger Mark


1Ospina-Tascón GA, Büchele GL, Vincent JL. Multicenter, randomized, controlled trials evaluating mortality in intensive care: doomed to fail? Crit Care Med 2008; 36:1311–1322. http://dx.doi.org/10.1097/CCM.0b013e318168ea3e

2Pisani E and Abou-Zahr C. Sharing health data: good intentions are not enough. Bulletin of the World Health Organization. Volume 88, Number 6, June 2010, 401-480. http://www.who.int/bulletin/volumes/88/6/09-074393.pdf

3MIMIC-III, a freely accessible critical care database. Johnson AEW, Pollard TJ, Shen L, Lehman L, Feng M, Ghassemi M, Moody B, Szolovits P, Celi LA, and Mark RG. Scientific Data (2016). http://www.nature.com/articles/sdata201635