Date and time
-
Location

HMS Countway Library, Room 403, 10 Shattuck Street, Boston, MA 02115

and Zoom (see below for full information)

Few Shot Learning for Rare Disease Diagnosis

Rare diseases affect 300-400 million people worldwide, yet each disease has very low prevalence, affecting no more than 50 per 100,000 individuals. Many patients with rare genetic conditions remain undiagnosed due to clinicians' lack of experience with the individual diseases and the considerable heterogeneity of clinical presentations. Machine-assisted diagnosis offers the opportunity to shorten the diagnostic delays for rare disease patients. Recent advances in deep learning have considerably improved the accuracy of medical diagnosis. However, much of the success thus far is contingent on the availability of large annotated datasets. Machine-assisted diagnosis of rare diseases presents unique challenges; approaches must learn from limited data and extrapolate beyond training distribution to novel genetic conditions.

The goal of this thesis is to develop few shot learning methods that can overcome the data limitations of deep learning approaches to diagnose patients with rare genetic conditions. Motivated by the need to infuse external knowledge into models, we first develop novel graph neural network methods for subgraph representation learning that encode how subgraphs (e.g., a set of patient phenotypes) relate to a larger knowledge graph. To address the issue of data scarcity, we next develop a framework for simulating realistic rare disease patients with novel genetic conditions and demonstrate how these simulated patients are similar to real rare disease patients. Finally, we leverage these advances to develop SHEPHERD, a few shot method for diagnosis of patients with rare genetic conditions in the Undiagnosed Diseases Network. SHEPHERD reasons over biomedical knowledge via geometric deep learning to learn generalizable representations of rare disease patients. SHEPHERD can operate at multiple facets throughout the rare disease diagnosis process: performing causal gene discovery, retrieving "patients-like-me" with the same causal gene or disease, and providing interpretable characterizations of novel disease presentations. Our work illustrates the potential for deep learning methods to rapidly accelerate molecular diagnosis and shorten the diagnostic odyssey for rare disease patients.

Thesis Supervisor:
Isaac Kohane, MD, PhD
Chair and Professor of Biomedical Informatics, HMS

Thesis Committee Chair:
Peter Szolovits, PhD
Professor of Computer Science and Engineering, MIT

Thesis Committee Member:
Marinka Zitnik, PhD
Assistant Professor of Biomedical Informatics, HMS

------------------------------------------------------------------------------------------------------

Zoom invitation –

Emily Alsentzer is inviting you to a scheduled Zoom meeting.

Topic: Emily Alsentzer PhD Thesis Defense
Time: Thursday, June 16, 2022 03:00 PM Eastern Time (US and Canada)

Your participation is important to us: please notify hst [at] mit.edu (hst[at]mit[dot]edu), at least 3 business days in advance, if you require accommodations in order to access this event.

Join Zoom Meeting
https://mit.zoom.us/j/96819116457?pwd=QUxrcHZuTys3T1BwWVFLajVTbXBRZz09

Password: EATHESIS

One tap mobile
+16465588656,,96819116457# US (New York)
+16699006833,,96819116457# US (San Jose)

Meeting ID: 968 1911 6457

US : +1 646 558 8656 or +1 669 900 6833

International Numbers: https://mit.zoom.us/u/aeImX04XXS

Join by SIP
96819116457 [at] zoomcrc.com

Join by Skype for Business
https://mit.zoom.us/skype/96819116457