Current Projects:
Building Virtual Cells and Virtual Tissues
We are pioneering the development of "Virtual Cells" and "Virtual Tissues," using AI to predict and control biological behavior from molecular to tissue scales. By integrating high-content single-cell data with spatial tissue architecture and machine learning, we aim to create foundational multi-scale models of biology that can effectively model perturbations—a critical capability that enables these models to learn causal relationships rather than just correlations. Our data collection focuses specifically on genetic and chemical perturbation experiments, filling a crucial gap as such systematic perturbation data remains scarce in the field. This innovative approach to understanding causality in biological systems has the potential to solve complex challenges in genetic disease, cancer, aging, and beyond. As part of this effort, we are collecting the largest single-cell and spatial perturbation datasets ever compiled to build the most comprehensive biology models ever created.
Building Health Superintelligence
We are developing advanced reasoning models that integrate multimodal and multiscale biomedical data to accelerate science and medicine at every level—from discovering new therapeutic targets and simulating drug efficacy/toxicity to reasoning about complex health patterns and enabling truly personalized medicine. Our approach leverages reinforcement learning, self-improving model architectures, and other cutting-edge AI techniques while simultaneously building better evaluation frameworks and benchmark datasets. This initiative includes strategic collaboration with Google to apply these approaches to the most advanced language models as well as open-source models, pushing the boundaries of what's possible in computational biomedicine.
Deciphering the human secretome for discovery of novel peptides ("Ozempic for X")
Combining pooled screening approaches, single cell readouts, and machine learning methods, we are screening for new classes of peptide and protein drugs to treat diverse diseases and augment the human condition (e.g. neuromodulation). We are applying this model in multiple contexts, including muscle aging, inflammatory responses, immune system rejuvenation, and neural circuit modulation.
Aging therapeutics, rejuvenation, and life span extension
We are applying novel molecular tools and approaches our lab has developed over the past few years to develop molecular signatures of aging and rejuvenate diverse aged tissues, including hematopoietic stem cells, muscle, and the immune system.
Large language models for directed evolution and protein engineering
We are building the largest and most advanced models for protein design and engineering using generative foundation models and groundbreaking few-shot learning approaches. By integrating protein language models with active learning, we can accelerate the evolution of proteins, achieving dramatic improvements in protein function with minimal experimental rounds. This work paves the way for unparalleled advancements in protein engineering with applications across antibody therapies, genome editing, delivery, diagnostics, and sustainability/climate change.
Nucleic Acid Delivery
The efficient delivery of nucleic acids into cells beyond the liver is critical for developing new gene and cell therapies. Our lab is leveraging the natural biology of nanoparticles and protein engineering to develop programmable delivery solutions to target extra-hepatic tissues.
New gene editing tools
We have multiple projects creating novel systems to perturb and modify DNA and RNA. Through rational engineering methods, machine learning, and natural enzyme discovery, we aim to develop tools to unlock new classes of gene and cell therapies.
Contact: hello-abugoot [at] mit.edu (hello-abugoot[at]mit[dot]edu) | oabudayyeh [at] bwh.harvard.edu (oabudayyeh[at]bwh[dot]harvard[dot]edu)
Location: MIT and Longwood