2024 ARCHIVES

Cambridge Healthtech Institute’s 3rd Annual

Machine Learning Approaches for Protein Engineering

Balancing Theory with Practice

MAY 16 - 17, 2024 ALL TIMES EST

The incorporation of machine learning, AI tools, and generative as well as protein language models will have a tremendous impact on the field of protein engineering but has also sparked a heated debate into which methods save time and money vs. have the opposite effect of adding inefficiencies and uncertainty to the process. Historically, drug discovery and development processes have been fraught with inefficiencies due to the lack of predictive tools. Machine learning and AI have the capacity to completely change the way protein structures and biologics will get predicted, discovered, designed, and optimized in the future, but it remains imperative to evolve and adapt them for use in antibody and protein engineering. Join the esteemed faculty of the 3rd Annual Machine Learning Approaches for Protein Engineering conference at PEGS Boston to transform the process of antibody development and improve success rates.

Scientific Advisory Board:

M. Frank Erasmus, PhD, Head, Bioinformatics, Specifica, Inc.

Victor Greiff, PhD, Associate Professor, Oslo University Hospital

Maria Wendt, PhD, Global Head and Vice President, Digital and Biologics Strategy and Innovation, Sanofi

Sunday, May 12

Main Conference Registration1:00 pm

Recommended Pre-Conference Short Course2:00 pm

SC3: In silico and Machine Learning Tools for Antibody Design and Developability Predictions

*Separate registration required. See short course page for details.

Thursday, May 16

7:30 am

PANEL DISCUSSION: Fostering Mentorship and Company Culture for the Advancement of Gender Equity: IN-PERSON ONLY
(Continental Breakfast Provided)
Co-Organized with Thinkubator Media

PANEL MODERATOR:

Lori Lennon, Founder & CEO, Thinkubator Media

Advancing gender equity in the workplace is an effort that requires mentorship, shifts in company culture, and investment from all levels of an organization. Join us for a robust and insightful conversation on how companies can foster quality mentorship, create team-based success models, develop meaningful and measurable commitments to DEI, and how this important work can greatly benefit an organization and its goals.

PANELISTS:

Tom Browne, Director of Diversity, Equity, & Inclusion, MassBio

Sheila Phicil, Equity Architect, Director of Innovation, Health Equity Accelerator, Boston Medical Center (BMC)

Nicole Renaud, PhD, Director, Global Co-Lead of Human Genetics and Targets, Discovery Science, Biomedical Research, Novartis

Kerry Robert, Senior Vice President, Head of People & Culture, Entrada Therapeutics

Minmin (Mimi) Yen, PhD, CEO & Co-Founder, PhagePro Inc.

Registration and Morning Coffee7:30 am

8:45 am

Chairperson's Remarks

M. Frank Erasmus, PhD, Head, Bioinformatics, Specifica, Inc.

8:50 am

KEYNOTE PRESENTATION: Why Does the Virus Change Its Spots? The Role of Immunodominance in Viral Evolution

Stephen J. Elledge, PhD, Gregor Mendel Professor, Genetics & Medicine, Harvard Medical School

Despite the vast diversity of the antibody repertoire, infected individuals often mount antibody responses to precisely the same epitopes within antigens through unknown. By mapping 376 immunodominant “public epitopes” at high resolution and characterizing several of their cognate antibodies, we conclude that germline-encoded sequences in CDR1 and CDR2 in antibodies drive recurrent recognition. Systematic analysis of antibody–antigen structures uncovered 18 human and 21 mouse germline-encoded amino acid-binding (GRAB) motifs within heavy and light V gene segments that are critical for public epitope recognition. GRAB motifs represent a fundamental component of the immune system’s architecture that ensures antibody recognition of pathogens and promotes species-specific reproducible responses that can exert selective pressure on pathogens.

9:20 am

De novo Peptide Sequencing with InstaNovo: Accurate, Database-Free Peptide Identification for Large-Scale Proteomics Experiments

Timothy Patrick Jenkins, PhD, Assistant Professor & Head, Data Science, DTU Bioengineering

InstaNovo, a cutting-edge transformer neural network, addresses the challenges of de novo peptide sequencing in mass spectrometry-based proteomics. Trained on 28M spectra, InstaNovo outperforms current state-of-the-art methods and showcases its utility in several applications. We further introduce InstaNovo+, a multinomial diffusion model that improves performance via iterative refinement. Together, these models unlock a plethora of opportunities across different scientific domains, including direct protein sequencing, immunopeptidomics, and dark proteome exploration.

1. How to improve accuracy of de novo peptide sequencing

2. What are the next big areas to be included in such models?

3. What can they already be used for?

9:50 am

Predicting Antibody Binders and Generating Synthetic Antibodies Using Deep Learning

David Johnson, PhD, Founder and CEO, GigaMune

First, we generated a large panel of binder and non-binder antibody sequences to the cancer immunotherapy targets PD-1 and CTLA-4. Next, we encoded the antibody light and heavy chain complementarity-determining regions (CDR3s) into antibody images, then built and trained convolutional neural network models to classify binders and nonbinders. We then built generative deep learning models, using generative adversarial network models to produce synthetic antibodies against PD-1 and CTLA-4.

10:20 am

Innovative Antibody Discovery Workflow Leveraging Machine Learning to Prioritize Leads

Crystal Richardson, Dr., Senior Business Partnership Manager, Azenta Life Sciences

Azenta now offers an innovative end-to-end antibody screening solution that guides your discovery program to more diverse leads while reducing liabilities for antibody development. Utilizing next generation sequencing of your in-vivo samples (i.e. B-cells, PBMCs) or in vitro libraries (i.e. Phage display), a bioinformatics platform, and gene synthesis, antibodies are produced with promising biophysical profiles for commercialization.

10:35 am

Cutting Through the Hype: Real-World Applications of AI in Antibody Discovery and Engineering

Patrick Doonan, PhD, Director of Antibody Engineering, Antibody Discovery, Ailux Biologics

Artificial intelligence (AI) is transforming antibody discovery and engineering. Ailux's platform synergistically combines the best of wet lab and AI. We will explore a series of case studies that exemplify the applications of our AI-driven approach for tackling difficult GPCR targets, designing next-gen display libraries, predicting Ab-Ag complex structures and engineering challenging molecules. This presentation provides a realistic and evidence-based perspective on AI’s impact on the industry.

Coffee Break in the Exhibit Hall with Poster Viewing10:50 am

11:00 am

Meet Fellow Women Scientists, Celebrate Successes, and Inspire the Future Generations of Female Leaders

Lori Lennon, Founder & CEO, Thinkubator Media

The Women in Science Meet-Up celebrates female trailblazers who are setting their own course in science. We invite all to come celebrate the successes of these women in breaking down barriers and inspiring future generations of female leaders. Come join fellow scientists and share your personal and professional journey.

Who or What inspires you to explore a career in science?
What fuels your imagination and spirit when you’re faced with challenges?
What is your proudest moment?
What can each of us do to improve things further?

Transition to Plenary Fireside Chat11:50 am

12:00 pm

What Comes Next in Antibody Discovery and Engineering?

PANEL MODERATOR:

K. Dane Wittrup, PhD, C.P. Dubbs Professor, Chemical Engineering & Bioengineering, Massachusetts Institute of Technology

How significantly will domain antibodies supersede Fabs in antibody-like structures in the future? Considering the generally superior biophysical attributes of domain antibodies relative to Fabs, what advantages, aside from extensive clinical experience, do Fabs offer?

Is the field of antibody engineering nearing a point where it can be considered a solved problem? How frequently do we fail to discover a lead candidate that aligns with a realistic target product profile?

If we had access to a completely predictive computational method for antibody design, how would this quantifiably enhance the antibody discovery and optimization process? Would this truly revolutionize the field, especially considering the advanced experimental techniques we currently possess? Is there often (or ever) an atomically precise understanding of the exact structural epitope we aim for an antibody to target in order to achieve pharmacological benefit? Are there gaps in the existing experimental tools for developability optimization?

PANELISTS:

Paul J. Carter, PhD, Genentech Fellow, Antibody Engineering, Genentech

Daniel Chen, MD, PhD, Founder & CEO, Synthetic Design Lab

Jane K. Osbourn, PhD, CSO, Alchemab Therapeutics Ltd.

Luncheon in the Exhibit Hall and Last Chance for Poster Viewing12:55 pm

2:20 pm

How Good are Machine Learning, Artificial Intelligence and Other Computational Methods in Antibody Discovery and Optimization?

Andrew R.M. Bradbury, MD, PhD, CSO, Specifica, Inc., a Q2 Solutions Company

Notwithstanding the claims, it is challenging to understand how effective ML/AI and other computational methods really are in antibody discovery and optimization. This talk will describe the launch of a blinded benchmarking competition using several Specifica NGS datasets which will be made available to the community. We hope this will become a regular CASP-like competition, with challenges of increasing difficulty.

2:35 pm

Improving Computational Models of Human Antibodies

Bryan Briney, PhD, Assistant Professor, Immunology & Microbial Science, Scripps Research Institute

Antibody language models are an emerging class of tools with the potential to revolutionize our ability to understand the linkage between antibody sequence, structure, and function. The paucity of natively paired antibody sequence datasets means most antibody language models have been trained exclusively using unpaired antibody sequences, however, we have shown that training with natively paired sequences allows models to learn critically important cross-chain features. This talk will explore the utility of training using natively paired antibody sequences and explore strategies for training such models in a data-limited environment.

3:05 pm

Building Next-Generation Platforms to Enable AI-Guided Biologics Design

Alicia Kaestli, PhD, Senior Associate, Flagship Pioneering

The fusion of AI with biologics design offers unprecedented potential in drug discovery. However, building platforms that enable AI-guided drug discovery is not a straightforward process. I’ll walk through how we’ve built and scaled new biotech platforms from the ground up for de novo biologics design.

3:34 pm

Chairperson's Remarks

M. Frank Erasmus, PhD, Head, Bioinformatics, Specifica, Inc.

3:35 pm

Learning Protein Fitness Models from Evolutionary and Assay-Labeled Data

Chloe Hsu, PhD, Co-Founder & CEO, Stealth

Machine learning-based models of protein fitness typically learn from either unlabeled, evolutionarily related sequences (such as in the case of protein language models) or variant sequences, with experimentally measured labels. For regimes where only limited experimental data are available, we combine both sources of information in a simple combination approach that is competitive with, and on average outperforms more sophisticated methods.

4:05 pm

LENSai: Empowering Diversity-Driven Discovery, Intelligent Lead Selection and Optimization

Arnout Van Hyfte, Head of Products & Platform, BioStrand, Products & Platform, BioStrand, IPA (ImmunoPrecise Antibodies)

To select the best lead antibody for clinical use, it is crucial to have access to a highly diverse panel of lead candidates, matched with highly scalable down-selection and de-risking technologies. Leveraging our LENSai software, antibody discovery and optimization are accelerated through the integration of advanced ML and AI algorithms combined with experimental methodologies. Our pioneering LENSai software empowers cost-efficient development of next-generation antibodies with precision and speed.

4:20 pm

To B cell or not to B cell: Cellular Versus Serum Approaches for Antibody Discovery

Natalie Castellana, PhD, CEO, Abterra Biosciences

Serum antibodies are a key component of the phenotype assessed when screening immunized animals or patients. Alicanto is a technology for identifying antibodies from serum using machine learning. Alicanto integrates immunosequencing data with mass spectrometry measurements of serum antibodies to create a map of an individual’s immune response to challenge. In this talk we compare the B cell-derived repertoire versus serum antibody repertoire using Alicanto as well as phage display.

Networking Refreshment Break4:35 pm

5:00 pm

The RESP AI Model Accelerates the Identification of Tight-Binding Antibodies

Wei Wang, PhD, Professor, Chemistry and Biochemistry, University of California San Diego

Deep learning techniques hold the potential to accelerate identification of effective antibodies but the existing methods cannot provide the confidence interval or uncertainty needed to assess the reliability of the predictions. Here we present a pipeline called RESP for efficient identification of high affinity antibodies using uncertainty-aware machine learning methods.

5:30 pm

Meaningful Biological Priors as Guiding Constraints for Graph Neural Network-Based Antibody Developability Prediction

Pranav M. Khade, PhD, Postdoctoral Fellow, Prescient Design, Genentech

Antibody developability properties are dependent on the relative disposition of constituent amino acids. We develop a graph neural network with meaningful biological priors such as Delaunay-based adjacency and Kidera factors to build an efficient and explainable model to predict antibody developability.

6:00 pm

Deploying Synthetic Coevolution and Machine Learning to Engineer Protein-Protein Interactions

Aerin Yang, PhD, Basic Life Research Scientist, Molecular and Cellular Physiology, Stanford University

We present a platform for synthetic protein-protein coevolution, isolating diversely remodeled interacting pairs. This dataset enables comprehensive analysis of protein pairs, uncovering insights into structural diversity, affinities, cross-reactivities, and orthogonalities. Leveraging pretrained protein language models, we expand the amino acid diversity of our coevolution screen in silico, predicting remodeled interfaces. This integrated approach simulates protein coevolution, creating protein complexes with diverse recognition properties, benefiting biotechnology and synthetic biology.

Close of Day6:30 pm

Friday, May 17

Registration Open7:00 am

7:30 amInteractive Roundtable Discussions with Continental Breakfast

Interactive Roundtable Discussions are informal, moderated discussions, allowing participants to exchange ideas and experiences and develop future collaborations around a focused topic. Each discussion will be led by a facilitator who keeps the discussion on track and the group engaged. To get the most out of this format, please come prepared to share examples from your work, be a part of a collective, problem-solving session, and participate in active idea sharing. Please visit the Interactive Roundtable Discussions page on the conference website for a complete listing of topics and descriptions.

TABLE 1: Engineering Novel Cytokine Functions through Experimental and Computational Approaches- IN PERSON ONLY

Jamie B. Spangler, PhD, Associate Professor, Departments of Biomedical and Chemical & Biomolecular Engineering, Johns Hopkins University

Design parameters and objectives for cytokine engineering
Tailoring engineering strategies to various cytokine systems
Considerations for selecting the appropriate evolutionary and/or rational design strategies
Key factors in choosing a computational workflow for cytokine engineering

TABLE 2: De novo Peptide Sequencing: Applications and Opportunities- IN PERSON ONLY

Timothy Patrick Jenkins, PhD, Assistant Professor & Head, Data Science, DTU Bioengineering

Which tools are out there and how good are they?
Which applications benefit most from de novo peptide sequencing?
What are current limitations and opportunities for further development?

8:25 am

Chairperson's Remarks

M. Frank Erasmus, PhD, Head, Bioinformatics, Specifica, Inc.

8:30 am

A Machine Learning Strategy for the Identification of in silico Descriptors and Prediction Models for IgG Monoclonal Antibody Developability Properties

Andrew B. Waight, PhD, Senior Director, Machine Learning, Discovery Biologics & Protein Sciences, Merck Research Labs

Identification of favorable biophysical properties for protein therapeutics as part of developability assessment is a crucial part of the preclinical development process. Successful prediction of such properties and bioassay results from calculated in silico features has potential to reduce the time and cost of delivering clinical-grade material to patients. We have developed and implemented an automated machine learning workflow designed to compare and identify the most powerful features from computationally derived physiochemical feature sets. We demonstrate the use of this workflow with medium-sized datasets of IgG molecules to generate predictive regression models for key developability endpoints.

8:59 am

Chairperson's Remarks

Timothy Patrick Jenkins, PhD, Assistant Professor & Head, Data Science, DTU Bioengineering

9:00 am

KEYNOTE PRESENTATION: Launching into the Future: Sanofi’s Biologics AI Moonshot Program—Advancing AI Strategy and Innovation for Biologics

Yves Fomekong Nanfack, PhD, Executive Director, Head of End to End AI Foundations, Large Molecules Research, Sanofi

Sanofi recently launched the BioAIM program to push forward on our ambition to transform biologics drug discovery. This talk will discuss the landscape of opportunities for ML and AI in all aspects of antibody generation to design and engineering of advanced modalities, our approach, examples of novel methods developed, and early results.

9:30 am

De novo Cytokine Engineering to Probe and Manipulate Immune Biology

Jamie B. Spangler, PhD, Associate Professor, Departments of Biomedical and Chemical & Biomolecular Engineering, Johns Hopkins University

Cytokines are soluble factors that signal through stimulation of their cognate transmembrane receptors on target cells to perform critical biological functions, particularly those related to immune homeostasis. We synthesized a best-in-class computational protein design software with directed evolution technologies to generate a de novo engineered cytokine mimetic with superior stability and unique biochemical activities compared to naturally-derived cytokines. Collectively, our work pioneers a novel cytokine engineering platform that integrates computational and experimental approaches to create new molecules that expand the repertoire of protein function and inform the development of next-generation immunotherapeutics.

10:00 am

Designing Antibodies by Utilizing Large Language Models

Satoshi Tamaki, PhD, CEO/CSO, MOLCURE Inc.

Considering the challenges in de novo antibody discovery, MOLCURE has developed a platform that integrates AI, laboratory automation, and molecular biology experiments. In this presentation, we will demonstrate the functionality of our AI designed antibodies, such as Kd value at the level of pM, diverse target epitopes, etc. We will also propose generative AI approaches to design antibodies with desired functionality, requiring minimal amount of experimental validation.

10:15 am

Presentation to be Announced

Networking Coffee Break10:30 am

11:00 am

Design of Highly Functional Genome Editors by Modeling the Universe of CRISPR-Cas Sequences

Jeffrey Ruffolo, PhD, Machine Learning Scientist, Profluent Bio

Gene editing has the potential to solve fundamental challenges in agriculture, biotechnology, and human health. Using large language models (LLMs) trained on biological diversity at scale, we demonstrate the first successful precision editing of the human genome with programmable gene editors designed with AI. We release OpenCRISPR-1, which showed comparable or improved activity and specificity relative to SpCas9 while being 400 mutations away in sequence, for free and ethical usage.

11:30 am

Next-Generation Protein Language Model for Antibody Engineering and Discovery

Abhinav Gupta, PhD, Senior Machine Learning Scientist, Next-Generation Biologics Design, Sanofi

Current Ab-specific PLMs are limited in knowledge due to the use of only unpaired sequences and the traditional masked-language-modeling approach of self-supervised training, and therefore, are not ideal for utilization for tasks such as affinity, thermal stability, contact-map / epitope / paratope prediction, etc. We aim to alleviate this knowledge limitation in our Next-Gen PLM by infusing information from multiple internal and external data modalities and develop new training strategies, to improve performance on these different tasks and beyond.

Close of Conference12:00 pm