Abstract:
Large datasets are being generated that can transform biology and medicine. New machine learning methods are necessary to
unlock these data and open doors for scientific discoveries. In this talk, I will argue that, in order to advance science,
machine learning models should not be trained in the context of one particular dataset. Instead, we should be developing
methods that can integrate rich, heterogeneous data and knowledge into multimodal networks, enhance these networks to reduce
biases and uncertainty, and learn over the networks.
My talk will focus on two key aspects of this goal: deep learning and network science for multimodal networks. I will first
show how we can move beyond prevailing deep learning methods, which treat network features as simple variables and ignore
interactions between entities. Further, I will present an algorithm that learns deep models by embedding multimodal networks
into compact embedding spaces whose geometry is optimized to reflect the interactions, the essence of multimodal networks.
These deep models set sights on new frontiers, including the prediction of protein functions in specific human tissues,
modeling of drug combinations, and repurposing of old drugs for new diseases. Beyond such predictive ability, a hallmark of
science is to achieve a holistic understanding of the world. I will discuss how we can blend network algorithms with rigorous
statistics to harness biomedical networks at the scale of billions of interactions. These methods revealed, among others, how
Darwinian evolution changes molecular networks, providing evidence for a longstanding hypothesis in biology. In all studies, I
collaborated closely with experimental biologists and clinical scientists to give insights and validate predictions made by our
methods. I will conclude with future directions for contextual models of rich interaction data which open up new avenues for science.