Selected Publications

To see a more complete view of my publications, please view my google scholar page or my CV.

Yasha Ektefaie, George Dasoulas, Ayush Noori, Maha Farhat, Marinka Zitnik

Arxiv 2022

Geometric Multimodal Representation Learning

Graph-centric artificial intelligence (graph AI) has achieved remarkable success in modeling interacting systems prevalent in nature, from dynamical systems in biology to particle physics. The increasing heterogeneity of data calls for graph neural architectures that can combine multiple inductive biases. However, combining data from various sources is challenging because appropriate inductive bias may vary by data modality. Multimodal learning methods fuse multiple data modalities while leveraging cross-modal dependencies to address this challenge. Here, we survey 140 studies in graph-centric AI and realize that diverse data types are increasingly brought together using graphs and fed into sophisticated multimodal models. These models stratify into image-, language-, and knowledge-grounded multimodal learning. We put forward an algorithmic blueprint for multimodal graph learning based on this categorization…Read more.

Yasha Ektefaie, Avika Dixit, Luca Freschi, Maha Farhat

The Lancet Microbe 2021

Globally diverse Mycobacterium tuberculosis resistance acquisition: a retrospective geographical and temporal analysis of whole genome sequences

Mycobacterium tuberculosis whole genome sequencing (WGS) data can provide insights into temporal and geographical trends in resistance acquisition and inform public health interventions. We aimed to use a large clinical collection of M tuberculosis WGS and resistance phenotype data to study how, when, and where resistance was acquired on a global scale.We did a retrospective analysis of WGS data. We curated a set of clinical M tuberculosis isolates with high- quality sequencing and culture-based drug susceptibility data (spanning four lineages and 52 countries in Africa, Asia, the Americas, and Europe) using public databases and literature curation. For inclusion, sequence quality criteria and country of origin data were required. We constructed geographical and lineage specific M tuberculosisphylogenies and used Bayesian molecular dating with BEAST, version 1.10.4, to infer the most recent common susceptible ancestor age for 4869 instances of resistance to ten drugs…Read more.

Yasha Ektefaie, William Yuan, Deborah A. Dillon, Nancy U. Lin, Jeffrey A. Golden, Isaac S. Kohane & Kun-Hsing Yu

NPJ Breast Cancer 2021

Integrative multiomics-histopathology analysis for breast cancer classification.

Histopathologic evaluation of biopsy slides is a critical step in diagnosing and subtyping breast cancers. However, the connections between histology and multi-omics status have never been systematically explored or interpreted. We developed weakly supervised deep learning models over hematoxylin-and-eosin-stained slides to examine the relations between visual morphological signal, clinical subtyping, gene expression, and mutation status in breast cancer. We first designed fully automated models for tumor detection and pathology subtype classification, with the results validated in independent cohorts (area under the receiver operating characteristic curve ≥ 0.950). Using only visual information, our models achieved strong predictive performance in estrogen/progesterone/HER2 receptor status, PAM50 status, and TP53 mutation status. We demonstrated that these models learned lymphocyte-specific morphological signals to identify estrogen receptor status…Read more.