Modeling the Tumor Microenvironment with Graph Concept Learners

Using concept and geometric-deep learning to enable interpretable predictions of tumor tissue images.

Abstract

Heterogeneity is an emergent property of tumors, linked to cancer resistance and poor treatment outcomes (Marusyk et al., 2012). Geometric deep learning using graph representations has emerged as a promising approach to investigate tumor heterogeneity. Still, these approaches suffer from interpretability and transferability limitations (Rudin, 2019).

In this work, we propose a geometric deep-learning model with an enhanced interpretability framework to predict metadata from spatial datasets. Our approach draws inspiration from concept learning in computer vision and introduces concept graphs as a way of defining high-level postulates relevant to the task at hand (Cao et al., 2021).

Cell graphs are a graphical representation of a tissue, where nodes represent cells and edges between nodes denote adjacency between cells in the high-dimensional omic space. Concept graphs are cell graphs constructed on a subset of all cells, where the filtering that induces a subset is defined by an underlying postulate. We propose to model the tumor microenvironment by constructing concept graphs that represent relevant postulates such as well-established cancer hallmarks (e.g., tumor-promoting inflammation) by constructing subgraphs including only cells that participate in the relevant interactions (e.g., tumor-immune cells).

Altogether, our approach decomposes each tumor sample into concept graphs and computes embeddings for each of them using a concept-specific graph neural network, and finally, integrates all embeddings in a downstream prediction engine with an attention mechanism (Vaswani et al., 2017). This mechanism highlights the relevance of each concept for a given prediction, thereby opening an avenue for interpretation.

The novelty of our approach lies in disentangling complex tumor graphs into interpretable concept graphs, allowing us to explore the relevance of inputs in terms of high-level concepts instead of low-level features like pixels or nodes in a graph. We applied our framework on a publicly available breast cancer imaging mass cytometry dataset (Jackson et al., 2020), achieving highly accurate and interpretable predictions of several clinically relevant metadata.

With spatial omics and multiplexed imaging data becoming increasingly popular, we hope that our approach will offer an interpretable and transferable framework that can assist in identifying cancer-related mechanisms and their relevance in clinical predictions.

Schematic representation of a small cell population and a superimposed cell graph.

Want to know more?

📖 Thesis (will be published soon).
📝 Paper (coming soon!).
🚀 Checkout the Github repo.

References

2021

Concept Learners for Few-Shot Learning

Kaidi Cao, Maria Brbic, and Jure Leskovec

Mar 2021

arXiv:2007.07375 [cs, stat]

Abs

Developing algorithms that are able to generalize to a novel task given only a few labeled examples represents a fundamental challenge in closing the gap between machine- and human-level performance. The core of human cognition lies in the structured, reusable concepts that help us to rapidly adapt to new tasks and provide reasoning behind our decisions. However, existing meta-learning methods learn complex representations across prior labeled tasks without imposing any structure on the learned representations. Here we propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions. Instead of learning a joint unstructured metric space, COMET learns mappings of high-level concepts into semi-structured metric spaces, and effectively combines the outputs of independent concept learners. We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation on a novel dataset from a biological domain developed in our work. COMET significantly outperforms strong meta-learning baselines, achieving 6-15% relative improvement on the most challenging 1-shot learning tasks, while unlike existing methods providing interpretations behind the model’s predictions.

2020

Imaging mass cytometry and multiplatform genomics define the phenogenomic landscape of breast cancer

H. Raza Ali, Hartland W. Jackson, Vito R. T. Zanotelli, and 10 more authors

Nature Cancer, Feb 2020

Number: 2 Publisher: Nature Publishing Group

Abs

Genomic alterations shape cell phenotypes and the structure of tumor ecosystems in poorly defined ways. To investigate these relationships, we used imaging mass cytometry to quantify the expression of 37 proteins with subcellular spatial resolution in 483 tumors from the METABRIC cohort. Single-cell analysis revealed cell phenotypes spanning epithelial, stromal and immune types. Distinct combinations of cell phenotypes and cell–cell interactions were associated with genomic subtypes of breast cancer. Epithelial luminal cell phenotypes separated into those predominantly impacted by mutations and those affected by copy number aberrations. Several features of tumor ecosystems, including cellular neighborhoods, were linked to prognosis, illustrating their clinical relevance. In summary, systematic analysis of single-cell phenotypic and spatial correlates of genomic alterations in cancer revealed how genomes shape both the composition and architecture of breast tumor ecosystems and will enable greater understanding of the phenotypic impact of genomic alterations.

2019

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

Cynthia Rudin

Nature Machine Intelligence, May 2019

Number: 5 Publisher: Nature Publishing Group

Abs

Black box machine learning models are currently being used for high-stakes decision making throughout society, causing problems in healthcare, criminal justice and other domains. Some people hope that creating methods for explaining these black box models will alleviate some of the problems, but trying to explain black box models, rather than creating models that are interpretable in the first place, is likely to perpetuate bad practice and can potentially cause great harm to society. The way forward is to design models that are inherently interpretable. This Perspective clarifies the chasm between explaining black boxes and using inherently interpretable models, outlines several key reasons why explainable black boxes should be avoided in high-stakes decisions, identifies challenges to interpretable machine learning, and provides several example applications where interpretable models could potentially replace black box models in criminal justice, healthcare and computer vision.

2017

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, Niki Parmar, and 5 more authors

Dec 2017

arXiv:1706.03762 [cs]

Abs

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.

2012

Intra-tumour heterogeneity: a looking glass for cancer?

Andriy Marusyk, Vanessa Almendro, and Kornelia Polyak

Nature Reviews Cancer, May 2012

Number: 5 Publisher: Nature Publishing Group

Abs

Primary human tumours consist of cells that differ in clinically important phenotypic features. This phenotypic heterogeneity is a result of the interplay between genetic and non-genetic factors that shape cellular phenotypes.Genomic instability, which is frequently observed in human cancers, in combination with the large numbers of cell divisions required for the formation of macroscopic tumours, leads to inevitable genetic diversity in populations of tumour cells.Somatic evolution that drives tumour progression is characterized by complex dynamics arising from the Darwinian nature of the process. As a result, individual tumours have a unique clonal architecture that is spatially and temporally heterogeneous.The cancer stem cell perspective can explain only some of the non-genetic variability in tumour cell phenotypes. A more comprehensive explanation of non-genetic sources of phenotypic heterogeneity necessitates the consideration of mechanisms that underlie cellular phenotypes.Both deterministic and stochastic determinants of cellular phenotypes can be substantially affected during oncogenic transformation and tumour progression, contributing both to abnormal phenotypes and to an increased degree of phenotypic plasticity.Phenotypic and genetic heterogeneity within tumours impedes clinical diagnostics: owing to topological heterogeneity in the distribution of diagnostically important phenotypes even multiple sampling might not provide adequate information. At the same time, given the link between a high degree of genetic heterogeneity and poor prognosis, a measure of heterogeneity by itself may be useful as a prognostic marker.Phenotypic heterogeneity in tumour cell populations that results from both genetic and non-genetic determinants constitutes a major source of therapeutic resistance. Initial phenotypic heterogeneity and changes in cellular phenotypes resulting from adaptation to response and selection for resistant phenotypes need to be accounted for in order to achieve substantial improvements in therapeutic outcomes.