## Overview

### The Workshop

This workshop focuses on recent advances in the diverse areas of computational topology.
The application of topological techniques to traditional data analysis, as well as the growing use of algorithmic topology in theoretical mathematics, has led to a boost in research in the topic.
At the same time, computational topology is closely connected to discrete and computational geometry, reflected by the fact that CG week is the major annual conference for both fields simultaneously.

### Main Themes

This year, the plenary talks at the workshop will emphasize connections between topology and biomedicine.

### Confirmed Plenary Speakers

Laxmi Parida (IBM and New York University)

Moo K. Chung (University of Wisconsin-Madison)

### Confirmed Invited Speakers

Yusu Wang (Ohio State University)

Rachel Neville (University of Arizona)

Kritika Singhal (Ohio State University)

Chao Chen (Stony Brook University)

Mathieu Carriere (Columbia University)

Samir Chowdhury (Stanford University)

Tamal Dey (Ohio State University)

### Format

The workshop will span two half days.
For each half day session, we plan to hold one plenary talk (1 hour, 45 minutes followed by 15 minutes discussions and questions) and 4 (or 5) invited talks (30 minutes, 25 minutes followed by 5 minutes discussions and questions).
There will be 30 minutes break after the first 2 talks.

### Intended Audience

The workshop primarily targets both the computational topological community represented at SoCG, general SoCG attendees interested in learning topological data analysis and its application in biomedicine, as well as researchers from biomedicine domain.
Our aim is to help both theoretical and applied communities exchange and learn from each other.

## Abstracts

Exact Topological Inference of the Resting-State Brain Network in Twins

Moo K. Chung

A cycle in a brain network is a subset of a connected component with
redundant additional connections. If there are many cycles in a connected
component, the connected component is more densely connected. While the
number of connected components represents the integration of the brain
network, the number of cycles represents how strong the integration is. However,
it is unclear how to perform statistical inference on the number of cycles in the
brain network. In this lecture, we present a new Exact Topological Inference
framework for determining the statistical significance of the number of cycles
through the Kolmogorov-Smirnov (KS) distance, which was recently introduced to
measure the similarity between networks across different filtration values using
the zeroth Betti number. We show how to extend the method to the first Betti
number. Using a twin imaging study, which provides biological ground truth, the
methods are applied in determining if cycles are heritable network features in the
resting-state functional brain networks of 217 twins. This talk is based on a paper
of the same title: doi.org/10.1162/netn_a_00091. The MATLAB codes as well as
the connectivity matrices used in the paper are freely available at
www.stat.wisc.edu/~mchung/TDA.

Topological methods in neuronal data analysis

Yusu Wang

The persistent homology is a fundamental development in the field of topological data analysis in the past two decades. Given a potentially complex object (e.g, a manifold, an image, a graph, or a time series), persistent homology can summarize and characterize it from different perspectives in a multi-scale manner. It thus provides a general and powerful way for data summarization / feature vectorization. In the past few years, there have been many interesting approaches developed either to provide kernels for persistence-based summaries or to vectorize them, so as to make persistence-based summaries more friendly with machine learning frameworks for downstream data analysis tasks. In these approaches, the importance (weight) of different persistence features are often pre-set.
However often in practice, the choice of the weight function should depend on the nature of the specific type of data one considers. In this talk, I will talk about our recent work on learning the ``best'' weight function (and thus metric for persistence-based summaries) from labelled data, as well as an application of this idea to the challenging graph classification task. I will show that a graph classification framework based on the learned kernel between persistence-based summaries can obtain similar or (sometimes significantly) better results than the best results from a range of previous graph classification frameworks on a collection of benchmark datasets. This is joint work with Qi Zhao.

Topological Techniques for Characterizing Patterns Induced by Ion Bombardment

Rachel Neville

Bombarding a solid surface with a broad ion beam produces a wide variety of self-assembled nanoscale patterns. These complex spatial-temporal patterns can be difficult to characterize quantitatively, but understanding differences in these patterns can be important for the fabrication of nanostructures. Persistent homology can be used as a low-dimensional quantitative summary of the structure of the data, including measures of symmetry and order in patterns. These summaries retain a remarkable amount of information that allows for the investigation of the influence of parameters in pattern formation and defect evolution.

Topology-Preserving Deep Image Segmentation for Thin Biomedical Structures

Chao Chen

Segmentation algorithms are prone to make topological errors on fine-scale structures, which is very common in biomedicine. A topological error may only induce marginal per-pixel error, but can cause catastrophic functional mistakes in downstream analysis. We propose a novel segmentation method that learns to segment with correct topology. We use a persistent homology based loss function that is differentiable and can be incorporated into end-to-end training of a topology-preserving deep neural network. We apply our method on various biomedical images such as neuron images and retinal images. Combined with the standard pixel-wise loss, our method achieves much better performance on metrics that directly measures structural correctness, e.g., the Rand Index, Variation of Information, and Betti Number Error, without sacrificing the per-pixel accuracy.

Sketching and Clustering Metric Measure Spaces

Kritika Singhal

We study the problem of approximating a metric space with another metric space of cardinality k. We refer to this problem as “Sketching”. Sketching has been studied by various methods, such as by sampling k randomly distributed points from an input metric space, using the farthest point sampling method, and using kernel methods. We provide a natural mathematical formulation of the sketching problem and show that k-sketching i.e. approximating with a metric space of cardinality k, is equivalent to certain notions of k-clustering. This equivalence provides us with a polynomial time constant factor approximation algorithm for computing a k-sketch of a metric space.
We also study the sketching problem for metric measure spaces. These are metric spaces with a notion of weights on points. These spaces arise naturally in various applications, wherein the points are assigned different weights based on their importance for the underlying application. We follow a similar procedure as for metric spaces, and show that k-sketching is equivalent to certain notions of k-clustering, of metric measure spaces. This again implies existence of polynomial time constant factor approximation algorithms for obtaining a k-point approximation of a metric measure space.

Generalized Persistence Algorithm for Decomposing Multi-parameter Persistence Modules

Tamal Dey

There is no known generalization of the classical matrix reduction-based persistence algorithm for simplicial filtration in a multi-parameter setting. We present for the first time such a generalization. It improves over the Meataxe algorithm commonly used for the purpose by several orders. This is joint work with Cheng Xin.

TDA on Genomics Data

Laxmi Parida

Many problems in genomics are combinatorial in nature, i.e., the interrelationship amongst the entities is at par, if not more important, than the value of the entity themselves.
Graphs are the most commonly used mathematical object to model such relationships. However, often it is important to capture higher order relationships as well. TDA provides a natural basis to model such interactions and, additionally, provides mechanisms for extracting signal patterns from noisy data. I will discuss a few applications of such models.

Geodesics in persistence diagram space

Samir Chowdhury

It is known that for a variety of choices of metrics, including the standard bottleneck distance, the space of persistence diagrams admits geodesics. Typically these existence results produce geodesics that have the form of a convex combination. We prove that for several families of metrics, all geodesics have the form of a convex combination. For certain other choices of metrics, we explicitly construct geodesics that do not have this form.

PersLay: A Simple and Versatile Neural Network Layer for Persistence Diagrams

Mathieu Carriere

Persistence diagrams, a key descriptor from Topological Data Analysis, encode and summarize all sorts of topological features and have already proved pivotal in many different applications of data science. But persistence diagrams are weakly structured and therefore constitute a difficult input for most Machine Learning techniques. To address this concern several vectorization methods have been put forward that embed persistence diagrams into either finite-dimensional Euclidean spaces or implicit Hilbert spaces with kernels. But finite-dimensional embeddings are prone to miss a lot of information about persistence diagrams, while kernel methods require the full computation of the kernel matrix.
We introduce PersLay: a simple, highly modular layer of learning architecture for persistence diagrams that allows to exploit the full capacities of neural networks on topological information from any dataset. This layer encompasses most of the vectorization methods of the literature. We illustrate its strengths on challenging classification problems on dynamical systems orbit or real-life graph data, with results improving or comparable to the state-of-the-art. In order to exploit topological information from graph data, we show how graph structures can be encoded in the so-called extended persistence diagrams computed with the heat kernel signatures of the graphs.