PDF
Collaborative Research: SCALE MoDL: Advancing Theoretical Minimax Deep Learning: Optimization, Resilience, and Interpretability

Award Number and Duration

NSF DMS 2134223

NSF DMS 2134148

September 1, 2021 to August 31, 2024 (Estimated)

PI and Point of Contact

Yi Zhou (PI)
Assistant Professor
Department of Electrical and Computer Engineering
University of Utah
yi.zhou AT utah.edu
Home Page

Bei Wang (co-PI)
Assistant Professor
School of Computing and Scientific Computing and Imaging Institute
University of Utah
beiwang AT sci.utah.edu
Home Page

Jie Ding (PI)
Assistant Professor
School of Statistics
University of Minnesota-Twin Cities
dingj AT umn.edu
Home Page

Overview

The past decade has witnessed the great success of deep learning in broad societal and commercial applications. However, conventional deep learning relies on fitting data with neural networks, which is known to produce models that lack resilience. For instance, models used in autonomous driving are vulnerable to malicious attacks, e.g., putting an art sticker on a stop sign can cause the model to classify it as a speed limit sign; models used in facial recognition are known to be biased toward people of a certain race or gender; models in healthcare can be hacked to reconstruct the identities of patients that are used in training those models. The next-generation deep learning paradigm needs to deliver resilient models that promote robustness to malicious attacks, fairness among users, and privacy preservation. This project aims to develop a comprehensive learning theory to enhance the model resilience of deep learning. The project will produce fast algorithms and new diagnostic tools for training, enhancing, visualizing, and interpreting model resilience, all of which can have broad research and societal significance. The research activities will also generate positive educational impacts on undergraduate and graduate students. The materials developed by this project will be integrated into courses on machine learning, statistics, and data visualization and will benefit interdisciplinary students majoring in electrical and computer engineering, statistics, mathematics, and computer science. The project will actively involve underrepresented students and integrate research with education for undergraduate and graduate students in STEM. It will also produce introductory materials for K-12 students to be used in engineering summer camps.

In this project, the investigators will collaboratively develop a comprehensive minimax learning theory that advances the fundamental understanding of minimax deep learning from the perspectives of optimization, resilience, and interpretability. These complementary theoretical developments, in turn, will guide the design of novel minimax learning algorithms with substantially improved computational efficiency, statistical guarantees, and interpretability. The research includes three major thrusts. First, the investigators will develop a principled non-convex minimax optimization theory that supports scalable, fast, and convergent gradient-descent-ascent algorithms for training complex minimax deep learning models. The theory will focus on analyzing the convergence rate and sample complexity of the developed algorithms. Second, the investigators will formulate a measure of vulnerability of deep learning models and study how minimaxity can enhance their resilience against data, model, and task deviations. This theory will focus on the statistical limits of deep learning. Lastly, the investigators will establish the mathematical foundations for a set of novel visual analytics techniques that increase the model interpretability of minimax learning. In particular, the theory will provide guidance on visualizing and interpreting model resilience.

Honors and Awards

Highlights

Publications

Papers marked with * use alphabetic ordering of authors.
Students are underlined.

Software Downloads

Presentations, Educational Development and Broader Impacts

Students and Postdocs

Collaborators

Acknowledgement

This material is based upon work supported or partially supported by the National Science Foundation under Grant No. 2134223 and No. 2134148.

Any opinions, findings, and conclusions or recommendations expressed in this project are those of author(s) and do not necessarily reflect the views of the National Science Foundation.

Web page last update: August 26, 2021.