Danica J. Sutherland she/her

dsuth[a t]cs.ubc.ca; CV; orcid; github; crossvalidated; twitter
Assistant Professor, UBC Computer Science; CIFAR AI Chair, Amii

My current research interests include:

I was previously a Research Assistant Professor at TTIC, before that a postdoc with Arthur Gretton at the Gatsby Computational Neuroscience Unit, University College London, and I did my Ph.D. at Carnegie Mellon University with Jeff Schneider.

Publications and selected talks are listed below.

I previously published under a different name, and not all papers are corrected yet, though they should be. If you're citing my papers (please do!), please use the name "Danica J. Sutherland". Note that Google Scholar's bib entries often deadname me (and most trans people), even on papers whose official versions have been updated; please correct them if you must use them (or use the more accurate bib entries from here).

Group

Courses

Publications

Below, ** denotes equal contribution. Also available as a .bib file, and most of these are on Semantic Scholar. If you must (but I'd rather you didn't), here's Google Scholar.

Coauthor filters: (show) (hide)
  • Michael Arbel (3)
  • Mikołaj Bińkowski (2)
  • Seth Flaxman (3)
  • Roman Garnett (2)
  • Arthur Gretton (7)
  • Frederic Koehler (2)
  • Ho Chung Leon Law (2)
  • Feng Liu (2)
  • Jie Lu (2)
  • Yifei Ma (3)
  • Michelle Ntampaka (3)
  • Junier B. Oliva (4)
  • Barnabás Póczos (9)
  • Jeff Schneider (11)
  • Dino Sejdinovic (2)
  • Nathan Srebro (4)
  • Heiko Strathmann (3)
  • Hy Trac (3)
  • Liang Xiong (2)
  • Wenkai Xu (2)
  • Lijia Zhou (3)

Preprints

Optimistic Rates: A Unifying Theory for Interpolation Learning and Regularization in Linear Regression. Lijia Zhou**, Frederic Koehler**, Danica J. Sutherland, and Nathan Srebro. Preprint 2021.

Journal and Low-Acceptance-Rate Conference Papers

Better Supervisory Signals by Observing Learning Paths. Yi Ren, Shangmin Guo, and Danica J. Sutherland. International Conference on Learning Representations (ICLR) 2022.
Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds and Benign Overfitting. Frederic Koehler**, Lijia Zhou**, Danica J. Sutherland, and Nathan Srebro. Neural Information Processing Systems (NeurIPS) 2021. Selected for oral presentation.
Self-Supervised Learning with Kernel Dependence Maximization. Yazhe Li**, Roman Pogodin**, Danica J. Sutherland, and Arthur Gretton. Neural Information Processing Systems (NeurIPS) 2021.
Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data. Feng Liu**, Wenkai Xu**, Jie Lu, and Danica J. Sutherland. Neural Information Processing Systems (NeurIPS) 2021.
POT: Python Optimal Transport. Rémi Flamary, Nicolas Courty, Alexandre Gramfort, Mokhtar Z. Alaya, Aurélie Boisbunon, Stanislas Chambon, Laetitia Chapel, Adrien Corenflos, Kilian Fatras, Nemo Fournier, Léo Gautheron, Nathalie T.H. Gayraud, Hicham Janati, Alain Rakotomamonjy, Ievgen Redko, Antoine Rolet, Antony Schutz, Vivien Seguy, Danica J. Sutherland, Romain Tavenard, Alexander Tong, and Titouan Vayer. Journal of Machine Learning Research (JMLR) 2021. Machine Learning Open Source Software Paper.
Does Invariant Risk Minimization Capture Invariance? Pritish Kamath, Akilesh Tangella, Danica J. Sutherland, and Nathan Srebro. Artificial Intelligence and Statistics (AISTATS) 2021. Selected for oral presentation.
On Uniform Convergence and Low-Norm Interpolation Learning. Lijia Zhou, Danica J. Sutherland, and Nathan Srebro. Neural Information Processing Systems (NeurIPS) 2020. Selected for spotlight presentation.
Learning Deep Kernels for Non-Parametric Two-Sample Tests. Feng Liu**, Wenkai Xu**, Jie Lu, Guangquan Zhang, Arthur Gretton, and Danica J. Sutherland. International Conference on Machine Learning (ICML) 2020.
Learning deep kernels for exponential family densities. Li Wenliang**, Danica J. Sutherland**, Heiko Strathmann, and Arthur Gretton. International Conference on Machine Learning (ICML) 2019.
On gradient regularizers for MMD GANs. Michael Arbel**, Danica J. Sutherland**, Mikołaj Bińkowski, and Arthur Gretton. Neural Information Processing Systems (NeurIPS) 2018.
Demystifying MMD GANs. Mikołaj Bińkowski**, Danica J. Sutherland**, Michael Arbel, and Arthur Gretton. International Conference on Learning Representations (ICLR) 2018.
Efficient and principled score estimation with Nyström kernel exponential families. Danica J. Sutherland**, Heiko Strathmann**, Michael Arbel, and Arthur Gretton. Artificial Intelligence and Statistics (AISTATS) 2018. Selected for oral presentation.
Bayesian Approaches to Distribution Regression. Ho Chung Leon Law**, Danica J. Sutherland**, Dino Sejdinovic, and Seth Flaxman. Artificial Intelligence and Statistics (AISTATS) 2018.
Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy. Danica J. Sutherland, Hsiao-Yu Tung, Heiko Strathmann, Soumyajit De, Aaditya Ramdas, Alex Smola, and Arthur Gretton. International Conference on Learning Representations (ICLR) 2017.
Dynamical Mass Measurements of Contaminated Galaxy Clusters Using Machine Learning. Michelle Ntampaka, Hy Trac, Danica J. Sutherland, Sebastian Fromenteau, Barnabás Póczos, and Jeff Schneider. The Astrophysical Journal (ApJ) 831, 2, 135. 2016.
Linear-time Learning on Distributions with Approximate Kernel Embeddings. Danica J. Sutherland**, Junier B. Oliva**, Barnabás Póczos, and Jeff Schneider. AAAI Conference on Artificial Intelligence (AAAI) 2016.
On the Error of Random Fourier Features. Danica J. Sutherland and Jeff Schneider. Uncertainty in Artificial Intelligence (UAI) 2015. Chapter 3 / Section 4.1 of my thesis supersedes this paper, fixing a few errors in constants and providing more results.
Active Pointillistic Pattern Search. Yifei Ma**, Danica J. Sutherland**, Roman Garnett, and Jeff Schneider. Artificial Intelligence and Statistics (AISTATS) 2015.
A Machine Learning Approach for Dynamical Mass Measurements of Galaxy Clusters. Michelle Ntampaka, Hy Trac, Danica J. Sutherland, Nicholas Battaglia, Barnabás Póczos, and Jeff Schneider. The Astrophysical Journal (ApJ) 803, 2, 50. 2015.
Active learning and search on low-rank matrices. Danica J. Sutherland, Barnabás Póczos, and Jeff Schneider. Knowledge Discovery and Data Mining (KDD) 2013. Selected for oral presentation.
Nonparametric kernel estimators for image classification. Barnabás Póczos, Liang Xiong, Danica J. Sutherland, and Jeff Schneider. Computer Vision and Pattern Recognition (CVPR) 2012.
Managing User Requests with the Grand Unified Task System (GUTS). Andrew Stromme, Danica J. Sutherland, Alexander Burka, Benjamin Lipton, Nicholas Felt, Rebecca Roelofs, Daniel-Elia Feist-Alexandrov, Steve Dini, and Allen Welkie. Large Installation System Administration (LISA) 2012. Work done as part of the Swarthmore College Computer Society.

Dissertations

Scalable, Flexible, and Active Learning on Distributions. Committee: Jeff Schneider, Barnabás Póczos, Maria-Florina Balcan, and Arthur Gretton. Computer Science Department, Carnegie Mellon University. Ph.D. thesis, 2016.
Integrating Human Knowledge into a Relational Learning System. Danica J. Sutherland. Computer Science Department, Swarthmore College. B.A. thesis, 2011.

Technical Reports, Posters, etc.

How to Make Virtual Conferences Queer-Friendly: A Guide. Organizers of QueerInAI, A Pranav, MaryLena Bleile, Arjun Subramonian, Luca Soldaini, Danica J. Sutherland, Sabine Weber, and Pan Xu. Workshop on Widening NLP (WiNLP), EMNLP 2021.
Unbiased estimators for the variance of MMD estimators. Danica J. Sutherland. Technical report 2019.
The Role of Machine Learning in the Next Decade of Cosmology. Michelle Ntampaka, Camille Avestruz, Steven Boada, João Caldeira, Jessi Cisewski-Kehe, Rosanne Di Stefano, Cora Dvorkin, August E. Evrard, Arya Farahi, Doug Finkbeiner, Shy Genel, Alyssa Goodman, Andy Goulding, Shirley Ho, Arthur Kosowsky, Paul La Plante, François Lanusse, Michelle Lochner, Rachel Mandelbaum, Daisuke Nagai, Jeffrey A. Newman, Brian Nord, J. E. G. Peek, Austin Peel, Barnabás Póczos, Markus Michael Rau, Aneta Siemiginowska, Danica J. Sutherland, Hy Trac, and Benjamin Wandelt. White paper 2019.
Bayesian Approaches to Distribution Regression. Ho Chung Leon Law**, Danica J. Sutherland**, Dino Sejdinovic, and Seth Flaxman. Learning on Distributions, Functions, Graphs and Groups, NeurIPS 2017. Selected for oral presentation.
Fixing an error in Caponnetto and de Vito (2007). Danica J. Sutherland. Technical report 2017.
Understanding the 2016 US Presidential Election using ecological inference and distribution regression with census microdata. Seth Flaxman, Danica J. Sutherland, Yu-Xiang Wang, and Yee Whye Teh. Technical report 2016.
List Mode Regression for Low Count Detection. Jay Jin, Kyle Miller, Danica J. Sutherland, Simon Labov, Karl Nelson, and Artur Dubrawski. IEEE Nuclear Science Symposium (IEEE NSS/MIC) 2016.
Deep Mean Maps. Junier B. Oliva**, Danica J. Sutherland**, Barnabás Póczos, and Jeff Schneider. Technical report 2015.
Linear-time Learning on Distributions with Approximate Kernel Embeddings. Danica J. Sutherland**, Junier B. Oliva**, Barnabás Póczos, and Jeff Schneider. Feature Extraction: Modern Questions and Challenges, NeurIPS 2015.
Active Pointillistic Pattern Search. Yifei Ma**, Danica J. Sutherland**, Roman Garnett, and Jeff Schneider. Bayesian Optimization (BayesOpt), NeurIPS 2014.
Kernels on Sample Sets via Nonparametric Divergence Estimates. Danica J. Sutherland, Liang Xiong, Barnabás Póczos, and Jeff Schneider. Technical report 2012.
Finding Representative Objects with Sparse Modeling. Junier B. Oliva, Danica J. Sutherland, and Yifei Ma. CMU 10-725 Optimization course project 2012. Best poster award.
Grounding Conceptual Knowledge with Spatio-Temporal Multi-Dimensional Relational Framework Trees. Matthew Bodenhamer, Thomas Palmer, Danica J. Sutherland, and Andrew H. Fagg. Technical report 2012.

Invited talks

Slides for conference and workshop talks directly for a paper are linked next to the paper above.

Better deep learning (sometimes) by learning kernel mean embeddings. January 2022. Lifting Inference with Kernel Embeddings (LIKE22), University of Bern. Related papers: Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data, Self-Supervised Learning with Kernel Dependence Maximization.
Deep kernel-based distances between distributions. January 2021. Kickoff Workshop, Pacific Interdisciplinary Hub on Optimal Transport.
Can Uniform Convergence Explain Interpolation Learning? October 2020. Penn State, Statistics colloquium. Related papers: On Uniform Convergence and Low-Norm Interpolation Learning.
Tutorial: Interpretable Comparison of Distributions and Models. December 2019. Neural Information Processing Systems (NeurIPS). Related papers: Learning Deep Kernels for Non-Parametric Two-Sample Tests, Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy. With Arthur Gretton and Wittawat Jitkrittum.
Better GANs by Using Kernels. October 2019. University of Massachusetts Amherst, College of Information and Computer Sciences. Related papers: Demystifying MMD GANs, On gradient regularizers for MMD GANs.
Kernel distances between distributions for generative models. July 2019. Distance Metrics and Mass Transfer Between High Dimensional Point Clouds, ICIAM. Related papers: Demystifying MMD GANs, On gradient regularizers for MMD GANs.
Introduction to Generative Adversarial Networks. June 2019. Machine Learning Crash Course (MLCC).
Kernel Distances for Better Deep Generative Models. September 2018. Advances in Kernel Methods, GPSS. Related papers: Demystifying MMD GANs, On gradient regularizers for MMD GANs.
Better GANs by using the MMD. June 2018. Facebook AI Research New York. Related papers: Demystifying MMD GANs, On gradient regularizers for MMD GANs.
Efficiently Estimating Densities and Scores with Kernel Exponential Families. June 2018. Gatsby Tri-Center Meeting. Related papers: Efficient and principled score estimation with Nyström kernel exponential families.
Better GANs by using the MMD. June 2018. Machine Learning reading group, Google New York. Related papers: Demystifying MMD GANs, On gradient regularizers for MMD GANs.
Better GANs by using the MMD. June 2018. Machine Learning reading group, Columbia University. Related papers: Demystifying MMD GANs, On gradient regularizers for MMD GANs. No slides actually used at the talk because of a projector mishap, but they would have been the same as the Google talk.
Efficient and principled score estimation with kernel exponential families. December 2017. Approximating high dimensional functions, Alan Turing Institute. Related papers: Efficient and principled score estimation with Nyström kernel exponential families.
Efficient and principled score estimation with kernel exponential families. December 2017. Computational Statistics and Machine Learning seminar, University College London. Related papers: Efficient and principled score estimation with Nyström kernel exponential families.
Evaluating and Training Implicit Generative Models with Two-Sample Tests. August 2017. Implicit Models, ICML. Related papers: Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy.
Two-Sample Tests, Integral Probability Metrics, and GAN Objectives. April 2017. Theory of Generative Adversarial Networks, DALI. Related papers: Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy.
Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy. February 2017. Computational Statistics and Machine Learning seminar, Oxford University. Related papers: Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy.