Danica J. Sutherlandshe

Assistant Professor, UBC Computer Science Canada CIFAR AI Chair, Amii

UBC Machine Learning AML-TN MILD (ML theory) CAIDA (AI) PIHOT/Kantorovich Initiative (optimal transport)
Queer in AI Name Change Policy Working Group

dsuth[a t]cs.ubc.ca or djs[a t]djsutherland.ml
CV orcid github crossvalidated twitter mastodon

Prospective students: Like most North American schools, we only accept applications through the departmental process, with a deadline in December. There is no need to email me about admissions, and due to the volume of emails I will probably not reply.

Anyone with specific research connections / questions / etc should feel free to get in touch at any time, via email / Twitter DM / whatever.

Trans and gender-expansive or other queer people, also please reach out whenever, about specific things or just saying hi. Consider using my personal email (the djsutherland.ml one) for privacy reasons; Twitter DMs or Queer in AI's Slack are also good.

I was previously at TTIC (non-tenure-track faculty, affiliated with Nati Srebro), Gatsby (postdoc with Arthur Gretton), and CMU (PhD with Jeff Schneider).

Publications and selected talks are listed below.

You may come across various old items referring to me with a different first name. Please only use the name Danica to cite or refer to me, and check that your old .bib entries are correct, e.g. by replacing them with the entries here.

Group

Alumni:

Courses

Publications

Below, ** denotes equal contribution, and this colour one of my students. Also available as a .bib file, and most of these are on Semantic Scholar. If you must (but I'd rather you didn't), here's Google Scholar.

Coauthor filters: (show) (hide)
  • Michael Arbel (3)
  • Wonho Bae (8)
  • Mikołaj Bińkowski (2)
  • Namrata Deka (4)
  • Seth Flaxman (3)
  • Roman Garnett (2)
  • Arthur Gretton (9)
  • Shangmin Guo (3)
  • Milad Jalali Asadabadi (2)
  • Frederic Koehler (3)
  • Ho Chung Leon Law (2)
  • Yazhe Li (3)
  • Zhiyuan Li (2)
  • Feng Liu (2)
  • Jie Lu (2)
  • Yifei Ma (2)
  • Mohamad Amin Mohamadi (4)
  • Jyunhug Noh (3)
  • Michelle Ntampaka (3)
  • Junier B. Oliva (3)
  • Mijung Park (2)
  • Roman Pogodin (3)
  • A Pranav (2)
  • Barnabás Póczos (9)
  • Organizers of QueerInAI (2)
  • Yi Ren (5)
  • Jeff Schneider (11)
  • Dino Sejdinovic (2)
  • Hamed Shirzad (3)
  • Luca Soldaini (2)
  • Nathan Srebro (5)
  • Heiko Strathmann (3)
  • Arjun Subramonian (2)
  • Hy Trac (3)
  • Ameya Velingker (2)
  • Balaji Venkatachalam (2)
  • Lei Wu (2)
  • Liang Xiong (2)
  • Pan Xu (2)
  • Wenkai Xu (2)
  • Lijia Zhou (4)

Preprints

Language Model Evolution: An Iterated Learning Perspective. Yi Ren, Shangmin Guo, Linlu Qiu, Bailin Wang, and Danica J. Sutherland. Preprint 2024.
Practical Kernel Tests of Conditional Independence. Roman Pogodin, Antonin Schrab, Yazhe Li, Danica J. Sutherland, and Arthur Gretton. Preprint 2024.
AdaFlood: Adaptive Flood Regularization. Wonho Bae, Yi Ren, Mohamad Osama Ahmed, Frederick Tung, Danica J. Sutherland, and Gabriel Oliveira. Preprint 2023.

Journal and Low-Acceptance-Rate Conference Papers

Generalized Coverage for More Robust Low-Budget Active Learning. Wonho Bae, Jyunhug Noh, and Danica J. Sutherland. European Conference on Computer Vision (ECCV) 2024.
Exploring Active Learning in Meta-Learning: Enhancing Context Set Labeling. Wonho Bae, Jing Wang, and Danica J. Sutherland. European Conference on Computer Vision (ECCV) 2024.
Why Do You Grok? A Theoretical Analysis on Grokking Modular Addition. Mohamad Amin Mohamadi, Zhiyuan Li, Lei Wu, and Danica J. Sutherland. International Conference on Machine Learning (ICML) 2024.
Improving Compositional Generalization using Iterated Learning and Simplicial Embeddings. Yi Ren, Samuel Lavoie, Mikhail Galkin, Danica J. Sutherland, and Aaron Courville. Neural Information Processing Systems (NeurIPS) 2023.
Exphormer: Scaling Graph Transformers with Expander Graphs. Hamed Shirzad**, Ameya Velingker**, Balaji Venkatachalam**, Danica J. Sutherland, and Ali Kemal Sinop. International Conference on Machine Learning (ICML) 2023.
A Fast, Well-Founded Approximation to the Empirical Neural Tangent Kernel. Mohamad Amin Mohamadi, Wonho Bae, and Danica J. Sutherland. International Conference on Machine Learning (ICML) 2023.
Queer in AI: A Case Study in Community-Led Participatory AI. Organizers of QueerInAI, Analeia Ovalle, Arjun Subramonian, Ashwin Singh, Claas Voelcker, Danica J. Sutherland, Davide Locatelli, Eva Breznik, Filip Klubička, Hang Yuan, Hetvi J, Huan Zhang, Jaidev Shriram, Kruno Lehamn, Luca Soldaini, Maarten Sap, Marc Peter Deisenroth, Maria Leonor Pacheco, Maria Ryskina, Martin Mundt, Melvin Selim Atay, Milind Agarwal, Nyx McLean, Pan Xu, A Pranav, Raj Korpan, Ruchira Ray, Sarah Mathew, Sarthak Arora, ST John, Tanvi Anand, Vishakha Agrawal, William Agnew, Yanan Long, Zijie J. Wang, Zeerak Talat, Avijit Ghosh, Nathaniel Dennler, Michael Noseworthy, Sharvani Jha, Emi Baylor, Aditya Joshi, Natalia Y. Bilenko, Andrew McNamara, Raphael Gontijo-Lopes, Alex Markham, Evyn Dǒng, Jackie Kay, Manu Saraswat, Nikhil Vytla, and Luke Stark. ACM Conference on Fairness, Accountability, and Transparency (FAccT) 2023. Best Paper award.
Pre-trained Perceptual Features Improve Differentially Private Image Generation. Frederik Harder, Milad Jalali Asadabadi, Danica J. Sutherland, and Mijung Park. Transactions on Machine Learning Research (TMLR) 2023.
Efficient Conditionally Invariant Representation Learning. Roman Pogodin**, Namrata Deka**, Yazhe Li**, Danica J. Sutherland, Victor Veitch, and Arthur Gretton. International Conference on Learning Representations (ICLR) 2023. Selected as notable (top 5%), i.e. as an oral.
How to prepare your task head for finetuning. Yi Ren, Shangmin Guo, Wonho Bae, and Danica J. Sutherland. International Conference on Learning Representations (ICLR) 2023.
MMD-B-Fair: Learning Fair Representations with Statistical Testing. Namrata Deka and Danica J. Sutherland. Artificial Intelligence and Statistics (AISTATS) 2023.
A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized Linear Models. Lijia Zhou**, Frederic Koehler**, Pragya Sur, Danica J. Sutherland, and Nathan Srebro. Neural Information Processing Systems (NeurIPS) 2022.
Making Look-Ahead Active Learning Strategies Feasible with Neural Tangent Kernels. Mohamad Amin Mohamadi**, Wonho Bae**, and Danica J. Sutherland. Neural Information Processing Systems (NeurIPS) 2022.
Evaluating Graph Generative Models with Contrastively Learned Features. Hamed Shirzad, Kaveh Hassani, and Danica J. Sutherland. Neural Information Processing Systems (NeurIPS) 2022.
Object Discovery via Contrastive Learning for Weakly Supervised Object Detection. Jinhwan Seo, Wonho Bae, Danica J. Sutherland, Jyunhug Noh, and Daijin Kim. European Conference on Computer Vision (ECCV) 2022.
One Weird Trick to Improve Your Semi-Weakly Supervised Semantic Segmentation Model. Wonho Bae, Jyunhug Noh, Milad Jalali Asadabadi, and Danica J. Sutherland. International Joint Conference on Artificial Intelligence (IJCAI) 2022.
Better Supervisory Signals by Observing Learning Paths. Yi Ren, Shangmin Guo, and Danica J. Sutherland. International Conference on Learning Representations (ICLR) 2022.
Optimistic Rates: A Unifying Theory for Interpolation Learning and Regularization in Linear Regression. Lijia Zhou**, Frederic Koehler**, Danica J. Sutherland, and Nathan Srebro. ACM/IMS Journal of Data Science (JDS) 2023.
Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds and Benign Overfitting. Frederic Koehler**, Lijia Zhou**, Danica J. Sutherland, and Nathan Srebro. Neural Information Processing Systems (NeurIPS) 2021. Selected for oral presentation.
Self-Supervised Learning with Kernel Dependence Maximization. Yazhe Li**, Roman Pogodin**, Danica J. Sutherland, and Arthur Gretton. Neural Information Processing Systems (NeurIPS) 2021.
Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data. Feng Liu**, Wenkai Xu**, Jie Lu, and Danica J. Sutherland. Neural Information Processing Systems (NeurIPS) 2021.
POT: Python Optimal Transport. Rémi Flamary, Nicolas Courty, Alexandre Gramfort, Mokhtar Z. Alaya, Aurélie Boisbunon, Stanislas Chambon, Laetitia Chapel, Adrien Corenflos, Kilian Fatras, Nemo Fournier, Léo Gautheron, Nathalie T.H. Gayraud, Hicham Janati, Alain Rakotomamonjy, Ievgen Redko, Antoine Rolet, Antony Schutz, Vivien Seguy, Danica J. Sutherland, Romain Tavenard, Alexander Tong, and Titouan Vayer. Journal of Machine Learning Research (JMLR) 2021. Machine Learning Open Source Software Paper.
Does Invariant Risk Minimization Capture Invariance? Pritish Kamath, Akilesh Tangella, Danica J. Sutherland, and Nathan Srebro. Artificial Intelligence and Statistics (AISTATS) 2021. Selected for oral presentation.
On Uniform Convergence and Low-Norm Interpolation Learning. Lijia Zhou, Danica J. Sutherland, and Nathan Srebro. Neural Information Processing Systems (NeurIPS) 2020. Selected for spotlight presentation.
Learning Deep Kernels for Non-Parametric Two-Sample Tests. Feng Liu**, Wenkai Xu**, Jie Lu, Guangquan Zhang, Arthur Gretton, and Danica J. Sutherland. International Conference on Machine Learning (ICML) 2020.
Learning deep kernels for exponential family densities. Li Wenliang**, Danica J. Sutherland**, Heiko Strathmann, and Arthur Gretton. International Conference on Machine Learning (ICML) 2019.
On gradient regularizers for MMD GANs. Michael Arbel**, Danica J. Sutherland**, Mikołaj Bińkowski, and Arthur Gretton. Neural Information Processing Systems (NeurIPS) 2018.
Demystifying MMD GANs. Mikołaj Bińkowski**, Danica J. Sutherland**, Michael Arbel, and Arthur Gretton. International Conference on Learning Representations (ICLR) 2018.
Efficient and principled score estimation with Nyström kernel exponential families. Danica J. Sutherland**, Heiko Strathmann**, Michael Arbel, and Arthur Gretton. Artificial Intelligence and Statistics (AISTATS) 2018. Selected for oral presentation.
Bayesian Approaches to Distribution Regression. Ho Chung Leon Law**, Danica J. Sutherland**, Dino Sejdinovic, and Seth Flaxman. Artificial Intelligence and Statistics (AISTATS) 2018.
Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy. Danica J. Sutherland, Hsiao-Yu Tung, Heiko Strathmann, Soumyajit De, Aaditya Ramdas, Alex Smola, and Arthur Gretton. International Conference on Learning Representations (ICLR) 2017.
Dynamical Mass Measurements of Contaminated Galaxy Clusters Using Machine Learning. Michelle Ntampaka, Hy Trac, Danica J. Sutherland, Sebastian Fromenteau, Barnabás Póczos, and Jeff Schneider. The Astrophysical Journal (ApJ) 831, 2, 135. 2016.
Linear-time Learning on Distributions with Approximate Kernel Embeddings. Danica J. Sutherland**, Junier B. Oliva**, Barnabás Póczos, and Jeff Schneider. AAAI Conference on Artificial Intelligence (AAAI) 2016.
On the Error of Random Fourier Features. Danica J. Sutherland and Jeff Schneider. Uncertainty in Artificial Intelligence (UAI) 2015. Chapter 3 / Section 4.1 of my thesis supersedes this paper, fixing a few errors in constants and providing more results.
Active Pointillistic Pattern Search. Yifei Ma**, Danica J. Sutherland**, Roman Garnett, and Jeff Schneider. Artificial Intelligence and Statistics (AISTATS) 2015.
A Machine Learning Approach for Dynamical Mass Measurements of Galaxy Clusters. Michelle Ntampaka, Hy Trac, Danica J. Sutherland, Nicholas Battaglia, Barnabás Póczos, and Jeff Schneider. The Astrophysical Journal (ApJ) 803, 2, 50. 2015.
Active learning and search on low-rank matrices. Danica J. Sutherland, Barnabás Póczos, and Jeff Schneider. Knowledge Discovery and Data Mining (KDD) 2013. Selected for oral presentation.
Nonparametric kernel estimators for image classification. Barnabás Póczos, Liang Xiong, Danica J. Sutherland, and Jeff Schneider. Computer Vision and Pattern Recognition (CVPR) 2012.
Managing User Requests with the Grand Unified Task System (GUTS). Andrew Stromme, Danica J. Sutherland, Alexander Burka, Benjamin Lipton, Nicholas Felt, Rebecca Roelofs, Daniel-Elia Feist-Alexandrov, Steve Dini, and Allen Welkie. Large Installation System Administration (LISA) 2012. Work done as part of the Swarthmore College Computer Society.

Dissertations

Scalable, Flexible, and Active Learning on Distributions. Committee: Jeff Schneider, Barnabás Póczos, Maria-Florina Balcan, and Arthur Gretton. Computer Science Department, Carnegie Mellon University. Ph.D. thesis, 2016.
Integrating Human Knowledge into a Relational Learning System. Danica J. Sutherland. Computer Science Department, Swarthmore College. B.A. thesis, 2011.

Technical Reports, Posters, etc.

Differentially Private Neural Tangent Kernels for Privacy-Preserving Data Generation. Yilin Yang, Kamil Adamczewski, Danica J. Sutherland, Xiaoxiao Li, and Mijung Park. Privacy-Preserving Artificial Intelligence (PPAI-24), AAAI 2024.
Low-Width Approximations and Sparsification for Scaling Graph Transformers. Hamed Shirzad, Balaji Venkatachalam, Ameya Velingker, Danica J. Sutherland, and David Woodruff. New Frontiers in Graph Learning, NeurIPS 2023.
Grokking modular arithmetic can be explained by margin maximization. Mohamad Amin Mohamadi, Zhiyuan Li, Lei Wu, and Danica J. Sutherland. Mathematics of Modern Machine Learning, NeurIPS 2023.
Learning Privacy-Preserving Deep Kernels with Known Demographics. Namrata Deka and Danica J. Sutherland. Privacy-Preserving Artificial Intelligence (PPAI-22), AAAI 2022.
How to Make Virtual Conferences Queer-Friendly: A Guide. Organizers of QueerInAI, A Pranav, MaryLena Bleile, Arjun Subramonian, Luca Soldaini, Danica J. Sutherland, Sabine Weber, and Pan Xu. Workshop on Widening NLP (WiNLP), EMNLP 2021.
Unbiased estimators for the variance of MMD estimators. Danica J. Sutherland and Namrata Deka. Technical report 2019.
The Role of Machine Learning in the Next Decade of Cosmology. Michelle Ntampaka, Camille Avestruz, Steven Boada, João Caldeira, Jessi Cisewski-Kehe, Rosanne Di Stefano, Cora Dvorkin, August E. Evrard, Arya Farahi, Doug Finkbeiner, Shy Genel, Alyssa Goodman, Andy Goulding, Shirley Ho, Arthur Kosowsky, Paul La Plante, François Lanusse, Michelle Lochner, Rachel Mandelbaum, Daisuke Nagai, Jeffrey A. Newman, Brian Nord, J. E. G. Peek, Austin Peel, Barnabás Póczos, Markus Michael Rau, Aneta Siemiginowska, Danica J. Sutherland, Hy Trac, and Benjamin Wandelt. White paper 2019.
Bayesian Approaches to Distribution Regression. Ho Chung Leon Law**, Danica J. Sutherland**, Dino Sejdinovic, and Seth Flaxman. Learning on Distributions, Functions, Graphs and Groups, NeurIPS 2017. Selected for oral presentation.
Fixing an error in Caponnetto and de Vito (2007). Danica J. Sutherland. Technical report 2017.
Understanding the 2016 US Presidential Election using ecological inference and distribution regression with census microdata. Seth Flaxman, Danica J. Sutherland, Yu-Xiang Wang, and Yee Whye Teh. Technical report 2016.
List Mode Regression for Low Count Detection. Jay Jin, Kyle Miller, Danica J. Sutherland, Simon Labov, Karl Nelson, and Artur Dubrawski. IEEE Nuclear Science Symposium (IEEE NSS/MIC) 2016.
Deep Mean Maps. Junier B. Oliva**, Danica J. Sutherland**, Barnabás Póczos, and Jeff Schneider. Technical report 2015.
Linear-time Learning on Distributions with Approximate Kernel Embeddings. Danica J. Sutherland**, Junier B. Oliva**, Barnabás Póczos, and Jeff Schneider. Feature Extraction: Modern Questions and Challenges, NeurIPS 2015.
Active Pointillistic Pattern Search. Yifei Ma**, Danica J. Sutherland**, Roman Garnett, and Jeff Schneider. Bayesian Optimization (BayesOpt), NeurIPS 2014.
Kernels on Sample Sets via Nonparametric Divergence Estimates. Danica J. Sutherland, Liang Xiong, Barnabás Póczos, and Jeff Schneider. Technical report 2012.
Grounding Conceptual Knowledge with Spatio-Temporal Multi-Dimensional Relational Framework Trees. Matthew Bodenhamer, Thomas Palmer, Danica J. Sutherland, and Andrew H. Fagg. Technical report 2012.

Invited talks

Slides for conference and workshop talks directly for a paper are linked next to the paper above.

Scaling Graph Transformers with Expander Graphs. June 2024. Simon Fraser University, AI seminar. Related papers: ICML-23.
Conditional independence measures for fairer, more reliable models. February 2024. Statistical Aspects of Trustworthy Machine Learning, Banff International Research Station. Related papers: ICLR-23, Preprint 2024.
Learning conditionally independent representations with kernel regularizers. June 2023. Lifting Inference with Kernel Embeddings (LIKE23), University of Bern. Related papers: ICLR-23.
[Lecture] (Deep) Kernel Mean Embeddings for Representing and Learning on Distributions. June 2023. Lifting Inference with Kernel Embeddings (LIKE23), University of Bern.
Learning conditionally independent representations with kernel regularizers. June 2023. Gatsby25. Related papers: ICLR-23.
Are these datasets different? Two-sample testing for data scientists. April 2023. Pacific Conference on Artificial Intelligence (PCA). Related papers: AISTATS-23, NeurIPS-21, ICML-20, ICLR-17.
A Defense of (Empirical) Neural Tangent Kernels. March 2023. AI Seminar, University of Michigan. Related papers: ICLR-22, ICLR-23, NeurIPS-22, ICML-23.
Post-Publication Name Change Policies, Why they Matter, and Whether they Work. March 2023. Robotics DEI Seminar, University of Michigan.
In Defence of (Empirical) Neural Tangent Kernels. March 2023. Microsoft Research Montréal. Related papers: ICLR-22, ICLR-23, NeurIPS-22, ICML-23.
[Lecture] Neural Tangent Kernels, Finite and Infinite. February 2023. Winter School on Deep Learning, Indian Statistical Institute.
Name Change Policies: A Brief (Personal) Tour. November 2022. Queer in AI Workshop, NeurIPS.
[Lecture] Modern Kernel Methods in Machine Learning. October 2022. Research School on Uncertainty in Scientific Computing, Corsica (ETICS).
Are These Datasets The Same? Learning Kernels for Efficient and Fair Two-sample Tests. April 2022. Toronto Womxn in Data Science Conference. Related papers: NeurIPS-21, ICML-20, ICLR-17.
Are These Datasets The Same? Learning Kernels for Efficient and Fair Two-sample Tests. February 2022. TrustML Young Scientist Seminars. Related papers: NeurIPS-21, ICML-20, ICLR-17.
Better deep learning (sometimes) by learning kernel mean embeddings. January 2022. Lifting Inference with Kernel Embeddings (LIKE22), University of Bern. Related papers: NeurIPS-21, NeurIPS-21.
Can Uniform Convergence Explain Interpolation Learning? November 2021. NYU Center for Data Science Lunch Seminar Series. Related papers: NeurIPS-20, NeurIPS-21.
Deep kernel-based distances between distributions. January 2021. Kickoff Workshop, Pacific Interdisciplinary Hub on Optimal Transport. Related papers: ICML-20, ICLR-17, NeurIPS-18, ICLR-18.
[Lecture] Kernel Methods: From Basics to Modern Applications. January 2021. Data Science Summer School (DS3), École Polytechnique, Paris.
Can Uniform Convergence Explain Interpolation Learning? October 2020. Penn State, Statistics colloquium. Related papers: NeurIPS-20.
[Tutorial] Interpretable Comparison of Distributions and Models. December 2019. Neural Information Processing Systems (NeurIPS). Related papers: ICML-20, ICLR-17. With Arthur Gretton and Wittawat Jitkrittum.
Better GANs by Using Kernels. October 2019. University of Massachusetts Amherst, College of Information and Computer Sciences. Related papers: ICLR-18, NeurIPS-18.
Kernel distances between distributions for generative models. July 2019. Distance Metrics and Mass Transfer Between High Dimensional Point Clouds, ICIAM. Related papers: ICLR-18, NeurIPS-18.
[Lecture] Learning with Positive Definite Kernels: Theory, Algorithms and Applications. June 2019. Data Science Summer School (DS3), École Polytechnique, Paris. With Bharath Sriperumbudur.
[Lecture] Introduction to Generative Adversarial Networks. June 2019. Machine Learning Crash Course (MLCC), University of Genoa.
Kernel Distances for Better Deep Generative Models. September 2018. Advances in Kernel Methods, GPSS. Related papers: ICLR-18, NeurIPS-18.
Better GANs by using the MMD. June 2018. Facebook AI Research New York. Related papers: ICLR-18, NeurIPS-18.
Efficiently Estimating Densities and Scores with Kernel Exponential Families. June 2018. Gatsby Tri-Center Meeting. Related papers: AISTATS-18.
Better GANs by using the MMD. June 2018. Machine Learning reading group, Google New York. Related papers: ICLR-18, NeurIPS-18.
Better GANs by using the MMD. June 2018. Machine Learning reading group, Columbia University. Related papers: ICLR-18, NeurIPS-18. No slides actually used at the talk because of a projector mishap, but they would have been the same as the Google talk.
Advances in GANs based on the MMD. May 2018. Machine Learning Seminar, University of Sheffield. Related papers: ICLR-18, NeurIPS-18.
Efficient and principled score estimation with kernel exponential families. December 2017. Approximating high dimensional functions, Alan Turing Institute. Related papers: AISTATS-18.
Efficient and principled score estimation with kernel exponential families. December 2017. Computational Statistics and Machine Learning seminar, University College London. Related papers: AISTATS-18.
Evaluating and Training Implicit Generative Models with Two-Sample Tests. August 2017. Implicit Models, ICML. Related papers: ICLR-17.
Two-Sample Tests, Integral Probability Metrics, and GAN Objectives. April 2017. Theory of Generative Adversarial Networks, DALI. Related papers: ICLR-17.
Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy. February 2017. Computational Statistics and Machine Learning seminar, Oxford University. Related papers: ICLR-17.