Training GANs - From Theory to Practice
GANs, originally discovered in the context of unsupervised learning, have had far reaching implications to science, engineering, and society. However, training GANs remains challenging (in part) due to...
View ArticleBeyond log-concave sampling
As the growing number of posts on this blog would suggest, recent years have seen a lot of progress in understanding optimization beyond convexity. However, optimization is only one of the basic...
View ArticleMismatches between Traditional Optimization Analyses and Modern Deep Learning
You may remember our previous blog post showing that it is possible to do state-of-the-art deep learning with learning rate that increases exponentially during training. It was meant to be a dramatic...
View ArticleHow to allow deep learning on your data without revealing the data
Today’s online world and the emerging internet of things is built around a Faustian bargain: consumers (and their internet of things) hand over their data, and in return get customization of the world...
View ArticleCan implicit regularization in deep learning be explained by norms?
This post is based on my recent paper with Noam Razin (to appear at NeurIPS 2020), studying the question of whether norms can explain implicit regularization in deep learning. TL;DR: we argue they...
View ArticleBeyond log-concave sampling (Part 2)
In our previous blog post, we introduced the challenges of sampling distributions beyond log-concavity. We first introduced the problem of sampling from a distibution $p(x) \propto e^{-f(x)}$ given...
View ArticleBeyond log-concave sampling (Part 3)
In the first post of this series, we introduced the challenges of sampling distributions beyond log-concavity. In Part 2 we tackled sampling from multimodal distributions: a typical obstacle occuring...
View ArticleWhen are Neural Networks more powerful than Neural Tangent Kernels?
The empirical success of deep learning has posed significant challenges to machine learning theory: Why can we efficiently train neural networks with gradient descent despite its highly non-convex...
View ArticleRip van Winkle's Razor, a Simple New Estimate for Adaptive Data Analysis
Can you trust a model whose designer had access to the test/holdout set? This implicit question in Dwork et al 2015 launched a new field, adaptive data analysis. The question referred to the fact that...
View ArticleImplicit Regularization in Tensor Factorization: Can Tensor Rank Shed...
In effort to understand implicit regularization in deep learning, a lot of theoretical focus is being directed at matrix factorization, which can be seen as linear neural networks. This post is based...
View ArticleDoes Gradient Flow Over Neural Networks Really Represent Gradient Descent?
TL;DRA lot was said in this blog (cf. post by Sanjeev) about the importance of studying trajectories of gradient descent (GD) for understanding deep learning. Researchers often conduct such studies by...
View ArticlePredicting Generalization using GANs
A central problem of generalization theory is the following: Given a training dataset and a deep net trained with that dataset, give a mathematical estimate of the test error.While this may seem...
View ArticleImplicit Regularization in Hierarchical Tensor Factorization and Deep...
The ability of large neural networks to generalize is commonly believed to stem from an implicit regularization — a tendency of gradient-based optimization towards predictors of low complexity. A lot...
View Article