Quantcast
Channel: Off the convex path
Browsing all 53 articles
Browse latest View live

Image may be NSFW.
Clik here to view.

Training GANs - From Theory to Practice

GANs, originally discovered in the context of unsupervised learning, have had far reaching implications to science, engineering, and society. However, training GANs remains challenging (in part) due to...

View Article


Image may be NSFW.
Clik here to view.

Beyond log-concave sampling

As the growing number of posts on this blog would suggest, recent years have seen a lot of progress in understanding optimization beyond convexity. However, optimization is only one of the basic...

View Article


Image may be NSFW.
Clik here to view.

Mismatches between Traditional Optimization Analyses and Modern Deep Learning

You may remember our previous blog post showing that it is possible to do state-of-the-art deep learning with learning rate that increases exponentially during training. It was meant to be a dramatic...

View Article

Image may be NSFW.
Clik here to view.

How to allow deep learning on your data without revealing the data

Today’s online world and the emerging internet of things is built around a Faustian bargain: consumers (and their internet of things) hand over their data, and in return get customization of the world...

View Article

Image may be NSFW.
Clik here to view.

Can implicit regularization in deep learning be explained by norms?

This post is based on my recent paper with Noam Razin (to appear at NeurIPS 2020), studying the question of whether norms can explain implicit regularization in deep learning. TL;DR: we argue they...

View Article


Image may be NSFW.
Clik here to view.

Beyond log-concave sampling (Part 2)

In our previous blog post, we introduced the challenges of sampling distributions beyond log-concavity. We first introduced the problem of sampling from a distibution $p(x) \propto e^{-f(x)}$ given...

View Article

Image may be NSFW.
Clik here to view.

Beyond log-concave sampling (Part 3)

In the first post of this series, we introduced the challenges of sampling distributions beyond log-concavity. In Part 2 we tackled sampling from multimodal distributions: a typical obstacle occuring...

View Article

Image may be NSFW.
Clik here to view.

When are Neural Networks more powerful than Neural Tangent Kernels?

The empirical success of deep learning has posed significant challenges to machine learning theory: Why can we efficiently train neural networks with gradient descent despite its highly non-convex...

View Article


Image may be NSFW.
Clik here to view.

Rip van Winkle's Razor, a Simple New Estimate for Adaptive Data Analysis

Can you trust a model whose designer had access to the test/holdout set? This implicit question in Dwork et al 2015 launched a new field, adaptive data analysis. The question referred to the fact that...

View Article


Image may be NSFW.
Clik here to view.

Implicit Regularization in Tensor Factorization: Can Tensor Rank Shed...

In effort to understand implicit regularization in deep learning, a lot of theoretical focus is being directed at matrix factorization, which can be seen as linear neural networks. This post is based...

View Article

Image may be NSFW.
Clik here to view.

Does Gradient Flow Over Neural Networks Really Represent Gradient Descent?

TL;DRA lot was said in this blog (cf. post by Sanjeev) about the importance of studying trajectories of gradient descent (GD) for understanding deep learning. Researchers often conduct such studies by...

View Article

Image may be NSFW.
Clik here to view.

Predicting Generalization using GANs

A central problem of generalization theory is the following: Given a training dataset and a deep net trained with that dataset, give a mathematical estimate of the test error.While this may seem...

View Article

Image may be NSFW.
Clik here to view.

Implicit Regularization in Hierarchical Tensor Factorization and Deep...

The ability of large neural networks to generalize is commonly believed to stem from an implicit regularization — a tendency of gradient-based optimization towards predictors of low complexity. A lot...

View Article

Browsing all 53 articles
Browse latest View live