VITutorial/Modules_outline.tex at master · vitutorial/VITutorial · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
\documentclass[11pt, a4paper]{article}

\usepackage{amsmath, amssymb, hyperref}
\hypersetup{colorlinks=true, linkcolor=blue, urlcolor=blue, citecolor=blue, breaklinks=true}
\usepackage[round]{natbib}
\usepackage{xcolor}

\newcommand{\pnote}[1]{\textcolor{blue}{#1}}
\newcommand{\wnote}[1]{\textcolor{red}{#1}}

\title{Modules}
\date{last modified: \today}
\author{}

\begin{document}

\maketitle


The modules below are in no particular order (except for the Basics, of course).

%\footnote{\wnote{The order is actually not too arbitrary :) I noted a few exceptions below. Let's try and come up with an ideal order or some other form of dependency.}}

\section{Basics}

\begin{itemize}
\item What is a prior?
\item What is a posterior and what is posterior inference? $ \rightarrow $ recap of Bayes' rule
\item Sampling as an intuitive way of performing inference before diving in the realms of VI?
\item Example problems: Factorial HMMs, Bayesian Mixture Models (show GMs)
\item ELBO derivation I: from KL divergence
\item ELBO derivation II: with Jensen's inequality
\item Connection to EM
\item Mean Field inference
\item Application to example problems (show GMs)
\end{itemize}

\section{Conjugate Models}

\begin{itemize}
\item Exponential families
\item Gaussian-Gaussian conjugacy
\item Example: Bayesian Linear Regression
\item Beta-Binomial warmup for Dirichlet-multinomial?
\item Dirichlet-multinomial conjugacy
\item Example: LDA
\item Conjugate VI in the general case \citep{Beal:2003}
\end{itemize}

\section{Stochastic algorithms}

\begin{itemize}
	\item Stochastic optimisation \citep{RobbinsEtAl:1951}
	\item SVI \citep{HoffmanEtAl:2013}
\end{itemize}

\section{Deep Generative Models}
\subsection{Continuous Latent Variables}
\begin{itemize}
\item Review of generative models
\item Exact case: EM with features \citep{BergkirkpatrickEtAl:2010}
\item First attempt: Wake-sleep \citep{HintonEtAl:1995}
\item Variational Autoencoders \citep{KingmaWelling:2013, RezendeEtAl:2014}
\item Example models: Product of Bernoullis
\item Jupyter notebook as support
\end{itemize}

\subsection{Discrete Latent Variables}
\begin{itemize}
\item Laplace Approximation
\item Gradient methods
\item Problem: cannot simply differentiate an MC average
\item Idea: transform $ \frac{d}{dq} \mathbb{E}_{q}[\cdot] $ into $ \mathbb{E}[\frac{d}{dq}\cdot] $
\item Score function gradient $ \rightarrow $ Black Box VI \citep{PaisleyEtAl:2012, RanganathEtAl:2014}
\item Reparametrisation gradient \citep{KingmaWelling:2013, RezendeEtAl:2014, TitsiasLazarogredilla:2014}
\end{itemize}

\section{Bayesian Neural Networks}
\begin{itemize}
\item Putting priors on weights
\item The old stuff by Neal, MacKay and Hinton \citep{HintonVancamp:1993}
\item The new stuff by DeepMind et al. \citep{Graves:2011, BlundellEtAl:2015}
\item Bayesian Interpretation of Dropout \citep{Gal:2016}
\end{itemize}

\section{Reparametrisation Gradients}

I think the whole module should depend on audience and we can cover the location-scale case in the modules about Nonconjugate models and/or DGMs.

\begin{itemize}
\item Recap: Gaussian reparametrisation
\item Exension to general location-scale families \citep{TitsiasLazarogredilla:2014}
\item ADVI (depending on the audience only go until here; the next two are way more complicated) \citep{KucukelbirEtAl:2017}
\item Generalised Reparametrisation Gradient \citep{RuizEtAl:2016}
\item Rejection Sampling VI \citep{NaessethEtAl:2017}
\end{itemize}

\section{Normalising Flows [Advanced]}
\begin{itemize}
\item Review Gaussian Reparametrisation
\item MADE \citep{GermainEtAl:2015}
\item Generative RNNs on continuous data as normalising flows \citep{KingmaEtAl:2016,PapamakariosEtAl:2017}
\end{itemize}

\section{Nonparametric Models [Advanced]}

\begin{itemize}
\item Intro to stick-breaking processes \citep{IshwaranJames:2001}
\item VI for HDP/PYP \citep{WangEtAl:2011}
\item Intro to GPs
\item VI for GPs
\end{itemize}

\section{Beyond Mean Field [Advanced]}
\begin{itemize}
\item Structured VI (example: Bayesian or Factorial HMMs)
\item Auxiliary variables
\item Hierarchical Varational models
\end{itemize}

\section{Collapsed VB [Advanced]}

Another module that depends on audience: people with Bayesian aspirations vs people who want to play with DGMs.

\begin{itemize}
\item Taylor expansions
\item Example: LDA
\item Connection between collapsed VB and unconstrained variational approximation \citep{TehEtAl:2007}
\item CVB0 \citep{AsuncionEtAl:2009}
\end{itemize}

\section{Beyond KL [Advanced]}
\begin{itemize}
\item $ \alpha $-divergence (make connection to EP)
\item Stein VI
\item Implicit models
\item Hoelder bound
\end{itemize}


\bibliographystyle{plainnat}
\bibliography{VI}

\end{document}