## Friday, April 28, 2017

### Bias in Markov Chain Monte Carlo

Markov Chain Monte Carlo (MCMC) simulation can be used to calculate sums $$\label{eq:20170427a} I = \sum_a \pi_a f(a)$$ One finds a Markov process $X_t$ with stationary distribution $\pi_a$, then the sum \eqref{eq:20170427a} is approximated by \begin{equation*} S =\frac{1}{N} \sum_{t=1}^N f(X_t) \end{equation*} One can prove that under certain assumptions, \begin{equation*} \lim_{N \to \infty} \frac{1}{N} \sum_{t=1}^N f(X_t) = \sum_a \pi_a f(a) \end{equation*} This is Birkhoff's ergodic theorem. In this post I illustrate the behaviour of $ES$ for large $N$.
A formula
First I calculated that for large $N$ $$\label{eq:20170427c} ES \approx \sum_a \pi_a f(a) + \frac{\beta}{N}$$ with $$\label{eq:20170427d} \beta = \sum_{\lambda \neq 1} \langle f | \lambda \rangle \langle \lambda | X_0 \rangle \frac{\lambda}{1 - \lambda}$$ In equation \eqref{eq:20170427d} the sum is over the eigenvalues of the transition matrix $P$ of the Markov process. I have used notation that is common in quantum mechanics: $| \lambda \rangle$ is a right eigenvector with eigenvalue $\lambda$, thus $P | \lambda \rangle =\lambda | \lambda \rangle$. Also, $\langle \lambda|$ is a left eigenvector with eigenvalue $\lambda$, thus $\langle \lambda| P$ =$\langle \lambda| \lambda$. I have normalized the eigenvectors. Furthermore, I write \begin{equation*} \langle f | = \sum_a f(a) \langle a | \end{equation*} with $\langle a |$ a vector with 1 on place $a$ and zero otherwise.

Test case
I work on the circle $a = 0, 1, \ldots, M-1 , M = 0$ and calculate with MCMC the sum \begin{equation*} I = \frac{1}{M} \sum_{a=0}^{M-1} f(a) \end{equation*} $X_t$ is the Markov process which moves left or right, each with probability $1/2$. The transition matrix $P$ has thus the form \begin{equation*} P_{ab} = \frac{1}{2} \left( \delta_{a,b+1} + \delta_{a, b-1} \right) \end{equation*} Its eigenvectors are exponential functions and its eigenvalues are $\cos\left(\frac{2 \pi k}{m} \right)$ with $k = 0, 1, \ldots, M-1$. Thus, for the case at hand \begin{equation*} \beta = \sum_{a=1}^M \sum_{k=1}^{M-1} \frac{1}{M} f(a) e^{ i 2 \pi k ( a - X_0)/M } \frac{ \cos (2 \pi k /M )}{1 - \cos (2 \pi k /M )} \end{equation*}
Illustration
I tested formula \eqref{eq:20170427c} in Python for the function $f(a) = \sqrt a$. I took $N = 2, 4, \ldots, 2^{20}$ and ran the MCMC simulation for each $N$ 10,000 times. I used these 10,000 simulations to estimate $ES$. In the graph below, I have plotted the bias $\beta/N$ for 2 different values of the starting point $X_0$ of the Markov process. For $X_0 = 0, \beta \approx 86$, for $X_0 = 40, \beta \approx 217$. In both cases the numerical data fit the formula \eqref{eq:20170427c} well.
References