Two problems involving random walks on graphs are studied. First, the starting point from which a random walk on is most likely to hit a given point before another given point is determined. Second, the slowest mixing initial distribution under a random walk on a given finite graph is found.
Introduction
This article describes my investigation into several basic problems regarding random walks on graphs. On several occasions, I asked myself questions which my intuition failed to answer. I guessed at an answer, and spent some time in a fruitless attempt at proving that it was correct. Out of frustration I turned to computer simulations, only to discover that my guesses were faulty. Once I had the correct answer, I was able to supply the proofs. As every mathematician knows, it is much easier to solve a problem when you know the right answer ahead of time. This presentation is deliberately informal, as it represents the record of an actual investigation that took place, rather than a crafted paper. In fact, the notebook that I used to run my experiments has become the paper, with explanatory text added and unnecessary debris removed.
Probability That a Random Walk Hits One Point before Another
A graph is a set of vertices
and a set of pairs of vertices
, known as edges. We will assume throughout that
is an undirected, simple graph. That is, no edges are of the form
, the edge
is the same as the edge
, and the set of edges
contains no repeated elements. The degree
of a vertex
of a graph is the number of edges containing
. A random walk on
is the random movement of a particle from vertex to vertex, where at each step a particle at any vertex moves to an adjacent vertex, with the probability of moving to each adjacent vertex given by the reciprocal of the degree of the vertex. This movement is memoryless, in the sense that each step is independent of earlier steps.
One of the most interesting and fundamental questions in this field is the question of recurrence and transience in the integer lattice of dimension . The basic and well-known result is that a random walk on
is recurrent (returns infinitely often to any point) if and only if
. See [1] for a beautiful proof of this involving resistance in electric circuits, as well as a more elementary proof. I have recently asked myself a related question. Given two fixed points
,
and a variable point
, let us find
, the probability that a random walk starting at
hits
before hitting
. In one dimension this question is quite easy, but in two dimensions, as far as I have been able to ascertain, no closed form for the solution is known. To circumvent the difficulty of this question, however, we can ask a simpler one. Namely, given
and
, let us determine the point
that maximizes
. I spent some time working in vain on this problem, before I realized that it would help to know what the answer is before attempting a proof. I wrote the following program.
The function q is used to generate the correct range of points, a diamond-shaped region of radius n, and place the points in the right order. Here is an example.
This nests the diamonds to create a square lattice.
To understand the function f one must understand a bit of the mathematics of random walks. Due to the memoryless property of random walks, is a harmonic function. That is,
is the average of its surrounding values,
for all in
and the sum is over all
adjacent to
. In fact, it is easy to see that
is the unique function harmonic on
, such that
,
, and
as
(see [1]). The program f begins by setting
,
, and
for all other
. In each iteration through the While loop, the value of the function at each point is set to the average of the neighboring values. In this way the function fun becomes more and more nearly harmonic, and as a result closer to the values of our desired function
. This process is described in [1], and just to check that f is working correctly I tested it on an example that appears in [1].
This exactly matches the values given in [1]. Confident that my function works well, I tested it on an example.
And here we see the value of numerical experimentation. Lacking what I now know to be the proper intuition, I was foolishly confident that the maximum would occur at the point , since this seems to be the point most “separated” from
by
. However, the simulation above clearly shows that this is not the case, and that in fact
and
are the maximal values. After playing with several other examples, which readers are encouraged to do on their own, I realized that the maximal point is always one adjacent to
. Once that is realized, it is clear that it must be the point or points “farthest” from
, in some sense. One can define farthest in terms of the Euclidean metric, but it is actually more natural to define it in the way given in the statement of theorem 1.
Theorem 1

















Proof
The statement of the theorem is that the maximum occurs at , since that is the point adjacent to
that does not lie in a half-plane with
. First let us prove that the maximum occurs at a point adjacent to
. Let
be the set of points adjacent to
. Then, for any
not in
,
Thus, is seen to be a weighted average over
of the quantities
with weights
. Since
it follows that
so that the maximum is obtained on . Now, to determine at which point it is obtained, let us first define a reflection in this context. The reflection of a point
in the line
is the unique point
such that
is a line perpendicular to
with its midpoint on
. So, for instance, in the graphic above the reflection of
over
is the point
. If
is a sequence of points in
such that
is adjacent to
for all
, then we call
a path. The reflection of a path
in
is the path
, where each
is the reflection of
. So, in the graphic below, the blue path is a reflection in
of the green path.
Now,
Suppose that lies in one of the half-planes formed by the line
and that
is a point adjacent to
that also lies in this half-plane. Let
be any path starting at
that hits
before
. This path must hit
at some point, possibly at
. Let
be the smallest value such that
. Then we set
. That is,
is the reflection of
up until
and
both hit
, from which point
and
coincide.
is then a path, beginning at
, that must hit
before
(note that for
,
lies in the half-plane not containing
). The map
is an injective map from the set of paths starting from
that hit
before
to the set of paths starting from
that hit
before
. Thus, the number of distinct paths starting at
of length
that hit
before
is less than or equal to the number of distinct paths starting at
of length
that hit
before
. It is in fact easy to see that this inequality is strict, since there are paths starting at
that hit
before
, but whose reflections hit
before
. It follows that
. Thus, the maximum occurs at the point or points adjacent to
that do not share a half-plane with
, and the theorem follows.
Slowest Mixing Distribution
Suppose now that is a finite graph. The question of hitting probabilities of points can be handled in a similar way to the above argument, where we still must look for harmonic functions on the graph. However, there is a related question that is a bit different, but quite interesting. Suppose, rather than a sole walker, we have a large number of random walkers, all moving at the same time. The object of interest will be the fraction of walkers that are at each vertex at any given time. Suppose
has
vertices, numbered from 1 to
. A distribution vector is a vector in
whose components sum to 1, representing a distribution of random walkers on the graph. Form an
matrix whose elements satisfy
unless vertices
and
are adjacent, in which case
. This is known as the adjacency matrix. We will suppose below that the matrix is regular (every vertex has the same degree) and not bipartite (bipartite means that the vertices of the graph can be divided into two sets
and
, such that every vertex in
is adjacent only to vertices in
, and vice versa). Let
denote the probability matrix obtained by dividing all elements in the adjacency matrix by the degree of each vertex. If at time 0 the walkers have distribution
, then
is the distribution at time 1, followed by
, etc. The distribution vector
gives the stationary distribution, the distribution that is unchanged (since each row of
sums to 1) when the random walk undergoes a step. The key question is, when we start with an arbitrary distribution, what happens? Do the walkers somehow approach the stationary distribution, or do they stay disordered forever? It is a consequence of the Perron-Frobenius theorem that all eigenvalues of
are real and lie in the interval
, and that the eigenvectors
form an orthogonal basis of
. The largest eigenvalue
is simple and equal to 1, with the stationary distribution
as an eigenvector. Any function
on the vertices of the graph can be expressed uniquely as
, and
,
, and so on. If
is a distribution vector, then
since the basis of eigenvectors is orthogonal and the sum of the components of any vector orthogonal to
must be 0. We know that
. Let us assume that none of
are equal to
(in case one of them is, it can be shown that the graph is bipartite). In that case, an arbitrary distribution will converge geometrically to the stationary distribution, since
as
. Thus, we see that the largest magnitude of the eigenvectors other than
determines the slowest possible rate of convergence of any distribution to the stationary one. This is all standard and well known in this field (see [2]). I was interested in using Mathematica to understand these theorems better, so I decided to try to determine the “slowest mixing” distribution, given any graph. Again, my intuition failed me, as I imagined that the slowest mixing distribution was likely to be complicated and difficult to obtain. The slowest mixing distribution will be the distribution vector
where
is maximized, where
is chosen to be the largest eigenvalue (in magnitude) other than
. If we define
by
, then we seek to find the maximal value of
that lies in
. Let us use GraphData to find interesting examples with which to work.
We only want regular graphs, and let us consider a number of vertices that is small enough with which to calculate but still large enough to be interesting; nine seems reasonable. Let us also take only graphs of degree four.
We get rid of the bipartite and disconnected graphs.
Here are the graphs.
Here are the spectrums, the sets of eigenvalues of adjacency matrices. They all have degree four as their largest eigenvalue, as they should, and all other eigenvalues lie in .
Just as a check, let us make sure that all spectrums sum to 0. This is because the sum of the eigenvalues of a matrix is equal to the trace, and the trace of the adjacency matrix is 0.
The sum of the squares of the eigenvalues gives twice the number of edges.
The sum of the cubes of the eigenvalues gives six times the number of triangles in the graph.
See [3] for a proof of the preceding facts. We see that the fourth graph in our list has many triangles, while the next to last has hardly any. It may be guessed that this property has something to do with how rapidly a distribution gets mixed on a graph, so let us closely consider these two graphs.
Here is what they look like.
Recall that we are trying to find the distribution that converges most slowly to the uniform one. The probability matrix is defined to be the adjacency matrix divided by the degree, in this case four. In the context of random walks on regular graphs, the probability matrix is more relevant than the adjacency matrix, as it represents the transition probabilities to the neighboring states of each step of the walk. Here are the all-important eigenvalues and eigenvectors for the probability matrix of the graph with many triangles.
We define , …,
to be the eigenvectors.
Now, if the theory is correct, this should be an orthogonal basis of , so that for any vector
we should have
This is a standard result from linear algebra, and holds in the much more general context of Hilbert spaces (see [4]). Let us verify it with the vector .
In order to find the slowest mixing distribution, then, we need to maximize the coefficient of in the representation of any distribution vector
in
. A distribution vector is a vector
such that
. Thus, we are led to ask Mathematica the following.
Interestingly enough, the slowest mixing distribution occurs when we concentrate the entire mass of walkers on vertex 8. But which is vertex 8? Here is the graph with the vertices labeled.
There does not seem to be anything too special about vertex 8, and considering that , we see that vertices 1, 2, and 9 would have fared equally well.
Let us now see how the other graph works.
Here, , so we see that vertices 6, 7, and 9 would have worked equally well. Here is the labeled graph.
Having worked through all of this, it is now clear which is the slowest mixing distribution. One can simply take the maximal component of and place all of the mass on that vertex. One could also distribute the mass over several vertices with maximal size and the same sign, for instance over vertices 8 and 9 in the notri graph. We are led once again to the correct theorem.
Theorem 2
Acknowledgments
This work was supported by the Priority Research Centers Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Education, Science and Technology (Grant #2009-0094070). I would like to thank Professor Sangu Lee of Sungkyunkwan University for the invitation that led to the writing of this paper.
References
[1] | P. Doyle and L. Snell, Random Walks and Electric Networks, Washington, D.C.: Mathematical Association of America, 1984. |
[2] | J. Doob, Stochastic Processes, New York: Wiley, 1953. |
[3] | D. West, Introduction to Graph Theory, Upper Saddle River, NJ: Prentice Hall Publishing, 2001. |
[4] | G. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed., New York: Wiley, 1999. |
G. Markowsky, “Two Basic Results Concerning Random Walks on Graphs,” The Mathematica Journal, 2011. dx.doi.org/doi:10.3888/tmj.13-16. |
About the Author
Greg Markowsky is currently a lecturer in mathematics at Monash University. He does research on probability theory, complex analysis, and geometry. He also worked for two years at Wolfram Research on the Wolfram|Alpha project.
Greg Markowsky
School of Mathematical Sciences
Monash University
Building 28
Wellington Road
Clayton
Victoria 3800, Australia 9905-4487
gmarkowsky@gmail.com