Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
Scarica il documento per vederlo tutto.
vuoi
o PayPal
tutte le volte che vuoi
Probability of X and Y meeting at a vertex
(a) X starts at A and Y starts at B;
(b) X starts at A and Y starts at E.
For I = B, D let M denote the expected time, when both X and Y start at I, until they are once again both at I. Show that 9M < B1.
Ergodic theorem
Ergodic theorems concern the limiting behaviour of averages over time. We shall prove a theorem which identifies for Markov chains the long-run proportion of time spent in each state. An essential tool is the following ergodic theorem for independent random variables which is a version of the strong law of large numbers.
Theorem 1.10.1 (Strong law of large numbers). Let Y1, Y2, ... be a sequence of independent, identically distributed, non-negative random variables with E(Y1) = µ. Then
&
David Williams (Cambridge University Press, 1991).
The case where µ = is a simple deduction. Fix N < and set Y =n∧Y N . Thenn (N ) (N )Y+ . . . + Y + . . . + YY n1 n ≥ → E(Y ∧ → ∞1 N ) as n1n n ∧ ↑↑ ∞ E(Y N ) µ by monotonewith probability one. As N we have 1convergence (see Section 6.4). So we must have, with probability 1+ . . . + YY1 n →∞ → ∞.as nnWe denote by V (n) the number of visits to i before n:i n−1(n) = 1 .V {Xi =i}kk=0(n)/n is the proportion of time before n spent in state i. TheThen Vifollowing result gives the long-run proportion of time spent by a Markovchain in each state.
Theorem 1.10.2 (Ergodic theorem). Let P be irreducible and let λ) is Markov(λ, P ) thenbe any distribution. If (X n n≥0 V 1(n)iP → → ∞as n = 1n miEwhere m = (T ) is the expected return time to state i. Moreover, in thei i i → Rpositive recurrent case, for any
Proof. If P is transient, then, with probability 1, the total number Vivisits to i is finite, so V 1V (n)i i≤ → 0= .n n miSuppose then that P is recurrent and fix a state i. For T = T we haveiP(T ∞)< = 1 by Theorem 1.5.7 and (X ) is Markov(δ , P ) andT +n n≥0 iindependent of X , X , . . . , X by the strong Markov property. The long-0 1 Trun proportion of time spent in i is the same for (X ) and (X ) ,T +n n≥0 n n≥0so it suffices to consider the case λ = δ .i(r)Write S for the length of the rth excursion to i, as in Section 1.5. Byi (1) (2)Lemma 1.5.1, the non-negative random variables S , S , . . . are indepen-i i(r)Edent and identically distributed with (S ) = m . Nowi ii(1) (V (n)−1) ≤ −iS + . . . + S n 1,i
ithe left-hand side being the time of the last visit to i before n. Also(1) (V (n)) ≥i+ . . . + S n,S i i −the left-hand side being the time of the first visit to i after n 1. Hence(1) (V (n)−1) (1) (V (n))i in S+ . . . + S + . . . + SS ≤ ≤i i i i . (1.8)V (n) V (n) V (n)i i i
By the strong law of large numbers (1) (n)S + . . . + SP → → ∞i i m as n = 1inand, since P is recurrentP(V → ∞ → ∞)(n) as n = 1.i→ ∞So, letting n in (1.8), we get n → → ∞P m as n = 1,iV (n)iwhich implies 1(n)Vi → → ∞P as n = 1.n mi1.10 Ergodic theorem 55∈Assume now that (X ) has an invariant distribution (π : i I). Letn n≥0 i→ Rf : I be a bounded function and assume without loss of generality that|f | ≤ ⊆1. For any J I we have n−1 (n)V1 i− −f (X ) f fπ=k i in ni∈Ik=0 (n) (n)V Vi i≤ − −π π+i in ni∈i∈J J V(n) (n)Vi i−≤
π + π+i in ni∈i∈J JV (n)i≤ − + 22 π .πi in i∈i∈J JWe proved above that V (n)iP → ∞→ as n for all i = 1.πinGiven ε > 0, choose J finite so thatπ < ε/4ii∈J ≥and then N = N (ω) so that, for n N (ω)(n)Vi − π < ε/4.ini∈J≥Then, for n N (ω), we haven−11 −f (X ) f < ε,kn k=0which establishes the desired convergence.We consider now the statistical problem of estimating an unknown tran-sition matrix P on the basis of observations of the corresponding Markovchain. Consider, to begin, the case where we have N + 1 observations(X ) . The log-likelihood function is given byn 0≤n≤Nl(P ) = log(λ p . . . p ) = N log pX X X X X ij ij0 0 1 −1N N i,j∈I56 1. Discrete-time Markov chainsup to a constant independent of P , where N is the number of transitionsijfrom i to j. A standard statistical procedure is to find themaximum likeli-hood estimate P , which is the choice of P maximizing l(P ). Since P must p = 1 for each i, we first try to maximize satisfy the linear constraint ijj µ pl(P ) + i iji,j∈I∈ and then choose (µ : i I) to fit the constraints. This is the method of Lagrange multipliers. Thus we find −1 −1N Np = 1 / 1{X {Xij =i,X =j} =i}+1n n nn=0 n=0 which is the proportion of jumps from i which go to j.
We now turn to consider the consistency of this sort of estimate, that is → → ∞. to say whether p p with probability 1 as N Since this is clearly ij ij false when i is transient, we shall slightly modify our approach. Note that to find p we simply have to maximize ij N log pij ijj∈I subject to p = 1: the other terms and constraints are irrelevant. Sup-ijj pose then that instead of N + 1 observations we make enough observations to ensure the chain leaves state i a total of N times. In the transient case this may involve restarting the chain
several times. Denote again by Nij the number of transitions from i to j. ∈To maximize the likelihood for (p : j I) we still maximizeijN log pij ijj∈Isubject to p = 1, which leads to the maximum likelihood estimateijj p = N /Nij. ijBut N = Y + . . . + Yn, where Yn = 1 if the nth transition from i is toij 1 Nij, and Yn = 0 otherwise. By the strong Markov property Y1, . . . , Yn are n 1 Nij.independent and identically distributed random variables with mean pij. So, by the strong law of large numbersP( → → ∞)pij p as N = 1,ij ij is consistent.which shows that pij = 1.
1.11 Appendix: recurrence relations 57
Exercises
1.10.1 Prove the claim (d) made in example (v) of the Introduction.
1.10.2 A professor has N umbrellas. He walks to the office in the morning and walks home in the evening. If it is raining he likes to carry an umbrella and if it is fine he does not. Suppose that it rains on each journey with probability p, independently of past weather. What is the
long-run proportion of journeys on which the professor gets wet?
1.10.3 Let (X ) be an irreducible Markov chain on I having an invariantn n≥0 ⊆ ) be the Markov chain on J obtaineddistribution π. For J I let (Ym m≥0by observing (X ) whilst in J . (See Example 1.4.4.) Show that (Y )n n≥0 m m≥0is positive recurrent and find its invariant distribution.
1.10.4 An opera singer is due to perform a long series of concerts. Hav-ing a fine artistic temperament, she is liable to pull out each night withprobability 1/2. Once this has happened she will not sing again until thepromoter convinces her of his high regard. This he does by sending flowers≤ ≤every day until she returns. Flowers costing x thousand pounds, 0 x 1,√bring about a reconciliation with probability x. The promoter stands tomake £750 from each successful concert. How much should he spend onflowers?
1.11 Appendix: recurrence relationsRecurrence relations often arise in the linear equations
associated to Markovchains. Here is an account of the simplest cases. A more specialized casewas dealt with in Example 1.3.4. In Example 1.1.4 we found a recurrencerelation of the form x = ax + b.n+1 n= x; then x = ax + b, so providedWe look first for a constant solution xn − − − = x b/(1 a) satisfiesa = 1 we must have x = b/(1 a). Now y n n ny = ay , so y = a y . Thus the general solution when a = 1 is givenn+1 n n 0by −nx = Aa + b/(1 a)nwhere A is a constant. When a = 1 the general solution is obviously= x + nb.xn 0In Example 1.3.3 we found a recurrence relation of the form+ bx + cx = 0axn+1 n n−158 1. Discrete-time Markov chains n= λ ;where a and c were both non-zero. Let us try a solution of the form xn2then aλ + bλ + c = 0. Denote by α and β the roots of this quadratic. Thenn ny = Aα + Bβnis a solution. If α = β then we can solve the equationsx = A + B, x = Aα + Bβ0 1so that y = x and
y = x ; but0 0 1 1− − −a(y x ) + b(y x ) + c(y x ) = 0n+1 n+1 n n n−1 n−1for all n, so by induction y = x for all n. If α = β = 0, thenn n ny = (A + nB)αnis a solution and we can solve n nx = Aα , x = (A + B)α0 1