On Poisson process - Daijiang Li

This is my study note for Math 605, UW-Madison. See more details at here.

Definitions

Almost every entry level statistical course introduce the Poisson distribution formula (we call this as the first formulation): $f (k; μ) = P r (X = μ) = \frac{μ^{k} e^{- μ}}{k!}$ where $X$ is a discrete random variable, $μ$ is the mean measure. When you need to calculate the probability of observing k events, you can just put k back to the formula. One important property of Poisson distribution is that its mean equals with its variance, i.e. $μ = E (X) = V a r (X)$ .

Now let’s consider continuous process. Suppose that we start at time 0 to count events (earthquakes, car accidents, number of death, etc.). For each time t, we obtain a number N(t), which is the total number of events that has occurred up to time t. We then make the following modeling assumptions on the process N(t):

For some $λ > 0$ , the probability of exactly one event occurring in a given time interval of length h is equal to $λ h + o (h)$ . That is, for any $t \geq 0$ , $P {N (t + h) - N (t) = 1} = λ h + o (h), a s h \to 0$
The probability that 2 or more events occur in an interval of length h is o(h): $P {N (t + h) - N (t) \geq 2} = o (h), a s h \to 0$
The random variables $N (t_{1}) - N (s_{1})$ and $N (t_{2}) - N (s_{2})$ are independent for any choice of $s_{1} \leq t_{1} \leq s_{2} \leq t_{2}$ . This is usually termed an independent interval assumption.

We then sat that N(t) is a homogeneous Poisson process with intensity, propensity, or rate $λ$ .

Proposition 1: Let N(t) be a Poisson process satisfying the three assumptions above. Then for any $t \geq s \geq 0$ and $k \in {0, 1, 2, . . .}$ , we can prove that $P {N (t) - N (s)} = e^{- λ (t - s)} \frac{(λ (t - s))^{k}}{k!}$ If we choose $s = 0$ , then we get $P {N (t) - N (s)} = e^{- (λ t)} \frac{(λ t)^{k}}{k!}$

Letting $S_{1}$ denote the time of the first increase in the process (i.e. the first event occurred), then according to Proposition 1, $P {S_{1} > t} = P {N (t) = 0} = e^{- λ t}$ Therefore, the distribution of the first event time is exponentially distributed with a parameter of $λ$ . Because of the independent increments assumption (assumption 3), we can see that the distribution of $S_{2} - S_{1}$ is also exponentially distributed with a parameter of $λ$ . Therefore, we see that *N(t)* is simply the counting process of a renewal process with inter-event times determined by *exponential random variables*(we call this as the second formulation).

We now move to the third formulation of a one dimensional Poisson process. We say the N is a Poisson process with intensity $λ$ if for any $A \subset R_{\geq 0}$ and $k \geq 0$ , we have that $P {N (A) = k} = e^{- λ | A |} \frac{(λ | A |)^{k}}{k!}$ where $| A |$ is the Lebesgue measure of A and if $N (A_{1}), . . ., N (A_{k})$ are independent random variables whenever $A_{1}, . . ., A_{k}$ are disjoint subsets of state space $E$ .

Definition 1. Let N be a point process with state space $E \in R^{d}$ , and let $μ$ be a measure on $R^{d}$ . We say that N is a Poisson process with mean measure $μ$ , or a Poisson random measure, if the following two conditions hold:

For $A \subset E$ , $P {N (A) = k} = {\begin{cases} \frac{e^{- μ (A)} (μ (A))^{k}}{k!} & i f μ (A) < \infty \\ 0 & i f μ (A) = \infty \end{cases}$
If $A_{1}, . . ., A_{k}$ are disjoint subsets of state space $E$ , then $N (A_{1}), . . ., N (A_{k})$ are independent random variables.

Note that the mean measure of a Poisson process, $μ (A)$ , completely determines the process. One choice of the mean measure would be a multiple of Lebesgue measure, which gives length in $R^{1}$ , area in $R^{2}$ , volume in $R^{3}$ , etc. That is, if $μ ((a, b]) = λ (b - a)$ , for $a, b \in R$ , then $μ$ is said to be Lebesgue measure with rate, or intensity, $λ$ . If $λ = 1$ , then the measure is said to be unit-rate. For another example, a Poisson process with Lebesgue measure in $R^{2}$ satisfies $μ (A) = A r e a (A)$ . When the mean measure is a multiple of Lebesgue measure, we call the process homogeneous ( $λ > 1$ ).

If $\land$ is a non-decreasing, absolutely continuous function, and over an open interval (a,b), then mean measure $μ$ for a Poisson process is $μ ((a, b)) = \land (b) - \land (a)$ If $\land$ has density $λ$ (i.e. it is differentiable), then $μ ((a, b)) = \land (b) - \land (a) = \int_{a}^{b} λ (s) d s$ As a result, for any $A \subset R$ , $P {N (A) = k} = e^{- \int_{a}^{b} λ (s) d s} \frac{(\int_{a}^{b} λ (s) d s)^{k}}{k!}$

The three formulations above are all equivalent.

Some definitions: >1. Renewal process: it is used to model occurrences of events happening at random time, where gaps between points (inter-event time) are i.i.d. random variables. >1. Point process: it is used to model a random distribution of points in space. It is a renewal process which distributes points so gaps are i.i.d. exponential random variables. e.g. locations of diseased deer in a given region (space); the breakdown times of certain part of a car (time). The simplest and most ubiquitous example of a point process is the Poisson point process. >1. Lebesgue measure: length, area, volume, etc.

Transformations of Poisson processes

If the position of the points of a homogeneous process of rate $λ > 0$ are multiplied by $λ$ , then the resulting point process is also Poisson, and it is, in fact, a homogeneous process of rate 1.
Likewise, we could start with a unit-rate process and divide the position of each point by $λ$ to get a homogeneous process with rate $λ$ .
Also, move the points around via an one-to-one function, or transformation, resulted in another Poisson process.

Simulating non-homogeneous Poisson process

Set $t_{0} = 0, n = 1$ .

Let $E_{n}$ be an exponential random variable with parameter one, which is independent from all other random variables already generated.
Find the smallest $μ \geq 0$ for which $\int_{0}^{μ} λ (s) d s = E_{1} + \dots + E_{n}$
Set n+1 to n.
Return to step 1 or break.

###Example Let N be a non-homogeneous Poisson process with local intensity $λ (t) = t^{2}$ . Write a code that simulates this process until 500 jumps have taken place.

library(ggplot2)
theme_set(theme_bw())
N=500 # number of jumps
time=vector() # to hold times of jumps
E_n=rexp(N) # N exp random values
for (i in 1:N){
  time[i]=(3*sum(E_n[1:i]))^(1/3)
}
ggplot(data=NULL)+geom_step(aes(x=c(0,time), y=c(0:500)))+
  geom_line(aes(x=c(0,time), y=c(0,time)^3/3), color="blue")