Analysis of Biomedical Data and Signals: appunti del corso

Appunti completi del corso di Analysis of Biomedical Data and Signals integrati con le slides delle lezioni.
Argomenti:
- Data Analysis and Display
- Probability density estimates
- Pattern analysis and decision theory
- Model selection and hypothesis testing
- Dynamic models

Esame Analysis of biomedical data and signals

Facoltà Ingegneria

Dal corso del Prof. Sanguineti Vittorio

Università Università degli studi di Genova

Publisher ES_01

A.A. 2023-2024

177 pagine

Appunti esame

Vota 3,5 / 5 (2)

Scarica

Estratto del documento

Y

1 t

use of to predict .

Y Y

t t+1

Then: ∣ ...Y ) = ∣ )

P r(Y Y P r(Y Y

t+1 t t+1 t

05 - Dynamic models 2

We can write the so called Markov model as:

) = ) ∣ )...P ∣ )

P r(Y P r(Y P r(Y Y r(Y Y

1:T 1 2 1 −1

T T

This is a first order Markov Model. , ...,

If we assume that each depends on the previous N values than we can

Y Y Y

t t−1 t−N

write the N-th order markov model as:

∣ , ..., ) = ∣ , ..., )

P r(Y Y Y P r(Y Y Y

1 +1

t+1 t t+1 t t−N

Example

In temporal data there is the auto-regressive model of order N.

The model is formulated as:

where is the noise that can’t be explained by the model.

e(t)

5.1.4 Dynamic models as graphical models

Dynamic models are also called Dynamic Bayesian Networks.

They may involve both hidden and observed variables, which can have

complex interdependencies.

The graphical structure provides an easy way to specify these conditional

independencies, and hence to provide a compact parameterization of the model.

We need to add time to the graphical representation and in order to do that typically samples

are divided into time slices that then can be interconnected.

So, to specify a dynamic model as graphical models we need to define:

Intra-slice topology

Inter-slice topology

The model parameters for the first two slices

Two example of dynamic models are:

Hidden Markov Models

05 - Dynamic models 3

Example

It’s the simplest type of dynamic model.

There is an observable continue variable (grey one) and a non-observable categorical

variable (white one) which is time dependent.

The Q variable at a given time depends on the value of the Q variable at the previous

time. This is a first order Markov Model. These variables are often called state

variables.

The observable variables are not time dependent, so there is no memory at this level.

These are often called output variables.

Linear Dynamical Systems

5.1.5 Time invariance

We assume that the parameters do not change. So we’ll deal with time-invariant model.

If model parameters change with time we need to change the model structure over time.

5.2 Hidden Markov Models

A tool for representing probability distributions over sequences of observations.

The observation of time is denoted as and it can be discrete (categorical variable) or

t Y

continuous.

5.2.1 Markov property

There are three properties:

Observation at time t was generated by some dynamic stochastic variable which is

Y S

t t

hidden.

is discrete-valued and can take one of K values. The hidden variable represents the

S S

t t

underlying state of the system or process. These states often represent different

configurations, conditions, or modes of the system being modeled.

The state of the hidden process satisfies the Markov property: given the value the

S t−1

− 1

current state is independent of all the states prior to .

S t

This means that the current state encapsulates all relevant information from the past

S t

and serves as a sufficient statistic for predicting the future.

In other words, the hidden state is a Markov model.

05 - Dynamic models 4

Example

Imagine deploying a wearable device onto an elderly person with the primary objective of

closely monitoring their daily activities—specifically, the durations of walking, lying down,

standing and sitting.

In this context, the "state variable" refers to the person's activity state at any given

moment, which can take on one of three values: walking, lying, standing, or sitting.

To infer these activity states, the wearable device collects data from an accelerometer,

capturing the person's movements over time.

The accelerometer provides a continuous stream of information, and by analyzing these

values, we can deduce the person's current activity state—whether they are actively

walking, in a resting position, or engaged in a seated posture.

Importantly, there exists a logical relationship between these activity states. Specifically,

there is a dependence or sequence in the transitions between different movements. For

instance, before engaging in a walking activity, it is typical for the person to be in a

standing position. This dependency reflects a real-world scenario where certain activities

often follow a logical order or pattern. In this case, the person is more likely to transition

from a standing state to walking, establishing a connection between consecutive activities.

5.2.2 HMM: joint probability

The joint probability of the S and Y can be written as:

T T

∏ ∏

, ) = ) ∣S ) ∣S )

P r(S Y P r(S P r(Y P r(S

1:T 1:T 1 t t t t−1

t=1 t=2

State 1 doesn’t depend on anything

)

is a vector of K values and it’s the probability of initial state.

P r(S 1

Importantly, it doesn't depend on any prior states. It's a vector with values, where is the

K K

number of possible states. Each value in the vector corresponds to the probability of the

system starting in a specific state.

∣S ) ×

is the probability of state transition from state to state . It’s a

P r(S S S K

t t−1 t−1 t

(i,

matrix, where each entry corresponds to the probability of transitioning from state

K j)

05 - Dynamic models 5

to . Every state depends only on the previous state.

i j ∣ )

is the probability of observing a particular output (also called emission

P r(Y S Y

t t t

probability) given the current state . It specifies the relation between the observation

S t

and the state.

If the observation is a continuous variable, we have to consider the mean and the variance

for each State. ×

If the observation is a discrete variable, it can be represented as a matrix , where

L K

is the number of possible observations, while are the different states.

L K

5.2.3 HMM: time invariance

Usually the state transition matrix and the output model are assumed to be independent on t,

i.e. we assume that the model is time-invariant.

5.2.4 HMM with external input

There may be also an external input that could be observable or not.

U

If it’s not observable, we can just forget about it.

We will not consider this case.

5.2.5 Variants

5.2.6 Learning in a HMM {Y , ..., }

Suppose we have a data sequence .

Y

1 T

Let’s define the set of all the parameters (transition probability matrix, initial state probability,

parameters of the output model).

In order to estimate it from the data sequence, we can use the log form of likelihood.

T T

∏ ∏

∣θ) ∣S , ∣S ,

= , ∣θ) =

L(θ) P r(S Y P r(S P r(Y θ) P r(S θ)

1:T 1:T 1 t t t t−1

t=1 t=2

05 - Dynamic models 6

Log form: T T

∑ ∑

= ∣ + ∣S , + ∣S ,

logL(θ) logP r(S θ) logP r(Y θ) logP r(S θ)

1 t t t t−1

t=1 t=2

We can then procede as usual taking the derivative and put it equal to 0.

The difficulty arises from the fact that the components of the HMM (initial state probabilities,

state transition probabilities, and emission probabilities) are distinct and involve different

calculations.

A significant challenge in the HMM learning process is the unobservable nature of the hidden

states ( ). Unlike the observed data sequence, the states are not directly known.

S

It’s the exactly same problem that we dealt in mixture gaussians where there were an hidden

categorical and unknown variable.

Solution:

To address these challenges, the process begins by taking the logarithm of the

probabilities to simplify calculations.

One-hot encoding is then introduced to represent the hidden states ( ). This encoding

S

transforms each state into a k-dimensional column unit vector, where is the number of

possible states.

For instance, this is the state taking the value 2 at time t:

T

= [0 1 0... 0]

In fact:

Using the unit vector notation, to express the fact that:

⎧ = 1∣θ) =

P r(S π

1 1

⎨

...

⎩

= =

P r(S K∣θ) π

1 K

where is the value of the probability, we can simply write:

i K

∏ ,

is i

= ∣ = 1

P r(S s θ) π

1 !

i=1

NB: is the i-th component of the unit vector and it’s used as exponent.

s s

1,i 1

= 1

Only one peer is different from 0, so in the end we obtain the element for which .

s S

This is a more compact way to write the system above.

05 - Dynamic models 7

Taking the log of this compact form we obtain a more simple expression, that we can

vectorize to obtain k

∑ ⋅

= T

= ∣θ) = (logπ)

logP r(S s s logπ s

1 1 1,i 1

i=1

NB: both and are vectors and log is applied element wise.

s π

Similarly, if the probability of transition from state j to state i is expressed as follows:

= = = = 1∣S = 1) =

P r(S i∣S j, θ) P r(S ϕ

t t−1 t,i t−1,j ij

We get: K K

∏ ∏ s s

= ∣ = , = (ϕ )

P r(S s S s θ) t,i t−1,j

t t t−1 t−1 ij

i=1 j=1

NB: In this double product there is just one element that is different from 0.

Same trick as before.

We can take the logarithm: K K

∑ ∑ = tT

= ∣ = , = (logΦ)S

ogϕ

log(P r(S s S s θ) s s l S

t t t−1 t−1 t,i t−1,j ij t−1

i=1 j=1

where is the matrix of elements .

5.2.7 HMM: Emission probability

We need now to consider the emission probability, which depends on the nature of

observations .

Y

We have to distinguish the two cases:

discrete-valued

Y

Using a coding similar to that used for we can write the emission probabilities as

S t L K

∏ ∏ y s

= ∣ = , = (E )

P r(Y y S s θ) t,i t,j

t t t t ij

i=1

Anteprima

Vedrai una selezione di 10 pagine su 177