Vision and Cognitive Systems

Esame Vision and Cognitive Systems

Facoltà Ingegneria

Dal corso del Prof. Cucchiara Rita

Università Università degli Studi di Modena e Reggio Emilia

Appunto

5,0 / 5 (1)

Scarica

Appunti dell'esame di "Vision and Cognitive Systems" dei prof. Rita Cucchiara (segnata come C negli appunti) e Lorenzo Baraldi. Gli appunti sono scritti in Inglese direttamente da me e derivano da studio accurato del corso. L'esame fa parte del Corso Magistrale di Ingegneria Informatica, curriculum Artificial Intelligence Engineering (Intelligenza Artificiale). Ho superato l'esame con una votazione di 30.

…continua

Anteprima

Vedrai una selezione di 21 pagine su 139

Vision and Cognitive Systems Pag. 1

Vision and Cognitive Systems Pag. 2

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 6

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 11

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 16

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 21

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 26

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 31

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 36

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 41

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 46

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 51

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 56

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 61

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 66

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 71

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 76

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 81

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 86

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 91

Anteprima di 21 pagg. su 139.
Scarica il documento per vederlo tutto.

Scarica

Vision and Cognitive Systems Pag. 96

Disdici quando
vuoi

Acquista con carta
o PayPal

Scarica i documenti
tutte le volte che vuoi

Estratto del documento

Random Sample Consensus (RANSAC)

Random Sample Consensus (RANSAC) is one of the most famous iterative math methods for fitting parameters of a mathematical model from a set of observed data, which contains inliers (the point of the model) and outliers. It is a non-deterministic algorithm because it produces reasonable results only with a certain probability.

Hough Transform

We call Hough Transform a transformation of points from the coordinate space to the parameter space, where the detection is easier to do (where the evidence of a curve is accumulated). For image shape detection, using only feature points is better, so we use an Edge Hough Transform.

Vision and Cognitive Systems Pagina 38

For image shape detection, using only feature points is better, so we use an Edge Hough Transform (EHT).

Hough transform for line detection: We can use the Hough transform when we have a geometrical model, where (x, y) is the coordinate and (m, b) are the parameters for a line. In particular, for a line, each point in the Cartesian space corresponds...

To a point in the parameter space. But this is not a good model because we cannot represent vertical lines (to infinity). So we use the polar representation: with θ and ρ (ρ is discretized).

Given a point in the image, we can apply the transformation in the parameter space. In this space, the point is not represented by a line, but by a sinusoid.

EHT Algorithm:

Quantize the HT space: identify maximum and minimum of θ and ρ.
Discretize the Hough space, sampling steps Δθ and Δρ.
Generate an accumulator array for all θ and ρ.

Hough Transform for circle detection:

The parametric equation of the circle can be written as:

Vision and Cognitive Systems Pagina 39

The parametric equation of the circle can be written as:

A circle in the image space corresponds to a point in the parameter space.

A point in the image space corresponds to a circular cone in the parameter space:

Generalized HT:

We can use whichever shape that can be described in a parametric way.

Given a center point of origin, the shape is a list of points.

represented by the pairs

We create an hash table to collect the points.

For each point that belongs to the edge, we should find the distance from the point to the center we have chosen and which is the angle between the normal and the distance.

For each point, we have to calculate the direction of the gradient and accumulate the possible radius and centers, to find the best center.

Vision and Cognitive Systems Pagina 40C Camera Model and Calibration

sabato 4 aprile 2020 11:30

The image sensor is the acquisition system for an image.

Image formation has two important aspects:

The geometry of image formation, which determines where the projection of a point in the scene will be located in the image plane.

The physics of light, which determines the brightness of a point in the image plane as a function of illumination and surface properties (and sensor capabilities).

Camera physics of light model:

The object reflects radiation towards the camera and the camera senses the object randiance

reflected(not absorbed by the object). Camera geometry model: Each point irradiates light everywhere. We add a barrier to block off most of the rays (because the film is too irradiated and becomes white). The rays of light pass through a "pinhole" and form an inverted image of the object on the image plane(the film). Vision and Cognitive Systems Pagina 41 Pinhole Model We have a 3D object and every point is a source of light. Every point transmits a lot of rays, but only one passes through the pinhole, this hole is called the camera center or optical center. The rays project the 3D world on a plane (2D world), called image plane or projective plane or virtual image plane, in particular they project the inverted image. The center of the image plane is called principal point or . This is a tranformation from 3D to 2D world. The focal lenght is the distance between the pinhole and the image plane. The virtual image is on a virtual image plane at the same distance from the pinhole. Optical

The axis passes through the optical center and the principal point. The model transforms a 3D point into a 2D point on the image plane. We put the center of the axis in <center>. We should know the extrinsic parameters of the camera: translation and rotation of the camera (if the camera is not moving). Intrinsic (fabric parameters): ex. focal length. We need to convert the 3D coordinates of the world into 3D coordinates of the camera and then into 2D coordinates on the image plane (film).

Problems:

We don't know the distance of the object.
We don't know the parameters of the camera.
Image is not in the center of the image plane, so we need to do a translation.

Projective perspective:

Since the real image plane is inverted, we use the virtual image plane that is an inverted information in the opposite direction at the same length. We exploit the similarity among triangles.

Orthographic perspective:

Projective Perspective: When the object is much smaller compared to the

focal lenght (the distancefrom the camera is higher), we can consider a costant instead of :Vision and Cognitive Systems Pagina 43
We don't know the value of θ and f.
To obtain the 3rd dimension we put two cameras. We need to know the precise value of b.
Normalized camera model: it is when we have f = 1.
Perspective Effect
In an image, the perspective can alter the object dimension. We need to do an homographictransformation from a 3D point to a 2D point :
Vanishing point (punto di fuga): point where parallel lines viewed in perspective appear to converge.
An image can have more than one vanishing point. Any two parallel lines have the same vanishingpoint .
If the line on ground plane is parallel to the optical axis, a point of the ground plane intersects theimage and reaches the camera center. The point at the infinity is the vanishing point and passesthrough the optical axis:
The vanishing point isn't always in the center of the image, because it is in the center only when theline on ground

plane is parallel to the optical axis.

Computing Distances Vision and Cognitive Systems Pagina 44

Computing Distances

If we are able to know the position of the camera, the focal lenght and the parameters of the camera, we can extract many information about the external world (distances, speed of the camera, etc…).

The blue ones are the estrinsic parameters of the camera: translation and rotation of the camera wrt the world coordinates.

Optical Physics

More the pinhole size/aperture is larger, more light passes. It affects the image clarity.

Lens:

We use lens in order to collect more rays information about an object: If we put lens, we can have blurring in the image because it is not in focus. In focus means that the object is at a correct distance wrt the lens and the rays convey in the image plane. If the lens is not correct, the image rays convey before or after the image plane.

Focus and depht of field (DOF): It is the distance between the nearest and farthest objects in a scene that

The Depth of Field (DOF) is the range of distances in an image where objects appear acceptably sharp. It can also be defined as the distance between image planes where blur is tolerable. DOF depends on the lens width.

The Field of View (FOV) is the angular measure of the portion of 3D space seen by the camera. If the FOV is smaller, the image becomes more wide angle. If the FOV is larger, the image becomes more telephoto. A smaller FOV means a larger focal length.

Calibration is the process of finding the correct matrix that allows the transformation of a 3D object onto the image plane. In the pinhole camera model, the origin of the image plane is on the principal point. By knowing the coordinates, we can go from one transformation to the other, using the projective perspective.

In general, we don't have the exact values for the principal point.

is not in the origin, but it is translated by and:

In case of non-squared pixels (digital video), we need to take into account the pixel distortion, so is different among and:

The most general form of the intrinsic matrix considers also the skew, which is the real angle between and axes:

Normally because we have a good approximation.

So, the intrinsic parameters (they are intrinsic to the image and the camera) to consider are 5:

---

Camera Rotation and Translation

As well as the intrinsic camera parameters, there are some extrinsic parameters due to the position of the camera wrt the real world coordinates: Rotation and Translation.

The center of the camera is and the center of the world is . This means that the camera has a translation and a rotation wrt : Vision and Cognitive Systems Pagina 47

Complete perspective camera model with 11 DOF (5 intrinsic and 6 extrinsic):

Therefore, if we want to start from a point in the world/scene coordinate system and go to a point in the image system, we have

To do a double matrix multiplication:

One takes into account rotation and translation of the camera.

One takes into account the intrinsic parameters of the camera.

Camera calibration: How can we find the matrix parameters?

We need to know:

Position of image center in the image
Focal length
Scaling factors for row and column pixels
Skew factor
Lens distortion

Weak perspective: it happens when the object is very far, so is so small that is negligible.

Calibration method:

Calibration means finding the intrinsic and extrinsic parameters, so the 11 DOF of the transformation between 3D and 2D world.

In order to calibrate a camera we can do:

Multi-plane calibration: we put a chessboard in front of the camera in different positions and we have to make it planar. Very precise, it requires multiple image manual calibration.

It has many advantages: only requires a plane, you don't have to know

positions/orientations,good code is available online. The minimum number of images to take is 2 with 3-4 points each,because we have 4 intrinsic parameters ( ), extrinsic parameters and corners each. .We obtain constraints, so we need that

Single image metrology: use different geometric structures of the scene as reference (ex.- vanishing points, select some existing lines), only for man-made scenes.

Camera self-calibration.

- Deep Learning.

-Distortion

Types of distortion:

Barrel distortion:

- Pin cushion distortion:

- Vision and Cognitive Systems Pagina 49

If we use a Taylor approx of the point, we find that the distortion can be computed in throught 2values:

Radial distortion : caused by the lens, a circle is not a circle anymore

Dettagli

Publisher

A.A. 2021-2022

139 pagine

1 download

SSD Ingegneria industriale e dell'informazione ING-INF/05 Sistemi di elaborazione delle informazioni

I contenuti di questa pagina costituiscono rielaborazioni personali del Publisher Dino_A di informazioni apprese con la frequenza delle lezioni di Vision and Cognitive Systems e studio autonomo di eventuali libri di riferimento in preparazione dell'esame finale o della tesi. Non devono intendersi come materiale ufficiale dell'università Università degli Studi di Modena e Reggio Emilia o del prof Cucchiara Rita.

Domande e risposte

Hai bisogno di aiuto?

Chiedi alla più grande community di studenti

Fai una domanda

Quali linguaggi di programmazione consigli per iniziare nel 2024?

di Matteo_Gallo

help c+++

di misslanternina

Programma in C (310905)

di Robertino21

Appunti correlati

Appunti completi Artificial Vision

Appunti completi Artificial Vision Premium

Appunto

4,0 / 5

14. Analyse - Vision, J. Du Bellay

14. Analyse - Vision, J. Du Bellay Premium

Appunto

Paniere completo di Computer Vision (2025) - Risposte multiple

Paniere completo di Computer Vision (2025) - Risposte multiple Premium

Panieri

Appunti completi del corso di Computer Vision

Appunti completi del corso di Computer Vision Premium

Appunti esame

Recensioni

5/5

1 recensione

5 stelle

4 stelle

3 stelle

2 stelle

1 stella

1

0

0

0

0

Ti è piaciuto questo appunto?

Fulvio93

25 Novembre 2022

Tutor AI

Ciao! Sono il tuo Tutor AI, il compagno ideale per uno studio interattivo. Utilizzo il metodo maieutico per affinare il tuo ragionamento e la comprensione. Insieme possiamo:

Risolvere un problema di matematica
Riassumere un testo
Tradurre una frase
E molto altro ancora...

Cosa vuoi imparare oggi?

Il Tutor AI di Skuola.net usa un modello AI di Chat GPT.
Per termini, condizioni e privacy, visita la relativa pagina.