Estratto del documento

PC

shord between files

theads page

,

ore E -

FINE GRAINED COARSE GRA INED

chage stalles

thood Switch long in

of

Cu evens shore

active

only h to

be

processes proces resource

time

cor ,

what

slaws stalle

execution

down individual reduce

don't

- of -

theory

throds toll

without

realy thought Cose

since s

,

will deloged thread

b

be other ents the

stell

from to

Es pipeline

-s

-

it short

bide Stalls

- and lang condition

can c

,in norma

- in is

other threads

of when

I executed not down

slowed

ore

,

Theod stolls

TLP to multithreading

simultaneous

It's the

manage

a

may

flexible

He hadeare

it

but complex

requires

, functional

has

EU resources

more ,

.

A lage of

set register is needed

MULTITUREADING

SIMULTANEOUS adaptive

It to

The

with sistem domenically

combines imp be

10 can

.

the environment execution

possible) hom

the of

allowing (if Is

, allowing that

each and

theed functional

the of thead used

simple

Is all

a

thed latency

other .

Units the long

if in event

incre a

DEEP DIPELINE stopes

dan dissipation

agele

Nighe Leat

fequeug >

-

3 mare S

,

stapes foults

smaller transmission

and

more more ,

henden

delog design

,

SI D date

has

rach its

special

processor purpose memory

· ore ,

with

model 1

simple PC

programming

· computation synchonized

fully

is

& code

He

I of

copy

only

· registers

their

units have

execution addressing and memory

· awn

architecture

vector

variations

· 3 (called multimedia

aso

extensions extensional

Aux)

SSE

(x86 mx

: ,

,

Gr multimedio estension

↑ elements

central controller moltiple

broodcosts to

Is processing

· alls

#Amples and

11 comel (3-core)

vener

=

VETTOR ARCHITETTURE register-to registe

A operates data results

vectors dozen

of

I

simple in of

o

an

operations (vector processing

Used hide

to loteny

· memous unit

unit

scolor

pipelined vector

processor

-vector +

= all menors-to-menay

rector operations

* vector

memory-menars are

processor :

D vector

all stal

Ins

registers

rector operations

vector-register between load and

:

processor

EXAmplE UMIPS

:

No

+ loopt

whaten Cade

+ No control hazards

+ and

No

↓ WAW WAR

VS VECTORS

MmX

Limited set

I

· Limited vector ength

register

· Tend galler vector support

towards

· in microprocessor

SUPER COMPUTER

Fortest machine at computer-band

turn

to they

told problem

in given a

a

into bound problem

10

on

m

flexible used

be

· can as

focuses

machines high for

performance specific

single are

:

usa on

- application tosks simultaneously

metiprogrammed metipoesers rn

:

-

to execute

h to

with

exploit theads

there mustbe I

simp or

processores process

& a

indipendent theod advantages

conti

performance

be build the shelf

can off

from -

· co

fetches and

each agenates doto

its

each Is

processor awn

aun

parallelism the the

by

identified softwar not

by in

12

in os

,

⑨ apersdor CpUs

paolelism achied by

is :

Podbliem

Data data

Quel items be of

pocesed the

can

- : seme

many

time

Level Perdlelism in pandlel

executed

Tor tess and

be

can

- : indipendently

the

defending involved

relasses processes

on

CENTRALIZED/SRMETRIC DISTRIBUTED

METIPROCESSORS support

to count

lage processores

12-100 cales connection

high

requires bandwidth

lange oches high comunication

structure data

of

multiplenos volume

5metric =

comps between processons

Menag Accen

Uniform (UMA) (Non-UMA)

SID us

SIMB :

exploits for

DLP

· : computina

matrix-oriented scientific

- media-oriented and sound

image process

- epenction

efficient data

to

it fetch

because

· only needs

enegs

more 2 per

I

(compared

sequentially mima

continue

allows to to

to think

pequemmer

·

PIPENE account

CPI colelated into

toring :

are

Ideal pipeline P

- Structural (limited

stolls usacres

ne

- Lazords depende with

Data scheduline

solved

(12 compiler

forwarding

I1

on a

- Contral hazards with

branches

(cased carly

solved belayed

eduction

by

- a ,

branch predictors

,

Featener :

higher put

though

- multiple simultaneous

agente

tass

- Fid I lo I

Ex E wis

tipeline

time to -

speed

and

fill the

emety the reduces

- I

fetch a access

endeevenute activities

Memors anc

Write at

5 numming

↳ ↳

register rad

also niregister once

PIPELINE NAZARDS there dependence

where

fault pipeline

HAZARD in is

= a

a

I D Structura different

attent time

at the

from

the

to resarce Is some

use

: same

Data attempt result

↑ it

before ready

to

: use is

a

dependence by

& RAW compiles

: dependence

↑ anti by compiles

WAR : dependence

* output by compilen

Waw :

↑ Control the condition

request execute

the evolated

deciding to

next

of before

: is

I

on

T

Solutions i

are nisetion

Nas

compilation techniques schedeling

instruction

- stolls

bubbles insertion

or

techniques

nu

- farading

data bypessing

: or

COMPLEX PIPELINE

IN-ORDER plasting

not operations

execution point

the

but sflit

simple of

AW in

is

a

· , inseted

stage

FUs detect conflicts the

delon

isove and

>

- to

to

more is

stolls

execution With

Fid lone DWD

Ist Ex high ther

have

used wher

to reformance ore

· :

lathay

long time

with

systems variable

- memory

- access

functions units

multiple and exception

memory precise

- -

· main issues ore : execution

structural of stage

the

conflicts

- structural of the write

conflicts stage

been

- aut-of hazards pat/mader

write solved

ander write

with simple wa

a

- reptions

had handle

to

-

DEDENCENCES the

DEPENDENCE order

would the

auale change of

that

close march

two I

= are

S

I the involved

operande

to

occes

Nome

& location

register

the (nome

: Is

2 or memory

some

use detect

difficult to

casie

Anti to

WAR enome

:

Output Waw

:

Data

↑ Rac

:

-Control determine odering of

the ,

I

:

hozads property

ND dependences

the pipeline the

of

: of

are program

a ,

BRANCH PREDICTION sotisfied

CONDITIONAL and

the condition

toner

te banch if

ARANCH INSTRUCTIONE is

is

the get

brach staed the

adess instead the the

of

of

A) PC

in

to is one

met the instruction

sequential stream

I in

The outcome the branche

the

branch end

and of

ready but

of Ex

ATA one ,

(*)

updated at

salved the

when of

end

or me

is

ore

BRANCH HAZARDS SOLUTIONS

Stall pipeline toven

the until fetch

the the

and

decision is

without stall

forwarding IF

3 ME

EX

: ID w EX

FID mew

SSS

stall

with 2

forwarding IF ME

EX w

ID

: * EX

IF MEW

ID

SS

do better

We with the

enuction of

calier

can PC

on to

During need

branch we :

a

sisten

1 compar

.

2 Compute BTA

update pc stage stall

do steps

these costs

banch

Mips -

in I

processors I

a ID

IF ID w

ME

EX

IF ID EX ME

S W

BRANCH PREDICTION TECHNIQUE

depende

performance prediction

branch

of predictions

of

on occurry ,

bench

cost frequency

inconcet

of an one ,

STATIC branch durage pipeline

Mores which

- 1 toren in

for

: : sense

.

* the actual

before outcome

actions the (mot mips)

known

for is

Bis

a 2.

2 mot

banch mot

durage token if

branch condition is

in

fixed in

for :

ore

branch the sotisfied

each offerie the

during next

pesered

performance is I

entire ,

execution stats

rection

and

into the

(turned

flushed mop)

i the

fetching E at

by the penality

(1

At

toren

banward not predict

formard

3 tonen :

, borward towe

banches and

the

lat loope)

of

going

- as

forward not

branches torm

as

going

- based profiling

S prediction

driver

profile prediction on

:

. collected complicated

information calier and

for runs ,

based profiling

S prediction

driver

profile prediction on

:

. collected complicated

information calier and

for runs ,

additiona

it needs nu schedules indipendent

5

. deloged branch compilar an

: then branch

branch

instruction delog the

slot

the if is

in ,

towen execution continues of BTA

:

- not toren continues the

the

execution with branch

after

I

:

-

There Bolog

the

to Branch Slot

schedule

ore a mass :

hedule

2 From before independent the lous

before branch

: I

Anteprima
Vedrai una selezione di 4 pagine su 12
Appunti per l'esame di Advanced Computer Architectures - parte 1 Pag. 1 Appunti per l'esame di Advanced Computer Architectures - parte 1 Pag. 2
Anteprima di 4 pagg. su 12.
Scarica il documento per vederlo tutto.
Appunti per l'esame di Advanced Computer Architectures - parte 1 Pag. 6
Anteprima di 4 pagg. su 12.
Scarica il documento per vederlo tutto.
Appunti per l'esame di Advanced Computer Architectures - parte 1 Pag. 11
1 su 12
D/illustrazione/soddisfatti o rimborsati
Acquista con carta o PayPal
Scarica i documenti tutte le volte che vuoi
Dettagli
SSD
Scienze matematiche e informatiche INF/01 Informatica

I contenuti di questa pagina costituiscono rielaborazioni personali del Publisher nicole_perrotta di informazioni apprese con la frequenza delle lezioni di Advanced computer architectures e studio autonomo di eventuali libri di riferimento in preparazione dell'esame finale o della tesi. Non devono intendersi come materiale ufficiale dell'università Politecnico di Milano o del prof Conficconi Davide.
Appunti correlati Invia appunti e guadagna

Domande e risposte

Hai bisogno di aiuto?
Chiedi alla community