# Riassunto esame Decision Support Database, prof. Ruggieri, libro consigliato Decision Support Databases Essentials, Albano

Anteprima

### ESTRATTO DOCUMENTO

You must consider the nature of the fact which may be of the following types:

1. A Transaction Fact represents the information on a speciﬁc event that

occurred at a speciﬁc point in time during the execution of a business

process. Ex. a fact is a transaction (withdrawal or deposit) on a bank

account

2. A Periodic Snapshot Fact represents the information on a series of

events that have occurred over a period of time. Ex. a fact is the

monthly summary of all transactions on a bank account.

3. An Accumulating Snapshot Fact represents the information on the

lifetime of an evolving event that has a duration and change over

time. Ex. A mortgage application

The grain of the fact determines the size of the set of facts that can be estimated using the estimates of

the number of possible values for each dimension. The cardinality of the facts depends on both the

number of dimensions and the grain of the fact.

STEP 2: Identify the Fact Measures

A measure describes one of the fact’s quantitative aspects of interest for analysis.

Facts may be also without measures (factless), when used only to represent the occurrence of an event,

such as the attendance of a student in a course.

Measures are numerical values that make sense to add, but not everything that is numeric is a measure!

Ex. OrdeNo is numeric but it is a dimension not a measure

Measure Function Dimensions Time Evaluation Example

Additive SUM Any Period End of the period. The number of

(also called flow Record the products sold in a

or rate) cumulative effect day or the monthly

over the period income

Semi-additive SUM Certain Particular Record the state The monthly account

(also called a point in of an event balance or the

stock or level) time monthly inventory

quantity-on-hand

Non-additive NO AGGREGATED Any For such a Per-unit price,

(also called WITH SUM measure, we percentages and

value-per-unit count the number ratios, measure of

measures) of facts with such intensity such as the

measures room temperature,

averages.

7

N.B: A measure M is semi-additive with respect to a dimension D1 when it cannot be aggregated with

the function SUM for groups of data with different values of D1. However, M may be aggregated with

other functions such as AVG, MIN, MAX, for groups of data with different values of D1.

STEP 3: Identify the Fact Dimensions

The dimensions are chosen to provide context for facts. Without context, facts are impossible to analyse.

We use the 5W-1H rule: who, what, when, where, why, how.

STEP 4: Identify Dimensional Attributes

To perform a more interesting analysis, it is necessary to describe each dimension with attributes

relevant to the analysis that must be performed.

In general, the structure of a dimension should therefore reﬂect two logics:

– The logic of the event to be analysed: the values of dimensional attributes at every level of the

hierarchy, are used to group fact data so that groups are internally homogeneous and different between

themselves with respect to the values of the measures, to help the analyst to understand what the

factors that inﬂuence the event are.

– The logic of company operations: the values of dimensional attributes at every level of the hierarchy

are used to group fact data so that groups are internally homogeneous and different between

themselves with respect to their reaction to the actions of the company, to help the decision maker to

revise their actions in order to inﬂuence the event.

STEP 5: Identify the Dimensional Attribute Hierarchies

Dimensional attributes are useful for generating readable reports, but the most interesting attributes

for interactive multidimensional analysis are organized into hierarchies to allow groupings of facts data

and aggregations of the measures at different levels of generality, as usually required in practice. For

example, in the case of the Date dimension, the hierarchy Day → Month → Quarter → Year is relevant.

## 2.2 A MULTIDIMENSIONAL RELATIONAL MODEL

It is the traditional logical model to represent data in data warehouse systems.

A conceptual multidimensional schema is transformed into a relational logical schema by applying a set of

mapping rules.

This model can adopt the following structures:

1. STAR

4. A fact table: contains the data about the

facts to be analysed. It is the centre of the

star

5. A set of dimension tables, one for each

dimension.

Each of the dimension tables has a single

attribute primary key which has a one-to-

many relationship with a foreign key in

the fact table.

2. SNOWFLAKE

It is a variant of the star schema, where some

dimension tables are normalized, thereby further

splitting the data into additional tables. 8

3. CONSTELLATION

It has multiple fact tables that share dimension

tables.

## 2.3 A MULTIDIMENSIONAL CUBE MODEL

It is useful to show the basic operators for data

analysis.

It represents facts with n dimensions by points in an

n-dimensional space. A point (a fact) is identiﬁed by

the values of dimensions and has an associated set of

measures.

## OLAP OPERATIONS IN THE MULTIDIMENSIONAL DATA

## MODEL

1. To generate sub-cubes by selections without make summarizations.

a. Slice operator selects a cross section that cuts across a cube with a selection on one dimension

b. Dice operator selects a sub-cube with a selection on two or more dimensions.

2. To perform summarizations

a. Roll-up operator performs summarizations at different levels by dimension reduction or by

climbing up dimension hierarchy.

b. Drill-down operator is the reverse of roll-up. It produces more detailed data from less detailed

data.

3. To perform rotation of data axes

a. Pivot operator. It is used to an alternative presentation of data.

Moreover, a cube can be extended with new “borders” made of cells containing the value of aggregate

functions. In general, the border represents only a small addition to the volume of the data cube.

For example, using the notation Sales (Store, Product,

Date, Qty) for a cube with dimensions Store, Product, Date

and a measure Qty, we can denote sub cubes as follows:

– (’S1’, ’P1’, ’D1’) is the cell that contains 300, the sales for

the product P1 on date D1 by the store S1;

∗,

– (’S1’, ’D1’) is the cell that contains 330, the sum of sales

for all products on date D1 by the store S1

– When a dimension is used as a coordinate instead of one

of its values, the notation denotes a cuboid.

The total number of cuboids for a data cube with three

3

dimensions is 2 = 8. The possible cuboids can also be denoted without using the “∗” as follows:

(Store, Product, Date), (Store, Product), (Store, Date), (Product, Date), (Product), (Date), (Store), ().

(Store, Product, Date) denotes the data cube, while () denotes the total sum of all sales.

9

An extended cube consists of a lattice of cuboids, each

corresponding to a different degree of summarization of data.

We say that the cuboid C1 is below the cuboid C2 if and only if

C1 can be computed from C2. In general, the computation of

the cuboid C1 from C2 depends on the aggregate function used.

Types of aggregate functions:

1. Distributive

If there is a local aggregate function f and a global aggregate function f , such that for any k-

l g

partition{V , ... ,V } of V we have

1 k

### DESCRIZIONE APPUNTO

Riassunto per l'esame di Decision Support Database e del prof. Ruggieri, basato su appunti personali del publisher e studio autonomo del libro consigliato dal docente **Decision Support Databases Essentials, Albano**. Scarica il file in PDF!

Chapter 1-5 of mandatory book. Information Systems, Types of Information Systems, Data Warehouse: a Decision Support Database, Data Warehousing Architecture, What to Model, Data Warehouse Modeling, Conceptual Multidimensional Model,Multidimensional Relational Model, Multidimensional Cube Model, Data Warehouse Design, Data Warehouse Design Approaches, Db to Support Analytical CRM Analysis,Operational and Analytical CRM, Sales and Marketing Analysis,Profitability Analysis, Service Quality Analysis, Customer Analysis, Data Warehouse Logical Design,OLAP Systems Solutions,Data Analysis Using SQL,Simple Reports with SQL, Moderately Difficult Reports with SQL,Very Difficult Reports Without Analytic SQL.

I contenuti di questa pagina costituiscono rielaborazioni personali del Publisher macchia17 di informazioni apprese con la frequenza delle lezioni di Decision Support Database e studio autonomo di eventuali libri di riferimento in preparazione dell'esame finale o della tesi. Non devono intendersi come materiale ufficiale dell'università Pisa - Unipi o del prof Ruggieri Salvatore.

Acquista con carta o conto PayPal

Scarica il file tutte le volte che vuoi

Paga con un conto PayPal per usufruire della garanzia Soddisfatto o rimborsato