Previsione e informazione
Forecasting and disaggregation
• The level of y functional disaggregation relates
the variable of interest with the variables on
which it is based (i.e. its components).
• Instead of an univariate model for y , we can use
a set of different models for disaggregate
forecasts; then aggregation of the components.
• Examples: demand components (or supply GDP
models). Sectors to forecast IPI. Geographic area:
single-country (region) vs area-wide models.
• Question: disaggregation provides more
information or only noise? No general theorems,
the choices are specific: it is empirical issue.
• Examples: Baffigi et al (2004), Golinelli&Pastorello (2002).
The two main statistical competitors
• In more complex indicator (statistical) models, y
data information is augmented by other time
series P , available up to t: I = ( y ; P ).
t t t t
• If P includes few indicators pre-selected by the
practitioner on the basis of her skills and
experience, we have the Selected indicator Model
(SM) or the Bridge Model (BM).
• If P is very large (including up to 100-200
indicators) we have alternative Factor based
Models (FM). In this case we define P = X .
t t 6 3
Bridge (selected-indicator) models
• BM deliver early forecasts by using the information of few
timely indicators through linear dynamic equations, where
the target (e.g. GDP) or its components are explained by
suitable short term indicators, selected on the basis of
researchers experience and statistical testing procedures.
• BM for GDP can be seen as tools to ‘translate’ the noisy and
timely information of short-term indicators into the more
coherent and complete ‘language’ of NA. Since indicators
cover a wide range of short run macroeconomic phenomena,
they can be used in different bridge equations for the main
GDP components (namely, C, G, I, S, X and M, ‘demand-
side’ BM where GDP is predicted by the NA income–
expenditure identity), or directly at aggregate GDP level
(‘supply-side’ BM, where GDP is forecast by a single bridge
• IPI is GDP main (coincident) indicator in supply side BM.
Forecasting GDP with BM & auxiliary models
• The following cases mimic as close as possible the actual activity
of forecasting. Since one-quarter ahead GDP forecasts with BM
require the knowledge of the conditioning indicator data for that
quarter, we suggest four alternative cases, depending on monthly
data availability (T is the last quarter of the estimation sample):
• (1) The pure one-step ahead forecast: one-quarter ahead (T+1)
GDP forecast when all the conditioning indicators are unknown. In
this case such indicators have to be forecast three-months ahead by
an auxiliary model.
• (2) Forecast with one month known: one-quarter ahead GDP
forecast when the conditioning indicators are known only for the
first month of the quarter T+1 (in this case indicators have to be
forecast two-months ahead by an auxiliary model).
• (3) Forecast with two months known: one-quarter ahead GDP
forecast when the conditioning indicators are known for two
months of the quarter T+1 (in this case indicators have to be
forecast one-month ahead by an auxiliary model).
• (4) Forecast with all months known (nowcast): one-quarter ahead
GDP forecast when the conditioning indicators for the quarter T+1
are fully known (in this case, the auxiliary models are not used). 4
Factor based models
Factor analysis and principal components analysis (PCA) are
two longstanding methods for summarizing the main sources
of variation and covariation among N variables in X:
The relationship describes the N×1 vector X using a k×1
and an error term e .
vector of unobserved factors F
Empirical content is given to this relation by assuming that
k is much smaller than N and the elements of e are only
weakly correlated; this implies that covariances between
the (many) elements of X are explained, in large part, by
the (few) factors in F . A single variable in X is predicted
with the ad hoc equation: 9
The equation above describes the forecast h-steps ahead
using the factors, autoregressive lags, and an forecast error
u . Because X does not enter this equation directly, the
elements of X are useful for predicting Y only because they
contain information about F . This equation is a factor-
augmented autoregression (FAAR).
With FM forecasts of the target are computed through the
first k principal components of the N indicators in the large
data set of indicators (N>>k). The main advantage of this
approach is to exploit not only the information content of
the single variables but their covariance as well, without
incurring in the “curse of dimensionality” as in unrestricted
vector autoregressive models. Problem: the composition of
the indicators sample. 10 5
The ragged edge problem with FM
• Issue: incomplete data on current and immediate past
values of indicators are available because they are
released at different times within the period t to t+1.
• What strategy to deal with non-synchronous data
1. Shifting operator: all indicators with missing
observations for the latest period are shifted in time so
as to have a balanced panel.
2. Forecasting the missing observations with
autoregressive (AR) models
3. Use the EM algorithm which not only guarantees a
coherent solution to the jagged edge issue, but also
provides forecasts of missing observations that are
efficient (they exploit all available information) and
consistent with the factor estimates. 11
The BM-FM alternative to the use of AR models
represents a good example of the trade-off
simple/complex models depicted in previous lecture.
If the increase of information (because of the exploitation
of additional predictors) is effective, we expect an
improvement of the forecasting ability of both BM and
FM over the simple AR model.
Since the AR model is nested in the BM, the comparison
can be made both in- and out-of-sample. Given that the
same is not always true for AR vs FM comparison, we
can only rely on out-of-sample comparisons.
BM can tell the story of the forecasts, FM does not. More
discussion and comparisons in Bulligan-Golinelli-Parigi.
Forecasting with real-time data
• Forecasting literature often compares the performance of
competing models through pseudo out-of-sample forecasting
exercises with the latest available data rather than the real-
time data actually at the forecaster’s command (main
reference: Croushore & Stark works).
• A real-time data-set – i.e. a collection of data vintages that
gives the modeler a snapshot of the macroeconomic data
available at any given date in the past – makes it possible to
take into account the revision process applied by statistical
institutes after the first published data.
• For example, the preliminary GDP estimate of the same
quarter is updated until, after a number of both statistical and
definitional changes, all relevant information is incorporated
and a stable measure is reached: the “actual” or final GDP
estimate for that quarter. Thus, data revisions imply two
possible alternative targets in prediction: (i) the first release
or (ii) the final GDP data. 13
A sketch of real-time data
• The problem of data revisions: we can use preliminary
(e.g. only the 1st releases) or final data if e.g. our
forecast purpose is in the field of financial markets
(reacting to preliminary data publication), or of
driving policy maker decisions respectively.
• Statistical agencies usually revise their preliminar data
as new information is available because of:
1. statistical revisions (new information)
2. definitional (benchmark) revisions (base, account…)
• Until, after a number of revisions, all relevant
information is incorporated and a stable measure is
of that period .
reached (?): the “actual” or final data 14 7
1 2 3 4 f
period, t T+1 1 T+2 2 T+3 3 T+4 4 T+f f
y y y y .... y
1 1 1 1 1 1
T 1 T+1 2 T+2 3 T+3 4 T+f-1 f
2 y y y y .... y
2 2 2 2 2
T-1 1 T 2 T+1 3 T+2 4 T+f-2 f
3 y y y y .... y
3 3 3 3 3
T-2 1 T-1 2 T 3 T+1 4 T+f-3 f
4 y y y y .... y
4 4 4 4 4
.... .... .... .... .... .... ....
3 1 4 2 5 3 6 4 f+2 f
T-1 y y y y .... y
T-1 T-1 T-1 T-1 T-1
2 1 3 2 4 3 5 4 f+1 f
T y y y y .... y
T T T T T
1 1 2 2 3 3 4 4 f f
T+1 y y y y .... y
T+1 T+1 T+1 T+1 T+1
1 2 2 3 3 4 f-1 f
T+2 n.a. y y y .... y
T+2 T+2 T+2 T+2
1 3 2 4 f-2 f
T+3 n.a. n.a. y y .... y
T+3 T+3 T+3
1 4 f-3 f
T+4 n.a. n.a. n.a. y .... y
.... .... .... .... .... .... ....
1 release 1 f
T+f n.a. n.a. n.a. n.a. .... y T+f
2 vintage Latest available
A real-time data-set
Alternative data to forecast
Vintage y series represent the state-of-art at the time of
publication. Observations are homogeneous over t (but
not in quality: old data are better than recent ones)
y growth rate calculation is straightforward:
o v o v o’ v
dy = 100 ( y / y – 1)
t t t-1
Release y series includes data revised for the same
number of times which come from different vintages;
observations are not homogeneous over t (but they are
o o v
y growth rates need collecting dy (growth-within)
→ What is the effect of data revisions on the simulated
models predicting ability? 16 8
+1 anno fa
Materiale didattico per il corso di Econometria per la politica economia del prof. Roberto Golinelli. Trattasi di slides in lingua inglese a cura del docente, all'interno delle quali sono affrontati i seguenti argomenti: il modello statistico, le previsioni e l'informazione; previsione e disaggregazione; modelli factor based; il test di Giacomini e White; il test di razionalità.
I contenuti di questa pagina costituiscono rielaborazioni personali del Publisher Atreyu di informazioni apprese con la frequenza delle lezioni di Econometria per la politica economica e studio autonomo di eventuali libri di riferimento in preparazione dell'esame finale o della tesi. Non devono intendersi come materiale ufficiale dell'università Bologna - Unibo o del prof Golinelli Roberto.
Acquista con carta o conto PayPal
Scarica il file tutte le volte che vuoi
Paga con un conto PayPal per usufruire della garanzia Soddisfatto o rimborsato