Sune Karlsson

Bayesian analysis of dynamic factor models


When forecasting macroeconomic variables, such as GDP-growth and inflation, a huge number of potentially informative time-series of varying quality are often available. Ignoring part of this information will necessarily lead to sub-optimal forecasts. On the other hand, utilizing all series in a statistically and computationally efficient manner is a formidable task.


Within this project, so-called dynamic factor models are studied. Such models have become increasingly popular in finance and economics lately, and aim at summarizing the information content of a large number of variables in a few factors. In contrast to traditionally used factor analysis, dynamic factor models incorporate the serial dependence found in time-series data.


The project will aim at constructing statistically and computationally efficient forecasting methods as well as methods for model specification, with a particular focus on the number of factors to be used as well as detailed modeling of the time-series dynamics
Final report

Sune Karlsson, Örebro University

2009-2015

The aims of the project

The project is based on the increasing richness of available data and the opportunities and challenges the larger data sets give rise to. While providing new opportunities for the modeling of economic relationships and improved forecasts, it is a challenge to effectively summarize and incorporate information from a large number of variables in a statistical model.

More specifically, the project aims to develop dynamic models and methods for forecasting macroeconomic variables that take full advantage of the large data sets that are available today. The project uses two approaches, so-called dynamic factor models where a large number of variables can be included directly in the model and the common dynamics are modeled. Secondly an approach based on many smaller models based on different amounts of information and "model averaging", i.e. forecasts from different models are weighted together according to different criteria.

A Bayesian approach to inference is used throughout the project.

The project's three main results

A fundamental problem in the case of model selection and model averaging in multivariate models of differing dimensions is that traditional likelihood based measures of fit such as the marginal likelihood are not applicable. This is because they refer to different sets of variables. Ding and Karlsson (2014a) suggests the marginalized marginal likelihood (MML) as a solution to the problem in situations where a small number of variables are of interest, such as when the aim is to forecast a smaller number of variables. By marginalizing out the other variables in the model, a measure that is comparable between models and focused on the essential part of the models is obtained. With vector autoregressive (VAR) models as examples analytical expressions for the MML with conjugate prior distributions are derived. For situations where analytical expressions for the MML cannot be obtained it is shown how simulation methods (MCMC) can be used to estimate the MML and various methods for estimating the MML are evaluated. A simulation study demonstrates that the MML works well as a model selection criterion and in an application to forecasting GDP growth and inflation for the US is forecast combination based on the MML proves to be competitive.

Ding and Karlsson (2014c) builds on Ding and Karlsson (2014a) and show how MML-based model selection and forecasting combination can be scaled up to situations of over 100 variables and an almost astronomical number of potential models. In order handle a large number of models in a Bayesian approach the specification of the priori distribution for each model must be close to automatic. A new method for automatic calibration of the prior distribution of the different models based on the model's degrees of freedom is proposed. This method is faster than existing methods, and gives comparable results. Since it is practically impossible to evaluate all potential models effective methods to quickly identify the subset of well-functioning models are required. That is, we need to quickly identify the models that are of interest in model selection or which would be given a weight different from zero in a forecast combination if all models where to be evaluated. A simple and efficient algorithm based on Reversible Jump Markov Chain Monte Carlo is proposed. The methods are tested on a dataset with 135 variables describing the US economy where GDP growth, inflation and the interest rate are forecasted. Both the forecast combination and model selection based on MML works well in comparison with other forecasting methods for rich datasets.

The previous two studies are based on the approach of combining smaller models which together incorporates the information in a large number of variables. Ding and Karlsson (2014b) shows how to, alternatively, directly include a large number of variables in a single model using a VAR model with reduced rank structure. In the proposed parameterization of the reduced rank structure the model can also be interpreted as a dynamic factor model. The parameters of a model with a reduced rank structure are fundamentally unidentified and further restrictions are needed in order to identify the parameters. Two different identification strategies - a "traditional" linear normalization and a semi-orthogonal normalization - with associated prior distributions and simulation algorithms are proposed and evaluated. An important model selection issue is the choice of dimension (rank) of the reduced rank structure. Various criteria for the choice of the rank is proposed and evaluated. The methods are also evaluated empirically in an application where global stock markets are forecasted. Overall, the results show that a reduced rank structure can lead to better results and the semi-orthogonal normalization of the model gives good results even if it is more computationally demanding than the linear normalization.

New research questions

The project has explored two avenues to exploit the information contained in today's rich datasets consisting of a large number of variables to model and forecast the economic data. Throughout we have assumed that they data is normally distributed, an assumption that is questionable - especially when it comes to financial data. There is thus a need to generalize the approach to other data distributions that can be skewed or have thicker tails than the normal distribution. Further research is also needed to gain more insight into which model formulation works well in different situations.

The project's international footprint

Results from the project have been presented at several international workshops and conferences, and some are already cited in the international research literature. Citations according to Google Scholar (without self-citations).
Karlsson (2013): 20
Ding and Karlsson (2014A): 3

Outreach and research information

As this project is concerned with methodological development the target audience is the scientific community and practitioners concerned with modeling or forecasting of economic relationships. These target groups are reached quickly through presentations at international conferences and workshops, such as the workshop on short term forecasting hosted by the National Bank of Poland. Publication in international journals is also an important channel for disseminating the research results.

The project's two main publications

As the project has been delayed and essentially completed during the past two years, only the research review Karlsson (2013) and Andersson et al. (2015) have been published. The two main papers produced in the project are Ding and Karlsson (2014b) and Ding and Karlsson (2014c).

Project publication strategy

The aim is to publish the results that currently are available in manuscript form in well regarded international peer-reviewed journals. As far as possible open access journals will be selected.

 

Publications

Editerade volymer

Karlsson, Sune, (2013), ‘Forecasting with Bayesian Vector Autoregression’, ch. 15, p 791-897 in Elliot, G. and Timmermann, A., eds, Handbook of Economic Forecasting, vol 2B, Elsevier.

Artiklar i refereebedömda tidskrifter

Andersson, August, Junjun Deng, Ke Du, Mei Zheng, Caiqing Yan, Martin Sköld, och Örjan Gustafsson, (2015), ’ Regionally-Varying Combustion Sources of the January 2013 Severe Haze Events over Eastern China’, Environmental Science and Technology, 49, 2038-43.

Manuskript

Ding, Shutong, Bayesian VAR models with asymmetric lags, Mimeo, Handelshögskolan vid Örebro universitet.
Ding, Shutong, och Sune Karlsson (2013), Model Selection in Dynamic Factor Models, Mimeo, Handelshögskolan vid Örebro universitet.
Ding, Shutong och Sune Karlsson (2014a), Model averaging and variable selection in VAR-models, Mimeo, Handelshögskolan vid Örebro universitet.
Ding, Shutong och Sune Karlsson (2014b), Bayesian forecasting using reduced rank VARs, Mimeo, Handelshögskolan vid Örebro universitet.
Ding, Shutong och Sune Karlsson (2014c), Bayesian forecast combination in VAR models with many predictors, Mimeo, Handelshögskolan vid Örebro universitet.
Karlsson, Sune, (2012), ‘Conditional posteriors for the reduced rank regression model’, Working Paper 2012:11, Örebro University Business School.

Avhandlingar

Shutong Ding, (2014), “Model Choice in Bayesian VAR Models”, Doktorsavhandling, Örebro universitet.

Konferenspresentationer

Workshop on Model Uncertainty, University of Warwick, 30/5-1/6 2010: Model Selection in Dynamic Factor Models (Shutong Ding och Sune Karlsson)
Computational and Financial Econometrics (CFE ’11), London, 17-19/12, 2011: Model averaging and variable selection in VAR-models (Shutong Ding och Sune Karlsson)
Rimini Conferences in Economics & Finance, Toronto, 16-18/8 2012: Model averaging and variable selection in VAR-models (Shutong Ding och Sune Karlsson)
7th Rimini Bayesian Econometrics Workshop, Rimini, 25-26/6, 2013: Bayesian forecast combination in VAR models with many predictors (Shutong Ding och Sune Karlsson)
Rimini Conferences in Economics & Finance, Rimini, 9-10/6, 2014: Bayesian forecasting with reduced rank VARs (Shutong Ding och Sune Karlsson)
Short term forecasting workshop, Warszawa, 13-14/11, 2014: Bayesian forecasting with reduced rank VARs (Shutong Ding och Sune Karlsson)
 

Grant administrator
Örebro University
Reference number
P09-0972:1-E
Amount
SEK 2,600,000
Funding
RJ Projects
Subject
Probability Theory and Statistics
Year
2009