At the international level, the study developed by Stock and Watson (1991) has served as a catalyst for the construction of monthly indicators of economic activity. Such indicators enable the monitoring of general economic conditions before gross domestic product (GDP) data are published because statistical offices typically record GDP on a quarterly basis and publish the first results after several weeks.
However, the application of this methodology has been focused on national statistics. There are very few examples of monthly indicators of economic activity for a region within a country. We did not find any regional indicators that were constructed using Stock and Watson’s (1991) methodology in Latin America. In international literature, we identified only two applications, both in the United States: the work by Megna and Xu (2003, 701–713) for the state of New York and by Crone (2005) for all fifty states.1
This article applies a version of this methodology that is adjusted on the basis of the characteristics of the data that are usually present at regional level. The contributions of this study are twofold; first, we develop a three-step empirical strategy to build regional economic indicators on a monthly basis that can be replicated in provinces or municipalities in Latin America; that is, we add two more steps to the conventional methodology of Stock and Watson (1991). Second, we apply this empirical strategy to develop the Monthly Indicator of Economic Activity (in Spanish, IMAE) for one of the main regions of Colombia: Valle del Cauca. However, the interest of the study goes beyond the specific case of the Valle del Cauca, which is presented as an example of the viability and applicability of the empirical strategy that we propose.
Valle del Cauca occupies a strategic geographic position for Colombia, because it is the gateway to trade with Asia and the entire Pacific Coast of Canada, the United States, and Latin America. However, as in most of the provinces in Latin America, to plan finances and make spending and investment decisions, the regional government and the companies operating in the department monitor on monthly bases the evolution of dissimilar economic variables and receive data on the region’s annual GDP with a several-month lag. In this article, we show how we constructed an indicator that unifies all this information in a single variable to estimate the general state of the department’s economy on a monthly basis. Because this problem is not particular to the government and companies of Valle del Cauca, we present a methodology of interest, which can be easily replicated in other regions in Latin America.
The IMAE is already in use in the Regional Economic Bulletin of the central bank of Colombia (Banco de la República 2015). Each month, the IMAE indicates whether the economic activity of Valle del Cauca is accelerating or decelerating, and whether the estimated economic growth is above or below the historical average. This real-time information about the economic cycle falls within what is internationally known as “nowcasting” because it provides an estimate of the state of the economy at that time (in each month), according to the information available to date.
Indicators of economic activity are expected to contribute to better fiscal management of the regional governments in Latin America, because the government will have greater ability to project its income and plan its spending budget. Meanwhile, companies can make a better assessment of the income effect on demand for their goods and services. In general, all economic actors will have information in “real time” on the general state of the economy, which is key to early decision making and efficient resource allocation.
The first step of this empirical strategy begins with the estimation of a dynamic factor model (DFM) with the Kalman filter. The DFM assumes that there is an unobserved or latent variable common to a group of various observed variables. By using series related to economic activity, the latent variable approximates the general state of the economy. The DFM seeks to identify common repetitive sequences in the series, that is, the co-movements. This pattern in the common dynamic provides a signal about the evolution of the economic cycle. For regional indicators, an initial study has to be conducted to search for and select key variables for the economic cycle according to the economic structure of the province or municipality and the characteristics of the information available.
The second step includes the temporal disaggregation of the department’s annual GDP using the Litterman method (1983, 169–173). A monthly GDP series is obtained to include this information in the coincident activity index. The methodology applied in this study is therefore unusual because it combines information on an annual basis and a monthly basis in a single indicator. The use of the Litterman method allows us to take into consideration the fact that regions do not have a quarterly GDP, only an annual one.
Finally, the third step incorporates the application of the univariate structural time-series model initially developed by Harvey (1989) to smooth the estimated common factor and get a better signal of the business cycle. The inclusion of Harvey (1989) methodology allows for the consideration that regional available series have high volatility, so it is necessary to smooth the indicator’s trajectory for a more reliable estimate of the economic cycle.
The most common methods of DFM estimation are the principal components method and the Kalman filter. The former determines the least number of principal components that explain the largest proportion of variability in the original data set, with the first principal component being a linear combination of the original variables in the direction of maximum variability. Some studies in this regard include those by Gupta and Kabundi (2011, 1076–1088); Forni, Hallin, Lippi, and Reichlin (2009, 62–85); Schumacher (2007, 271–302); and Boivin and Ng (2006, 169–194). One of the characteristics of principal components estimation is that it requires a large number of variables, which becomes a disadvantage for regional application.
The origin of the filter can be found in a paper by Kalman (1960, 35–45). The Kalman filter is the main algorithm for estimating dynamic systems represented in state-space form. Among the applications performed with this method to construct indicators for economic activity are those by Angelini, Banbura, and Rünstler (2008, 1–22) and Camacho and Doménech (2012, 475–497), who applied a DFM, mixing monthly and quarterly data, to predict economic activity in the Eurozone and Spain, respectively. Additionally, Aruoba, Diebold, and Scotti (2009, 417–427) performed mixtures of frequencies with few series for the United States: with only five variables, they produced an index of economic activity that can be observed weekly. Other models focused on the Eurozone are those by Camacho, Pérez-Quirós, and Poncela (2013) and Camacho and Pérez-Quirós (2010, 663–694). There are also other DFM applications, such as the case study by Poncela, Senra, and Sierra (2014, 3724–3735) that evaluates common movement in the prices of different raw materials.2
In Colombian national literature, the development of economic indicators at the macroeconomic level using DFMs has also had an extended presence. The following studies are noted: Melo and Nieto (2001, 1–32); Melo and colleagues (2001, 1–66); Melo, Nieto, and Ramos (2003, 1–57); Castro (2003, 1–26); Kamil, Pulido, and Torres (2010, 1–40); and Marcillo (2013, 1–41).
The sections that follow describe the three steps of the empirical strategy used to build a regional economic indicator, explain the selection and transformation process of the variables, report the three steps of the methodology, and present conclusions.
Step 1: The Dynamic Factor Model and the Kalman Filter
The methodology used in this study uses Stock and Watson (1991) as one of its principal references. In general, the dynamic factor model assumes that a vector Yt, with N observed variables of economic activity, can be represented as the sum of two unobservable components that are mutually independent: a component common to all variables (Ft) that represents the state of the economy and an idiosyncratic component (μt) that represents the dynamics of each series. Accordingly, the equation for the observed series Yt is as follows:
where Yt is a vector (N × 1) of the monthly variables of economic activity; Ft is the common factor or co-movement of the series; P is a factor loading matrix (N × 1); and μt is a vector (N × 1) of specific or idiosyncratic components. The dynamics of the factors are given by the following:
where at is normal multivariate white noise with a variance and covariance matrix ∑a. Φ(B) = I – Φ1B– … – ΦpBp and B are the lag operator. We also consider that the idiosyncratic components may have a dynamic structure in the following form:
where D(B) = diag(Di(B)) is a diagonal matrix that includes the specific dynamics of each idiosyncratic disturbance; and, corresponds to the autoregressive structure of the component in each series represented with the lag operator. In addition, et is zero mean white noise with a diagonal covariance matrix.
The estimate of the DFM is performed with maximum likelihood by applying the Kalman filter. To that end, the model is written in state space:
The measurement equation, equation 4, describes the relationship between the observed variables (data) and a vector of state variables. The state equation, equation 5, describes the dynamics of the state variables (unobserved variables) over time, where H is a matrix (N × m) and ϵt is (N × 1), with ϵt ~ iid N(0, R). G is a transition matrix m × m, νt is m × 1, and νt ~ iid N(0, Q). It is assumed that the errors in the measurement equation ϵt are independent from the errors in the transition equation νt, such that:
The state vector contains the most important information of the system at each point in time.3 The basic Kalman filter is based on an algorithm of predicting and updating vector zt, which is repeated for each observation from the beginning to the end of the sample using the initial values of the system parameters (matrices: H, G, R, and Q). The algorithm minimizes the average squared prediction errors recursively: in each observation, zt is updated with the new information contained in the prediction error (Kalman gain).4
Thus, in estimating the vector of the state variables zt and in matrix H, the most relevant information for this study is found: the common factor or co-movement (Ft) of the variables used and the matrix of loads or weights (P) that they each contribute to calculating the factor.
Step 2: The Temporal Disaggregation of GDP
Step 1 is used to estimate a common factor of the monthly variables available. However, to efficiently use the available information, we worked to incorporate the annual GDP data available for the department since 2000 in the indicator. The goal is for the indicator cycles to be more consistent with the annual GDP growth rates for the province or municipality in the period in which this information is available.
In an extension of the work of Stock and Watson (1991), Angelini, Banbura, and Rünstler (2008, 1–22) and Camacho and Doménech (2012, 475–497) developed a methodology of dynamic factor models with the possibility of combining data on quarterly GDP with higher frequency series. This methodology is not applicable to Valle del Cauca because no quarterly GDP information is available. In general, very few regions within countries have quarterly GDP statistics in Latin America.
In this article, we apply a methodology that is consistent with the characteristics of the information that is typically available at the regional level. A methodology is proposed in which monthly and annual data are combined in the indicator after temporarily disaggregating GDP with the common factor previously estimated with the Kalman filter in the DFM. Specifically, the common factor is used to measure GDP on a monthly basis, following the Litterman method (1983, 169–173).
Temporal disaggregation consists of generating a high-frequency series starting from low-frequency data. Some of these methods use one or more additional related variables with a high frequency (for a detailed review of the different methods of temporal disaggregation, see Quilis 2001; Hurtado and Melo 2010, 1–36). In this article, a factor (Ft) that was previously extracted as co-movement from the monthly series is used as the related variable.
The temporal disaggregation technique for economic series proposed by Litterman (1983) generalizes the approach described by Fernández (1981, 471–478) and assumes the existence of a linear relationship between the estimating series (h) and the high-frequency p related series (w):
where h = [h1, …, hN] is a vector N × 1 with unobserved high-frequency values of the series after temporal disaggregation; w = [w1, …, wp] is vector N × p with p high-frequency related variables (for this study, p = 1); β is a vector p × 1 of coefficients; and u has the dimension N × 1 with zero mean and covariance matrix V.
where at is a white noise process with variance σ2.
Step 3: The Univariate Structural Time-Series Model
The common factor resulting from the estimate with the DFM sometimes contains too much noise, which makes it difficult to have a clear reading of the state of the economy. To have an indicator that can be interpreted and used by economic actors, it is necessary to extract a stronger signal, that is, to estimate the most permanent trajectory of the variable.
To that end, there are different methods of signal extraction. There are empiricist methods, which are characterized by implementing decomposition based on linear filters whose structure and parameters are not dependent on the nature of the data but instead have preset values. These include moving averages, exponential smoothing, X-11 ARIMA (autoregressive integrated moving average) (and its most recent versions), and the Hodrick-Prescott filter.
In this article, we use the univariate structural time-series model initially developed by Harvey (1989). This method, unlike the empiricist methods, take into account the particular characteristics of each time series when estimating their different unobservable components (trend, cycle, seasonality, and irregular component). It allowed each of the unobservable components to have a stochastic nature and is distinguished by independently estimating the trend of the cyclical component, and they can estimate up to three cycles of a different period.
Its broadest specification, considering all components of a time-series yt is as follows:
where the trend is broken down into changes of level μt and slope βt ; γt is the seasonal component; and ɛt is the irregular component, which can consist of white noise or follow an autoregressive process. In turn, ht, ξt, and wt are white noise with variance, and ; they are called hyperparameters and allow components to evolve stochastically if they are not equal to 0. ψt represents the cyclical component, which is modeled with periodic sine and cosine functions:
where κt and 2009).are white noises that are not correlated with each other or with any other disturbances in the model and with common variance ; and the parameter λc is the frequency measured in radians; that is, it represents the number of times the cycle is repeated in a time period with length 2π. To estimate the model, we used the program Structural Time Series Analyser, Modeller and Predictor (STAMP) by Koopman, Harvey, Doornik, and Shephard (
The initial stage in building a regional indicator is variable selection. For Valle del Cauca, initially, twenty-seven monthly variables related to the economic activity of the department were analyzed (in Table A1 of the Appendix, the entire group of variables considered is shown).
The selection of the series to calculate the indicator is based on the following criteria: (1) series with a monthly frequency that were published during the period of study; (2) variables with the largest annual correlation with departmental GDP; (3) key variables that represent different sectors and or components of demand in the region’s economy; and (4) variables with less lag in monthly publication.
The seven variables chosen that met these criteria and were eventually used to estimate the IMAE were the following:
- Ground sugarcane (CAN)
- Cement shipments (CEM)
- Nonresidential energy consumption (ENER)
- New vehicle sales (VEH)
- Exports at constant prices (X)
- The Regional Industrial Production Index (Índice de Producción Industrial Regional, or IPIR)5
- Imports at constant prices (M)
The selected variables contain direct and/or indirect information about different key activities of the regional economy on either the supply side or the demand side. Importantly, each variable contains information beyond the sector where it is measured; this is the case for nonresidential energy demand and the data on exports and imports. All the variables provide indirect information on the general state of business activity in the region. Similarly, new vehicle sales and imports are linked to consumption and the economic situation of households. In this manner, estimating the state of the department’s economy is facilitated with few variables. The fact that the indicator is constructed using a few variables makes it feasible to repeat the methodology in other regions.
The sectors that are directly considered in the indicator are agriculture, construction, energy, industry, and trade. Compared to the sectors that are typically considered indicators of economic activity internationally, the inclusion of ground sugarcane in the IMAE is noteworthy. According to the Ministry of Agriculture and Rural Development, sugarcane accounts for 94 percent of agricultural production in Valle del Cauca and 49 percent of the sown area, making it the main agricultural crop of the department. Figures from the Asocaña business organization show that 77.3 percent of the production of sugar mills in Colombia is concentrated in Valle del Cauca. Sugarcane is the raw material for many important networks in other industries in the department, such as beverages, food, paper, cardboard, and pharmaceuticals; additionally, it is used in ethanol production and electricity cogeneration (Escobar, Moreno, and Collazos 2013, 3–10).
Furthermore, it is not usual to include external trade data in indicators of economic activity. According to figures from the Annual Report on Transportation in 2013, in Colombia, 96.7 percent of foreign trade is moved by maritime shipping. Valle del Cauca has a comparative advantage in having the only seaport on the Colombian Pacific Coast. The port of Buenaventura moved 46.2 percent of the total burden of foreign trade in 2013. Thus, higher shipments of goods to and from abroad indicate increased economic activity in the department by promoting transportation, logistics, employment, and industry in the region.
The variables that comprise the IMAE are the type used internationally in coincident economic activity indicators.6 They contain contemporary information on the supply side about economic sectors or effective demand in the economy. Therefore, it is assumed that the turning points of the IMAE coincide with the dates of turning points in the economic cycle, unlike leading indicators in which turning points anticipate the economic cycle. To construct the latter type of indicators, data would be needed with regard to expectations, interest rates, monetary aggregates, and other variables that contain advance information on economic activity.
Stock and Watson (1991) used the following macroeconomic variables in their estimations: industrial production, household income, sales, and employment. The variables considered for the IMAE directly or indirectly contain information about these categories, with the exception of employment. The series of total employment and unemployment rate available for the department had many outliers and were not consistent with the other variables used in the IMAE. Other regions in Latin America might have better data on employment so are worth taking into account.
The following analysis and treatment of the series were performed. First, the augmented Dickey-Fuller (ADF), Phillips-Peron (PP), and Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) unit root tests were performed for each of the series to evaluate the presence of unit roots. Except for sugarcane, which is stationary, all selected series were integrated of order one. In addition, seasonality and outliers in the series were eliminated using the TRAMO-SEATS program of Gómez and Maravall (1996). A logarithmic first difference was applied to I(1) variables, which was expressed in monthly variation rates. Last, all variables were standardized (see Table A2 in the Appendix for details on the treatment performed on each series).
Figure 1 shows the transformed series. All series have high variance; thus, it is expected that the co-movement of the series also show high variability.
Step 1: The Estimation of the Common Factor
Figure 2 shows the results for the common factor estimated by the Kalman filter. In Table 1, the weights that each variable contributes are shown (normalized to sum to 1). All series have positive correlations with the factor; that is, all are procyclical.
|1. Ground sugarcane||0.06|
|2. Cement shipments||0.06|
|3. Energy consumption||0.12|
|4. New vehicle sales||0.24|
|6. Regional IPI||0.22|
Table 1 shows that 46 percent of the common factor is composed of industry and the new vehicle sales variable (which approximates conditions regarding consumption and the economic situation of households). Next in importance are variables related to the external sector: exports with 16 percent and imports with 15 percent. Energy consumption has a weight of 12 percent, and the ground sugarcane and cement shipments each have a weight of 6 percent.
Step 2: Common Factor Corrected with GDP Information
As explained already in the methodology section, the indicator incorporates information on the department’s annual GDP data published by the National Administrative Department of Statistics (in Spanish, DANE). Thus, the aim is for indicator cycles to be more consistent with the annual growth rates of GDP in the period in which this information is available.
Table 2 shows the estimated monthly GDP data for the period 2000–2013 using the common factor (Ft) as related variable in the Litterman method. It is shown at constant prices (given that the annual data were in constant prices) and without seasonality (given that the common factor was constructed with seasonally adjusted variables).
Figure 3 shows the common factor corrected with monthly GDP information. The new series is composed from January 2000 to December 2013, using the rates of change of monthly GDP, and from January 2014 to March 2015, using the common factor previously estimated by the Kalman filter. The fact that GDP is disaggregated with the common factor ensures consistency throughout the series.
Importantly, the routine for the systematic use of the IMAE involves updating the factor by taking into account new annual GDP data whenever DANE publishes new information. However, there will never be a complete updated series for the IMAE corrected with GDP data, given the lag in publication. Therefore, monthly GDP does not replace the IMAE. The real-time assessment of the economy in the last period of the series (nowcasting) would always have been made with the “pure” data from the common factor. In this sense, the main benefit of measuring GDP monthly as part of the methodology to construct the IMAE is that doing so allows for a better assessment of the past, which improves the estimation of economic cycles.
Step 3: The Economic Cycle
In the trajectory of the factor in Figure 2, a cyclical component can be observed but with turning points that are hidden by the “noise” of the series. With this series, it is difficult to obtain a clear reading on the state of the economy in the region. Therefore, it is necessary to smooth the series by applying the univariate structural time-series model.
In the specification of the univariate structural model, neither seasonality components nor slope were included because both were eliminated in the factor estimation. The results of the estimation by the univariate structural model show that the series has the following components:7
- A fixed level equal to 0.0031, which indicates the average monthly growth rate of the series (annual would be 3.8 percent).
- An irregular AR(1) component. This type of component coincides with the specification of both the DFM and the Litterman method.
- A cycle period of 4 years and 10 months. It is noteworthy that this length of time is an average period of a stochastic cycle, based on growth rates (growth cycle). For the average period, it is the cycle known in the literature as the “business cycle.” The estimated period of the cycle for Valle del Cauca approaches the estimated four years for the national GDP, according to the study by Arango, Arias, Flores, and Jalil (2008, 9–37).
Figure 4 shows the cyclical component and Figure 5 the estimated irregular component. The sum of these plus the fixed level (0.0031) would result in the common factor previously shown in Figure 3. The irregular component includes noise to be removed from the series, whereas the cycle contains the signal sought for the indicator: the strong, continued evolution of the common factor. Indeed, the estimated cycle is the monthly indicator of economic activity (IMAE) proposed for the department of Valle del Cauca.
Figure 6 shows the IMAE and the annual growth rates of GDP for Valle del Cauca. It is observed that when the IMAE exceeds zero, the GDP grows above its historical average (3.8 percent). This occurs in 2004, 2006, 2007, 2011, and 2013. In 2012, the IMAE remained near zero, and the economy grew by 3.79 percent. In the years when the IMAE is more negative, the GDP grows more slowly, as is the case of 2001, 2005, 2009, and 2010. The correlation between the average annual value of the IMAE and the annual GDP growth rate reported by DANE is 0.84.
Table A3 in the appendix shows the maximum and minimum points of the estimated growth cycle. The last turning point of the IMAE (0.14) is reached in June 2013. In this year, the department showed its fastest growth rate in the previous five years (GDP grew by 4.6 percent). Since then, the IMAE has estimated a slowdown, which may have bottomed out in November 2014. In the first quarter of 2015, Valle del Cauca grew at values that are very close to its historical average (3.8 percent) and below the growth in the first quarter of 2014.
An analysis of the contribution of the variables that constitute the IMAE suggests that from November 2014 to March 2015, industry in Valle de Cauca seems to be leading the growth, replacing vehicle sales and foreign trade, which were the elements that contributed to the economy’s dynamism in the first three quarters of 2014.
Finally, we calculate the correlations between IMAE and two monthly indicators of national economic activity, Indicador de Seguimiento a la Economía (ISE), prepared by DANE, and Índice Mensual de Actividad Económica Colombiana (IMACO), calculated by the central bank. The correlations are 0.45 and 0.34, respectively. They are relatively low correlations, which suggest that the economic cycle in the region has a high degree of independence in relation to the macroeconomic business cycle. This result further justifies the need for the Valle del Cauca to have its own monthly indicator of economic activity.
Although it is typically applied in the construction of indicators of economic activity at the macroeconomic level, in this paper, a version of the methodology of Stock and Watson (1991) was applied to a regional context.
On the basis of seven key historical series from the department of Valle del Cauca in the period from January 2000 to March 2015, a monthly coincident economic activity indicator was constructed. To that end, a methodology was employed that uses dynamic factor models estimated with the Kalman filter. Additionally, the estimated common factor for the series was adjusted to annual GDP growth data and then smoothed for a more stable signal about the evolution of economic activity. We present a three-step methodology that can be easily replicated in other regions in Latin America.
The regional industrial production index, vehicle sales, and exports are series with greater weight within the estimated indicator. These are the variables that, to a greater extent, determine the economic cycle of the department. This signal approximates a cycle for the department’s economic activity, with an average periodicity of four years and ten months.
The IMAE reveals a low correlation between the economic cycle of Valle del Cauca and the macroeconomic business cycle in Colombia. This can be explained by the peculiarities of the economic structure of Valle del Cauca. Unlike the predominant structure in the country, Valle del Cauca does not depend on extractive activities such as oil and minerals, has rather a diversified industry, trade concentrated with border partners, an agricultural predominance of sugar cane, and consumption fueled largely by remittances. While in general the Colombian economy has been slowing from 4.9 percent in 2013 to 2.5 percent in the first quarter of 2016, the economy of the Valle del Cauca shown resilience and maintains a growth rate close to 4 percent.
Such a situation in the cycle and structure of Valle del Cauca is similar to several regions in Latin America. These are regions not directly benefited from the commodities boom and increasing Asian demand in the 2000s. Nevertheless, they can now offer strength and provide momentum to the growth of exports and GDP in the current scenario of low commodity prices and slowing Chinese economy.
A necessary extension of the present work would be to estimate time series models with the IMAE and the leading variables of economic activity, not only to count on nowcasting, but also to be able to project GDP growth in the region based on different scenarios. That is, to develop new regional analysis concerning the transmission mechanisms of different external and domestic shocks on the monthly economic activity of the region, such as variation of interest rates, the exchange rate, and commodity prices.