Data users want to understand how official data are generated, and statistical agencies want to explain their methodologies. But sometimes the two groups seem to speak different languages.
I saw an example in a recent article in Journal of Finance by Tim Kroencke, “Asset Pricing without Garbage” (HT: Jonathan Parker). Kroencke is analyzing a well-known model in financial economics, the “consumption-based capital asset pricing model.” This model says that variation in consumption should, in theory, help explain the relationships between asset prices, risk, and returns. However, when this model has been tested empirically with national accounts data on personal consumption expenditures (PCE), it has produced poor, or even implausible results. (This 1986 paper by Greg Mankiw and Matt Shapiro provides some background and a nice overview of the model.)
Kroencke’s explanation for the poor empirical performance starts with the idea that raw consumption data are subject to measurement error. He hypothesizes that when statistical agencies measure consumption, they attempt to remove the measurement error:
NIPA statisticians do not attempt to provide a consumption series to measure stock market consumption risk. Instead, they try to estimate the level of consumption as precisely as possible. As a result, they optimally filter observable consumption to generate their series of reported NIPA consumption. Concerning asset pricing, however, filtered consumption leads to disastrous results when the consumption risk of stocks is estimated… On top, filtering is intensified by the well-known bias stemming from time-aggregation: While reported consumption is an estimate of consumption flow during a specific period, the consumption-based asset pricing model relates asset returns to consumption at a specific point in time.
My immediate reaction when I read the assertion that NIPA statisticians “optimally filter observable consumption to generate their series” was surprise. I worked with and oversaw the U.S. national income and product accounts for 19 years, and I know that BEA staff didn’t typically think of themselves as filtering observable consumption. Indeed, they seldom used the term “filter.” Nevertheless, I wondered if some BEA methods were, either directly or indirectly, acting as a filter.
Before talking about statistical agency methodology, here’s a quick explanation of what statisticians mean when they talk about filters. The idea originated with signal processing—for example, the way that engineers remove noise from a radio or television signal to enhance the clarity of the signal that’s transmitted. That work eventually led to the development of digital filters, and statisticians soon realized that these same methods could also be applied to remove measurement error and other types of unwanted noise from time series. In time series econometrics, the Kalman filter is widely used, and you also occasionally see applications of other types of filters, such as the fast Fourier transform.
The Philadelphia Fed’s GDPplus is an example of a series that’s derived from a formal filtering process. The statistic, which was developed by Borağan Aruoba, Francis Diebold, Jeremy Nalewaik, Frank Schorfheide, and Dongho Song, notes that BEA produces two measures of Gross Domestic Product—the familiar expenditure-side measure, which BEA calls GDP, as well as an income-side that BEA calls gross domestic income. These measures are conceptually equivalent, but because they are produced from different source and are each subject to measurement error. They describe their approach: “We view ‘true GDP’ as a latent variable on which we have several indicators, the two most obvious being GDP-E and GDP-I, and we then extract true GDP using optimal filtering techniques.”
As shown in the chart, the Philadelphia Fed’s filtered estimate of GDP, “GDPplus,” shown with the light blue line, is smoother than than either BEA’s expenditure-based or income-based GDP measures.
Although “filtering” is not really a concept that’s explicitly used in BEA’s methodologies for PCE, after some thought I decided several methods might be related to filtering.
First, here’s a quick summary of the methodology for PCE:
- BEA’s estimates of quarterly PCE for particularly spending categories are mostly based on Census monthly retail sales (for goods), the Census quarterly services survey (for services), and few miscellaneous data sources for specific components (e.g., Energy Information Administration data for gasoline sales and household electricity and gas consumption, Wards data for motor vehicle sales, etc.)
- These estimates of nominal spending are seasonally adjusted, and then deflated by price indices (primarily from the consumer price index, or CPI) to obtain real PCE.
- Chain-type price and quantity indexes are used to aggregate the individual spending categories to obtain total PCE and other broad categories.
The most obvious example filtering is seasonal adjustment. Seasonal adjustment methods decompose a time series into three components—seasonal, trend, and irregular. The seasonal component reflects regular seasonal fluctuations, such as increased retail purchases around the Christmas holiday, as well as other regular, predictable effects, such as the number of trading days during a month. The trend reflects the underlying long-term movement of the series, and the irregular component reflects unpredictable movements, such as unseasonable weather, strikes, and natural disasters, as well as random measurement error. When a statistical agency produces seasonally adjusted data, it has removed the seasonal component from the data, leaving the trend and irregular components. Here are adjusted and unadjusted data from the Census Bureau on e-commerce retail sales:
While seasonal adjustment filters out regular seasonal effects and generates a smoother overall series, it doesn’t remove the irregular component, which includes idiosyncratic shocks and measurement error. (I understand that the Australian Bureau of statistics sometimes publishes the trend estimate, which filters out both the seasonal and irregular components, but that is not common practice for U.S. statistical agencies.) The PCE data do not currently filter out unusual non-seasonal events like weather, strikes, riots, etc.
What else happens when statistical agencies process their data that might be considered similar to filtering? When the Census Bureau collects survey data, the data first go through a data editing process that identifies outliers and erroneous data. While I tend to think of the data editing process as eliminating obvious errors (e.g., a respondent leaving off a digit in reporting a number), it does tend to reduce measurement error, and I suppose you could think of it as a type of filtering.
The data also go through a periodic benchmarking process, in which BEA benchmarks all of the GDP components at 5-year intervals to the benchmark input-output accounts, which in turn are mostly based on the economic census. The 5-year census, in essence, sets the levels of GDP and PCE. The annual and quarterly estimated are interpolated to maintain consistency between the quarterly, annual, and economic census-year estimates without introducing breaks in the time series. But interpolation procedure (the “Denton” method) is designed to preserve as much of the quarter-to-quarter movements in the quarterly source data as possible while still maintaining the annual and 5-year benchmark levels. That is, the benchmarking and interpolation methods should improve the accuracy of the data, but are not primarily designed to smooth out or reduce volatility in the quarterly estimates.
For data that haven’t yet been benchmarked (that is, the data for the most recent period that hasn’t yet been through an annual revision), BEA uses a “best change” method–that is, the estimates for the most recent quarters are based on the quarter-to-quarter changes in the source data, thereby avoiding any level breaks in the estimates.
Another factor that especially affects the BEA estimates of PCE services is that there have historically been significant gaps in the quarterly source data. Over the last 15 years, a lot of progress has been made in eliminating those gaps. In particular, the Census Bureau’s quarterly services survey (QSS) now provides source data for a large share of PCE services. But that’s a relatively recent survey. The QSS started in 2003, and it took several years of expansion in coverage until it covered most of the services categories, sometime around 2010. That means that for most of the history of the PCE estimates, BEA really didn’t have much actual quarterly data on consumption of services. The older numbers are based on proxy variables—data like industry employment or hours—rather than on actual sales or consumption. These proxy data were clearly measured with error relative to what PCE is conceptually designed to measure. The proxy data are also probably smoother in most cases than actual consumption data would have been, so they could be considered another type of filtering.
So I guess the bottom line is that there’s certainly a type of filtering going on in the PCE and GDP estimation processes, especially in seasonal adjustment. But the overall filtering process is less formal and very different from what we see in the statistical literature. And BEA generally tries to avoid smoothing out too much volatility in the data. After all, one of the main uses of quarterly GDP is to measure the effects of business cycles, and BEA staff recognize that excessive data smoothing would risk making GDP less reliable as a business cycle indicator.
I’m not going to try to evaluate how BEA’s actual methods relate to the Kroencke’s analysis, but I thought it would be interesting to share my take on how BEA’s methods relate to the overall idea of filtering.
Statistics New Zealand also publishes trend estimates, though generally you have to go into our into our InfoShare system to get them. In the past we often put them into news releases and published graphs, though we are in the de-emphasis phase at present.
There have been many discussions here and with our colleagues at the Australian Bureau of Statistics as to the place of trend estimates in official statistics. In my view it is part of the more general conversation of how we convey the idea that some outputs are more reliable than others (i.e. users read the figures but generally not the footnotes and explanations in any detail)
Offtopic, but I thought you might be interested in this
As you can imagine, your clustering research is heavily cited.