How reliable are the employment/unemployment data? Standard errors and mean absolute revisions

Source: iStock/Oakozhan

At 8:30 am tomorrow (March 10), BLS will release the “Employment Situation” for February. I expect most of the media attention will focus on whether U.S. employment growth accelerated or decelerated in the first month of the Trump administration—Trump has already tweeted about an ADP jobs report that was above expections. (ADP is a private-sector payroll services provider.) But the February data will be preliminary and subject to revision, so another important question is “how reliable are the first monthly estimates?”

The Employment Situation combines data from two major surveys—a household survey (the “current population survey”) of about 60,000 households, and an establishment survey (the “current employment statistics” survey) of about 147,000 businesses covering about 634,000 worksites (or “establishments”). The household survey is the primary source for information on labor force participation and unemployment, while the establishment survey is the primary source for information on jobs, hours worked, and wages.

Both surveys report information on employment. The employment concepts in the two surveys are different. The establishment survey counts “jobs” (of course, some individuals hold more than one job), whereas the household survey is counting the employment status of individuals. The establishment survey excludes the self-employed, farm workers, and people working for private households, all of which are covered by the household survey. But in terms of reliability of the employment data, the big difference is the sample sizes. The household survey is much smaller (in terms of number of workers being measured) than the establishment survey, so it is subject to more measurement error. That’s why for job growth, the media focus exclusively on the establishment or payroll number.

The BLS reports information on the reliability of CPS estimates of month-to-month changes based on standard errors of the household data. The table may be difficult to interpret, since it’s set up as a set of significance tests (I don’t think the table actually tells the reader that the null hypothesis being tested is no change for each measure). The column that provides information on the precision of the estimates is labeled “Needed” and can be interpreted as the ± for a 90% confidence interval (or ±1.6 standard errors). For example, according to the preliminary estimate released on February 3, the unemployment rate (measured at two decimal points) increased 0.06 percentage point from December 2016 to January 2017 (though the published unemployment rate for both months rounded to 4.8%). The 90% confidence interval for this change was ±0.17 percentage point. What does this mean in practice? It means we shouldn’t be too concerned if the unemployment increases a tenth, or even two-tenths of a percentage point in any given month, since such a small change may just represent sampling error.

These measures of reliability show quite large confidence intervals for some measures from the household survey. For example, a 90% confidence interval for the change in the unemployment rate for teenagers is ±1.59 percentage points. For changes in labor force participation, the 90% confidence interval is ±484,000 persons, and for changes in household-based employment, it’s ±482,000 persons. In contrast, the 90% confidence interval for changes in establishment-based employment is about ±120,000 jobs, which, again, is why the payroll estimates are the ones everyone pays attention to.

There aren’t a lot of subsequent revisions to the estimates from the household survey—the seasonal adjustment factors and Census population controls are updated annually, but these revisions are generally small and infrequent.

In contrast, many establishments are not able to provide payroll information in time to be included in the preliminary payroll estimates. For the next two months, BLS continues to collect late payroll reports and add them to the sample, which results in revisions. Most attention focuses on the revisions in the second and third monthly estimates, but in addition, each year the BLS benchmarks the payroll estimates to more comprehensive administrative data. The benchmark data are primarily from the unemployment insurance program (or “quarterly census of employment and wages”), and are supplemented with data from other sources for industries that are not covered by the UI program. The seasonal adjustment factors are also updated during the annual benchmark revisions.

BLS publishes a nice table showing the revisions that have taken place between the first, second, and third monthly estimates. For nonfarm payroll employment from May 2003 through the third estimate for November 2016, the average revisions without regard to sign (or the “mean absolute revision”) between the first and third monthly estimates was 42,000 jobs (for the seasonally adjusted estimates) and 46,000 jobs (for the not seasonally adjusted estimates). The table doesn’t show the revisions between the first and “latest” estimates (that is, the estimate that includes benchmark and seasonal revisions), but I was able to add the latest data to a spreadsheet and do the calculation myself. The mean absolute revisions between the first and latest estimates were 58,000 jobs (seasonally adjusted estimates) and 53,000 jobs (not seasonally adjusted estimates).

Other interesting things one can observe in the BLS revisions table are that the revisions were much larger during the 2008–09 recession. The largest monthly revision (first to latest) to seasonally adjusted net one-month employment change was for September 2008, a revision of –291,000 jobs—the first estimate for September 2008 was –159,000 and the latest estimate is –450,000. For not seasonally adjusted monthly net change, the largest revision (first to latest) was for July 2009, a revision of –264,000 jobs.

A final note is that federal agencies are required to follow the 2002 OMB Information Quality Guidelines, which require federal agencies to follow standards of best statistical practice, objectivity, and integrity. As discussed in earlier posts, the OMB has issued statistical policy directives and other regulations that require statistical agencies to regularly provide information on the quality and reliability of the data they produce. In this post, I’ve illustrated some of the available information about the BLS jobs report, but users should expect to find similar information about most other federal statistical data. I’ll add that this situation contrasts with most private-sector data that I’ve used. My impression is that it’s really quite rare for private sector data providers to give detailed information on their methodologies and on the reliability of their data. This is an important reason for supporting the maintenance and improvement of federal statistics.

 

Leave a Reply

Your email address will not be published.