Labor Market Equity Simulation Detailed Methods


Federal Reserve Community Development Staff

Labor Market Equity Simulation Detailed Methods

<< View the simulation

Download data (xlsx, 102 kb)

We apply the standard shift-share techniques from a recent Federal Reserve working paper to simulate conditions in a counterfactual world where racial and gender gaps in the employment-to-population ratio, educational attainment, average hourly earnings, and average hours worked did not exist for each of the 50 US states and District of Columbia. (We use “race” as a shorthand for “race and ethnicity.” Individuals who identify as Hispanic may be of any race.)

Using the labor market variables listed above, we simulate the labor component of aggregate gross domestic product (GDP) in the economy for each state. (Note that ours is not a general equilibrium model, and is not intended to holistically describe state economies; simulated GDP in our calculation will differ from other GDP measures.) Then, we simulate what the total output could be if gaps were closed. The difference between the two is our main outcome variable: simulated gains to GDP for each state. To simulate closing racial and gender gaps, we set the floor for each labor market measure equal to that of white men (the group that on average has faced the fewest structural barriers to economic opportunity). To simulate closing gender gaps only, we set the floor to the value of same-race men; to simulate closing racial gaps only, we set the floor to the value of whites.

For groups with real-world labor market measures above those of their floor group, no changes to those measures are made for our simulation. That is, in cases where a labor market measure for a historically marginalized group is in reality higher than for its floor group, our simulation does not result in a GDP gain for that measure. In addition, the total simulated gain to GDP differs from the sum of the GDP gains that come from closing the gaps in each labor market variable. This is due to the interacting effects of these variables on GDP, which our methodology for each individual measure does not capture. Further, because education in our approach acts to increase GDP through average hourly earnings, there exist some cases in which closing gaps in educational attainment does not result in a simulated GDP gain due to persistent gaps in average hourly earnings.

To simulate economic gains at the state level, we modify the original methodology as follows: we aggregate the US Census Bureau American Community Survey (ACS) Public Use Microdata Samples 1-year files over 2005 through 2019 and calculate annualized economic gains for each state. We use data over this time range to increase sample sizes and maximize the number of groups we could include (age, race/ethnicity, education, and state), which we determined appropriate given the long-standing structural nature of the disparities. Due to data limitations, educational utilization (the degree to which workers’ jobs utilize their education) and industry-occupation allocation are not included in the analysis.

For the calculation of state-level economic output we elected to retain the greatest amount of racial/ethnic disaggregation possible to account for disproportionate impacts of racialized policies and practices over time. We employ mutually exclusive categories of race/ethnicity where Hispanic refers to Hispanic ethnicity of any racial category and the categories of white, Black, American Indian or Alaska Native, Asian, and all other races (which includes Native Hawaiian or Other Pacific Islander, some other race alone, and two or more races) refer to non-Hispanic racial categories. We elect to use these terms, but we would like to acknowledge the evolving and complex nature of terminology to describe race/ethnicity and that many people may prefer alternate terminology to identify themselves. We are unable to disaggregate subgroups within the category of Asian race but wish to acknowledge that this heading represents many diverse subgroups who have faced varying levels of economic disadvantage.

For earnings and hours calculations, we limit our sample to employed individuals (both full-time and part-time) and exclude those in the top and bottom 1% of average hourly earnings for each state to have more reliable estimates. Furthermore, we impose the following sample restrictions: only include individuals aged 25-64; exclude self-employed workers, those working on family farms, and those with missing employment information; exclude individuals in the armed forces; exclude those with zero or negative earnings; and exclude those reporting zero average weekly hours worked in the last 12 months. We define education groups as follows: high school or less, associate degree, bachelor’s degree, and master’s or above. We calculate hours worked (which includes hours for both full- and part-time workers) using continuous values when available. When only available in categories (“40-47 weeks”), we use the midpoints of categories. For real annual earnings we use CPI-U adjusted earnings to 2019 dollars, making values comparable over time. We account for differences in labor market outcomes by age by dividing our data into 10-year age groups: 25-34, 35-44, 45-54, and 55-64. To calculate the employment-to-population ratio, we divide the number of people in the civilian labor force currently employed by the total working-age population (ages 25-64) of the state.

For state-level summary statistics in the visualization we do not show data for racial/ethnic groups and gender categories for which fewer than 100 observations were available. In our simulation calculations, we repeated our analysis excluding groups with fewer than 30 observations and our estimates did not markedly change. Based on this robustness check we do not remove these groups from the final analysis.

A limitation of our analysis is that our simulation does not include the Commonwealth of Puerto Rico, the US Virgin Islands, American Samoa, Guam, and the Commonwealth of the Northern Mariana Islands. This is due to limited data availability as well as the differing histories, contexts, and demographics of those places.