Routinization and Covid-19: a comparison between United States and Portugal

The purpose of this paper is to identify what is the role of automatization in increasing wage inequality, making a comparison between the two countries. Using PSID and Quadros de Pessoal, we ﬁnd that labor income dynamics are strongly determined by the variance of the individual ﬁxed component. This eﬀect is drastically reduced by adding information on workers’ occupational tasks, conﬁrming that decreasing price of capital and the consequent replacement of routine manual workers have deepened wage inequality. During the current crisis, we ﬁnd that the ability to keep working is strongly related with the occupation type. As such, we simulate the impact of a permanent demand shock using an overlapping-generations model with incomplete markets and heterogeneous agents to quantitatively predict the impact of Covid-19 and lockdown measures on wage premium and earnings inequality. We ﬁnd that wage premia and earnings dispersion increase, suggesting that earnings inequality will increase at the expenses of manual workers.


Introduction
Technological progress is considered one of the main drivers behind earnings inequality.
Factor-biased technological change and skill-biased technological change represent two main sources of wage inequality. To this extent, we explore empirically the differences between workers in different categories, according to their occupation tasks, to assess how labor market has been impacted by task premia changes. This paper provides two main contributions to the existing literature. First, we use a 10-rolling window to estimate the evolution of determinants of dispersion in the labor income processes to investigate whether changes in task-premia represent a major source of labor income inequality. Second, we implement an overlapping generations model with incomplete markets to study the role of skill-based technological change in increasing wage inequality and to assess the potential impact of  when people ability to continue working is mostly determined by the type of task they perform. We calibrate the model in order to match US and Portuguese economies using 2010 as benchmark year and we repeat the exercise targeting different working hours ratio per cognitive and manual workers in order to simulate the impact of demand side shocks.  Figure 1 shows the steady rise in wage inequality and wage growth at different percentiles suggesting that both Portugal and U.S. experienced wage polarization at two different time periods. In Portugal, low wages in routine task intensive occupations, combined with the same price of computer capital may limit the gains of substituting workers by machines. We separate agents into non-routine and routine, according to their abilities substitutability with machines, and cognitive and manual, depending on the level of skills required to perform daily tasks. In this framework, we expect the wage premium of non-routine workers to increase, following the drop in investment price and the decrease in tax progressivity 1 , this mechanism is triggered by a drop in routine labor demand by firms and by cheaper capital accumulation.
The trends in labor force composition, figure 2, confirms that Portugal experiences similar patterns of labor market polarization of the U.S., explained by technology advances such as computerization and automation which displace routine tasks, and complement cognitive tasks. There is a clear increase in employment share of non-routine cognitive occupations, these workers are indeed complementary to capital and less likely to be substitute by machines.
Both countries show a decrease in routine manual occupations, in Portugal the change is bigger decreasing from 50% of the labor force in 1987 to 30% in 2017. Routine cognitive occupations remained approximately at the same level in both countries, driven by the increasing importance of the service sector. Non-routine workers, both cognitive and manual, show an upward sloping trend, steeper for cognitive occupations. The differences between US and Portugal are evident in terms of share of composition of the labor force as for U.S.
1 ? and Nóbrega (2020) there is a steady increase in non-routine cognitive employment share from 30% in 1976 to 40% in 2017, in Portugal the same occupation category increases from 3.5% in 1987 to 20% in 2017. The increase in demand for non-routine occupation confirms that Portugal is experiencing labor market polarization but is lagging behind the United States in the adoption of computer capital. Fonseca et al. (2018) claims that routinization is the main cause of this shift in labor force composition in Portugal 2 .
Literature Review Autor et al. (2003) first introduced the concept of routinization hypothesis as the decrease in labor input of routine manual tasks and the increase in labor input for non-routine cognitive tasks. Autor et al. (2006) pointed out that US wages structure widened due to an increase in demand for skills that was driven by skill-biased technical change and a slowdown in the growth of the relative supply of college workers. ? argues that difference in education are important source of inequality and Krusell et al. (2000) found that factor-biased technological change has the strongest impact in determining the increase in wage inequality. Acemoglu & Restrepo (2018) discuss the impact of increasing demand for skilled workers, who are able to perform more abstract tasks, outlining how automation can replace manual tasks in the long-run if the rental rate of capital remains less costly than wages. Also Guerreiro et al. (2017) found that substitutability is higher for routine occupations requiring low skills.
Recent improvements in Artificial Intelligence brought astonishing changes in different fields and is expected to be even more disrupting in the future, Acemoglu & Restrepo (2018) investigate on the trade-off between the displacement effect, change in labour supply cause by automation of tasks which reduces demand for labor, and the overall increase in labor demand triggered by productivity-enhancing technologies. On the other side the creation of new tasks where human capital has a comparative advantage relative to machines , the 2 Workers in the two sample are unlikely to change occupation across the panel, meaning that changes in labor composition are driven by replacement with machines. This can be checked also in transition matrices 18-20 in the Appendix B. reinstatement effect, may counterbalance the displacement effect. These mentioned effect do not grow equally faster, and different economies require different time to absorb efficiently and smoothing these processes, Goos & Manning (2007) argue that the "routinization" hypothesis is the driving factor of the increase in highest and lowest wage occupations in United kingdom since 1975 and Goos et al. (2009) extend the study to Western European group countries explaining job polarization using both routine biased technological change and offshoring. In the spirit of Fonseca et al. (2018) we replicated figure 7: it shows that wage inequality is mainly determined by skills level but, more importantly, the increase in minimum wage had a positive impact for Portugal on the 10th percentile as it may have impacted the wage convergence observed and the growth in wages for manual workers. For U.S. we cannot argue the same as the difference in wage still is clearly not impacted by the raise in minimum wage. Krusell et al. (2000) and Karabarbounis & Neiman (2014) argues that the more recent decline in relative price of investment has been triggered by the investment-specific technological change. ? shows that the previously mentioned drop in demand for routine occupations was concurrent to the decrease in price of information and communication technology capital goods: this drop is responsible for 50% of the drop in labor share.

Data
To divide the workers in different categories according to the level of automation of their job we followed Cortes et al. (2014) PSID The Panel Study of Income Dynamics (PSID) is one of the longest longitudinal study as it includes almost 5000 families followed from 1968 to 2017. Data are collected every year from 1968 to 1997 and biannually from 1997 to 2017. All the information collected are referred to the previous year. The survey contains information both at individual level and family level, in this work we focused on individuals. In particular, to define the sample used for the estimation of the labor income processes we followed Heathcote et al. (2010) approach.
The only difference is that we split households to create a panel for singular individuals and we generate individual characteristics splitting variables based on household composition. Figure 3 shows that PSID sample, despite two minor divergences between 1995-1999 and after 2008, is representative for the US labor market 3 . The sample is made of only heads and spouses of the families where the greatest level of accuracy in the data is guaranteed. Observations with a wage lower than half of the minimum wage 4 have been dropped, also 3 Series for National Income and Product Account have been obtained from Bureau of Economic Analysis website. The series is obtain as the ratio between National Income from Wages and Salaries and Full-time equivalent employees, which includes employees on full-time schedules plus the number of employees on part-time schedules converted to a full-time basis.
4 Minimum wage is calculated hourly for US and monthly for Portugal, source: Federal Reserve Economic Data (US) and OECD Labour Data (Portugal).
individual working less than 260 yearly hours have been dropped out of the samples. Table 1 and table 2 report the two samples that we use for our analysis. For Quadros de Pessoal we followed the approach of Fonseca et al. (2018) re-adapting their method to Heathcote et al. (2010) to have consistency between the two samples.  Impact of Covid-19 The current pandemic situation and the lockdown measures adopted by governments in many countries obliged people to work from home but, simply, many occupations cannot be done from home. To understand and link our results to the recent developments in people working conditions we replicate and improve the mapping made by Dingel & Neiman (2020) 5 conforming it to the PSID and Quadros de Pessoal samples in order to define whether occupations can be performed at home or not. For U.S. we used the same crosswalk between SOCs and Census made for mapping occupation categories, for Portugal the method is described in details in the appendix. The teleworking index we use is based on two O-NET surveys questioning the "work context" and "generalized work activities" and in case that respondents' job need to be done outdoor, or require the use of specific machines for which the use of other facilities is needed, then that occupation cannot be performed at home and the occupation receives a teleworking index equal to 0. We also mapped every worker with three other indexes obtained from O-NET surveys: i) exposition to diseases or infections, ii) contact with others and iii) proximity with the others 6 . to viruses and diseases due to many occupations involved in the health care industry, as for example dental hygienists, critical care nurses, hospitalists and respiratory therapists.  For the U.S., figure 9, there is a clear separation between the non-routine cognitive share of each sector and the others categories; this difference in teleworking could further increase the demand for non-routine cognitive labor and decrease the demand for manual and routine workers. Furthermore, considering that a large part of the labor force is at the bottom of the teleworking scale, earning inequality is very likely to increase. Susceptibility index 9 is quite heterogeneous across sectors, both for the U.S. and Portugal.

Estimation of the labor income processes
One of the main contributions of this work is the estimation of the permanent component dispersion over time both using the previously described samples from PSID and Quadros de Pessoal. We estimate the evolution of the dispersion on the permanent and transitory components of labor income processes overtime following Brinca et al. (2016) andChakraborty et al. (2015). Different characteristics determine the number of efficient units of labour the individual is endowed with, namely age j plus a set of year dummies D ′ t ξ i : The productivity shock u follows an AR(1) process given by: where α ∼ N (0, σ 2 α ) represents the individual permanent ability and ǫ i,t ∼ N (0, σ 2 ǫ ) the idiosyncratic shock to the productivity shock process. Thanks to this specification, we are able to separate the permanent component from the individual fixed effect and the random noise 9 Obtained as a combination of the previously stated 3 measures of infection riskiness.
in the productivity process. This specification outlines the same sources of heterogeneity of Heathcote et al. (2017): (i) the individual fixed effect defines innate individual ability; (ii) the realization of idiosyncratic efficiency shocks determines individual fortune in labor market outcomes and (iii) experience of the individual in the labour market 10 . We inflation adjust the nominal wages using CPI inflation series from OECD with 2015 as base year. We found that the individual fixed component contribution to wage dispersion is increasing overt time, as the ratio between the variance of individual ability and the variance of idiosyncratic shock increases.
To assess the impact of skill-biased and factor-biases technological change, we included dummies for different occupation categories in the above equation and it becomes: This result is robust to different specification: for the US, having also non-workers in the initial sample, we use the Heckman estimation method used in Chakraborty et al. (2015) that use a two step approach to control for selection into the labor market, as described in Heckman (1976) and Heckman (1977), for Portugal, having only workers in the dataset, we use different size for the rolling window as robustness check. More information on the Heckman selection equation can be found in the Appendix.
This change in wage dispersion determinants is originated by different dynamics for U.S. and Portugal. For the U.S. , tables 6 and 7, the variance of individual ability is increasing over time more than the variance of the residual idiosyncratic shock. This increase, together with a decrease in permanent component persistency and the lower impact of individual experience on wage, is likely to have a large effect on long-run earnings, as suggested by Autor et al. (2006) and ?. Including dummies for different tasks, the increase in individual ability dispersion is much lower meaning that different occupation categories can explain two thirds of the total increase in the relative variance of labor income.
For Portugal, tables 9 and 10, the same increase in the ratio is driven by different dynamics 11 as now the noisy component dispersion is decreasing more than individual ability variance, the persistency of the residual increases across years. The impact of individual experience increases particularly from 2006. When we include dummies in the wage regression these trends do not change, but the dispersion of individual ability decreases in size whereas 11 We capture dynamics from 1987 for Portugal, period for which U.S. estimates are different.
the variance of transitory component remains approximately the same. This underlines the impact of investment-specific technological change 12 and the drop in the relative price of investment plays in explaining increases in wage premia and consequently income and earnings inequality.

The Model
The model is an incomplete markets economy with overlapping generations heterogeneous agents and partial uninsurable idiosyncratick risk generating both income and wealth distribution. Households are differentiated into Cognitive and Manual, according to the level of education required to perform daily tasks.

Demographics
In the economy there are J overlapping generations of households, who start life at age 20 and enter retirement at age 65. After retirement, households face an age-dependent probability of death π(j) and when they reach 100 they die with certainty. Time is discrete and one period is 1 year, indeed there are 40 model periods of active work life. Population size is considered to be constant over time. We define the age-dependent probability of survive as w(j) = 1 − π(j), so that the mass of retired agents of age j >= 65 still alive at any given period is Ω j = q=J−1 q=65 w(q). Given that there are no annuity markets, a fraction of households leave unintended bequests, denoted by Γ (per-household bequest), which are redistributed in a lump-sum manner between the household that are currently alive. Moreover, retired households receive a subsidy from the government Ψ.
12 Brinca et al. (2019b) Preferences Agents utility is decreasing in work hours n ∈ (0, 1] and increasing in consumption c and takes the following CRRA representation: In the above equation, χ is the disutility from work and η the Frisch labor elasticity. For retired households, utility function is extended with the scrap value of the bequest they leave to living generations: Technology By means of a linear production technology, intermediate inputs are transformed in consumption and investment goods. A quantity z c t of intermediate input is used to produce one unit of consumption good, that represents the numeraire and it is sold to households and government at price P c t . The transformation technology is: with z c t being the quantity of input paid p z t from a representative intermediate goods firm. Assuming that we are in perfect competition the environment, the final consumption good will have price equals to its marginal cost of production, hence: The investment good, X t uses the transformation technology: where ξ t is the level of technology used in the production of X t relative to the final consumption good and z x t (z) represents the quantity of input z used to produce the final investment good. Through the zero profit condition, the price of the investment good can be expressed as: and ξ t can be interpreted as the relative price of the investment good. The production function used in the economy has constant return to scale and uses capital and labor as inputs, with form y t (z) = F (k t (z), n C t (z), n M t (z)). Moreover r t is the rental rate of capital and w C t and w M t are the costs for cognitive and manual labour. We measure aggregate demand Y t = C t + G t + ξ t X t in terms of the consumption good. Inputs for production maximize firms' profit function: where A t is the total factor productivity, ρ the elasticity of substitution between capital and non-routine labor, φ 1 and φ 2 are factor shares, σ is the elasticity of substitution between the composite of those factors and routine labor. Capital depreciates at rate δ and X t represents the aggregate gross investment; the transition equation is:

Government
The government manages the social security system balancing tax rates for employees and employers, defined respectively by τ ss and τ ss , and benefits paid to retirees Ψ. Expenditures on pure public consumption goods G t , interest payment on the national debt rB t and the lump sum redistribution g t , are assumed to be separable in the utility function and are financed by the government through taxes on consumption (τ c ), labor (τ l ) and capital (τ k ) income. The government uses flat rates on τ c and τ k , whereas the labour income tax follows a non-linear functional form as in Bénabou & Tirole (2002) and Heathcote et al. (2020): where y is the pre-tax labour income, y a the after-tax labour income and θ 1 and θ 2 represent respectively the level and progressivity of the tax schedule. The government budget constraint is defined as follows: with R ss t being the social security revenues and T t the other tax revenues.

Asset Structure
The economy has two types of assets, capital (k) and government bonds (b). The relative price of the equipment good is constant as there is no investment-specific technological change in the steady-state , i.e. ξ = ξ ′ . Moreover, the return rate on the bond must satisfy: that follows the non-arbitrage condition ensuring that investing in capital has the same return as investing in bonds. The state variable observed by the consumer when taking decision is: With the non-arbitrage condition the previous equation can be rewritten as:

Household Problem
In every period the agent is endowed with certain characteristics, as age j, asset position h, time discount factor β ∈ {β 1 , β 2 }, permanent ability α, a persistent idiosyncratic productivity shock u and, according to his skills level, a labor variety supply constant over time s ∈ {C, M }. Consumption c, hours worked n C and nM and future asset holdings h ′ are the control variables of the optimization process. Each household is subject to the budget constraint : which with equations 20 and 21 becomes: where Y N is the labour income of the household after deductions. Hence, the household problem assumes the following recursive form: When household retires the optimization problem is characterized by the age dependent probability of dying π(J), retirement benefits and the bequest motive 13 D(h ′ ) and it can be defined as:

Calibration
The benchmark calibration of the model matches the US and Portuguese economies in 2010.
The exogenous parameters are set to match the data, the endogenous parameters are estimated through simulated method of moments (SSM).

Preferences
The Frisch elasticity parameter follows Brinca et al. (2016) and is set to 1.0, at the same level of the risk aversion parameter.

Taxes and Social Security
We use the previously described labor income tax function proposed by Bénabou & Tirole (2002) for both US and Portugal, we estimate tax income level and progressivity parameters, respectively θ 0 and θ 1 , using labor income tax data provided by the OECD. We then compute 13 Scrap value of the dynamic problem introduced by ? the weighted average over the population of θ 0 and θ 1 for different individuals, depending on whether they are single or married and on the number of children. Social Security parameters, τ ss and τ ss , are estimated from OECD Tax Data and τ c and τ k are taken from Trabandt & Uhlig (2011).

Parameters calibrated using SMM
We use simulated methods of moments to calibrate parameters that do not have an empirical counterpart. This method is used to estimate ψ, β 1 , β 2 , β 3 , β 4 , h, χ, T C , T M , σ C and σ M minimizing the loss function between moments from the model and moments observed in the data: used to match 75-100/all,n C ,n M , K/Y ,w C /w M , σ ln(w);C , σ ln(w);M , Q 20 , Q 40 , Q 60 and Q 80 .

Quantitative Results
Our main experiment consists in estimating how wage and earnings inequality change following the demand shocks caused by the pandemic outbreak. We argue that demand for many jobs that cannot be performed from home, as occupations in the hospitality and leisure services sector, will drop in the long run. Brinca et al. (2020) separate between demand and supply shocks, finding evidences of a predominant negative supply shock in the short run and correlation between both demand and supply shocks and teleworking ability for occupations. In this context, we estimate the impact of COVID-19 outbreak by applying the drop in working hours aggregating the drop in demand for each sector and weigthing occupations by teleworking ability, as we expect firms to adapt to the new social distancing norms. We found a large decrease in monthly hours worked for manual workers in almost every sector and a modest drop in hours worked by cognitive workers. Quadros de Pessoal, for structural reasons, gives a better representation of the effects on the whole labor market, as it includes employees from every industry, PSID includes only a panel of selected families so it does not capture entirely the heterogeneity of demand shocks.
Aggregating results we found that for Portugal the share of cognitive workers increases from 47.2% to 93.1% of the labor force, whereas manual workers decreasese to 6.8% from the pre-covid 52.7%. For the U.S. the impact has the same magnitude, going from 48.9% to 88.1% for cognitive workers and from 51.07% to 11.9% for manual workers. The effects in the short run 14 are quite strong although we expect that once the restrictions measures will be relieved the shock will be smoother and, in the long-run, many occupations will be readapted such that they can be performed from home. This will reduce the overall impact on hours worked but many manual occupation may be permanently replaced. The objective of this experiment is to study the heterogeneous impact of Covid-19 on cognitive and manual workers, and to do that we assume that only 20% of the observed demand shock will be permanent 15 , so the demand shock will be -15.6% for the U.S. and -17.4% for Portugal and the share of hours worked by manual workers will respectively drop to 43.1% and 43.5%. Recalibrating the model to match the decrease in working hours for manual workers, we find that wage premium between cognitive and manual workers increase from the initially observed 0.518 to 1.83 for the U.S. and from 0.624 to 2.19 for Portugal, and the variance of log-earnings from 0.63 to 1.81 for the U.S. and from 0.44 to 1.49 for Portugal. The U.S. are characterized by higher inequality within same occupation-task group but are more advanced in the adoption of technological capital and have a higher share of skilled human capital. Portugal delay in using new technologies will foster a higher demand for cognitive-task occupations, which, in turn, will raise wage premium for cognitive workers.

Conclusions
In this paper we study the role of task complementarity in explaining an important component of earnings inequality, namely the task wage premia. As the relative price of capital drops, workers whose tasks are complementary 16 with capital tend to observe an increase in demand, whereas workers whose main tasks are substitutable 17 , observe a drop. Empirical findings show that Portugal is experiencing the same labor market trends but is still lagging behind behing the U.S. due to the lower supply of skilled human capital which slows down 14 Figure 6 15 Calculated on the shock estimated from data. 16 In our taxonomy, workers who perform mostly non-routine tasks involving cognitive work. 17 Workers who perform mostly routine tasks involving manual work.
the adoption of computer capital. We estimate income processes for US and Portugal, based on PSID and Quadros de Pessoal respectively, and find that in both instances, the variance of wages that is explained by an increase in the variance of permanent differences across individuals relative to the variance of transitory shocks is increasing over time. Under the assumption that workers tend to say in the same task-type occupations over their life course, the impact of changes in the relative demand of routine vs non-routine type of work on wage premia is going to be captured mainly through individual fixed effects. When we include dummies for the type of occupation the worker has, we can explain about two thirds of the total increase in the relative variance of earnings for the US and about 30% of the same increase for Portugal in the overall sample. This stresses the role that investment-specific technological change and the drop in the relative price of investment plays in explaining increases in wage premia and consequently income and earnings inequality. The recent Covid-19 pandemic is also likely to have an impact on earnings inequality, as low wage manual and routine workers are being disproportionally affected, since these tasks typically involve physical contact and cannot be performed from home. In order to study the impacts that social distancing may have on inequality in the future, we simulate a permanent change in the demand for workers in those occupations. We study these counterfactuals in a structural model and find that wage premium and variance of log-earnings increase significantly for both the US and Portugal, even if only a fifth of the observed drop in the relative demand for manual workers is observed in the long run. This relative drop in demand is justified by the fact that manual workers tend to be over-represented in jobs that are most affected by social distancing policies and less doable from home. In future works, we want to study the effects of the pandemic on wage and earnings inequality from the supply side and divide workers according to the four categories initially used in the empirical analysis. This would allow us to capture entirely the heterogeneous effects of demand and supply shocks on different workers categories. Figure 6: Decomposition of demand shocks between sectors in April 2020.  Year 1985Year 1986Year 1987Year 1988Year 1989Year 1990Year 1991Year 1992Year 1993Year 1994Year 1995Year 1996Year 1997   Year 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997

Heckman correction on returns to experiences and shocks processes
We use Heckman's selection model to control for selection bias only for PSID, as it contains information on non-workers, through a two-step statistical approach that will correct for the non-randomly selected sample. The first step consists in estimating the probability of entering the labor force through the selection equation: where Z includes education, age, marital status and number of children. As we are we are using rolling window to capture the dynamics in the income process, time dummies for the specific window are used together with an interaction term between education and age. From these estimates the inverse of the Mills ratio, λ i , is stored for each observation (λ , with φ being the normal density and Φ the normal CDF), and we use it to obtain consistent estimate of the conditional expectation of logwage: u it is then modelled as an AR(1) with panel data to separate the individual fixed effect from the permanent and the idiosyncratic components,

Stationary Recursive Competitive Equilibrium
An agent with characteristics (j, h, β, a, u) has measure Φ(j, h, β, a, u). We define the recursive competitive equilibrium in the following way: 1. The household's optimization problem is solved dynamically through the value function V (j, h, β, a, u) and the policy functions c(j, h, β, a, u), h ′ (j, h, β, a, u) and n(j, h, β, a, u), given factor prices and initial conditions.
6. The assets of the deceased at the beginning of the period are uniformly distributed among the living: Γ w(j)dΦ = (1 − w(j))hdΦ. After having created a full correspondence between the three codes, we defined a multiple 19 Fonseca et al. (2018) matching is at 2 digits level. 20 We use the official crosswalks documents from the Bureau of Labor Statics. Some Official Crosswalks have been used in combination with files available on David Author's website. 21 For multiple matching, we used the first occurrence in the list manually checking their consistency.
dictionary that maps every ISCO-08 code to multiple Census values. The approach we followed here is based on Dingel & Neiman (2020)

Tax Function
Given the tax function ya = θ 0 y 1−θ 1 which we employ, the average tax rate is defined as ya = (1 − τ (y))y thus, θ 0 y 1−θ 1 = (1 − τ (y))y which implies: (1 − τ (y)) = θ 0 y −θ 1 In this way, the tax wedge for any two incomes (y 1 ;y 2 ) is given by: and therefore independent of the scaling parameter θ 0 . In this manner, one can raise average taxes by lowering θ 0 and not the progressivity of the tax code, since the progressivity is uniquely determined by the parameter θ 1 .

Information on O-NET Surveys
Exposition to diseases or infections This survey is based on the question "How often does this job require exposure to disease/infections?" and it is calculated as follows: Contact with others This survey is based on the question "How much does this job require the worker to be in contact with others (face-to-face, by telephone, or otherwise) in order to perform it?" and it is calculated as follows:     Characteristics of PSID "Head" and "Spouse" For each family, the head component represents the person with the most financial responsibility in the household unit and has at least 16 years old. The head can also be female, and it is the case when she is married and her husband is present in the financial unit, also if she has a boyfriend and they are living together for at least one year. When the head of a family die, become incapacitated, or simply move out a new head is selected for the next surveys. Also, if the family splits then a new head is chosen and a new family unit is created, with the respective new head.
Heads are defined in the panel by using the sequence number 1, meaning that they represent the reference person in the household, in combination with the variable "Relation to Head" equal to 1 before the survey wave of 1983 and 10 after. Spouses have sequence number 2, and relation to head 2 before 1983 and 20 or 22 after (The latter indicates female cohabitors who have lived with Head for 12 months or more or who was mover-out nonresponse by the time of the interview) File structure and data quality of the PSID Data have been retriwed from PSID website, where both family-level series and individual-level series have been used to import or generate time consistent series for different variables. Information from household variables have been disentangled to match only the relative individual to which they were referred to, and mainly all the variables used are from this source. The only variables imported from individual-level data were "Relation to Head" and "Interview Number 1968 Variable Definitions Most of the series contained in the family-level data are consistent and can be directly used, however some of them have been changed over the years, in these cases specific amendments have to be done. A specific description of all the variables modified follows here: − Education: Total grades completed by the individual at the moment of the interview, before 1984 a unique variable included all type of education independently of whether it was college or high-school, after that the series has missing years and restarts only after 10 years, to overcome this issue we used the combination of two other series specifying respectively the years of education before college and years of college achieved.
− Wage and Income from Labor -Head: Total income from wages and salaries plus overtime, bonuses, commissions and other job-related income, which are unified till 1993, after that all extra-wages source of income are splitted in different series.
− Wage and Income from Labor -Spouse: Total income from labor, in 1984 any income from farming, business, market gardening, or roomers and boarders, labor-asset has been added to the series. The respective series with these amount have been used to clear and obtain only income from labor.
− Sex of Spouse: This variable has been imputed using combination of Sex of Head, Relation to Head and sequence number.