A Unified Framework for Examining the Effect ofRetirement on Cognitive Performance Preliminary DraftJános K. Divényi †Central European UniversityJuly 1, 2015AbstractAll papers dealing with the effect of retirement on old age cognitive performance aretrying to sell their strategy as the only best. I choose another way to act: instead of throwingout all previous work I put the pieces together to see the broader picture. I draw the attentionto potential biases and assess their magnitude. Then, I build a new approach on the lessonslearnt from the studies and utilizing the panel structure of the Survey of Health, Ageing andRetirement in Europe (SHARE) I show that if retirement has any adverse effect on cognitiveperformance it should be really small in magnitude. This paper is improved upon my Master thesis at Central European University. I thank Gábor Kézdi for advice,Zsombor Cseres-Gergely, Gábor Rappai and participants of the Hungarian Economics Association/Pécs Universityof Sciences, Faculty of Business and Economics’ 2013 summer workshop for doctoral students and of the VeniceInternational University Summer Institute on Ageing for helpful discussion and comments. The analysis was carriedout in Stata with the help of the ivreg2 package of Baum et al. (2014) and the tables were produced by the estoutpackage of (Jann, 2005, 2007). The corresponding codes and the actual version of the paper can be found on GitHub.First draft: May 24, 2012.† divenyi [email protected]

Introduction11In developed countries, increased life expectancy, together with the parallel decline in averageretirement age, has increased the average spell of retirement in the last decades (e.g. from 10.5years in 1970 to 19.8 years in 2007 for Germany2 ). Even if eligibility ages have been raised recently,people often spend 15-20 years of their lives as pensioners, which makes this phase of their lifemore and more relevant. Beside the individual level, the period of retirement is also of growingimportance at the social level as well, because the proportion of retirees is increasing in the agingpopulation. As a natural consequence, various fields of research began to deal with the quality oflife of retirees. In this agenda, a particular aspect – namely the cognitive performance of old ageindividuals – has captured the attention of economists as it highly influences the decisions theymake forming their consumption or saving behavior which affects the work of the economy toan increasing extent. Therefore, the age profile of cognitive abilities at the later stages of life isfundamental for many fields from marketing to pension and health policy.It has been widely documented that individual cognitive performance tends to decline in olderages. According to Schaie (1989) cognitive abilities are relatively stable until the age of 50 butbegin to decline afterwards. decline in older ages. However, there is large heterogeneity in theprogress of cognitive decay, raising the natural question of what are the driving forces behindthe decay and whether there is a way to decelerate it in order to maintain cognitive abilities aslong as possible. A popular hypothesis, which is often called as use-it-or- lose-it hypothesis (seefor example Rohwedder and Willis, 2010), suggests that the natural decay of cognitive abilitiesin older ages can be mitigated by intellectually engaging activities. Thus, retirement which goestogether with the ceasing of cognitively demanding tasks at work, might accelerate the naturaldeclining process, having a causal effect on cognition. In this respect, the notion of retirementsimply refers to not working, and thus incorporates a broader definition than usual (for example,people on disability benefit or who are unemployed could also be regarded as retirees).Many papers have been investigating recently whether retirement has a causal effect oncognitive abilities in developed countries, yet the results they have delivered are ambiguous. Theinconclusive results are most likely due to the difficulty of identification and the resulting varietyin the identification strategies.The main problem is the endogeneity of retirement: a simple comparison of cognitive abilitiesof retirees and employees is likely to lead to biased estimates, as retirees and employees couldhardly be considered as randomly assigned to their groups. One can conveniently argue that thedecay of cognitive abilities may induce the individual to retire, that is there is reverse causality1Imay need to reconsider this.Demographic Pulse, March 20112 Allianz1

going from cognition to retirement, that may result in overestimating the retirement effect oncognition in a simple comparison, even if we control for age. The other part of the confusionmight come from the difference in terms for which the effect is identified (the short run effect islikely to differ from the long run one).In this paper I apply a novel identification strategy which aims to handle the problems whichthe current literature suffer from. I make use of the first, second and the fourth waves of theSurvey of Health, Ageing and Retirement in Europe (SHARE), 3 which collects rich multidisciplinarydata about the socio-economic status, health (including cognitive functioning), and other relevantcharacteristics (like social networks) of people aged 50 or over across 10 developed Europeancountries. I identify the yearly effect of retirement on a long run panel by applying a differencein-differences approach. Besides accounting for time- invariant individual heterogeneity andcontrolling for past labor market status, I also handle possible endogeneity by using public policyrules as instrumental variables. To my knowledge, this paper is the first which makes use of alarge longitudinal cross-country sample to go after this effect. Contrary to previous findings, myresults suggest that retirement does not seem to cause harm for cognition.2ModelA general way to model parametrically the relationship between cognitive performance andretirement is the following:CPi α f (YRi ; β) ui0u i Xi γ ε i(1)(2)where CPi denotes the cognitive performance of individual i, YRi is the number of years theindividual has spent in retirement (i.e. not working). I allow for the cognitive performance todepend upon these years through an arbitrary function f where β expresses the marginal effect ofone more year spent in retirement. The term ui contains all factors associated with CPi except for3 Thispaper uses data from SHARE wave 4 release 1.1.1, as of March 28th 2013(DOI: 10.6103/SHARE.w4.111) orSHARE wave 1 and 2 release 2.6.0, as of November 29 2013 (DOI: 10.6103/SHARE.w1.260 and 10.6103/SHARE.w2.260)or SHARELIFE release 1, as of November 24th 2010 (DOI: 10.6103/SHARE.w3.100). The SHARE data collection hasbeen primarily funded by the European Commission through the 5th Framework Programme (project QLK6-CT2001-00360 in the thematic programme Quality of Life), through the 6th Framework Programme (projects SHAREI3, RII-CT-2006-062193, COMPARE, CIT5- CT-2005-028857, and SHARELIFE, CIT4-CT-2006-028812) and throughthe 7th Framework Programme (SHARE-PREP, No 211909, SHARE-LEAP, No 227822 and SHARE M4, No 261982).Additional funding from the U.S. National Institute on Aging (U01 AG09740-13S2, P01 AG005842, P01 AG08291, P30AG12815, R21 AG025169, Y1-AG-4553-01, IAG BSR06-11 and OGHA 04-064) and the German Ministry of Educationand Research as well as from various national sources is gratefully acknowledged (see for afull list of funding institutions).2

YRi such as age. Equation (2) makes this dependency explicit where Xi is the vector containingthese factors.Clearly, E[CPi YRi ] α f (YRi ; β) E[ui YRi ]. Assuming that we know f and have agood measure for CPi the parameter of interest (β) can be consistently estimated if E[ui YRi ] 0.However, this is hardly the case. There are two sources which make the exogeneity assumptiondubious: (1) omitted variable bias and (2) reverse causality.Omitted variable bias. There are lots of factors which are associated with the cognitiveperformance and also the years spent in retirement. These are factors in Xi which are correlatedwith YRi . The most obvious candidate is age: older individuals are expected to have spent moreyears in retirement and they also have worse cognitive skills due to age-related decline. Educationis also incorporated in Xi : worse educated individuals retire earlier and they also have worsecognition. One should take care of these factors when estimating the effect of retirement oncognitive performance. The main challenge here is that we do not know exactly what factors arein Xi .Reverse causality. One can conveniently argue that the decay of cognitive abilities mayinduce the individual to retire, that is there is reverse causality going from cognition to retirement.That may result in overestimating the retirement effect on cognition in a simple comparison, evenif we control for all factors in Xi .Most attempts trying to uncover β apply instrumental variables, as they might be able toeliminate both of the problems. Good instrumental variables (let us denote them by the vectorZi ) satisfy two requirements: first, they are correlated with the possibly endogenous variable(Cov( Zi , YRi ) 6 0), and second, they are related to the cognitive performance only throughyears of retirement (E[ui Zi ] 0). If these two assumptions hold, both omitted variable bias andreverse causality is overcome.3DataMost papers which are after the effect of interest use the same sources of data provided by threelarge longitudinal surveys: the Health and Retirement Study (HRS), the English LongitudinalSurvey of Ageing (ELSA) and the Survey of Health, Ageing and Retirement (SHARE).Aiming to provide a multidisciplinary data about ageing, the United States of America launchedthe Health and Retirement Study (HRS) in 1992, and since then the study has collected detailedinformation about socio-economic status, health (including cognitive functioning), and otherrelevant characteristics (like social networks) of people aged 50 or over. Respondents of thesurvey are visited biannually and put through in-depth interviews to collect rich panel microdata about aging population. The English Longitudinal Survey of Aging (ELSA) was designed3

according to the HRS with its first wave launched in 2002. 2 years later Continental Europe alsodecided to set up an aging database by establishing the Survey of Health, Ageing and Retirementin Europe (SHARE), a cross-nationally comparable panel database of micro data. SHARE startedwith 12 countries (Austria, Belgium, Denmark, France, Germany, Greece, Israel, Italy, the Netherlands,Spain, Sweden and Switzerland) in 2004 with wave 1, three countries (the Czech Republic, Irelandand Poland) joined in wave 2, and another four countries (Estonia, Hungary, Portugal and Slovenia),joined in wave 4. The three surveys (HRS, ELSA and SHARE) are carefully harmonized, andthus provide an excellent basis for cross- country investigation of aging population in developedcountries.What makes the surveys appropriate for this particular analysis is that they include a batteryof tests about cognitive abilities (memory, verbal fluency and numeracy). The test of memory isdone as follows: 10 simple words are read out by the interviewer and the respondent should recallthem once immediately after hearing and then at the end of the cognitive functioning module.As a result, both immediate recall and delayed recall scores range from 0 to 10. Often, the twovariables are merged to a composite one by adding them up, which is called total word recall.Verbal fluency is tested by asking the respondent to name as many distinct animals as she canwithin one minute. The length of this list provides a measure for verbal fluency. SHARE alsoconsists of several questions about individual numeracy skills. Respondents who answer the firstone correctly get a more difficult one, while those who failed get an easier one. The last questionrequires the respondent to calculate compound interest. The number of correct answers to thesequestions provides an objective measure of numeracy ranging from 0 to 4. Finally, there is a testof orientation of four questions which examines whether the respondent is aware of the dateof the interview (day, month, year) and the day of the week. This test may be used to detectindividuals with serious cognitive problems or progressed dementia.Various measures of cognitive skills might grab its different aspects as argued in Mazzonnaand Peracchi (2012). As most of the papers use the results on memory tests I also focus on thatmeasure for comparison purposes. To have a common unit I use standardized scores to expressscales in standard deviation.Throughout the paper I make use of the first, second and fourth waves of SHARE. The thirdwave of data collection (SHARELIFE) is omitted, as it is of different nature: it focuses on people’slife histories instead of current characteristics.4ReplicationsIn this section I replicate the main results of the literature, namely that of Rohwedder and Willis(2010), Mazzonna and Peracchi (2012), Coe et al. (2012) and Bonsang et al. (2012). I put all4

of these results in my unified framework and show that their differing conclusions actuallyfit in the broader picture. The ambiguity of their results stems from the differences in theiridentification strategies that implies that their estimated “effects” of retirement on cognitiveperformance measure different kinds of things.The papers differ in three crucial aspect: first, what is their assumption about how retirementshould affect cognitive performance (i.e. what is their assumption for f ), second, which factorsthey are controlling for from Xi , and third, what is their choice for instrumental variable. Ofcourse, they also differ in the data they use for estimation but considering the goal of uncoveringa general relationship this fact does not really matter (as far as the measurements are comparableacross the datasets).The structural equation the papers try to estimate could be summarized as follows:0CSi α f (YRi ; β) Xi γ ũi0(3)(4)ũi X̃i γ̃ ε iwhere CSi is a cognitive score, a measurement of cognitive performance. This formulation helpsto differentiate between factors which are controlled for (Xi ) versus factors which remain in theerror term (X̃i ). To get a clear causal effect equation (3) is estimated by a 2SLS procedure wherethe first stage is00YRi Zi π Xi ρ νi(5)For now on let us assume that the cognitive measurements detailed in the previous sectiondescribe well the actual cognitive skills. To be more precise, I assume that CPi CSi ei where eiis a classical measurement error in the dependent variable, i.e. Cov(ei , CSi ) Cov(ei , YRi ) 0.In this case our estimators remain consistent although less precise. Later on I will ease thisassumption to assess how crucial it is.All of the papers use various public policy rules to instrument retirement (such as pensioneligibility rules). Such rules are good candidates for instrument as they vary across country andgender and are strongly correlated with employment status. The crucial question is whetherit also satisfies the other assumption, namely the assumption of exogeneity. Using the formalnotation the exogeneity assumption can be expressed as E [ũi Zi ] 0. It essentially says thatthere is no systematic difference related to cognitive performance between an eligible and a noneligible individual in the sample (once controlling for some other features).Rohwedder and Willis (2010) provide the first serious attempt to uncover the causal relationshipbetween retirement and individual cognitive performance. Their framework is the simplest oneamong the papers: they only include a dummy for not working on the right hand side on a5

restricted sample of people aged between 60 and 64. This is equivalent to estimating the averageeffect of retirement on cognition conditional on the average duration of retirement the sample,that is assuming that f (YRi ; β) β̃1(YRi 0) where β̃ βYRi . Beside restricting the sampleon a narrow age-range they do not include anything in Xi . To handle endogeneity they usepublic pension eligibility rules as instruments: whether the individual is eligible for early or fullbenefits. See Table A.1 for a summary of the methodologies.Rohwedder and Willis (2010) estimate their model on the 2004 waves of SHARE, ELSA andHRS, and find that retirement has a large adverse effect on cognition among 60-64 years old,amounting to one-and-a-half standard deviation. Unfortunately, they do not report the averageduration of retirement in their sample so I cannot convert this number to yearly average.Using only the first wave of SHARE (and thus having a much smaller sample than theirs,4464 versus 8828 observations) I was able to replicate their main findings (see the first column ofTable 1). The pattern is the same: retirement seems to decrease cognitive performance. However,my estimation is somewhat smaller, amounting to only 1 standard deviation. Considering thatthe average duration of retirement in my sample is 6.6 years, it could be translated to an averageyearly decline of 0.15 standard deviation.Table 1: Comparing the methodology of Rohwedder and Willis (2010)by two versions of the instrumental variable: 2SLS estimation(1)Rohwedder and Willis (2010)(2)Mazzonna and Peracchi (2012)Retired 1.010***(0.14) )Observations4,4644,464Both result are from the second stage estimation of CSi α βRi ui whereRi is a retirement dummy (1(YRi 0)) which is instrumented by early andnormal eligibility dummies. The coefficient of interest in Rohwedder andWillis (2010) is –4.66*** on a sample of 8828 observations which amounts to1.5 standard deviation.* p 0.1, ** p 0.05, *** p 0.01. Standard errors in parentheses.In order to be able to interpret the previous result as causal effect we should be sure thatE [ũi Zi ] 0. Clearly, eligibility rules are not related to unobserved individual idiosyncrasies incognition, as they generally refer to everyone. So using the instrument indeed helps to handleour problems. However, there are other factors left in ũi which are unlikely to be uncorrelated6

with the instrument. For example, in most countries eligibility rules differ for males and females:women tend to become eligible earlier. Women also have higher memory scores than men inthe same age (for this sample the mean difference amounts to 0.17 standard deviation). Notcontrolling for gender is likely to lead to underestimated effects as women with better scoresare overrepresented in the eligible population. Bingley and Martinello (2013) draw the attentionto the fact that people from different countries might differ in their average education as well(e.g. because of different compulsory schooling laws affecting today’s pensioners). As differentcountries also have different eligibility rules, ignoring schooling is also likely to undermine theexogeneity of the instruments. Bingley and Martinello (2013) show that countries with highereligibility ages also tend to have better educated old age people, and thus the effect of Rohwedderand Willis (2010) is overestimated. The failure of the exogeneity assumption makes the causalinterpretation of the results in the first column of Table 1 questionable.There is an easy way to improve upon the estimation of Rohwedder and Willis (2010). Theyuse eligibility rules that were in effect at the time when the interviews were conducted. Mazzonnaand Peracchi (2012), in contrast, use the same eligibility rules but consider that the rules mighthave had some changes. For each individual they apply the eligibility rules which were in effectfor the individual’s cohort. This way they have some variation in the rules within country-gendercells. Looking at first stage regression results in Table 2 we can see that both instrumentalvariables reach the same level of relevance (which is also comparable to that of reached byRohwedder and Willis (2010)). However, using this IV results in a significant drop in the coefficientof interest (see the second column of Table 1). Introducing within-country- gender variation intothe instrumental variable leads to halving of the effect, to 0.075 standard deviation decline peryear.The methodology of Mazzonna and Peracchi (2012) differs from that of Rohwedder and Willis(2010) not only in respect of the instrumental variable. They also assume a different functionalform, and control for a different set of features. Instead of using just a retirement dummy (and thusestimating the effect conditional on the average duration of retirement) they enter the numberof years spent in retirement linearly in the equation (i.e. they assume that f (YRi ; β) βYRi ).They control for age in a different manner: instead of restricting the sample to 60-64 years oldthey control for age linearly in a sample of people aged 50-70. They also control for countrydummies and estimate the equation separately for men and women.Table 3 summarizes the results of moving from the strategy of Rohwedder and Willis (2010)to that of Mazzonna and Peracchi (2012) step by step. Table A.2 shows the corresponding firststage regression results.Lessons from Table 3 in bullet points:(1) 0.05 yearly 0.08 average from second column of Table 17

Table 2: Comparing the methodology of Rohwedder and Willis (2010)by two versions of the instrumental variable: first stage(1)Rohwedder and Willis (2010)Eligible for early benefitsEligible for full benefitsConstantObservationsAdjusted R2(2)Mazzonna and Peracchi 40.06434,4640.0650Both result are from the first stage estimation of CSi α βRi ui where Ri is aretirement dummy (1(YRi 0)) which is instrumented by early and normal eligibilitydummies. The corresponding coefficients for early and full benefits in Rohwedder andWillis (2010) are 0.19*** and 0.16**, respectively, with the adjusted R2 being 0.059 on asample of 8828 observations.* p 0.1, ** p 0.05, *** p 0.01. Standard errors in parentheses.(2) Extending the age range does not really matter.(3) Restricting to those with labor market history makes the effect a bit larger (make the sampleof retirees more elite which is OK in this case as the main goal is to assess the effect ofmoving from working to retirement).(4) Controlling for age delivers weird results. The effect doubles and the coefficient on ageis positive: until retirement age seems to improve cognitive performance, after that itdecreases by 0.14 standard deviation. This could be explained by country differences:as Bingley and Martinello (2013) draws the attention to, eligibility age and schooling ispositively correlated (in my sample the correlation is 0.20 and 0.14 for the early and normaleligibility age, respectively). Therefore, comparing two individuals with the same agebut differing years after eligibility likely means comparing two individuals from differentcountries with the older one being from the better educated country. This reasoning justifiesthe positive age coefficient and underlines the importance of controlling for both age andcountry.(5) Controlling for country indeed solves the puzzle of positive age coefficient, changing itssign to the expected negative. However, now the coefficient of interest changes sign andgets positive. The unexpected sign results again from omitted variable bias: gender is notcontrolled for. As mentioned previously, women perform significantly better on memory8

Table 3: Moving from the strategy of Rohwedder and Willis (2010)to that of Mazzonna and Peracchi (2012)(1)aged 60-64Years in retirement 0.051***(0.0072)(2)aged 50-70 0.053***(0.0021)(3) worked at 50 0.083***(0.0029)AgeConstant0.376***(0.051)Country dummiesNoObservationsWeak IV F 1)No17,4481614.4813,9736337.18(4) age(5) country 0.176***(0.016)0.167***(0.035)0.047***(0.0077) 0.116***(0.017) 355.57Weak IV F statistic is calculated according to Angrist and Pischke (2008). Stock et al. (2002)suggest that an F below 10 should make us worry about the potential bias in the IVestimation.* p 0.1, ** p 0.05, *** p 0.01. Standard errors in parentheses.tests (controlling for age) than men (for this sample, they are by 0.28 standard deviationbetter. Thus, when we control for both age and country, we mainly identify the retirementeffect from gender variation. To see that this is really the case, check the results in the firsttwo columns of Table 4 where I also included a control for gender. The positive sign ofthe coefficient of interest reverse back to the expected negative. The next two columns ofthe table shows the same result when the numeracy score is used to measure the cognitiveskills. Women perform on average by -0.27 standard deviation worse on the numeracytest and correspondingly, we see larger negative effect of years in retirement on numeracywhen not controlling for gender.There is one more puzzle in Table 4. Why is the coefficient on age is positive for numeracywhen controlling for country but not for gender? (The same coefficient is negative for TWR.)There is a possible explanation for that. Interestingly, as the sample ages so decreases the shareof women (mortality rates would predict the opposite). The coefficient on age is mainly identifiedon non-eligible population (as for eligible population the age effect is actually the sum of thecoefficients on age and years in retirement). As women are better in memory tests, and they areless in relatively older cohort, the composition effect implies a negative coefficient for age. Bycontrast, the opposite is true for numeracy, composition effect implies a positive coefficient forage as women are worse in numeracy. Controlling for gender eliminates the level differences in9

Table 4: Control for gender(1)TWR(2)TWRYears in retirement0.167***(0.035) 0.197***(0.040) 0.390***(0.047) 0.010(0.036)Age 0.116***(0.017)0.059***(0.019)0.165***(0.022) 0.018(0.017)FemaleConstant(3)(4)numeracy numeracy0.291***(0.022)6.344***(0.86) 2.792***(1.00) 0.303***(0.019) 8.054***(1.15)1.478*(0.88)Country dummiesYesYesYesYesObservationsWeak IV F .95Weak IV F statistic is calculated according to Angrist and Pischke(2008). Stock et al. (2002) suggest that an F below 10 should makeus worry about the potential bias in the IV estimation.* p 0.1, ** p 0.05, *** p 0.01. Standard errors in parentheses.cognitive scores by gender, but possibly different rate of decline (i.e. differential retirement effectfor men and women) could further complicate the results and make it hard to assess the directionof possible bias.That is why Mazzonna and Peracchi (2012) estimate the equation separately for men andwomen. Table 5 show my replication for their strategy for total word recall and numeracy.According to my results, the rate of decline is indeed different in which the relatively betterperforming gender suffers a larger decline. These estimation is the closest to that of Mazzonnaand Peracchi (2012), the only difference is in the use of measurements: they adjust the cognitivescores by the time spent on answering them that I do not. This might be a reason why myestimates do not lie really close to theirs. In columns A of their Table 5 they report a consistentnegative effect of retirement amounting to 0.006-0.015 standard deviation per year (I rescale thecoefficients to standard deviations using the reported numbers in their Table 3). My results showbigger effects but they are also less precise. However, the gender difference in the rate is declineis the same. I was also able to broadly reproduce the simple OLS results which are available uponrequest.There is an additional problem which could contaminate the results. According to previousevidence (Banks and Mazzonna, 2012) education matters in old age cognitive skills even if gained10

Table 5: Estimating separately by gender, closest to Mazzonna and Peracchi (2012)(1)TWR, men(2)TWR, women(3)numeracy, men(4)numeracy, womenYears in retirement0.017(0.039) 0.043*(0.025) 0.061(0.040) 0.036(0.025)Age 0.042**(0.018) 0.016(0.012)0.009(0.019) 0.009(0.013)Constant2.475***(0.95)1.252**(0.63) 0.006(0.99)0.830(0.64)Country dummiesYesYesYesYesObservationsWeak IV F eak IV F statistic is calculated according to Angrist and Pischke (2008). Stock et al.(2002) suggest that an F below 10 should make us worry about the potential bias in theIV estimation.* p 0.1, ** p 0.05, *** p 0.01. Standard errors in early stages of life. Today’s pensioners are highly affected by the expansion of average schooling:the 50 years old cohort spent on average 2.7 years more in school than the cohort of 70. Mazzonnaand Peracchi (2012) try to control for education by including a low-education dummy (they alsointeract this dummy with the effect) and show that education indeed plays a significant role inexplaining the heterogeneity in the levels of cognitive skills (and to a smaller extent in their agerelated decline). However, there is also evidence (see for example the PISA surveys) that countriesare different in how effective they improve cognitive abilities in childhood. The first PISA surveywas performed in 2000, the mean scores of countries in math, science and reading are positivelycorrelated with the eligibility ages. If there is some persistence in the quality of education systemsfrom the time when today’s pensioners went to school and that of today, this might introduce anew type of bias in the estimations even if the number o

A Uni ed Framework for Examining the E ect of Retirement on Cognitive Performance Preliminary Draft János K. Divényi † Central European University July 1, 2015 Abstract All papers dealing with the e ect of retirement on old age cognitive performance are trying to sell their strategy as the only best. I choose another way to act: instead .