Taxation, Incomplete Markets and Social Security Munich Lectures, November 14-16, 2000 Peter Diamond∗ Lecture 1: Pension Insurance Reform in Germany Lecture 2: Taxation and Social Security Lecture 3: Incomplete Markets and Social Security November 8, 2000 Draft. Lecture 3. Incomplete Markets and Social Security Part I. Incomplete Markets Part II. Retirement Incentives In Lecture 2, I argued that from the perspective of taxation, competitive equilibrium without distorting taxes was generically nonoptimal. Indeed, I argued that an equilibrium without distorting taxes was not feasible. In this lecture, I turn to issues of risk-sharing, which include both incompleteness of the set of widely-available risk-sharing trades and incompleteness in the use of available trades. In parallel to the reversal of the fundamental welfare theorem when there are not ideal lump-sum taxes, so, too, we have the result that with incomplete markets the competitive allocation is generically not Pareto optimal (Geanakoplos). ∗ I am grateful to Tom Davidoff for research assistance and Emmanuel Saez for com- ments. The research reported on here was supported by the National Science Foundation, under grant: SBR-9618698. The views expressed are my own. 1 There are interesting analyses focusing on the incompleteness of organized markets, particularly the incompleteness that accompanies an overlapping- generations perspective on risk-sharing. While this literature on trades that individuals can not make is very interesting, I want to focus on the avail- able trades that individuals can but do not make, particularly the extremely limited use of annuity markets. Just as social security is needed because many workers would not save enough for retirement, so too social insurance is needed because many workers would not adequately insure both risks to earnings and the risk inherent in a variable length of life. In Lecture 2, I considered redistribution across workers with different earnings levels, ignoring any variation in retirement ages. In this lecture, I will focus on the choice of retirement age, ignoring variation in earnings levels. That is, on reaching the earliest age at which retirement benefits can be claimed, there is some level of benefits available for a worker with a given history of earnings. The increase in benefits as a consequence of delayed retirement is the focus of attention, ignoring any effect such rules might have on earlier behavior. In principle, such adjustment for delayed retirement could vary by earnings level. In practice systems have rules for determination of benefit levels and rules for determination of increases for delayed retirement which interact in limited ways. Thus there is a need at some point to put together the two aspects I am exploring in these lectures. But that will not happen today. I have two bottom lines. One is that for reasons of income distribution and insurance, there should be some taxation of work for those eligible for retirement benefits. Second is that the return to work should show up in both higher net earnings when working and higher later benefits for those who delay retirement. Part I. Incomplete Markets Some transactions are organized in markets where one does not know with whom one is trading. But trade in most commodities occurs pairwise. Much pairwise trade involves standardized commodities, available possibly at multiple outlets. Other pairwise trade is individualized - whether it is a made-to-order suit1 or individualized insurance through friends or family2 or arranged through Lloyds. Individuals do not purchase most of the stan- 1 I am indebted to Stavros Panageas for drawing my attention to the relevance of this distinction. 2 See, e. g., Ben-Porath (19), Kotlikoff and Spivack (19) 2 dardized commodities that exist. And they do not partake in many of the individualized trades that they might organize. So there are two interacting phenomena. The extent of availability of standardized products (or wide availability of possible trading partners pairwise) and the extent of use of these ”markets.” The extent of availability of standardized products is dependent on the extent to which individuals are interested in the products. Thus there is an interaction between the availability of standardized products and the market participation of economic agents. For example, Shiller () has argued for the risk-sharing advantages of markets in financial derivatives based on various indices. For example, trade in regional house price indices would permit people to hedge the risk associated with strong reasons to move from a low- cost housing region to a high-cost housing region. Or indices of wages by industry would let an individual hedge the risk to the demand for labor in the industry, while still preserving the full incentive to do well and be paid well. But an organized exchange in such derivatives would need a sufficient volume of transactions to make creation of such a market worthwhile. And many homeowners and workers would not participate in such markets. And those that would participate are likely to buy and hold and so not contribute much to the volume of transactions that would make such a market liquid. Individuals do not make many detailed arrangements for the distant fu- ture. And they do not make many arrangements that distinguish among low probability events. Thus, the Arrow-Debreu model with a complete set of markets is very wide of the mark relative to the extent of trading that does take place. As Foley (19) and Hahn (19) have noted, any fixed costs of arranging trades will result in rational agents choosing not to make some transactions. And that, in turn, implies that in the future there will be opportunities for new trades, opportunities that would not arise if mar- kets had been complete and trades had been executed over the full range of opportunities. The literature on incompleteness takes two forms. Some of the literature assumes full participation in the markets that exist and explores the implica- tions of incompleteness of markets. Particularly interesting is incompleteness where the set of available trades is endogenous because relative prices are en- dogenous. And some of the literature focuses on limited participation in a given set of markets, a set that might even be complete. I want to briefly review the central insights of the first set of models, and then turn to non- participation - focusing on the reasons that individuals do not participate in 3 particular markets. The presence of an incomplete set of perfectly competitive markets has two implications. First, the markets give all participants the opportunity to equalize their marginal rates of substitution between each pair of (composite) commodities that are competitively traded, but marginal rates of substitution between commodities not separately traded are not generally equalized (Di- amond). Second, and more subtle and more interesting is that the variation in future endogenous prices affects current trading opportunities for future outcomes, resulting in the endogeneity of trading opportunities. (Hart) This endogeneity, in turn, implies that generically such a competitive equilibrium is not Pareto optimal relative to the interventions that a government could do without violating the information restrictions consistent with the market structure. Redistributing income today changes future relative prices and so changes current trading opportunities in assets where the payoff is depen- dent on future prices. Generically there is some redistribution such that the resulting change in equilibrium would result in a Pareto gain (Geanakoplos). Given the set of markets that do exist, some individuals may not par- ticipate in the markets, even though their net demand would not be zero if they did participate. There are two classes of models that explore this idea. In one participation is impossible for some potential agents, for exam- ple because they are not yet born. Intergenerational risk sharing is limited to the set of generations that overlap. I will not explore the literature that uses OLG models to explore improved risk sharing from government pro- grams (e.g., Gale,...). In addition, if there is a fixed cost of participating in a market, then some individuals will choose not to participate. Limited participation can affect the characteristics of the goods being traded (e.g., the volatility of the stock market depends on the number of people trad- ing). Thus, with endogenous participation, more participation might well raise a social welfare function or even be a Pareto improvement, because of the positive feedback of the degree of participation on the value of partici- pation (Chatterjee, Pagano). This is readily seen in models with multiple equilibria. Life Insurance and Annuitization These models consider standard rational decisions about participation. But there is concern that failure to understand the properties of insurance and other psychological barriers to purchasing insurance may lead to under- 4 use of insurance (Kunreuther and Slovick).3 We can relate incomplete use of available to trades to transactions costs in a physical sense and/or to de- cision failures. The latter includes the cot of genuinely learning about the product at hand. If this is large (and it may not be feasible for some) then one can view this as a decision failure or as a transaction cost. But it may not be helpful to think in terms of transaction costs when the concept of decision failures naturally links to the kind of thinking that some people do and the kind of decisions that some people make. The cognitiive psychology literature has documented the fact that many properties of random variables are not intuitive. This leads readily to decision failures in some contexts. It does not seem helpful to lose this connection and think instead in terms of the categories of transactions costs. Generally, I will assume simply that certain trades simply do not happen. without necessarily deriving the non- trading behavior from a particular underlying cause. For many issues the findings are robust to cause. For some they may not be and one needs the more complex analysis. It would be natural to start consideration of annuities in their most com- mon form - contracts that last for an entire remaining lifetime and have some particular intertemporal structure.4 Instead I want to start with an annuity as a special example of an Arrow security. This will be closer in spirit to the argument in Yaari (196?) who bases his analysis on a complete market Arrow-Debreu competitive equilibrium and is the source of the commonly cited result that individuals without an interest in bequests, with access to actuarially fairly priced insurance and without other risks should annuitize all of their wealth. An Arrow security pays off at a particular time in a single state of nature. Let us consider the special case where a single individual’s mortality risk is the only risk in the economy. It is easy to generalize afterwards. Then, at each point in time the individual is either alive or dead. (I ignore the relevance of the history of the time of death.). If we have complete markets, the individual can purchase separately income at each time conditional on being 3 I ignore the apparent tendency of some individuals to purchase insurance policies that do not seem to make sense. Purchasing policies covering small losses appears to make little sense. If individuals are sufficiently risk averse to cover loads for small losses, their risk aversion for large losses would be massive (Rabin, 2000). 4 In terms of time shape, available forms of annuities include constant nominal, constant real, graded nominal (i. e., fixed percentage growth each time period) or linked to returns on some particular asset portfolio. 5 alive and conditional on being dead. Purchase of a conventional zero-coupon bond os some maturity is purchasing equal amounts of the two contingent commodities for that point in time. By the arbitrage condition, we know that the cost of a unit of unconditional consumption at that time equals the sum of the costs of the two conditional consumptions. If an individual has no bequest motive, then income conditional on being alive is as good for the consumer as an unconditional payment. The only way an individual would not annuitize all of the wealth being used to purchase consumption for that time is if it is not cheaper to purchase the conditional consumption. But as long as there is a positive probability of being dead, and as long as the transaction cost of delivering conditional consumption is not too much larger than the cost of delivering unconditional consumption, then the annuitized income provides a dominant asset. This is the Yaari result. It is clear that it extends to an Arrow-Debreu setting with lots of additional risks as a statement comparing payments that distinguish other risks and do or do not distinguish whether the individual is alive. This argument holds for insurance that need not be actuarially fair as long as contingent income is cheaper than noncontingent. Next consider the analysis when some people do value income even if dead, perhaps because of family members. Individuals can contract separately for income if they are dead or income if they are alive. The former is life insurance, although payoffs are usually triggered by dying rather than being conditional on being dead. But that is a small difference.5 Income if alive has a less familiar form. We are familiar with annuities which pay a flow of income for the rest of the time that a person is alive. The flow might be constant in nominal or real terms, or have some particular simple time shape (graded) or be conditional on the return on some particular portfolio (variable annuities). But if we decompose this into separate dates and possibly different amounts at different dates, we see that a standard annuity contract is a combination of Arrow securities, paying particular amounts at different dates in the event of being alive. An individual contemplating some future date and holding some particu- lar asset portfolio has a marginal rate of substitution between income if alive and income if dead. For an active worker with dependents, in the absence of 5 There is bundling of some life insurance products in that a payoff on dying is made over a range of dates when the death might occur. If the life insurance were paid as an annuity to a survivor, it would be closer to the theoretical structure. 6 any insurance, there is likely to be higher marginal utility for income when dead (relative to its price as fair insurance). Or someone with no depen- dents might have a clear preference for income if alive (given an actuarially fair annuity). If pricing is fair, it would be a zero-probability knife edge that one did not want either life insurance or annuity coverage (Bernheim). That is, without any insurance, there are wealth amounts in the two states. It would be unusual that this pattern of equalized wealths should result in a marginal rate of substitution that equalled the price ratio of the two types of insurance. With load factors on the two types of insurance, there is a kink in the budget constraint and some fraction of the population would be expected to be found in the position of no insurance of either kind. This is similar to analysis of borrowing and lending. If borrowing and lending rates are equal, we would not find many people deciding to do neither. With a higher borrowing than lending rate, some people will be at the kink in the budget constraint of zero borrowing and lending. Study of the extent to which people purchase life insurance (Bernheim and Kotlikoff) suggests that Americans with families are widely underin- sured, as measured by the ability of the surviving family to sustain the same standard of living. And studies of the value of annuitization, even with simple annuities, find the value is so large that it is surprising how little use people without dependents make of annuitization (Poterba and Warshawsky, Brown, Mitchell, Poterba and Warshawsky).6 Even those with dependents would want to combine life insurance for some dates with some annuitized payouts at others. 7 6 There is a sizable market in the US for what are called variable annuities. However, while these insurance products include an option to annuitize, they do not commit the investor to an annuity. Moreover, it appears that very few people saving in this form do purchase annuities. 7 Considerable annuitization happens around the world in three common forms. Private pension plans often provide benefits only as an annuity. Public mandatory pension plans often do too. Some tax-favored retirement savings plans require or strongly tax-favor some annuitization (UK). But, left to their own devices, individuals do very little annuitization. This suggests a role for public programs that encourage or require annuitization (such as paying retirement benefits as an annuity rather than a lump-sum). To explore this issue, we need to consider reasons why individuals might not annuitize. There are several types of reasons. One is that government programs provide sufficient annuitization. This includes both formally annuitized. ones and ones that are annuitized. in practice by being annual programs. This is an awkward line of argument. If the programs exist because individuals would not otherwise annuitize, then provision of considerable annuitization undercuts the ability to judge the need for annuitization. The lack of annuitization 7 The Yaari argument assumed actuarially fair insurance. But it is clear that the argument does not require actuarially fair insurance. For someone with no bequest motive, the argument just needs the result that the annu- itized return exceed the zero-coupon bond return. Let us see how this might look in the more familiar setting of mutual funds. The clear sense in which an annuity is a dominant asset for someone with no bequest motive becomes clear if one contemplates a short period investment opportunity with the properties of a TIAA-CREF annuity. Consider investing in some mutual fund for a given time period (a year or a month). Consider instead investing in the same mutual fund with the condition that the accounts of all the peo- ple who die during the investment period are divided proportionally among the accounts of the survivors. If one does not value resources after death, the annuitized mutual fund is a dominant asset as long as the administrative cost of determining death and reapportioning the accounts does not exceed the value of the accounts freed up by death. One does not need a long time horizon for such an investment opportunity to be a dominant asset. Thus it is interesting that such given horizon investment opportunities do not ex- ist.8 Presumably this is an interaction of companies (insurance companies and mutual funds in cooperation) not offering what they believe the public is not ready to purchase, at least without a selling campaign that would be too costly to be justified. Note that with this plan the supplier does not take on any risk from mortality. But this suggests strongly that the determination of such behavior does not lie in consumer costs of participation, since the potential gains are large. As noted above, one does not need risk classifi- cation for this to be a dominant asset. While I would prefer to be grouped where government retirement programs are small suggests that government provision is not filling a role that the private market would fill otherwise. Second, we can consider the terms on which annuities are offered. If they are very expensive because of high costs of administering them, then there may not be much role for them. However, as I will argue in a moment, the costs need not be high and much of the high costs come from consumer behavior. Indeed high selling costs because of the need to convince people to annuitize suggests that the market does not function smoothly. I pass over for now the consistent track record that the government can provide insurance more cheaply than the market can. This leaves us with two reasons based on consumer behavior. Either consumers do not have ana advantage from annuities or consumers are not making good decisions because of failures to appreciate and take advantage of the nature of annuities. 8 If they did exist, it would be natural for providers to try to draw distinctions in risk classes. With risk classification, then one would not get insurance against one’s future risk classification merely by rolling over short-term purchases. 8 with people with a high probability of dying, being grouped with people who have any positive probability of dying is sufficient for this to dominate as an asset. This result does depend on the richness of alternative annuity payout streams. If individuals are restricted to some particular class of annuities, such as constant real payments for the rest of life, then the inefficiency in the shape of the stream might lead some to hold unannuitized wealth. Yet the simple structures would be easy to administer. And they would permit peo- ple to divide their wealth between traditional mutual funds and annuitized mutual funds and so have whatever different relative amounts of wealth con- ditional on survival or not. That these simple financial instruments do not exist is strongly suggestive that there is no demand for them, a lack of de- mand that suggests that people do not appreciate the potential in this type of insurance. Another issue that arises in the limited uses of annuity markets that do occur relates to the timing of annuitization. People tend to accumulate sums and then purchase an annuity late in life. Yet this exposes an individual to risk of how they will be classified by insurance companies for annuity pur- chase. That risk can be avoided by purchasing the annuity early, before the arrival of news that determines the placement in whatever risk classification scheme exists (Brugiavini). It also does not convert wealth in the state of dy- ing before retirement into higher retirement benefits conditional on surviving to retirement. Defined benefit pensions take care of these problems by hav- ing uniform risk classification (uniform relationship between earnings history and benefits) and on paying benefits to workers who survive to retirement age (although there may also be life insurance benefits as well).9 Thus we have another role for social insurance - providing insurance that the market does not provide, perhaps because individuals do not appreciate it virtues Annuitization is central in the design of defined benefit pensions systems without lump sum payout options. Insuring the length of ones career, or ones earnings trajectory (Dulitzky) is another opportunity for a system basing benefits on lifetime earnings records. Although it should be recognized that an annual income tax provides some of this insurance by itself. 9 It is common for annuities to come with a guarantee of some minimum payout. This is a peculliar form of bequest sicne it is random and purchase of an annuity without such a guarantee would permit a nonstochstic bequest and the saem level of annnuitized income flow. 9 I conclude that failures to annuitize, like failures to save for retirement, represent a consumer failure that government intervention could potentially improve. This is a separate argument from the justification of intervention based on adverse selection issues. The difference leads to different ways of modelling individual behavior and therefore possible different descriptions of optimized interventions. The next question is to analyze how such inter- ventions might be organized and what implications follow from the form of realistic intervention, if well executed. Part II. Retirement Incentives A central question for pension design is how benefits should vary with the age of retirement beyond the earliest eligibility age. I will not consider the selection of an early entitlement age, but just consider retirement incentives beyond that age. When some individuals do not save enough for retirement, mandating some savings can help those saving less than the mandate who should save more. However, it may hurt those who should save less than the mandate and may also hurt those who would have saved anyway and suffer from some of the restrictions or lost opportunities associated with the mandated savings program. Moreover, when the quantity of mandated savings is linked to earnings, then there is some implicit taxation of earnings associated with the mandate. The implications of that implicit taxation, along with any explicit redistribution, needs to be considered as part of the evaluation of the program. When we turn to annuities, we have the further issue that full annuiti- zation may be too much for some. A further issue comes with individuals with different life expectancies lumped into a single risk class for annuiti- zation purposes. Indeed the market has heterogeneous individuals in each risk class, but the government use of risk classes is likely to be very different from that of the market. Indeed as a matter of principle, the government might choose to have a single risk class, ignoring all easily observable as well as expensively available indicators of life expectancy. Thus such a program will tend to redistribute to the longer lived compared with annuitization with separate risk classes. the comparison with a market outcome without an- nuities is more complex since everyone gains from getting insurance, These gains and the redistributions together determine the pattern of gains and losses for this counterfactual. A program that is breakeven for the entire 10 population will be a subsidy of earnings for some and a tax on earnings for others. Of course a lack of annuitization, as a way to avoid this issue, has problems of its own. Restricting attention to workers with the same earnings potential, those who expect to live longer would, with the same preferences and career length, have higher marginal utility of consumption since they have more periods over which to spread the accumulated earnings. To consider this risk issue along with the risk of the length of a career, I now consider formal models of social insurance, beginning with retirement incentives in a certainty setting with a heterogeneous population. Three types of models will be considered, having variation in the disutility of labor, having randomness in the length of life, and having randomness in the length of career. It is assumed that all workers have the same earnings potential. And the models will assume no private savings. After considering the model with forward-looking retirement decisions, we will contrast that model with one where retirement decisions are highly myopic. Put differently, the pension system (along with the wage) is part of the determination of the incentives to continue working despite eligibility for retirement benefits. In thinking about this incentive, there are two issues. One is the size of the financial incentive to continue working. Second is how that incentive is divided between current compensation and increased future benefits. Today, I want to address both of these questions in a series of special models considering separately different reasons why different workers might retire at different ages. The analysis will derive the optimal uniform system, not drawing any distinctions between public and private systems. This will ignore any presence of UI or DI programs covering the same ages. (For analysis of DI and retirement programs together, see Diamond and Sheshinski.) If workers were time consistent and if there were perfect capital and insur- ance markets, it would not matter how the financial incentive for retirement was divided between current compensation and larger future benefits. But none of these three conditions hold in practice. Some workers are liquidity constrained, and so prevented from using larger future benefits to finance additional current consumption. Future benefits are paid as an annuity. As noted by Crawford and Lilien, this can be valued more highly than current compensation by some workers who would not otherwise have access to as good an annuity (in terms of either time shape of benefits or price). Thus the trade-off between current compensation and larger future benefits would be different for different workers according to their risk classification given 11 the alternative use of current compensation to purchase an annuity (or to decrease reliance on purchased annuities or to purchase life insurance) to off- set a larger future benefit. And third, some workers are not time consistent in savings decisions, and so their consumption behavior may vary with the form of compensation. Two simple ways of modeling retirement decisions given some time incon- sistency in savings decisions are to model the retirement decision as rational given a correct forecast of savings behavior and to consider a retirement de- cision that is also short-sighted in some form. And separate from individual behavior we have the issue of how social welfare should evaluate individual decisions. This is obviously an issue when workers are not time consistent. It is also an issue when their behavior affects others, through bequests. And it may be an issue even if workers are time consistent and leave no bequests if the social evaluation does not coincide with individual preferences because society concludes that discounting by the worker is not socially appropriate. This issue is further compounded if we are considering a couple, rather than a single worker. But I will not pursue the interesting and not much developed issues of family behavior around savings, retirement, and annuitization. Today, I want to concentrate on workers who simply do no saving. Both rational retirement decisions and retirement decisions that are too present- oriented will be considered. And I will concentrate on workers who are heterogeneous, either ex ante or ex post, with asymmetric information, mak- ing this a form of the standard optimal tax problem. First, I want to review why it is that different workers might be retiring at different ages, given a pension system that does not induce everyone to retire at the same age. First, I want to set up a simple model of the por- tion of the life cycle coming after the early entitlement age, with a focus on the retirement decision for a nonsaving worker. Implicitly, I am assuming that the form of retirement system has no important implications for earlier behavior, or that someone else will extend the model to earlier ages. What may matter for earlier behavior, and is held constant, is the level of resources accumulated by the early entitlement age and available for retirement bene- fits. I just think that the particular pattern of incentives beyond the early entitlement age may not be too important for earlier behavior, at least not until the worker is close to the early entitlement age. For convenience, I will assume that both the interest rate earned on all accumulations and the discount rate of workers are zero. The substantive part of this assumption, apart from the simplification of notation, is that I 12 will ignore variation in utility discount rates in worker retirement behavior. In discrete time, we write the realization of worker lifetime utility as: X R U= {u[xz ] − az } + (L − R)v[c[R]], (1) z=0 where we have assumed that preferences are intertemporally additive, that period utility functions are independent of age, that benefits are paid as a constant real stream, and we have measured time from the early entitlement age. This setup ignores an effect of labor intensity on disutility, treating work as a zero-one variable.10 In designing a pension plan, the realized cost of benefits for this worker is: X R C= {xz − nz } + (L − R)c[R] (2) z=0 Notation az disutility of labor at age z U real ized lifetime utility R retirement age L length of life nz productivity at age z c [R] consumption per period after retirement at age R xz consumption at age z before retirement u[x] − a utility function before retirement, differentiable and strictly concave v[c] utility function after retirement, differentiable and strictly concave There are different important elements affecting retirement for which we can use with this model as a starting place. For example, we start by as- suming that neither the disutility of labor, a, nor productivity, n, varies with age. Ignoring productivity differences, the variables which can vary across the population in a certainty setting are disutility and length of life. Disutil- ity is presumed not to be observable. And length of life becomes revealed, although expected length of life is not observable. I will ignore the latter element by restricting the pension plan to have uniform benefits over time, for a given retirement age. 10 This makes productivity observable for those who work, but not for those who are retired. 13 With the population identical ex ante, uncertainty can be added in this simple setting by assuming that productivity becomes zero stochastically, or equivalently, disutility becomes infinitely high. I have explored this model in a series of joint papers with Jim Mirrlees, which I will summarize later. Making length of life stochastic (with identical workers ex ante) is a minor modification if there is full annuitization. And these stochastic elements could be added in a setting where workers differed ex ante. More complica- tions, particularly in keeping track of what is happening in the model, would combine age varying patterns with stochastic elements. For example, people could be learning more about their mortality rates as they age. What I want to do today is simplify this model to three and four periods. This will lose the age structure, but allow consideration of both ex ante heterogeneity and individual uncertainty (with an assumption of no aggregate risk). This will bring out a common message in these models. Taxing work beyond the early entitlement age is part of optimal insurance and optimal redistribution. Moreover, the compensation for work should take the form of adding to both current compensation and future benefits, not just one or the other. Let me describe how this might work in practice (which is a proposal I made for the US nearly 20 years ago, based on the work with Mirrlees, and a belief it would carry over to more models). Starting at the early entitlement age, part of benefits is paid independent of retirement and part is paid only if the worker retires. The two fractions shift with age. Delayed benefits are increased to partially offset the expected loss in income from withholding part of benefits. Administratively, this would be easy in any system that has a retirement requirement for benefits, but would be an administrative cost for systems which paid benefits independent of retirement. I turn now to a series of simple models. The first is a three-period model, with variation in the additive disutility of labor being the only ele- ment of variation in the population. After considering this model with a forward-looking retirement decision, I consider the same model with myopic retirement decisions. Then I will consider varying life expectancy. In all cases the proofs are in the appendix. First, I need to explore an assumption that will be made throughout. Moral hazard constraint In a first-best solution, the marginal utility of consumption is the same 14 for a worker whether working or not at any particular time. We can ask whether with the levels of consumption that equate marginal utilities with and without work, utility is higher with or without work. The plausible assumption is that if marginal utilities were equated, then the level of utility would be higher without work. This is certainly the case if the only difference between utility functions is an additive disutility of work. More generally, we state the assumption: Assume that the moral hazard condition is met: that if marginal utilities are equated in period two, utility would be higher without work for all values of a: If u0 [x] = v 0 [c], then u[x] < v[c] (3) With this assumption we have a second-best problem if we have asym- metric information about the disutility of work or the ability to work. That is the interesting case and the one I explore. A. Varying disutility Consider a three-period model where everyone works in the first period, no one works in the third period and some people do work and some do not in period two. Assume that at the optimum there is some, but not complete, second-period work. Assume that individuals do no saving. Assume that the disutility of labor, a, a ≥ 0, is distributed in the popu- lation, with distribution F [a]. Denote consumption for workers by the period of consumption, and con- sumption for retirees by the length of their work career. We assume constant consumption after retirement. This will be optimal in some but not all of the models to be analyzed. 1. Forward-looking case Assume a forward-looking lifetime-utility maximization when making the retirement decision. Then the marginal second-period worker has disutility that satisfies 15 a∗ = u [x2 ] + v [c2 ] − 2v [c1 ] (4) The implicit tax on second-period work is the wage plus future benefits if not working less the sum of current consumption and future benefits if working. Thus the implicit tax is: T = n − x2 − c2 + 2c1 (5) Social welfare maximization is: R a∗ 0R ∞ {u [x1 ] + u [x2 ] − 2a + v [c2 ]}dF [a] Maximizex,c + a∗ {u [x1 ] − a + 2v [c1 ]}dF [a] R a∗ (6) E R 0 {x1 + x2 − 2n + c2 }dF [a] + subject to: ∞ + a∗ {x1 − n + 2c1 }dF [a] ≤ 0 where a∗ = u [x2 ] + v [c2 ] − 2v [c1 ] Solving the FOC, we have: u0 [x1 ] = λ (7) ∗ f [a ] u0 [x2 ] = λ/{1 + λT } F [a∗ ] f [a∗ ] v 0 [c1 ] = λ/{1 − λT } 1 − F [a∗ ] f [a∗ ] v 0 [c2 ] = λ/{1 + λT } F [a∗ ] The FOC reflect the direct marginal utility gain and resource cost of increasing any consumption level. In addition, there is an indirect effect coming from the induced change in labor supply, reflected in the change in a∗ . By the envelope theorem, there is no first-order change in utility for someone at the margin choosing to work. There is a first-order impact on resource use, equal to the implicit tax on work. From the FOC, the moral hazard constraint and the assumption that the optimum has some work in period 2, we have Theorem. At the optimum, there is positive taxation of second- period work: 16 T = n − x2 − c2 + 2c1 > 0. (8) And we have: v0 [c2 ] = u0 [x2 ] < u0 [x1 ] < v 0 [c1 ] . (9) Corollary. If u and v differ by an additive constant, then c2 = x2 > x1 > c1 (10) Thus, we have implicit taxation of work in the second period. Moreover second-period work results in higher consumption than first-period work and retirement benefits in period 3 are larger if there is work in period 2. If we have additive disutility, then the replacement rate rises with the age of retirement. 2. Forward looking case with consumption constraint. We now modify this model by assuming that consumption of workers must be the same in both periods: x1 = x2 = x. (11) This would be the case if there were a payroll tax with no age variation and no way to pay part of benefits. Having an additional constraint, the optimum is not as good as above since the constraint would be violated in the optimum without the constraint. Note that the implicit tax on second-period work is now: T = n − x − c2 + 2c1 . (12) Then the optimization becomes: R a∗ 0R ∞ {2u [x] − 2a + v [c2 ]}dF [a] Maximizex,c + a∗ {u [x] − a + 2v [c1 ]}dF [a] R a∗ (13) E R 0 {2x − 2n + c2 }dF [a] + subject to: ∞ + a∗ {x − n + 2c1 }dF [a] ≤ 0 where a∗ = u [x] + v [c2 ] − 2v [c1 ] 17 Solving, we have the FOC: f [a∗ ] u0 [x] = λ/{1 + λT } (14) 1 + F [a∗ ] f [a∗ ] v 0 [c1 ] = λ/{1 − λT } 1 − F [a∗ ] f [a∗ ] v 0 [c2 ] = λ/{1 + λT } F [a∗ ] Examining the FOC, from the moral hazard constraint and the assump- tion of an interior optimum we have Theorem. At the optimum there is positive taxation of second- period work: T = n − x2 − c2 + 2c1 > 0, (15) and we have: v 0 [c2 ] < u0 [x] < v0 [c1 ] (16) c2 > c1 (17) Corollary. If u and v differ by an additive constant, then c2 > x > c1 (18) . 3. Myopic model We now assume that the marginal second-period worker considers only the utility in the second period when making the retirement decision. Thus the marginal worker has disutility that satisfies a∗ = u [x2 ] − v [c1 ] (19) The assumption of an interior optimum now requires different conditions than the same assumption in the forward-looking case. 18 Note that the implicit tax on second-period work is the wage plus fu- ture benefits if not working less current consumption plus future benefits if working. Thus the implicit tax is: T = n − x2 − c2 + 2c1 . (20) Define the apparent tax for a myopic worker as the wage plus current benefit less current consumption if working: A = n − x2 + c1 . (21) Social welfare maximization is now: R a∗ 0R ∞ {u [x1 ] + u [x2 ] − 2a + v [c2 ]}dF [a] Maximizex,c + a∗ {u [x1 ] − a + 2v [c1 ]}dF [a] R a∗ (22) E R 0 {x1 + x2 − 2n + c2 }dF [a] + subject to: ∞ + a∗ {x1 − n + 2c1 }dF [a] ≤ 0 where a∗ = u [x2 ] − v [c1 ] Solving the FOC, we have: u0 [x1 ] = v0 [c2 ] = λ (23) ∗ f [a ] u0 [x2 ] = λ/{1 + (v [c2 ] − v [c1 ] + λT ) } F [a∗ ] f [a∗ ] v 0 [c1 ] = λ/{1 − (v [c2 ] − v [c1 ] + λT ) /2} 1 − F [a∗ ] Theorem. At the optimum we have: (v [c2 ] − v [c1 ] + λT ) > 0 (24) u0 [x2 ] < u0 [x1 ] = v 0 [c2 ] < v0 [c1 ] x2 > x1 , c2 > c1 Corollary. Corollary. If u and v differ by an additive constant, then x2 > x1 = c2 > c1 (25) 19 Since neither x1 nor c2 affect labor supply, the marginal utilities of these consumption levels are equated. In contrast, changes in the other two con- sumption levels do affect labor supply, and in opposite directions. It appears that the tax on work may be positive or negative since there is a need to sub- sidize work to offset myopia. The apparent tax is larger than the tax (akin to a Pigouvian tax), tending to offset the incentive arguments above. λA = λ(n − x2 + c1 ) = λ{n − x2 − c2 + 2c1 } (26) +λ(c2 − c1 ) = λT + λ(c2 − c1 ) > λT Extension If benefits could be different in the two periods for a worker retiring after one period, the optimum has higher benefits in period 3 than in period 2. B. Varying life span and disutility We now consider varying life spans. For this example, we use a four- period model where everyone works in the first period, no one works in the third or fourth periods and some people do work and some do not in period two. Assume that everyone survives to period 3, but only some survive to period 4. Assume that real annuities are the same in periods 3 and 4.11 Let p be the probability of surviving to period 4. Let ap (ap ≥ 0) be the additive disutility of labor in period 2 for someone who has the probability p of surviving to period 4. We assume that ap is nonincreasing in p, dap /dp ≤ 0. That is, those with longer lives are more capable of working in period 2, as measured by disutility. Assume p is distributed in the population, with distribution F [p]. As above, assume that at the optimum there is some, but not complete, second-period work. With a forward-looking retirement decision, the marginal second-period worker has survival probability p∗ and disutility that satisfies ap∗ = u [x2 ] + (1 + p∗ )v [c2 ] − (2 + p∗ )v [c1 ] (27) Differentiating implicitly, we have: 11 It would be interesting to examine a time-varying annuity. 20 dp∗ = −u0 [x2 ] /D, (28) dx2 dp∗ = (2 + p∗ )v 0 [c1 ] /D, dc1 dp∗ = −(1 + p∗ )v 0 [c2 ] /D, dc2 where D = v [c2 ] − v [c1 ] − dap /dp. There is a unique solution to this marginal condition provided c2 ≥ c1 . We will show that the optimum has this property. Note that the implicit tax on second-period work for worker-p is the wage plus expected future benefits if not working less the sum of current consumption and expected future benefits if working. Thus the implicit tax is: T [p] = n − x2 − (1 + p)c2 + (2 + p)c1 (29) Implicit taxes are less for those with longer life expectancies. Social welfare maximization is now (later we will add G [lifetime utility]): R p∗ {u [x ] − ap + (2 + p)v [c1 ]}dF [p] Maximizex,c p0R p1 1 + p∗ {u [x1 ] + u [x2 ] − 2ap + (1 + p)v [c2 ]}dF [p] R p∗ E + p0 {x1 − n + (2 + p)c1 }dF [p] subject to: Rp + p∗1 {x1 + x2 − 2n + (1 + p)c2 }dF [p] ≤ 0 where p∗ satisfies ap∗ = u [x2 ] + (1 + p∗ )v [c2 ] − (2 + p∗ )v [c1 ] (30) FOC: 21 u0 [x1 ] = λ, (31) ∗ ∗ T [p ] f [p ] u0 [x2 ] = λ/{1 + λ }, D 1 − F [p∗ ] T [p∗ ] f [p∗ ] 2 + p∗ v 0 [c1 ] = λ/{1 − λ }, D F [p∗ ] 2 + P0 [p∗ ] T [p∗ ] f [p∗ ] 1 + p∗ v 0 [c2 ] = λ/{1 + λ }. D 1 − F [p∗ ] 1 + P1 [p∗ ] where P0 [p] and P1 [p] are the average survival probabilities for those below and above p: Z p P0 [p] = sdF [s]/F [p] (32) Zp0 1 p P1 [p] = sdF [s]/(1 − F [p]) p Theorem. At the optimum we have: T [p∗ ] > 0, D > 0, u0 [x2 ] < v0 [c2 ] < u0 [x1 ] < v 0 [c1 ] (33) Corollary. If u and v differ by an additive constant, then x2 > c2 > x1 > c1 (34) C. Stochastic earnings In the models considered above, there was ex ante heterogeneity. The only individual uncertainty considered was length of life. I want to turn now to an important issue of individual uncertainty - uncertainty as to length of career. Some careers are cut short because of measurable disability. A disability insurance program is a valuable social insurance program, even though disability will only be measured with both type I and type II errors. Unemployment insurance can also help with the transition to retirement for some workers. Again, there will be errors in measuring true unemployment. Some careers will be sensibly cut short without being covered by such pro- grams. In part this will happen because of errors in measurement. In part 22 it can happen because work has become too taxing, even though there is no measurable disability. In part it can happen because the employment opportunities, while available, are not worth pursuing. For dealing with such uncertainties, socially provided insurance against a short career can add to expected lifetime utilities. To explore this issue, Jim Mirrlees and I have written four papers, examining models where all workers are ex ante identical.12 Today, I want to consider them briefly. Continue with the basic three period model we have employed. Assume that everyone works in period one and no one works in period three and there are no savings. Stochastically, some people can work in period two and some can not. If there are no savings opportunities, then this is equivalent to a heterogeneous agent model. However, once we extend the model to incorporate more periods of possible inability to work, then the two models become different. Given asymmetry of information as to who can work in period two, we have a moral hazard problem. Given the moral hazard constraint, the compensation for early retirees should be as large as possible without resulting in retirement of those able to work. The homogeneity of those able to work converts this problem into one of just inducing labor supply. Even with this difference, we get the same results as above - there should be some taxation of work and the return to work should show up in both a rising wage and a rising benefit for later retirement. Extending the model to more periods preserves these results and provides a further result - that the implicit tax on work should decrease with age, reaching zero at an age where everyone still able to work chooses to retire. This further result can be seen intuitively since higher later wages serves as an incentive to work at all earlier ages. So there is more incentive effect for a later wage than an earlier one. These results extend to two additional models - one where the wage must be constant (e. g., payroll tax financed) and one where individuals can save without the government being able to observe savings. The latter is interest- ing since it involves individuals saving in order to weaken the government’s ability to discourage early retirement. That is, someone considering an ear- lier retirement would save more to finance an earlier retirement. Thus the government would like to discourage savings in order to do a better job of providing insurance. 12 There are also old notes by Jim combining individual uncertainty with ex ante het- erogeneity in the disutility of labor. 23 Conclusion I have not considered what would happen with varying productivity. That would either resemble income taxation with fixed hours (Saez, Dia- mond) or require a more complex structure with varying intensity of work. It would be interesting to extend these discrete-time models to continuous time. 24 Appendix Varying disutility - forward-looking case Assume a forward-looking lifetime-utility maximization when making the retirement decision. Then the marginal second-period worker has disutility that satisfies a∗ = u [x2 ] + v [c2 ] − 2v [c1 ] (35) The implicit tax on second-period work is the wage plus future benefits if not working less the sum of current consumption and future benefits if working. Thus the implicit tax is: T = n − x2 − c2 + 2c1 (36) Social welfare maximization is: R a∗ 0R ∞ {u [x1 ] + u [x2 ] − 2a + v [c2 ]}dF [a] Maximizex,c + a∗ {u [x1 ] − a + 2v [c1 ]}dF [a] R a∗ (37) E R 0 {x1 + x2 − 2n + c2 }dF [a] + subject to: ∞ + a∗ {x1 − n + 2c1 }dF [a] ≤ 0 where a∗ = u [x2 ] + v [c2 ] − 2v [c1 ] FOC: u0 [x1 ] = λ (38) ∗ ∗ ∗ f [a ] da f [a ] 0 u0 [x2 ] = λ − λ{n − x2 − c2 + 2c1 } ∗ ] dx = λ − λT u [x2 ] F [a 2 F [a∗ ] f [a∗ ] da∗ f [a∗ ] v 0 [c1 ] = λ − λ{n − x2 − c2 + 2c1 } /2 = λ + λT v 0 [c1 ] 1 − F [a∗ ] dc1 1 − F [a∗ ] f [a∗ ] da∗ f [a∗ ] 0 v 0 [c2 ] = λ − λ{n − x2 − c2 + 2c1 } = λ − λT v [c2 ] F [a∗ ] dc2 F [a∗ ] Solving, we have: 25 u0 [x1 ] = λ (39) ∗ f [a ] u0 [x2 ] = λ/{1 + λT } F [a∗ ] 0 f [a∗ ] v [c1 ] = λ/{1 − λT } 1 − F [a∗ ] f [a∗ ] v 0 [c2 ] = λ/{1 + λT } F [a∗ ] Examining the FOC, we have u0 [x2 ] = v0 [c2 ], implying that if u and v differ by an additive constant, then x2 = c2 . Moreover, we have Lemma 1. If and only if T = n − x2 − c2 + 2c1 > 0, then x2 > x1 , c2 > c1 , and u0 [x1 ] < v0 [c1 ] . Proof. The sign of T determines the difference between the ratio of marginal utilities and 1. u0 [x2 ] 1 −1= f ∗ − 1. (40) 0 [x ] u 1 1 + λT F[a ∗]] [a ∗ v 0 [c2 ] 1 − λT F[a ∗]] f [a −1= f [a∗ ] − 1. (41) v 0 [c1 ] 1 + λT 1−F [a∗ ] v 0 [c1 ] 1 = f [a∗ −1 (42) 0 [x ] u 1 1 − λT 1−F [a]∗ ] Theorem. At the optimum, there is positive taxation of second- period work: T = n − x2 − c2 + 2c1 > 0. (43) And we have: v0 [c2 ] = u0 [x2 ] < u0 [x1 ] < v 0 [c1 ] . (44) Proof. If T < 0, then a∗ is zero from the Lemma and the moral hazard constraint. Corollary. If u and v differ by an additive constant, then c2 = x2 > x1 > c1 . 26 Varying disutility - forward looking case with consumption con- straint. We now modify this model by assuming that consumption of workers must be the same in both periods: x1 = x2 = x. (45) This would be the case if there were a payroll tax with no age variation and no way to pay part of benefits. Having an additional constraint, the optimum is not as good as above since the constraint would be violated in the optimum without the constraint. Note that the implicit tax on second-period work is now: T = n − x − c2 + 2c1 . (46) Then the optimization becomes: R a∗ 0R ∞ {2u [x] − 2a + v [c2 ]}dF [a] Maximizex,c + a∗ {u [x] − a + 2v [c1 ]}dF [a] R a∗ (47) E R 0 {2x − 2n + c2 }dF [a] + subject to: ∞ + a∗ {x − n + 2c1 }dF [a] ≤ 0 where a∗ = u [x] + v [c2 ] − 2v [c1 ] FOC: f [a∗ ] da∗ u0 [x] = λ − λ{n − x − c2 + 2c1 } (48) 1 + F [a∗ ] dx f [a∗ ] = λ − λT ∗] u0 [x] , 1 + F [a f [a∗ ] da∗ v0 [c1 ] = λ − λ{n − x − c2 + 2c1 } /2 1 − F [a∗ ] dc1 f [a∗ ] = λ + λT v0 [c1 ] , 1 − F [a∗ ] f [a∗ ] da∗ v0 [c2 ] = λ − λ{n − x − c2 + 2c1 } F [a∗ ] dc2 f [a∗ ] 0 = λ − λT v [c2 ] . F [a∗ ] 27 Solving, we have: f [a∗ ] u0 [x] = λ/{1 + λT } (49) 1 + F [a∗ ] f [a∗ ] v 0 [c1 ] = λ/{1 − λT } 1 − F [a∗ ] f [a∗ ] v 0 [c2 ] = λ/{1 + λT } F [a∗ ] Examining the FOC, we have Lemma 1. If and only if T = n − x − c2 + 2c1 > 0, then c2 > c1 , v [c2 ] < u0 [x] < v0 [c1 ]. 0 Proof. f [a∗ v 0 [c2 ] 1 − λT 1−F [a]∗ ] −1= ∗ − 1. (50) v 0 [c1 ] 1 + λT f [a ∗] F [a ] ∗ u0 [x] 1 − λT 1−F [a]∗ ] f [a −1= f [a∗ − 1. (51) v 0 [c1 ] 1 + λT 1+F [a]∗ ] ∗ u0 [x] 1 + λT F[a ∗]] f [a 0 [c ] −1= f [a∗ ] − 1. (52) v 2 1 + λT 1+F [a∗ ] Theorem. At the optimum there is positive taxation of second- period work: T = n − x2 − c2 + 2c1 > 0, (53) and we have: v 0 [c2 ] < u0 [x] < v0 [c1 ] (54) c2 > c1 (55) Proof. If T < 0, then a∗ is zero from the Lemma and the moral hazard constraint contradicting the assumption of an interior optimum. Corollary. If u and v differ by an additive constant, then c2 > x > c1 . Varying disutility - myopic model 28 We now assume that the marginal second-period worker considers only the utility in the second period when making the retirement decision. Thus the marginal worker has disutility that satisfies a∗ = u [x2 ] − v [c1 ] (56) The assumption of an interior optimum now requires different conditions than the same assumption in the forward-looking case. Note that the implicit tax on second-period work is the wage plus fu- ture benefits if not working less current consumption plus future benefits if working. Thus the implicit tax is: T = n − x2 − c2 + 2c1 . (57) Define the “apparent tax” for a myopic worker as the wage plus current benefit less current consumption if working: A = n − x2 + c1 . (58) Social welfare maximization is now: R a∗ 0R ∞ {u [x1 ] + u [x2 ] − 2a + v [c2 ]}dF [a] Maximizex,c + a∗ {u [x1 ] − a + 2v [c1 ]}dF [a] R a∗ (59) E R 0 {x1 + x2 − 2n + c2 }dF [a] + subject to: ∞ + a∗ {x1 − n + 2c1 }dF [a] ≤ 0 where a∗ = u [x2 ] − v [c1 ] FOC: 29 u0 [x1 ] = λ, (60) ∗ ∗ f [a ] 0 f [a ] 0 u0 [x2 ] + {v [c2 ] − v [c1 ]} ∗] u [x2 ] = λ − λ{n − x2 − c2 + 2c1 } u [x2 ] F [a F [a∗ ] f [a∗ ] 0 = λ − λT u [x2 ] , F [a∗ ] f [a∗ ] 0 f [a∗ ] v0 [c1 ] − {v [c2 ] − v [c1 ]} ∗] v [c1 ] /2 = λ + λ{n − x2 − c2 + 2c1 } ∗] v 0 [c1 ] /2 F [a 1 − F [a ∗ f [a ] = λ + λT ∗] v0 [c1 ] /2, 1 − F [a 0 v [c2 ] = λ. Solving the FOC, we have: u0 [x1 ] = v0 [c2 ] = λ (61) ∗ f [a ] u0 [x2 ] = λ/{1 + (v [c2 ] − v [c1 ] + λT ) } F [a∗ ] f [a∗ ] v 0 [c1 ] = λ/{1 − (v [c2 ] − v [c1 ] + λT ) /2} 1 − F [a∗ ] Theorem. At the optimum we have: (v [c2 ] − v [c1 ] + λT ) > 0 (62) u0 [x2 ] < u0 [x1 ] = v 0 [c2 ] < v0 [c1 ] x2 > x1 , c2 > c1 Proof. At the optimum, v0 [c2 ] = u0 [x1 ]. Since there is some work, u [x2 ] > v [c1 ]. The moral hazard constraint then implies u0 [x2 ] < v 0 [c1 ] . This implies that (v [c2 ]−v [c1 ]+λT ) > 0 and the remaining condition follows from the FOC. Corollary. Corollary. If u and v differ by an additive constant, then x2 > x1 = c2 > c1 . It appears that the tax may be positive or negative since there is a need to subsidize work to offset myopia. 30 The apparent tax is larger than the tax. λA = λ(n−x2 +c1 ) = λ{n−x2 −c2 +2c1 }+λ(c2 −c1 ) = λT +λ(c2 −c1 ) > λT (63) Varying life span and disutility We use a four-period model where everyone works in the first period, no one works in the third or fourth periods and some people do work and some do not in period two. Assume that everyone survives to period 3, but only some survive to period 4. Let p be the probability of surviving to period 4. Let ap (ap ≥ 0) be the additive disutility of labor in period 2 for someone who has the probability p of surviving to period 4. We assume that ap is nonincreasing in p, dap /dp ≤ 0. That is, those with longer lives are more capable of working in period 2, as measured by disutility. Assume p is distributed in the population, with distribution F [p]. Let P0 [p] and P1 [p] be the average survival probabilities for those below and above p: Z p P0 [p] = sdF [s]/F [p] (64) Zp0 1 p P1 [p] = sdF [s]/(1 − F [p]) p Thus we have: P0 [p] < p < P1 [p] . (65) As above, assume that at the optimum there is some, but not complete, second-period work. With a forward-looking retirement decision, the marginal second-period worker has survival probability p∗ and disutility that satisfies ap∗ = u [x2 ] + (1 + p∗ )v [c2 ] − (2 + p∗ )v [c1 ] (66) Differentiating implicitly, we have: 31 dp∗ = −u0 [x2 ] /D, (67) dx2 dp∗ = (2 + p∗ )v 0 [c1 ] /D, dc1 dp∗ = −(1 + p∗ )v 0 [c2 ] /D, dc2 where D = v [c2 ] − v [c1 ] − dap /dp. There is a unique solution to this marginal condition provided c2 ≥ c1 . We will show that the optimum has this property. Note that the implicit tax on second-period work for worker-p is the wage plus expected future benefits if not working less the sum of current consumption and expected future benefits if working. Thus the implicit tax is: T [p] = n − x2 − (1 + p)c2 + (2 + p)c1 (68) Implicit taxes are less for those with longer life expectancies. Social welfare maximization is now (later we will add G [lifetime utility]): R p∗ Rp Maximizex,c p0 {u [x1 ] − ap + (2 + p)v [c1 ]}dF [p] + p∗1 {u [x1 ] + u [x2 ] − 2ap + (1 + R p∗ Rp subject to: E + p0 {x1 − n + (2 + p)c1 }dF [p] + p∗1 {x1 + x2 − 2n + (1 + p)c2 }dF where p∗ satisfies ap∗ = u [x2 ] + (1 + p∗ )v [c2 ] − (2 + p∗ )v [c1 ] FOC: u0 [x1 ] = λ, (69) ∗ dp (u0 [x2 ] − λ)(1 − F [p∗ ]) = λT [p∗ ] f [p∗ ] = −λT [p∗ ] f [p∗ ] u0 [x2 ] /D, dx2 dp∗ (v 0 [c1 ] − λ) (2 + P0 [p∗ ])F [p∗ ] = λT [p∗ ] f [p∗ ] = λT [p∗ ] f [p∗ ] (2 + p∗ )v0 [c1 ] /D, dc1 dp∗ (v 0 [c2 ] − λ) (1 + P1 [p∗ ])(1 − F [p∗ ]) = λT [p∗ ] f [p∗ ] = −λT [p∗ ] f [p∗ ] (1 + p∗ )v0 [c2 ] /D, dc2 or 32 u0 [x1 ] = λ, (70) ∗ ∗ T [p ] f [p ] u0 [x2 ] = λ/{1 + λ }, D 1 − F [p∗ ] 0 T [p∗ ] f [p∗ ] 2 + p∗ v [c1 ] = λ/{1 − λ }, D F [p∗ ] 2 + P0 [p∗ ] T [p∗ ] f [p∗ ] 1 + p∗ v 0 [c2 ] = λ/{1 + λ }. D 1 − F [p∗ ] 1 + P1 [p∗ ] Lemma 1. If and only if T [p∗ ] /D > 0, then u0 [x2 ] < v0 [c2 ], v0 [c2 ] < u [x1 ] < v0 [c1 ]. 0 Proof. u0 [x2 ] K + (1 + p∗ ) / (1 + P1 [p∗ ]) = < 1, (71) v 0 [c2 ] K +1 ∗ ∗ where K −1 = λ T D ] 1−F [p]∗ ] . [p f [p The first inequality follows from the sign of K > 0 and (1 + p∗ ) / (1 + P1 [p∗ ]) < 1. The rest follow from the FOC. Lemma 2. If T [p∗ ] /D > 0, then T [p∗ ] > 0, D > 0. Proof. If D < 0 then v [c2 ] < v [c1 ] ,since dap /dp < 0. From Lemma 2, v 0 [c2 ] < v 0 [c1 ]. This is a contradiction. Theorem. At the optimum we have: T [p∗ ] > 0, D > 0, u0 [x2 ] < v0 [c2 ] < u0 [x1 ] < v 0 [c1 ] (72) Proof. If T [p∗ ] /D > 0, the results follow from the lemmas. If T [p∗ ] /D ≤ 0, then we contradict the assumption that ap∗ > 0 by the following argument. If T [p∗ ] is zero then all marginal utilities are equal and from the moral haz- ard constraint we have a contradiction. With T [p∗ ] < 0, Lemma 2 gives the same contradiction. Corollary. If u and v differ by an additive constant, then x2 > c2 > x1 > c1 . 33